![]() Training a neural network, especially on a large dataset, is nothing more than a producer/consumer relationship. The “secret sauce” to tf.data lies in TensorFlow’s multi-threading/multi-processing implementation, and more specifically, the concept of “autotuning.” The short answer is yes, using tf.data is significantly faster and more efficient than using ImageDataGenerator - as the results of this tutorial will show you, we’re able to obtain a ≈6.1x speedup when working with in-memory datasets and a ≈38x increase in efficiency when working with images data residing on disk. Is tf.data more efficient for building data pipelines?įigure 2: The “tf.data” module is significantly faster than the “ImageDataGenerator” class due to an optimized producer/consumer relationship ( image source). Working with data is now significantly easier using tf.data - and as we’ll see, it’s also worlds faster and more efficient than relying on the old ImageDataGenerator class. The tf.data API makes it possible to handle large amounts of data, read from different data formats, and perform complex transformations. The pipeline for a text model might involve extracting symbols from raw text data, converting them to embedding identifiers with a lookup table, and batching together sequences of different lengths. For example, the pipeline for an image model might aggregate data from files in a distributed file system, apply random perturbations to each image, and merge randomly selected images into a batch for training. The tf.data API enables you to build complex input pipelines from simple, reusable pieces. The TensorFlow v2 API has gone through a number of changes, and arguably one of the biggest/most important changes is the introduction of the tf.data module. The ImageDataGenerator function, while a perfectly fine option, wasn’t the fastest method either. Manually implementing your own data loading functions is hard work and can result in bugs.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |