3D Data-Augmentation Using tf.data and Volumentations-3D Library

Fakrul Islam Tushar
3 min readOct 5, 2020

Data argumentation proven to be very useful to avoid over-fitting and introduce variability during training deep neural networks. Almost all deep learning frame-work available they provide ready-to-use data-augmentation pipeline (e.g., tf.keras.layers.experimental.preprocessing) for 2D data. Although the application of this type of augmentation in 3D applications in terms of medical imaging is a little bit challenging owing to the 3D nature of the data ( e.g, CT, MRI), and additionally all the different augmentations may not also be useful for every task. In the past few years, we have seen a number of different open-source platforms/libraries giving support to different 3D medical imaging classification/segmentation/analysis.
Here I am listing few:

  1. NiftyNet is a TensorFlow-based open-source convolutional neural networks platform for research in medical image analysis and image-guided therapy.
  2. DLTK is an open-source library that makes deep learning on medical images easier, build on Tensorflow.
  3. MONAI framework is the open-source foundation being created by Project MONAI. MONAI is a freely available, community-supported, PyTorch-based framework for deep learning in healthcare imaging.
  4. MedicalTorch is an open-source framework for PyTorch, implementing an extensive set of loaders, pre-processors, and datasets for medical imaging.
  5. SimpleITK is an abstraction layer and wrapper around the Insight Segmentation and Registration Toolkit (ITK). It is available in the following programming languages: C++, Python, R, Java, C#, Lua, Tcl, and Ruby. [http://insightsoftwareconsortium.github.io/SimpleITK-Notebooks/Python_html/70_Data_Augmentation.html ]

You will also find useful content and functionality for augmentation in these libraries. In this blog, I will introduce a Library for 3D augmentations called volumentations-3D. This version is developed by @ZFTurbo and Fork of volumentations 3D Volume data augmentation package by @ashawkey. Initially inspired by albumentations library for augmentation of 2D images.
I found this library is very useful for my task while doing data-augmentation on-fly using tfrecords and tf.data data-pipeline in TF 2.0x.

3D data augmentation using volumentations-3D and tf.data

volumentations-3D

You can install the library using pip install.

pip install volumentations-3D

@ZFTurbo has clean and easy to use examples in his GitHub repo to show how to use the library link is below:

Here I will simply explain how to use this library for data-augmentation on-the-fly using tfrecords and tf.data in TensorFlow 2.0x

Volumentations-3D with tf.data

You can use it in a few simple steps. let’s see it step by step.

  1. First, you can set your criteria of augmentation you want to apply:

2. Now All these augmentations will be happening using NumPy and python functions. As we are intended to use the tfrecords and tf.data, our inputs will be tensors. Tf. 2.0x provides a wonderful function that we can use to warp this function and use the tensors in NumPy operations tf.numpy_function().

So now we will add this def get_augmentation() function in def augmentor() function and wrap this function using tf.numpy_function() inside the decoding function (line: 56) which will be later used in the mapping.

3. Finally we can make our data pipeline using tf.data to decode and map our tfrecords for training.

For more details on the best practices of using tf.data, please visit the TensorFlow's wonderful guide Better performance with the tf.data API.

So see it’s pretty simple to use augmentation using tf.data API with little effort using Volumentations-3D. Hope you will find it useful.

--

--

Fakrul Islam Tushar

Ph.D. Student at Duke University |Research Assistant at Center for Virtual Imaging Trials (CVIT)