Register for our upcoming hands-on workshop on LangChain | June 17th | Virtual

Information Preserving Frame-based Image Interpretation

Author(s): Vasudeva Kilaru


Interpreting image data to neural networks is challenging. Deep convolutional neural network methods have shown promising results in interpreting image data to neural networks, with convolution and pooling operations over the traditional fully connected dense layers. The performance of these deep convolutional methods, however, is often compromised by the constraint that the convolution and pooling operations interpret the image data to neural networks by compressing the data into lower dimensions, leading to an information bottleneck. To mitigate this, neural networks tend to be larger with more parameters (filters), thus increasing the computational cost (GPU Resources). In this paper, I present an information-preserving way to interpret image data to a convolutional neural net without any information bottleneck and with relatively fewer parameters that also ensure no loss in information due to convolution or pooling operations. My method uniquely adds two additional operations, one over the first regular convolution operation and the other next to the deeper convolution operation in the neural net. The first operation is to divide the image into individual frames by frame-based crop operation and then apply regular convolution, pooling operations to interpret individual frames of the image into low dimensional tensors that preserve information from being bottlenecked since operations use frames of the image instead of using total image data. The second operation is a convolution operation applied after joining all low dimensional tenors of individual frames to extract information that later passes through the rest of the layers. This idea of using frames can better help model tasks like image generation and completion as well. Using frames of data instead of the total image helps in parallelizing computations, thereby drastically decreasing the computational cost and depth of a neural network. Experiments conducted on inception networks worked better with relatively small network architecture on vision tasks.

Deep Learning DevCon 2023

The Flagship conference of Association of Data Scientists.
May 27 | Virtual

The Chartered Data Scientist Designation

Achieve the highest distinction in the data science profession.

Explore more from Association of Data Scientists