Histogram Layers for Texture Analysis

Problem Statement: Shortcomings of Convolutional Neural Networks

Convolutional neural networks (CNN) have been vital for a variety of applications. Despite the innovations of CNN, these models are great at structural textures but not statistical textures. To illustrate this point, below is an example of different structural and statistical textures: We can visually see the distinct differences between the different texture combinations. The structural textures are a checkboard, cross, and stripe while the statisistical textures are pixels sampled from multinomial, binomial and constant distributions. A CNN could easily distinguish the structural textures, but would struggle with the statistical texures.

Why would a CNN struggle with statistical textures?

Structural texture approaches consist of defining a set of texture examples and an order of spatial positions for each exemplar (Materka et al., 1998). Convolution is a weighted sum operator that uses spatial information to learn local relationships between pixels. Given enough samples from each distribution, the mean values are approximately the same as shown:   The average operation is a special case of convolution where the all of the weights are equal to 1/number of data points. As a result, the CNN will struggle to capture a linear combination of pixels that learns the statistical information of the data (i.e., cannot learn weights to discriminate statistical exemplars). Here is an example where if a 3 by 3 convolution is used, the model can easily learn weights to tell the cross and checkboard apart. However, if we sample from a different distribution and retain the same shape, a convolution operation cannot learn weights to distinguish this change as the convolution is unable to account for individual pixel intensity changes. In order to capture the statistical textures, instead of understanding the structure of each texture, the data can be represented through parameters that characterize the distributions and correlation between the intensity and/or feature values in an image (Humeau-Heurtier, 2019).

Method: Histogram Layer

The proposed solution is a local histogram layer. Instead of computing global histograms as done previously, the proposed histogram layer directly computes the local, spatial distribution of features for texture analysis, and parameters for the layer are estimated during backpropagation. Histograms perform a counting operation for values that fall within a certain range. Below is an example where we are counting the number of 1s, 2s, and 3s in local windows of the image: The standard histogram operation is not differentiable; however, a smooth approximation (i.e., radial basis function) can be used instead as shown below: Another advantage of the proposed method is that the histogram layer is easily implemented using pre-exisiting layers! Any deep learning framework (e.g., Pytorch, TensorFlow) can be used to integerate the histogram layer into deep learning models. Applications of Histogram Layer

There are several real-world applications for the histogram layer! Examples include the health domain and remote sensing tasks such as disease detection and crop quality management (Image Source). Check Out the Code and Paper!

This work was accepted to the IEEE Transactions on Artificial Intelligence! Our code and paper are available!