Generalized stacking of layerwise-trained Deep Convolutional Neural Networks for document image classification

Journal article

Saikat Roy, Arindam Das, U. Bhattacharya
International Conference on Pattern Recognition, 2016

Cite

APA Click to copy
Roy, S., Das, A., & Bhattacharya, U. (2016). Generalized stacking of layerwise-trained Deep Convolutional Neural Networks for document image classification. International Conference on Pattern Recognition.

Chicago/Turabian Click to copy
Roy, Saikat, Arindam Das, and U. Bhattacharya. “Generalized Stacking of Layerwise-Trained Deep Convolutional Neural Networks for Document Image Classification.” International Conference on Pattern Recognition (2016).

MLA Click to copy
Roy, Saikat, et al. “Generalized Stacking of Layerwise-Trained Deep Convolutional Neural Networks for Document Image Classification.” International Conference on Pattern Recognition, 2016.

BibTeX Click to copy

@article{saikat2016a,
  title = {Generalized stacking of layerwise-trained Deep Convolutional Neural Networks for document image classification},
  year = {2016},
  journal = {International Conference on Pattern Recognition},
  author = {Roy, Saikat and Das, Arindam and Bhattacharya, U.}
}

Abstract

This article presents our recent study of a lightweight Deep Convolutional Neural Network (DCNN) architecture for document image classification. Here, we concentrated on training of a committee of generalized, compact and powerful base DCNNs. A support vector machine (SVM) is used to combine the outputs of individual DCNNs. The main novelty of the present study is introduction of supervised layerwise training of DCNN architecture in document classification tasks for better initialization of weights of individual DCNNs. Each DCNN of the committee is trained for a specific part or the whole document. Also, here we used the principle of generalized stacking for combining the normalized outputs of all the members of the DCNN committee. The proposed document classification strategy has been tested on the well-known Tobacco3482 document image dataset. Results of our experimentations show that the proposed strategy involving a considerably smaller network architecture can produce comparable document classification accuracies in competition with the state-of-the-art architectures making it more suitable for use in comparatively low configuration mobile devices.