Industrialized Capsule Networks for Text Analytics
Multi-label text classification is an interesting problem where multiple tags or categories may have to be associated with the given text/documents. Multi-label text classification occurs in numerous real-world scenarios, for instance, in news categorization and in bioinformatics (gene classification problem, see [Zafer Barutcuoglu et. al 2006]). Kaggle data set is representative of the problem: https://www.kaggle.com/jhoward/nb-svm-strong-linear-baseline/data.
Several other interesting problem in text analytics exist, such as abstractive summarization [Chen, Yen-Chun 2018], sentiment analysis, search and information retrieval, entity resolution, document categorization, document clustering, machine translation etc. Deep learning has been applied to solve many of the above problems – for instance, the paper [Rie Johnson et. al 2015] gives an early approach to applying a convolutional network to make effective use of word order in text categorization. Recurrent Neural Networks (RNNs) have been effective in various tasks in text analytics, as explained here. Significant progress has been achieved in language translation by modelling machine translation using an encoder-decoder approach with the encoder formed by a neural network [Dzmitry Bahdanau et. al 2014].
However, as shown in [Dan Rosa de Jesus et. al 2018] , certain cases require modelling the hierarchical relationship in text data and is difficult to achieve with traditional deep learning networks because linguistic knowledge may have to be incorporated in these networks to achieve high accuracy. Moreover, deep learning networks do not consider hierarchical relationships between local features as pooling operation of CNNs lose information about the hierarchical relationships.
We show one industrial scale use case of capsule networks which we have implemented for our client in the realm of text analytics – news categorization. We explain how traditional deep learning methods may not be useful in the case when single-label data is only available for training (as in many real-life cases), while the test data set is multi-labelled – this is the sweet spot for capsule networks. We also discuss the key challenges faced industrialization of capsule networks – starting from providing a scalable implementation of capsule networks in TensorFlow, we show how capsule networks can be industrialized by providing an implementation on top of KubeFlow, which helps in productionization.
1. History of impact of machine learning and deep learning on NLP.
2. Motivation for capsule networks and how they can be used in text analytics.
3. Implementation of capsule networks in TensorFlow.
4. Industrialization of capsule nets with KubeFlow.
[Zafer Barutcuoglu et. al 2006] Zafer Barutcuoglu, Robert E. Schapire, and Olga G. Troyanskaya. 2006. Hierarchical multi-label prediction of gene function. Bioinformatics 22, 7 (April 2006), 830-836. DOI=http://dx.doi.org/10.1093/bioinformatics/btk048
[Rie Johnson et. al 2015] Rie Johnson, Tong Zhang: Effective Use of Word Order for Text Categorization with Convolutional Neural Networks. HLT-NAACL 2015: 103-112.
[Dzmitry Bahdanau et. al 2014] Bahdanau, Dzmitry et al. “Neural Machine Translation by Jointly Learning to Align and Translate.” CoRR abs/1409.0473 (2014).
[Dan Rosa de Jesus et. al 2018] Dan Rosa de Jesus, Julian Cuevas, Wilson Rivera, Silvia Crivelli (2018). “Capsule Networks for Protein Structure Classification and Prediction”,
available at https://arxiv.org/abs/1808.07475.
[Yequan Wang et. al 2018] Yequan Wang, Aixin Sun, Jialong Han, Ying Liu, and Xiaoyan Zhu. 2018. Sentiment Analysis by Capsules. In Proceedings of the 2018 World Wide Web Conference (WWW '18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 1165-1174. DOI: https://doi.org/10.1145/3178876.3186015
Chen, Yen-Chun and Bansal, Mohit (2018), “Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting”, eprint arXiv:1805.11080.
Outline/Structure of the Talk
5. Capsule Networks
6. Reference to Hinton’s work [ CapsNet for Vision ]
7. What is Capsule ?
8. Capsule Network for Text
N-Gram Convolutional Layer
Primary Capsule Layer
Conv. Capsule Layer
Fully Connected Layer
9. Code Snippet
Each Layer implementation in Tensorflow
All Code and Result on Dataset ( Text Classification)
11. Industrialization Attempt using Kubeflow
12. Challenges in Industrialization
13. Capsule for Text Summarization
Basics of capsule networks and how they can be applied to text analytics as well as basics of NLP.
CxOs, data scientists, data engineers, software engineers, architects
Prerequisites for Attendees
Basics of deep learning.