Multi-label text classification is an interesting problem where multiple tags or categories may have to be associated with the given text/documents. Multi-label text classification occurs in numerous real-world scenarios, for instance, in news categorization and in bioinformatics (gene classification problem, see [Zafer Barutcuoglu et. al 2006]). Kaggle data set is representative of the problem: https://www.kaggle.com/jhoward/nb-svm-strong-linear-baseline/data.

Several other interesting problem in text analytics exist, such as abstractive summarization [Chen, Yen-Chun 2018], sentiment analysis, search and information retrieval, entity resolution, document categorization, document clustering, machine translation etc. Deep learning has been applied to solve many of the above problems – for instance, the paper [Rie Johnson et. al 2015] gives an early approach to applying a convolutional network to make effective use of word order in text categorization. Recurrent Neural Networks (RNNs) have been effective in various tasks in text analytics, as explained here. Significant progress has been achieved in language translation by modelling machine translation using an encoder-decoder approach with the encoder formed by a neural network [Dzmitry Bahdanau et. al 2014].

However, as shown in [Dan Rosa de Jesus et. al 2018] , certain cases require modelling the hierarchical relationship in text data and is difficult to achieve with traditional deep learning networks because linguistic knowledge may have to be incorporated in these networks to achieve high accuracy. Moreover, deep learning networks do not consider hierarchical relationships between local features as pooling operation of CNNs lose information about the hierarchical relationships.

We show one industrial scale use case of capsule networks which we have implemented for our client in the realm of text analytics – news categorization. We explain how traditional deep learning methods may not be useful in the case when single-label data is only available for training (as in many real-life cases), while the test data set is multi-labelled – this is the sweet spot for capsule networks. We also discuss the key challenges faced industrialization of capsule networks – starting from providing a scalable implementation of capsule networks in TensorFlow, we show how capsule networks can be industrialized by providing an implementation on top of KubeFlow, which helps in productionization.

1. History of impact of machine learning and deep learning on NLP.

2. Motivation for capsule networks and how they can be used in text analytics.

3. Implementation of capsule networks in TensorFlow.

4. Industrialization of capsule nets with KubeFlow.

References:

[Zafer Barutcuoglu et. al 2006] Zafer Barutcuoglu, Robert E. Schapire, and Olga G. Troyanskaya. 2006. Hierarchical multi-label prediction of gene function. Bioinformatics 22, 7 (April 2006), 830-836. DOI=http://dx.doi.org/10.1093/bioinformatics/btk048

[Rie Johnson et. al 2015] Rie Johnson, Tong Zhang: Effective Use of Word Order for Text Categorization with Convolutional Neural Networks. HLT-NAACL 2015: 103-112.

[Dzmitry Bahdanau et. al 2014] Bahdanau, Dzmitry et al. “Neural Machine Translation by Jointly Learning to Align and Translate.” CoRR abs/1409.0473 (2014).

[Dan Rosa de Jesus et. al 2018] Dan Rosa de Jesus, Julian Cuevas, Wilson Rivera, Silvia Crivelli (2018). “Capsule Networks for Protein Structure Classification and Prediction”,

available at https://arxiv.org/abs/1808.07475.

[Yequan Wang et. al 2018] Yequan Wang, Aixin Sun, Jialong Han, Ying Liu, and Xiaoyan Zhu. 2018. Sentiment Analysis by Capsules. In Proceedings of the 2018 World Wide Web Conference (WWW '18). International World Wide Web Conferences Steering Committee, Republic and Canton of Geneva, Switzerland, 1165-1174. DOI: https://doi.org/10.1145/3178876.3186015

Chen, Yen-Chun and Bansal, Mohit (2018), “Fast Abstractive Summarization with Reinforce-Selected Sentence Rewriting”, eprint arXiv:1805.11080.

 
1 favorite thumb_down thumb_up 0 comments visibility_off  Remove from Watchlist visibility  Add to Watchlist
 

Outline/Structure of the Talk

1.Text Analytics & NLP and How important it is ? ( Why Text Analytics ?)
2.Application
3.Traditional Way [ Attempts so for on word understanding ]
•Bag of Words model
•Challenges of BOW model
•Distributed representation or word embeddings
•Challenges still left ( Missing part of “What” how to learn high level constructs )
4.CNN and LSTM based deep learning techniques
•Challenges :
1.little spatial invariance
2.More variation require more labelled data
3.Not Learning deep concepts with less data

5. Capsule Networks

•Promise of learning deep concepts with less data

6. Reference to Hinton’s work [ CapsNet for Vision ]

•How computer vision got affected by limitation of CNN especially Pooling
•Invariance vs Equivariance

7. What is Capsule ?

•Encoding Object representation
•Dynamic Routing

8. Capsule Network for Text

•Architecture
•Explanation of Each Component

N-Gram Convolutional Layer

Primary Capsule Layer

Conv. Capsule Layer

Fully Connected Layer

9. Code Snippet

Each Layer implementation in Tensorflow

10.Putting it all Together

All Code and Result on Dataset ( Text Classification)

11. Industrialization Attempt using Kubeflow

•KubeflowBrief Intro
•Comparison of Results on Kubeflow( CPU vs GPU )
•Hyper-Parameter Optimization through Katib

12. Challenges in Industrialization

13. Capsule for Text Summarization

•Architecture
•Results

Learning Outcome

Basics of capsule networks and how they can be applied to text analytics as well as basics of NLP.

Target Audience

CxOs, data scientists, data engineers, software engineers, architects

Prerequisites for Attendees

Basics of deep learning.

schedule Submitted 2 weeks ago

Public Feedback

comment Suggest improvements to the Speaker