Papers

../_images/article.jpeg

This chapter is associated with the papers published in NLP using deep learning.

Data Representation

One-hot representation

  • Effective Use of Word Order for Text Categorization with Convolutional Neural Networks : Exploiting the 1D structure (namely, word order) of text data for prediction. [Paper link , Code implementation]

    ../_images/progress-overall-60.png
  • Neural Responding Machine for Short-Text Conversation : Neural Responding Machine has been proposed to generate content-wise appropriate responses to input text. [Paper link , Paper summary]

    ../_images/progress-overall-60.png

Continuous Bag of Words (CBOW)

  • Distributed Representations of Words and Phrases and their Compositionality : Not necessarily about CBOWs but the techniques represented in this paper can be used for training the continuous bag-of-words model. [Paper link , Code implementation 1, Code implementation 2]

    ../_images/progress-overall-100.png

Word-Level Embedding

  • Efficient Estimation of Word Representations in Vector Space : Two novel model architectures for computing continuous vector representations of words. [Paper link , Official code implementation]

    ../_images/progress-overall-100.png
  • GloVe: Global Vectors for Word Representation : Combines the advantages of the two major models of global matrix factorization and local context window methods and efficiently leverages the statistical information of the content. [Paper link , Official code implementation]

    ../_images/progress-overall-100.png

Character-Level Embedding

  • Learning Character-level Representations for Part-of-Speech Tagging : CNNs have successfully been utilized for learning character-level embedding. [Paper link ]

    ../_images/progress-overall-60.png
  • Deep Convolutional Neural Networks forSentiment Analysis of Short Texts : A new deep convolutional neural network has been proposed for exploiting the character- to sentence-level information for sentiment analysis application on short texts. [Paper link ]

    ../_images/progress-overall-80.png
  • Finding Function in Form: Compositional Character Models for Open Vocabulary Word Representation : The usage of two LSTMs operate over the char- acters for generating the word embedding [Paper link ]

    ../_images/progress-overall-60.png
  • Improved Transition-Based Parsing by Modeling Characters instead of Words with LSTMs : The effectiveness of modeling characters for dependency parsing. [Paper link ]

    ../_images/progress-overall-40.png

Applications

Part-Of-Speech Tagging

  • Learning Character-level Representations for Part-of-Speech Tagging : A deep neural network (DNN) architecture that joins word-level and character-level representations to perform POS taggin [Paper]

    ../_images/progress-overall-100.png
  • Bidirectional LSTM-CRF Models for Sequence Tagging : A variety of neural network based models haves been proposed for sequence tagging task. [Paper, Code Implementation 1, Code Implementation 2]

    ../_images/progress-overall-80.png
  • Globally Normalized Transition-Based Neural Networks : Transition-based neural network model for part-of-speech tagging. [Paper]

    ../_images/progress-overall-80.png

Parsing

  • A fast and accurate dependency parser using neural networks : A novel way of learning a neural network classifier for use in a greedy, transition-based dependency parser. [Paper, Code Implementation 1]

    ../_images/progress-overall-100.png
  • Simple and Accurate Dependency Parsing Using Bidirectional LSTM Feature Representations : A simple and effective scheme for dependency parsing which is based on bidirectional-LSTMs. [Paper]

    ../_images/progress-overall-60.png
  • Transition-Based Dependency Parsing with Stack Long Short-Term Memory : A technique for learning representations of parser states in transition-based dependency parsers. [Paper]

    ../_images/progress-overall-80.png
  • Deep Biaffine Attention for Neural Dependency Parsing : Using neural attention in a simple graph-based dependency parser. [Paper]

    ../_images/progress-overall-20.png
  • Joint RNN-Based Greedy Parsing and Word Composition : A greedy parser based on neural networks, which leverages a new compositional sub-tree representation. [Paper]

    ../_images/progress-overall-20.png

Named Entity Recognition

  • Neural Architectures for Named Entity Recognition : Bidirectional LSTMs and conditional random fields for NER. [Paper]

    ../_images/progress-overall-100.png
  • Boosting named entity recognition with neural character embeddings : A language-independent NER system that uses automatically learned features. [Paper]

    ../_images/progress-overall-60.png
  • Named Entity Recognition with Bidirectional LSTM-CNNs : A novel neural network architecture that automatically detects word- and character-level features. [Paper]

    ../_images/progress-overall-80.png

Semantic Role Labeling

  • End-to-end learning of semantic role labeling using recurrent neural networks : The use of deep bi-directional recurrent network as an end-to-end system for SRL. [Paper]

    ../_images/progress-overall-60.png

Text classification

  • A Convolutional Neural Network for Modelling Sentences : Dynamic Convolutional Neural Network (DCNN) architecture, which technically is the CNN with a dynamic k-max pooling method, has been proposed for capturing the semantic modeling of the sentences. [Paper link , Code implementation]

    ../_images/progress-overall-80.png
  • Very Deep Convolutional Networks for Text Classification : The Very Deep Convolutional Neural Networks (VDCNNs) has been presented and employed at character-level with the demonstration of the effectiveness of the network depth on classification tasks [Paper link ]

    ../_images/progress-overall-20.png
  • Multichannel Variable-Size Convolution for Sentence Classification : Multichannel Variable Size Convolutional Neural Network (MV-CNN) architecture Combines different version of word-embeddings in addition to employing variable-size convolutional filters and is proposed in this paper for sentence classification. [Paper link]

    ../_images/progress-overall-20.png
  • A Sensitivity Analysis of (and Practitioners’ Guide to) Convolutional Neural Networks for Sentence Classification : A practical sensitivity analysis of CNNs for exploring the effect of architecture on the performance, has been investigated in this paper. [Paper link]

    ../_images/progress-overall-60.png
  • Generative and Discriminative Text Classification with Recurrent Neural Networks : RNN-based discriminative and generative models have been investigated for text classification and their robustness to the data distribution shifts has been claimed as well. [Paper link]

    ../_images/progress-overall-20.png
  • Deep sentence embedding using long short-term memory networks: Analysis and application to information retrieval : An LSTM-RNN architecture has been utilized for sentence embedding with special superiority in a defined web search task. [Paper link]

    ../_images/progress-overall-60.png
  • Recurrent Convolutional Neural Networks for Text Classification : The combination of both RNNs and CNNs is used for text classification which technically is a recurrent architecture in addition to max-pooling with an effective word representation method and demonstrates superiority compared to simple windows-based neural network approaches. [Paper link , Code implementation 1 , Code implementation 2 , Summary]

    ../_images/progress-overall-60.png
  • A C-LSTM Neural Network for Text Classification : A unified architecture proposed for sentence and document modeling for classification. [Paper link ]

    ../_images/progress-overall-20.png

Sentiment Analysis

  • Domain adaptation for large-scale sentiment classification: A deep learning approach : A deep learning approach which learns to extract a meaningful representation for each online review. [Paper link]

    ../_images/progress-overall-80.png
  • Sentiment analysis: Capturing favorability using natural language processing : A sentiment analysis approach to extract sentiments associated with polarities of positive or negative for specific subjects from a document. [Paper link]

    ../_images/progress-overall-80.png
  • Document-level sentiment classification: An empirical comparison between SVM and ANN : A comparison study. [Paper link]

    ../_images/progress-overall-60.png
  • Learning semantic representations of users and products for document level sentiment classification : Incorporating of user- and product- level information into a neural network approach for document level sentiment classification. [Paper]

    ../_images/progress-overall-40.png
  • Document modeling with gated recurrent neural network for sentiment classification : A a neural network model has been proposed to learn vector-based document representation. [Paper, Implementation]

    ../_images/progress-overall-60.png
  • Semi-supervised recursive autoencoders for predicting sentiment distributions : A novel machine learning framework based on recursive autoencoders for sentence-level prediction. [Paper]

    ../_images/progress-overall-80.png
  • A convolutional neural network for modelling sentences : A convolutional architecture adopted for the semantic modelling of sentences. [Paper]

    ../_images/progress-overall-80.png
  • Recursive deep models for semantic compositionality over a sentiment treebank : Recursive Neural Tensor Network for sentiment analysis. [Paper]

    ../_images/progress-overall-60.png
  • Adaptive recursive neural network for target-dependent twitter sentiment classification : AdaRNN adaptively propagates the sentiments of words to target depending on the context and syntactic relationships. [Paper]

    ../_images/progress-overall-20.png
  • Aspect extraction for opinion mining with a deep convolutional neural network : A deep learning approach to aspect extraction in opinion mining. [Paper]

    ../_images/progress-overall-20.png

Machine Translation

  • Learning phrase representations using RNN encoder-decoder for statistical machine translation : The proposed RNN Encoder–Decoder with a novel hidden unit has been empirically evaluated on the task of machine translation. [Paper, Code, Blog post]

    ../_images/progress-overall-100.png
  • Sequence to Sequence Learning with Neural Networks : A showcase of NMT system is comparable to the traditional pipeline by Google. [Paper, Code]

    ../_images/progress-overall-100.png
  • Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation : This work presents the design and implementation of GNMT, a production NMT system at Google. [Paper, Code]

    ../_images/progress-overall-100.png
  • Neural Machine Translation by Jointly Learning to Align and Translate : An extension to the encoder–decoder model which learns to align and translate jointly by attention mechanism. [Paper]

    ../_images/progress-overall-100.png
  • Effective Approaches to Attention-based Neural Machine Translation : Improvement of attention mechanism for NMT. [Paper, Code]

    ../_images/progress-overall-60.png
  • On the Properties of Neural Machine Translation: Encoder-Decoder Approaches : Analyzing the properties of the neural machine translation using two models; RNN Encoder–Decoder and a newly proposed gated recursive convolutional neural network. [Paper]

    ../_images/progress-overall-60.png
  • On Using Very Large Target Vocabulary for Neural Machine Translation : A method that allows to use a very large target vocabulary without increasing training complexity. [Paper]

    ../_images/progress-overall-40.png
  • Convolutional sequence to sequence learning : An architecture based entirely on convolutional neural networks. [Paper, Code[Torch], Code[Pytorch], Post]

    ../_images/progress-overall-60.png
  • Attention Is All You Need : The Transformer: a novel neural network architecture based on a self-attention mechanism. [Paper, Code, Accelerating Deep Learning Research with the Tensor2Tensor Library, Transformer: A Novel Neural Network Architecture for Language Understanding]

    ../_images/progress-overall-100.png

Summarization

  • A Neural Attention Model for Abstractive Sentence Summarization : A fully data-driven approach to abstractive sentence summarization based on a local attention model. [Paper, Code, A Read on “A Neural Attention Model for Abstractive Sentence Summarization”, Blog Post, Paper notes,]

    ../_images/progress-overall-100.png
  • Get To The Point: Summarization with Pointer-Generator Networks : A novel architecture that augments the standard sequence-to-sequence attentional model by using a hybrid pointer-generator network that may copy words from the source text via pointing and using coverage to keep track of what has been summarized. [Paper, Code, Video, Blog Post]

    ../_images/progress-overall-100.png
  • Abstractive Sentence Summarization with Attentive Recurrent Neural Networks : A conditional recurrent neural network (RNN) based on convolutional attention-based encoder which generates a summary of an input sentence. [Paper]

    ../_images/progress-overall-60.png
  • Abstractive Text Summarization Using Sequence-to-Sequence RNNs and Beyond : Abstractive text summarization using Attentional Encoder-Decoder Recurrent Neural Networks [Paper]

    ../_images/progress-overall-60.png
  • A Deep Reinforced Model for Abstractive Summarization : A neural network model with a novel intra-attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning (RL). [Paper]

    ../_images/progress-overall-60.png

Question Answering

  • Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks : An argue for the usefulness of a set of proxy tasks that evaluate reading comprehension via question answering. [Paper]

    ../_images/progress-overall-60.png
  • Teaching Machines to Read and Comprehend : addressing the lack of real natural language training data by introducing a novel approach to building a supervised reading comprehension data set. [Paper]

    ../_images/progress-overall-80.png
  • Ask Me Anything Dynamic Memory Networks for Natural Language Processing : Introducing the dynamic memory network (DMN), a neural network architecture which processes input sequences and questions, forms episodic memories, and generates relevant answers [Paper]

    ../_images/progress-overall-80.png