Try   HackMD

Reading List and Notes for Natural Language Generation

Machine Translation (MT)

  • Sequence to Sequence Learning with Neural Networks [arXiv]
  • Neural Machine Translation by Jointly Learning to Align and Translate [arXiv]
  • Attention Is All You Need [arXiv]

Low-resource MT tasks

A list of papers that I found interesting while exploring the task of tackling machine translation in low-resource settings, in descending order of the year published.
[Google Slides]

2021

  • A Comparison of Different NMT Approaches to Low-Resource Dutch-Albanian Machine Translation [arXiv]
  • Adapting High-resource NMT Models to Translate Low-resource Related Languages without Parallel Data [arXiv] [Notes]
  • Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages [arXiv]
  • IndicBART: A Pre-trained Model for Natural Language Generation of Indic Languages [arXiv] [Notes]
  • YANMTT: Yet Another Neural Machine Translation Toolkit [arXiv]
  • Itihāsa: A large-scale corpus for Sanskrit to English translation [arXiv] [Notes]
  • AugVic: Exploiting BiText Vicinity for Low-Resource NMT [arXiv]
  • Unsupervised Translation of German–Lower Sorbian: Exploring Training and Novel Transfer Methods on a Low-Resource Language [arXiv]
  • Optimal Word Segmentation for Neural Machine Translation into Dravidian Languages [aclweb] [Notes]
  • Improving Low-Resource NMT through Relevance Based Linguistic Features Incorporation [aclweb]
  • MuRIL: Multilingual Representations for Indian Languages [arXiv]

2020

  • Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation, Siddhant et al. [arXiv]
  • AI4Bharat-IndicNLP Corpus: Monolingual Corpora and Word Embeddings for Indic Languages [arXiv]
  • IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages [aclweb]
  • Semi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation [arXiv]

2019

  • The Missing Ingredient in Zero-Shot Neural Machine Translation, Arivazhagan et al. [arXiv]
  • Sanskrit Sandhi Splitting using seq2(seq)² [arXiv] [Notes]
  • Domain Adaptive Text Style Transfer [arXiv]

2018

  • Rapid Adaptation of Neural Machine Translation to New Languages, Neubig et al. [arXiv] [Notes]
  • Universal Neural Machine Translation for Extremely Low Resource Languages [aclweb] [Notes]
  • Meta-Learning for Low-Resource Neural Machine Translation, Gu et al. (2018) [arXiv]
  • Neural machine translation for low-resource languages without parallel corpora, Karakanta et al. [springer] [Notes]
  • Style Transfer as Unsupervised Machine Translation [arXiv]

2017

  • Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation [arXiv]

2016

  • Transfer Learning for Low-Resource Neural Machine Translation, Zoph et al. [arXiv] [Notes]
  • Exploiting Source-side Monolingual Data in Neural Machine Translation [aclweb] [Notes]
  • Improving Neural Machine Translation Models with Monolingual Data, Sennrich et al (2016) [arXiv]

2014

  • Improving Machine Translation via Triangulation and Transliteration, Durrani N, Koehn P [aclweb]