# Reading List and Notes for Natural Language Generation # Machine Translation (MT) - [x] Sequence to Sequence Learning with Neural Networks **[[arXiv]]()** - [x] Neural Machine Translation by Jointly Learning to Align and Translate **[[arXiv]]()** - [x] Attention Is All You Need **[[arXiv]]()** ## Low-resource MT tasks A list of papers that I found interesting while exploring the task of tackling machine translation in low-resource settings, in descending order of the year published. [**[Google Slides]**](https://docs.google.com/presentation/d/1oWpU-3UGvh6xf_P8z_LmJulEeX8B0UWYFoCn9NDeyyA/edit?usp=sharing) ### 2021 - [x] A Comparison of Different NMT Approaches to Low-Resource Dutch-Albanian Machine Translation **[[arXiv]](https://aclanthology.org/2021.mtsummit-loresmt.7/)** - [x] Adapting High-resource NMT Models to Translate Low-resource Related Languages without Parallel Data **[[arXiv]](https://arxiv.org/pdf/2105.15071.pdf) [[Notes]](https://hackmd.io/@Thanmay/By6NOU5Zt)** - [x] Samanantar: The Largest Publicly Available Parallel Corpora Collection for 11 Indic Languages **[[arXiv]](https://arxiv.org/pdf/2104.05596v2.pdf)** - [x] IndicBART: A Pre-trained Model for Natural Language Generation of Indic Languages **[[arXiv]](https://arxiv.org/pdf/2109.02903.pdf) [[Notes]](https://hackmd.io/@Thanmay/indic-bart)** - [x] YANMTT: Yet Another Neural Machine Translation Toolkit **[[arXiv]](https://arxiv.org/pdf/2108.11126.pdf)** - [x] Itihāsa: A large-scale corpus for Sanskrit to English translation **[[arXiv]](https://arxiv.org/pdf/2106.03269.pdf) [[Notes]](https://hackmd.io/@Thanmay/itihasa)** - [x] AugVic: Exploiting BiText Vicinity for Low-Resource NMT **[[arXiv]](https://arxiv.org/pdf/2106.05141.pdf)** - [x] Unsupervised Translation of German–Lower Sorbian: Exploring Training and Novel Transfer Methods on a Low-Resource Language **[[arXiv]](https://arxiv.org/pdf/2109.12012.pdf)** - [x] Optimal Word Segmentation for Neural Machine Translation into Dravidian Languages **[[aclweb]](https://aclanthology.org/2021.wat-1.21.pdf) [[Notes]](https://hackmd.io/@Thanmay/SJbcHmONF)** - [ ] Improving Low-Resource NMT through Relevance Based Linguistic Features Incorporation **[[aclweb]](https://aclanthology.org/2020.coling-main.376.pdf)** - [ ] MuRIL: Multilingual Representations for Indian Languages **[[arXiv]](https://arxiv.org/pdf/2103.10730.pdf)** ### 2020 - [ ] Leveraging Monolingual Data with Self-Supervision for Multilingual Neural Machine Translation, Siddhant et al. **[[arXiv]](https://arxiv.org/pdf/2005.04816.pdf)** - [x] AI4Bharat-IndicNLP Corpus: Monolingual Corpora and Word Embeddings for Indic Languages **[[arXiv]](https://arxiv.org/pdf/2005.00085.pdf)** - [x] IndicNLPSuite: Monolingual Corpora, Evaluation Benchmarks and Pre-trained Multilingual Language Models for Indian Languages **[[aclweb]](https://aclanthology.org/2020.findings-emnlp.445.pdf)** - [ ] Semi-Supervised Low-Resource Style Transfer of Indonesian Informal to Formal Language with Iterative Forward-Translation **[[arXiv]](https://arxiv.org/pdf/2011.03286.pdf)** ### 2019 - [ ] The Missing Ingredient in Zero-Shot Neural Machine Translation, Arivazhagan et al. **[[arXiv]](https://arxiv.org/pdf/1903.07091.pdf)** - [x] Sanskrit Sandhi Splitting using *seq2(seq)²* **[[arXiv]](https://arxiv.org/pdf/1801.00428.pdf) [[Notes]](https://hackmd.io/@Thanmay/S1O6TdbVF)** - [ ] Domain Adaptive Text Style Transfer **[[arXiv]](https://aclanthology.org/D19-1325.pdf)** ### 2018 - [x] Rapid Adaptation of Neural Machine Translation to New Languages, Neubig et al. **[[arXiv]](https://arxiv.org/pdf/1808.04189.pdf) [[Notes]](https://hackmd.io/@Thanmay/H1FVtF51t)** - [x] Universal Neural Machine Translation for Extremely Low Resource Languages **[[aclweb]](https://aclanthology.org/N18-1032.pdf) [[Notes]](https://hackmd.io/@Thanmay/B1GKO0ult)** - [ ] Meta-Learning for Low-Resource Neural Machine Translation, Gu et al. (2018) **[[arXiv]](https://arxiv.org/pdf/1808.08437.pdf)** - [x] Neural machine translation for low-resource languages without parallel corpora, Karakanta et al. **[[springer]](https://link.springer.com/content/pdf/10.1007/s10590-017-9203-5.pdf) [[Notes]](https://hackmd.io/@Thanmay/ryeJGfhkY)** - [ ] Style Transfer as Unsupervised Machine Translation **[[arXiv]](https://arxiv.org/pdf/1808.07894.pdf)** ### 2017 - [x] Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation **[[arXiv]](https://arxiv.org/pdf/1611.04558.pdf)** ### 2016 - [x] Transfer Learning for Low-Resource Neural Machine Translation, Zoph et al. **[[arXiv]](https://arxiv.org/pdf/1604.02201.pdf) [[Notes]](https://hackmd.io/@Thanmay/xfer-nmt)** - [x] Exploiting Source-side Monolingual Data in Neural Machine Translation **[[aclweb]](https://aclanthology.org/D16-1160.pdf) [[Notes]](https://hackmd.io/@Thanmay/Hyp-qHSgF)** - [x] Improving Neural Machine Translation Models with Monolingual Data, Sennrich et al (2016) **[[arXiv]](https://arxiv.org/pdf/1511.06709.pdf)** ### 2014 - [x] Improving Machine Translation via Triangulation and Transliteration, Durrani N, Koehn P **[[aclweb]](https://aclanthology.org/2014.eamt-1.17.pdf)**