NLP/ML Reading Group

[TOC] # Past Meetings - May 30, 2024: **Karl** presents [Efficient End-to-End Visual Document Understanding with Rationale Distillation](https://arxiv.org/pdf/2311.09612) - <img src="https://hackmd.io/_uploads/Hy62OzuM0.png" width="300"> <img src="https://hackmd.io/_uploads/rya4FzOfC.png" width="300"> - April 26, 2024: **Shreya** presents [Ghostbuster: Detecting Text Ghostwritten by Large Language Models](https://arxiv.org/pdf/2305.15047) - April 19, 2024: **Vincent** presents [Long-form factuality in large language models](https://arxiv.org/pdf/2403.18802.pdf) - April 5, 2024: **Govind** presents [SWEA: Changing Factual Knowledge in Large Language Models via Subject Word Embedding Altering](https://arxiv.org/pdf/2401.17809.pdf) - Mar 29, 2024: **Yash** presents [LLM2LLM: Boosting LLMs with Novel Iterative Data Enhancement](https://arxiv.org/abs/2403.15042) - Mar 22, 2024: **Karl** presents [Making Retrieval-Augmented Language Models Robust to Irrelevant Context](https://arxiv.org/pdf/2310.01558.pdf) - Mar 15, 2024: **Shreya** presents [Metric-aware LLM inference](https://arxiv.org/pdf/2403.04182.pdf) - Mar 8, 2024: **Vincent** presents [Fine-grained Hallucination Detection and Editing for Language Models](https://arxiv.org/pdf/2401.06855.pdf) - Mar 1, 2024: **Govind** presents [Self-Rewarding Language Models](https://arxiv.org/pdf/2401.10020.pdf) - Feb 23, 2024: **Karl** presents [FACTSCORE: Fine-grained Atomic Evaluation of Factual Precision in Long Form Text Generation](https://arxiv.org/pdf/2305.14251.pdf) - Dec 15, 2023: **Shreya** presents [Open Problems and Fundamental Limitations of Reinforcement Learning from Human Feedback](https://arxiv.org/pdf/2307.15217.pdf) - Dec 8, 2023: **Karl** presents [MAP's not dead yet: Uncovering true language model modes by conditioning away degeneracy](https://arxiv.org/pdf/2311.08817.pdf) - Dec 1, 2023: **Govind** presents [In-Context Learning Creates Task Vectors](https://arxiv.org/pdf/2310.15916.pdf) - Nov 17, 2023: **Vincent** presents [RA-DIT: Retrieval-Augmented Dual Instruction Tuning](https://arxiv.org/pdf/2310.01352.pdf) - Nov 10, 2023: **Yash** presents [Contrastive Preference Learning: Learning from Human Feedback without RL](https://arxiv.org/abs/2310.13639) - Nov 3, 2023: **Shuhang** presents [Larger language models do in-context learning differently ](https://arxiv.org/abs/2303.03846) - Oct 27, 2023: **Shreya** presents [DetectGPT: Zero-Shot Machine-Generated Text Detection using Probability Curvature](https://arxiv.org/pdf/2301.11305.pdf) - Oct 20, 2023: **Karl** presents [Text Embeddings Reveal (Almost) As Much As Text](https://arxiv.org/pdf/2310.06816.pdf) - Oct 6, 2023: **Govind** presents [Fast Model Editing at Scale](https://arxiv.org/pdf/2110.11309.pdf) - Sep 29, 2023: **Vincent** presents [Grammar Prompting for Domain-Specific Language Generation with Large Language Models](https://arxiv.org/pdf/2305.19234.pdf) - Sep 22, 2023: **Yash** presents [Let's Verify Step by Step](https://arxiv.org/abs/2305.20050) - Sep 15, 2023: **Shreya** presents [Pretraining Language Models with Human Preferences](https://arxiv.org/pdf/2302.08582.pdf) - Sep 8, 2023: **Karl** presents [Linearity of Relation Decoding in Transformer Language Models](https://arxiv.org/pdf/2308.09124.pdf) - Aug 10, 2023: **Vincent** presents [Active Retrieval Augmented Generation](https://arxiv.org/pdf/2305.06983.pdf) - Aug 4, 2023: **Yash** presents [RT-2: Vision-Language-Action Models Transfer Web Knowledge to Robotic Control](https://arxiv.org/abs/2307.15818) - July 28, 2023: **Shreya** presents [Model evaluation for extreme risks](https://arxiv.org/pdf/2305.15324.pdf) - July 21, 2023: **Karl** presents [Llama 2: Open Foundation and Fine-Tuned Chat Models](https://arxiv.org/pdf/2307.09288v2.pdf) - July 14, 2023: **Anindita** presents [Measuring Inductive Biases of In-Context Learning with Underspecified Demonstrations](https://arxiv.org/pdf/2305.13299.pdf) - July 7, 2023: **Vincent** presents [Tree of Thoughts: Deliberate Problem Solving with Large Language Models](https://arxiv.org/pdf/2305.10601.pdf) - Jun 30, 2023: **Yash** presents [Can Large Language Models Infer Causation from Correlation?](https://arxiv.org/abs/2306.05836) - Jun 23, 2023: **Shreya** presents [What In-Context Learning “Learns” In-Context: Disentangling Task Recognition and Task Learning](https://arxiv.org/pdf/2305.09731.pdf) - Jun 16, 2023: **Karl** presents [From Pixels to UI Actions: Learning to Follow Instructions via Graphical User Interfaces](https://arxiv.org/pdf/2306.00245.pdf) - Jun 9, 2023: **Anindita** presents [Enabling Large Language Models to Generate Text with Citations](https://arxiv.org/pdf/2305.14627.pdf) - Jun 2, 2023: **Vincent** presents [Evaluating Open-Domain Question Answering in the Era of Large Language Models](https://arxiv.org/pdf/2305.06984.pdf) - May 5, 2023: **Yash** presents [Segment Anything](https://arxiv.org/abs/2304.02643) - April 14, 2023: **Karl** presents [Multilingual Machine Translation with Large Language Models: Empirical Results and Analysis](https://arxiv.org/pdf/2304.04675.pdf) - April 7, 2023: **Anindita** presents [Recommender Systems with Generative Retrieval](https://shashankrajput.github.io/Generative.pdf) - March 17, 2023: **Vincent** presents [UL2: Unifying Language Learning Paradigms](https://arxiv.org/pdf/2205.05131.pdf) - March 10, 2023: **Yash** presents [Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs](https://arxiv.org/abs/2210.12283) - March 3, 2023: **Govind** presents [What learning algorithm is in-context learning? Investigations with linear models](https://openreview.net/pdf?id=0g0X4H8yN4I) - February 24, 2023: **Karl** presents [Toolformer: Language Models Can Teach Themselves to Use Tools](https://arxiv.org/pdf/2302.04761.pdf) - February 17, 2023: **Anindita** presents [Selection Inference: Exploiting Large Language Models For Interpretable Logical Reasoning](https://openreview.net/pdf?id=3Pf3Wg6o-A4) - February 10, 2023: **Vincent** presents [Retrieval as Attention: End-to-end Learning of Retrieval and Reading within a Single Transformer](https://arxiv.org/pdf/2212.02027.pdf) - January 27, 2023: **Yash** presents [Hungry Hungry Hippos: Towards Language Modeling with State Space Models](https://arxiv.org/abs/2212.14052) - January 20, 2023: **Rahul** presents [Why do Nearest Neighbor Language Models Work?](https://arxiv.org/pdf/2301.02828.pdf) - January 6, 2023: **Karl** presents [Attributed Question Answering: Evaluation and Modeling for Attributed Large Language Models](https://arxiv.org/pdf/2212.08037.pdf) - December 30, 2022: **Govind** presents [Controllable Text Generation with Language Constraints ](https://arxiv.org/pdf/2212.10466.pdf) - December 16, 2022: **Anindita** presents [Don’t Prompt, Search! Mining-based Zero-Shot Learning with Language Models](https://arxiv.org/pdf/2210.14803.pdf) - December 9, 2022: **Vincent** presents [How Much Does Attention Actually Attend? Questioning the Importance of Attention in Pretrained Transformers](https://arxiv.org/pdf/2211.03495.pdf) - November 18, 2022: **Govind** presents [Training Language Models with Memory Augmentation](https://arxiv.org/pdf/2205.12674.pdf) - November 4, 2022: **Vedang** presents [A-OKVQA: A Benchmark for Visual Question Answering using World Knowledge](https://arxiv.org/pdf/2206.01718.pdf) - October 28, 2022: **Karl** presents [Markup-to-Image Diffusion Models with Scheduled Sampling](https://arxiv.org/abs/2210.05147) - October 21, 2022: **Yash** presents [A Unified Sequence Interface for Vision Tasks](https://arxiv.org/abs/2206.07669) - October 14, 2022: **Rahul** presents [Explaining Patterns in Data with Language Models via Interpretable Autoprompting](https://arxiv.org/abs/2210.01848) - October 7, 2022: **Anindita** presents [Nested Named Entity Recognition as Latent Lexicalized Constituency Parsing](https://arxiv.org/pdf/2203.04665.pdf) - September 30, 2022: **Vincent** presents [MS MARCO](https://arxiv.org/pdf/1611.09268v3.pdf) and [TREC 2019](https://www.microsoft.com/en-us/research/uploads/prod/2020/03/trec2019-deeplearning-overview-5e725254303da.pdf) - September 9, 2022: **Govind** presents [Chain of Thought Prompting Elicits Reasoning in Large Language Models](https://arxiv.org/pdf/2201.11903.pdf) - September 2, 2022: **Vedang** presents [Few-shot Learning with Retrieval Augmented Language Models](https://arxiv.org/pdf/2208.03299.pdf) - August 26, 2022: **Karl** presents [Questions Are All You Need to Train a Dense Passage Retriever](https://arxiv.org/pdf/2206.10658.pdf) - August 5, 2022: **Yash** presents [Anticorrelated Noise Injection for Improved Generalization](https://arxiv.org/abs/2202.02831) - July 29, 2022: **Vincent** presents [No Parameters Left Behind: Sensitivity Guided Adaptive Learning Rate for Training Large Transformer Models](https://arxiv.org/pdf/2202.02664.pdf) - July 21, 2022: **Anindita** presents [Modeling Task Interactions in Document-Level Joint Entity and Relation Extraction](https://arxiv.org/pdf/2205.01909.pdf) - July 15, 2022: **Govind** presents [DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings](https://openreview.net/pdf?id=SzGgMLQfSb5) - July 9, 2022: **Lucy** presents [Clustering-based Inference for Biomedical Entity Linking](https://aclanthology.org/2021.naacl-main.205.pdf) and [Entity Linking and Discovery via Arborescence-based Supervised Clustering](https://arxiv.org/pdf/2109.01242.pdf) - July 2, 2022: **Karl** presents [Evaluating Explanations: How Much Do Explanations from the Teacher Aid Students?](https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00465/110436/Evaluating-Explanations-How-Much-Do-Explanations) - June 25, 2022: **Vrinda** presents [Gold Doesn’t Always Glitter: Spectral Removal of Linear and Nonlinear Guarded Attribute Information](https://arxiv.org/pdf/2203.07893.pdf) - June 3, 2022: **Rahul** presents [Word2Box: Capturing Set-Theoretic Semantics of Words using Box Embeddings](https://aclanthology.org/2022.acl-long.161.pdf) - May 23, 2022: **Anindita** presents [Unsupervised Parsing via Constituency Tests](https://arxiv.org/pdf/2010.03146.pdf) - May 12, 2022: **Vincent** presents [Generative Multi-hop Retrieval](https://arxiv.org/pdf/2204.13596.pdf) - May 5, 2022: **Anindita** presents [Knowledge Base Question Answering by Case-based Reasoning over Subgraphs](https://arxiv.org/pdf/2202.10610.pdf) - Apr 26, 2022: **Vedang** presents [Improving Passage Retrieval with Zero-Shot Question Generation](https://arxiv.org/pdf/2204.07496.pdf) - Apr 21, 2022: **Govind** presents [Commonsense Reasoning for Question Answering with Explanations](https://openreview.net/pdf?id=rg-zrfteOZc) - Apr 11, 2022: **Dhruv** presents [QuALITY: Question Answering with Long Input Texts, Yes!](https://arxiv.org/pdf/2112.08608.pdf) - Apr 1, 2022: **Yash** presents [DINO: DETR with Improved DeNoising Anchor Boxes for End-to-End Object Detection](https://arxiv.org/abs/2203.03605) - Feb 5, 2022: **Wenyue** presents [GreaseLM: Graph Reasoning Enhanced Language Models For Question Answering](https://arxiv.org/pdf/2201.08860.pdf) - Jan 29, 2022: **Shuning** presents [How Does SimSiam Avoid Collapse Without Negative Samples?](https://openreview.net/forum?id=bwq6O4Cwdl) - Jan 23, 2022: **Wenzheng** presents [End-to-End Training of Multi-Document Reader and Retriever for Open-Domain Question Answering](https://arxiv.org/pdf/2106.05346.pdf) - Jan 15, 2022: **Rajarshi** presents [Improving language models by retrieving from trillions of tokens](https://arxiv.org/pdf/2112.04426.pdf) - Dec 18, 2021: **Wenzheng** presents [Focus on what matters: Applying Discourse Coherence Theory to Cross Document Coreference](https://arxiv.org/pdf/2110.05362.pdf) - Dec 11, 2021: **Karl** presents [Rethinking Search: Making Domain Experts out of Dilettantes](https://arxiv.org/pdf/2105.02274.pdf) - Dec 4, 2021: **Wenyue** presents [Answering Open-Domain Questions of Varying Reasoning Steps from Text](https://arxiv.org/pdf/2010.12527.pdf) - Nov 20, 2021: **Shuning** presents [Efficient Training of Retrieval Models Using Negative Cache](https://proceedings.neurips.cc/paper/2021/file/2175f8c5cd9604f6b1e576b252d4c86e-Paper.pdf) - Nov 13, 2021: **Rajarshi** presents [Multi-Vector Attention Models for Deep Re-ranking](https://aclanthology.org/2021.emnlp-main.443.pdf) - Nov 6, 2021: **Wenzheng** presents [Highly Parallel Autoregressive Entity Linking with Discriminative Correction](https://arxiv.org/pdf/2109.03792.pdf) - Oct 30, 2021: **Karl** presents [Mention Memory: incorporating textual knowledge into Transformers through entity mention attention](https://arxiv.org/pdf/2110.06176.pdf) - Oct 23, 2021: **Wenyue** presents [Pay Attention to MLPs](https://arxiv.org/pdf/2105.08050.pdf) - Oct 16, 2021: **Shuning** presents [Less is More: Pre-train a Strong Text Encoder for Dense Retrieval Using a Weak Decoder](https://arxiv.org/pdf/2102.09206.pdf) - Oct 9, 2021: **Wenzheng** presents [Joint Passage Ranking for Diverse Multi-Answer Retrieval](https://arxiv.org/pdf/2104.08445.pdf) - Sep 25, 2021: **Rajarshi** presents [How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering](https://direct.mit.edu/tacl/article/doi/10.1162/tacl_a_00407/107277/How-Can-We-Know-When-Language-Models-Know-On-the) - Sep 11, 2021: **Karl** presents [WikiGraphs: A Wikipedia Text - Knowledge Graph Paired Dataset](https://arxiv.org/pdf/2107.09556.pdf) - Sep 4, 2021: **Wenyue** presents [Generation-Augmented Retrieval for Open-Domain Question Answering](https://aclanthology.org/2021.acl-long.316.pdf) - Aug 21, 2021: **Shuning** presents [Domain-matched Pre-training Tasks for Dense Retrieval](https://arxiv.org/pdf/2107.13602.pdf) - Aug 14, 2021: **Wenzheng** presents [UnitedQA: A Hybrid Approach for Open Domain Question Answering](https://arxiv.org/pdf/2101.00178.pdf) - July 31, 2021: **Rajarshi** presents [The Curse of Dense Low-Dimensional Information Retrieval for Large Index Sizes](https://aclanthology.org/2021.acl-short.77.pdf) - July 24, 2021: **Karl** presents [Thinking Like Transformers](https://arxiv.org/pdf/2106.06981.pdf) - July 17, 2021: **Wenyue** presents [ QA-GNN: Reasoning with Language Models and Knowledge Graphs for Question Answering](https://arxiv.org/pdf/2104.06378.pdf) - July 10, 2021: **Shuning** presents [Coreference Resolution without Span Representations](https://arxiv.org/abs/2101.00434) - Jul 3, 2021: **Wenzheng** presents [Efficient Passage Retrieval with Hashing for Open-domain Question Answering](https://arxiv.org/pdf/2106.00882.pdf) - June 26, 2021: **Rajarshi** presents [Learning Dense Representations of Phrases at Scale](https://arxiv.org/pdf/2012.12624.pdf) - May 1, 2021: **Karl** presents [How Many Data Points is a Prompt Worth?](https://www.aclweb.org/anthology/2021.naacl-main.208.pdf) - June 12, 2021: **Wenyue** presents [Few-Shot Question Answering by Pretraining Span Selection](https://arxiv.org/pdf/2101.00438.pdf) - May 29, 2021: **Shuning** presents [FlowPrior: Learning Expressive Priors for Latent Variable Sentence Models](https://www.aclweb.org/anthology/2021.naacl-main.259/) - [hackmd](https://hackmd.io/4rC2VEn_SXKtWgUfRezGiw?view#Discussion) - May 22,2021: **Wenzheng** presents [SimCSE: Simple Contrastive Learning of Sentence Embeddings](https://arxiv.org/pdf/2104.08821.pdf) - May 8, 2021: **Rajarshi** presents [Answering Complex Open-Domain Questions With Multi-Hop Dense Retrieval](https://openreview.net/pdf?id=EMHoBG0avc1) - May 1, 2021: **Karl** presents [CorefQA: Coreference Resolution as Query-based Span Prediction](https://arxiv.org/pdf/1911.01746.pdf) - Apr 24, 2021: **Wenyue** presents [Cooperative Learning of Zero-Shot Machine Reading Comprehension](https://arxiv.org/pdf/2103.07449.pdf) - Apr 17, 2021: **Shuning** presents [Unsupervised Data Augmentation for Consistency Training](https://arxiv.org/abs/1904.12848) - [hackmd](https://hackmd.io/vaOdU9fSQqeg3Fry-jGVLg?view) - Apr 10, 2021: **Wenzheng** presents [Multi-task Retrieval for Knowledge-Intensive Tasks](https://arxiv.org/pdf/2101.00117.pdf) - [hackmd](https://hackmd.io/QPDQo4bCRiWDymAS6z_0kQ) - April 3, 2021: **Rajarshi** presents [Syntax-Enhanced Pretrained Model](https://arxiv.org/abs/2012.14116) - Mar 27, 2021: **Karl** presents [Natural Questions: a Benchmark for Question Answering Research](https://research.google/pubs/pub47761/) - [hackmd](https://hackmd.io/XTh1jw3HSjGGs7b-ar6GPw?both) - Mar 20, 2021: **Wenyue** presents [SENSEI: Sensitive Set Invariance for Enforcing Individual Fairness](https://openreview.net/pdf?id=DktZb97_Fx) - Mar 13, 2021: **Shuning** presents [Pre-training via Paraphrasing](https://arxiv.org/abs/2006.15020) - [Note](https://drive.google.com/file/d/13fjBXV80LE_1RUcgu1EK7RMXahqhORHZ/view?usp=sharing) - Mar 6, 2021: **Rajarshi** presents [Empirical Analysis of Unlabeled Entity Problem in Named Entity Recognition](https://openreview.net/forum?id=5jRVa89sZk) - Feb 27, 2021: **Wenzheng** presents [Representations for Question Answering from Documents with Tables and Text](https://arxiv.org/pdf/2101.10573.pdf) - Jan 30, 2021: **Karl** presents [KILT: a Benchmark for Knowledge Intensive Language Tasks](https://arxiv.org/pdf/2009.02252.pdf) - Jan 23, 2021: **Wenyue** presents [Invariant Risk Minimization](https://arxiv.org/pdf/1907.02893.pdf) - [hackmd](https://hackmd.io/EWaT8wBgQ8mMswbIQQwUmA?view) - Jan 16, 2021: **Shuning** presents [Efficient Transformers: A Survey](https://arxiv.org/abs/2009.06732) - [hackmd](https://hackmd.io/XMIjmETiT0W4X2pwuFfdZw?view) - Dec 25, 2020: **Wenzheng** presents [Compositional Generalization and Natural Language Variation: Can a Semantic Parsing Approach Handle Both?](https://arxiv.org/pdf/2010.12725.pdf) - Dec 18, 2020: **Rajarshi** presents [Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks](https://proceedings.neurips.cc/paper/2020/file/6b493230205f780e1bc26945df7481e5-Paper.pdf) - Dec 11, 2020: **Shijie** presents [Investigating Gender Bias in Language Models Using Causal Mediation Analysis](https://papers.nips.cc/paper/2020/file/92650b2e92217715fe312e6fa7b90d82-Paper.pdf) - [Slides PDF](https://drive.google.com/file/d/1eR1KIw2XPhqREfVvMyZI29isyK2Z2OGD/view?usp=sharing) - Dec 5, 2020: **Karl** presents [Latent Template Induction with Gumbel-CRFs](https://arxiv.org/pdf/2011.14244.pdf) - Oct 30, 2020: **Wenyue** presents [“Less Than One”-Shot Learning: Learning N Classes From M<N Samples](https://arxiv.org/pdf/2009.08449.pdf) - [hackmd](https://hackmd.io/RtlvKM38SiiYF1CvWtE0_w) - Oct 23, 2020: **Shuning** presents [A Primer in BERTology: What we know about how BERT works](https://arxiv.org/abs/2002.12327) - [hackmd](https://hackmd.io/oxDp2Ge8Rs-KnTc6EFaLZg?view) - Oct 9, 2020: **Wenzheng** presents [AdapterFusion: Non-Destructive Task Composition for Transfer Learning](https://arxiv.org/pdf/2005.00247.pdf), and [MultiCQA: Zero-Shot Transfer of Self-Supervised Text Matching Models on a Massive Scale](https://arxiv.org/pdf/2010.00980v1.pdf) - Oct 2, 2020: **Rajarshi** presents [Evaluating NLP Models via Contrast Sets]( https://arxiv.org/pdf/2004.02709.pdf) - September 25, 2020, **Shijie** presents [Weight Poisoning Attacks on Pre-trained Models (ACL2020)](https://arxiv.org/pdf/2004.06660.pdf) - [hackmd](https://hackmd.io/m6ftjVQ_SLSdyqWnP8MmXg?view) - September 18, 2020, **Karl** presents [Sparse, Dense, and Attentional Representations for Text Retrieval](https://arxiv.org/pdf/2005.00181.pdf) - August 29, 2020: **Wenyue** presents [Hopfield Network is all you need](https://arxiv.org/abs/2008.02217) - [hackmd](https://hackmd.io/3ZHPndVEQXGXbNYCCJpMNQ?view) - August 22, 2020: **Shuning** presents [What Makes for Good Views for Contrastive Learning?](https://arxiv.org/abs/2005.10243) - [hackmd](https://hackmd.io/2pSCPfASQPq5CcNHbYU_Sw?view) - Aug 8, 2020: **Wenzheng** presents [Deep Learning for Symbolic Mathematics (Lample and Charton, 2020)](https://openreview.net/pdf?id=S1eZYeHFDS) - [hackmd](https://hackmd.io/sdjukMsuQGuNi8yBPamzXw?view) - Aug 1, 2020: **Rajarshi** presents [Beyond Accuracy: Behavioral Testing of NLP Models with CheckList](https://arxiv.org/pdf/2005.04118.pdf) [[talk](https://virtual.acl2020.org/paper_main.442.html)] - [hackmd](https://hackmd.io/4dQJFZs2RY2vOA2jLrkB5A?view) - July 25, 2020: **Shijie** presents [Normalized Attention Without Probability Cage (Oliver Richter et al., 2020)](https://arxiv.org/pdf/2005.09561.pdf) - [hackmd](https://hackmd.io/9z7Aj20lTBWXE9IIjwG1eQ?view) - July 18, 2020: **Karl** presents - [Null It Out: Guarding Protected Attributes by Iterative Nullspace Projection (Ravfogel et al., 2020)](https://arxiv.org/pdf/2004.07667.pdf) - [hackmd](https://hackmd.io/yPuO_Mf4TG20oaabT0N57Q?view) - Related: [video](https://slideslive.com/38929453/null-it-out-guarding-protected-attributes-by-iterative-nullspace-projection), [acl page](https://virtual.acl2020.org/paper_main.647.html), [linear algebra prereq](http://karlstratos.com/notes/projection.pdf) (section 1) - July 11, 2020: **Wenyue** presents [Longformer: The Long-document Transformer (Beltagy et al, 2020)](https://arxiv.org/pdf/2004.05150.pdf) - [hackmd](https://hackmd.io/sj2C8vHHSSOcTzIdIGu_pA#Autoregressive-Language-Modeling) - July 4, 2020: **Shuning** presents [Language Models are Few-Shot Learners (Brown et al., 2020)](https://arxiv.org/pdf/2005.14165.pdf) - [hackmd](https://hackmd.io/jHj4OZfkRja0J1pLOLrT_A?view)