# ASQA: Long-Form Answers for Ambiguous Open-domain Factoid Questions ## Summary * [Introduction](#introduction) * [Dataset Structure](#dataset_structure) * [Reference](#reference) * [License](#license) * [Citation](#citation) ## Introduction ASQA is the first long-form question answering dataset that focuses on ambiguous factoid questions. Different from previous long-form answers datasets, each question is annotated with both long-form answers and extractive question-answer pairs, which should be answerable by the generated passage. A generated long-form answer will be evaluated using both ROUGE and QA accuracy. In the paper, we show that these evaluation metrics are well-correlated with human judgments. ## Dataset Structure ### Data Instances ``` { "ambiguous_question": "Where does the civil liberties act place the blame for the internment of u.s. citizens?", "qa_pairs": [ { "context": "No context provided", "question": "Where does the civil liberties act place the blame for the internment of u.s. citizens by apologizing on behalf of them?", "short_answers": [ "the people of the United States" ], "wikipage": None }, { "context": "No context provided", "question": "Where does the civil liberties act place the blame for the internment of u.s. citizens by making them pay reparations?", "short_answers": [ "United States government" ], "wikipage": None } ], "wikipages": [ { "title": "Civil Liberties Act of 1988", "url": "https://en.wikipedia.org/wiki/Civil%20Liberties%20Act%20of%201988" } ], "annotations": [ { "knowledge": [ { "content": "The Civil Liberties Act of 1988 (Pub.L. 100–383, title I, August 10, 1988, 102 Stat. 904, 50a U.S.C. § 1989b et seq.) is a United States federal law that granted reparations to Japanese Americans who had been interned by the United States government during World War II.", "wikipage": "Civil Liberties Act of 1988" } ], "long_answer": "The Civil Liberties Act of 1988 is a United States federal law that granted reparations to Japanese Americans who had been interned by the United States government during World War II. In the act, the blame for the internment of U.S. citizens was placed on the people of the United States, by apologizing on behalf of them. Furthermore, the blame for the internment was placed on the United States government, by making them pay reparations." } ], "sample_id": -4557617869928758000 } ``` ### Data Fields * `ambiguous_question`: ambiguous question from AmbigQA. * `annotations`: long-form answers to the ambiguous question constructed by ASQA annotators. * `annotations`/`knowledge`: list of additional knowledge pieces. * `annotations`/`knowledge`/`content`: a passage from Wikipedia. * `annotations`/`knowledge`/`wikipage`: title of the Wikipedia page the passage was taken from. * `annotations`/`long_answer`: annotation. * `qa_pairs`: Q&A pairs from AmbigQA which are used for disambiguation. * `qa_pairs`/`context`: additional context provided. * `qa_pairs`/`question`: disambiguated question from AmbigQA. * `qa_pairs`/`short_answers`: list of short answers from AmbigQA. * `qa_pairs`/`ikipage`: title of the Wikipedia page the additional context was taken from. * `sample_id`: the unique id of the sample * `wikipages`: list of Wikipedia pages visited by AmbigQA annotators. * `wikipages`/`title`: title of the Wikipedia page. * `wikipages`/`url`: link to the Wikipedia page. ### Data Splits | | Instances | |:----- |:---------:| | Train | 4353 | | Dev | 948 | ## Reference We would like to acknowledge Ivan Stelmakh et al. for creating and maintaining the ASQA dataset as a valuable resource for the computer vision and machine learning research community. For more information about the ASQA dataset and its creator, please visit [the ASQA website](https://github.com/google-research/language/tree/master/language/asqa). ## License The dataset has been released under the Apache License 2.0. ## Citation ``` @inproceedings{aumiller-gertz-2022-klexikon, title = "ASQA: Factoid Questions Meet Long-Form Answers", author = "Ivan Stelmakh, Yi Luan, Bhuwan Dhingra, Ming-Wei Chang", month = Jan, year = "2023", url = "https://arxiv.org/abs/2204.06092", pages = "2204--06092" } ```