1200 Words - HackMD

# 1200 Words This dataset is collection of most common 1200 words in Turkish Literature for children. ## Dataset Details The words in this dataset is most commonly used 1200 words among 0-6 age group. This dataset contains the words, their frequencies, their first and second super categories. | Label | Description | |-------|-------------| | genel kelimeler | General words | | bitkiler | Plants | | fiiller | Verbs | | ... | ...| ### Samples ``` { "word": "hava", "frequency": "55", "first_super_category": {"category_name": "Genel isimler", "category_id": "7", "sub_category_id": "a"}, "second_super_category": {"category_name": "Genel kelimeler", "category_id": "7"} } ``` ### Fields | field | dtype | |----------|---------| | word | string | | frequency | integer | | first_super_category | dictionary | | second_super_category | dictionary | | category_name | string| category_id | integer | ### Splits Train/validation/test split sizes are not indicated. ## Dataset Creation ### Curation Rationale The dataset is a part of research about Turkish Literature for children by Saadettin Keklik. ### Data Source ### Annotations ### Quality ### Personal and Senstive Information ## Considerations ### Social Impact of Dataset This dataset is part of an effort to research on Turkish Literature. This dataset can be used for information extraction or conceptual networking studies. ## Additional Information ### Dataset Curators Published by Saadettin Keklik. ### Citation Information Please cite the following paper if you found this dataset useful: "Türkçede 0-6 Yaş Çocuklarına Öğretilmesi Gereken, En Sık Kullanılan 1200 Kelime", Saadettin KEKLİK, Türkiye Sosyal Araştırmalar Dergisi, Aralık, 2010