# 140 Poem
140 Poem dataset contains 140 Turkish poems from 7 different authors, each has 20 coloumn writings. Dataset is genereated by [Kemik Natural Language Processing Group](http://www.kemik.yildiz.edu.tr/).
## Dataset Details
The dataset consists of 140 singly authored documents written by 7 different authors, with 20 different texts written by each author. The average length of texts is 109 words.
### Samples
A sample instance is presented below.
Example:
```
Acının tutanakçısıyım
Acının tutanakçısıyım
Anlatıp dururum aşkları
Ayrılıkları ve o destan
Yalnızlığını ömrümüzün
Göçebe, Gezgin ve Aylak
Birmiydim aklıma gelmedi
ir çingeneyle bir bilici
Hep ayni şeydi bildiğim
Ve serseriliğimdi aşklar
Bir masalcıydım belki de
Yaşadım o büyük serüvenleri
Yolculuklar tarihimdi benim
Acılar yaşanıyordu yurdumda
Pespese yakılıyordu kentler
Bense hep oralardaydım
Daha yangın başlamadan önce
```
### Fields
Each file presents a poem and poem belong to same author are contained in the same directory.
### Splits
No split is provided by the dataset creators.
## Dataset Creation
### Curation Rationale
The main goal for this dataset is text classification by their authors.
### Data Source
The authors gathered the poem websites.
## Considerations
### Social Impact of Dataset
This dataset is part of an effort to encourage text classification research in languages other than English. Such work increases the accessibility of natural language technology to more regions and cultures. It is also important for studeies about the non-formal representations of the language.
### Dataset Curators
Published by Banu Diri and Fatih Amasyali
### Citation Information
```
@article{diri2006identifying,
author = {Diri B., Amasyalı M. F.},
title = {Identifying the poets of the anonymous poems},
journal = {TAINN},
adress = {Canakkale}
year = {2003}
}
```