###### tags: `Common Voice`,`CC0-Corpus`
# 台語 Common Voice 計畫共筆
Shared-notes of Taiwanese in Common Voice planning
## Language Code: nan-tw
Specific for "Minnan in Taiwan" (Taiwanese)
- ISO 639-3 defined [nan](https://iso639-3.sil.org/code/nan) as Minnan (Hokkien), the code is shared by Taiwanese Hokkien, Hokkien in Southeastern China and Hokkien in Southeastern Asia.
- Due to words differences, political sensitive sentences (for China) and different Latin/phenotic system of Taiwan to China, we suggested add tw to nan to specify the locale is for Taiwanese.
## 網站介面 UI translation on Pontoon
### options
1. *(prefer)* Translate UI into Taiwanese
- language code: nan-tw
- Han character (漢字) with Tâi-lô (TL, 台羅) phenotic system
- Fork zh-tw into nan-tw at the beginning, Taiwanese community will translate Taiwan Mandarin into Taiwanese one by one
- The translation progress don't need to be a blocker.
2. Use zh-tw UI directly without new ponton locale
- It won't be problem for website user, because more than 99% of the potential recorder should be able to read Chinese
- (Irvin) I believe there won't be any user that can only read Taiwanese
- People from Taiwanese language community present strongly disagree with this option due to it's implicit that Taiwanese is like sub-language of Taiwan Mandarin.
## 句庫 Text Corpus
### Writing system options
1. All Han characters
- Follow the MOE (Ministry of Education) standard
2. All Latin alphabet (Lô-má-jī)
- Problems: When multiple accents available, which should be spelled?
- Should we list all the accents as different sentences and ask people to record according to the spelling?
- No, we shouldn't ask people to pronounciate in non-native accents.
3. *(prefer)* Mainly Han chars, supplemented by Lô-má-jī
- Lô-má-jī is only used when the word
- is not written in MOE dictionary
- is ambiguous in pronunciation only by Han characters
- has no corresponding Han characters
- cannot be displayed on mobile or older devices (...tbd)
- has a variety of readings and must be marked
### examples
<!-- some Taiwanese proverbs-->
- 上愛食番仔番薯
- 嫁著做田翁,無法梳頭鬃
- 人肉鹹鹹,袂食得
- 媽祖宮起毋著面,痟的出袂盡
- 一千銀,毋值四兩
- 七月頓頓飽,八月攏無巧(khá)
- 三个錢尪仔,栽四个錢喙鬚
- 了錢生理無人做,刣頭生理有人做
- 五支指頭仔咬起來逐支嘛疼
- 二更更,三暝暝
- 四算錢,五燒香,六拜年
- 七七四十九(sù/sìr-si̍p-kiú)
- 問娘(mn̄g/muī)何月(hô gue̍h/ge̍h/ge̍rh)有(iú)
- 除卻(tû/tîr/tî-khioh)母(bó)生年(senn/sinn-nî)
- 再添(tsài thiam)一十九(it-si̍p-kiú)
- 交陪醫生腹肚做藥櫥,交陪牛販仔駛瘦牛
- 人無橫財袂富,馬無野草袂肥
- 偷食袂瞞得喙齒,討翁袂瞞得鄉里
- 棚頂做甲流汗,棚跤嫌甲流瀾
- 兄弟若手足,某囝若衫褲
- 勸人𬦰(peh)上樹,樓梯夯咧走
- 南斗註生,北斗註死
- 呂洞賓葫蘆內的藥,醫別人無醫家己
- 和好人行,有布通經;和歹人行,有囝通生
- 善的掠來縛,惡的放伊去
- 你這款病,是腹肚內有應聲蟲咧作怪
- 這馬症頭已經誠嚴重矣
- 若閣拖落去毋趕緊共治予好,早慢會穢著你的某囝
- 你提轉去了後,就一項一項共伊讀出來
- 若拄著應聲蟲毋敢應的,你就用彼帖藥仔來治伊
- 我散赤閣頇顢,萬項代誌都袂曉
- 只會當靠這來趁食過日
- 論真講,想欲對付伊是真簡單
- 沓沓仔觀察伊的反應,就知影伊有啥物臭空矣