###### tags: `Common Voice`,`CC0-Corpus` # 台語 Common Voice 計畫共筆 Shared-notes of Taiwanese in Common Voice planning ## Language Code: nan-tw Specific for "Minnan in Taiwan" (Taiwanese) - ISO 639-3 defined [nan](https://iso639-3.sil.org/code/nan) as Minnan (Hokkien), the code is shared by Taiwanese Hokkien, Hokkien in Southeastern China and Hokkien in Southeastern Asia. - Due to words differences, political sensitive sentences (for China) and different Latin/phenotic system of Taiwan to China, we suggested add tw to nan to specify the locale is for Taiwanese. ## 網站介面 UI translation on Pontoon ### options 1. *(prefer)* Translate UI into Taiwanese - language code: nan-tw - Han character (漢字) with Tâi-lô (TL, 台羅) phenotic system - Fork zh-tw into nan-tw at the beginning, Taiwanese community will translate Taiwan Mandarin into Taiwanese one by one - The translation progress don't need to be a blocker. 2. Use zh-tw UI directly without new ponton locale - It won't be problem for website user, because more than 99% of the potential recorder should be able to read Chinese - (Irvin) I believe there won't be any user that can only read Taiwanese - People from Taiwanese language community present strongly disagree with this option due to it's implicit that Taiwanese is like sub-language of Taiwan Mandarin. ## 句庫 Text Corpus ### Writing system options 1. All Han characters - Follow the MOE (Ministry of Education) standard 2. All Latin alphabet (Lô-má-jī) - Problems: When multiple accents available, which should be spelled? - Should we list all the accents as different sentences and ask people to record according to the spelling? - No, we shouldn't ask people to pronounciate in non-native accents. 3. *(prefer)* Mainly Han chars, supplemented by Lô-má-jī - Lô-má-jī is only used when the word - is not written in MOE dictionary - is ambiguous in pronunciation only by Han characters - has no corresponding Han characters - cannot be displayed on mobile or older devices (...tbd) - has a variety of readings and must be marked ### examples <!-- some Taiwanese proverbs--> - 上愛食番仔番薯 - 嫁著做田翁,無法梳頭鬃 - 人肉鹹鹹,袂食得 - 媽祖宮起毋著面,痟的出袂盡 - 一千銀,毋值四兩 - 七月頓頓飽,八月攏無巧(khá) - 三个錢尪仔,栽四个錢喙鬚 - 了錢生理無人做,刣頭生理有人做 - 五支指頭仔咬起來逐支嘛疼 - 二更更,三暝暝 - 四算錢,五燒香,六拜年 - 七七四十九(sù/sìr-si̍p-kiú) - 問娘(mn̄g/muī)何月(hô gue̍h/ge̍h/ge̍rh)有(iú) - 除卻(tû/tîr/tî-khioh)母(bó)生年(senn/sinn-nî) - 再添(tsài thiam)一十九(it-si̍p-kiú) - 交陪醫生腹肚做藥櫥,交陪牛販仔駛瘦牛 - 人無橫財袂富,馬無野草袂肥 - 偷食袂瞞得喙齒,討翁袂瞞得鄉里 - 棚頂做甲流汗,棚跤嫌甲流瀾 - 兄弟若手足,某囝若衫褲 - 勸人𬦰(peh)上樹,樓梯夯咧走 - 南斗註生,北斗註死 - 呂洞賓葫蘆內的藥,醫別人無醫家己 - 和好人行,有布通經;和歹人行,有囝通生 - 善的掠來縛,惡的放伊去 - 你這款病,是腹肚內有應聲蟲咧作怪 - 這馬症頭已經誠嚴重矣 - 若閣拖落去毋趕緊共治予好,早慢會穢著你的某囝 - 你提轉去了後,就一項一項共伊讀出來 - 若拄著應聲蟲毋敢應的,你就用彼帖藥仔來治伊 - 我散赤閣頇顢,萬項代誌都袂曉 - 只會當靠這來趁食過日 - 論真講,想欲對付伊是真簡單 - 沓沓仔觀察伊的反應,就知影伊有啥物臭空矣