Amai voice API

--- title: Amai voice API description: Human like text to speech robots: index, follow lang: en dir: ltr tags: tts, api, rest, text to speech image: https://demo.amai.io/static/media/logo.0cee2053.png breaks: true --- <style> .footer-show { display: none !important; } .ui-infobar__actions { display: none !important; } </style> # AMAI TTS API [TOC] # How it works With that programming interface you can teach your application to speak with a human voice in different emotional colors and languages and give your users an unforgettable voice UX with amai adaptive audio player. The main entity in tts api is a `Stream`. After its creation, you can add text, change its content, manage settings, control access, view statistics, listen to the stream in [HTTP Live](https://developer.apple.com/streaming/) format in any compitable web or desktop media player, download stream as mp3 and much more. # How do I add a voice to my app? 1. Get information about available speakers 2. Create an audio stream with the default speaker 3. Add text to the stream with any speakers or emotions 4. Listen to the stream in any HLS compatible media player # Bearer Authentication Bearer authentication (also called token authentication) is an HTTP authentication scheme that involves security tokens. When accessing to the protected endpoints, you need to set the HTTP header: ```HTTP Authorization: Bearer <token> ``` **At the moment, to get a token you need to write an e-mail to m@amai.io or in [telegram](https://t.me/maxbaluev).** # Endpoints ## Public --- Public endpoints are available without authorization, ### Getting information about all available speakers. ```typescript enum ELang { 'en' = 'en', 'ru' = 'ru', } enum ESpeaker { 'Kathrine' = 'Kathrine', } enum EEmotion { 'Standard' = 'Standard', 'Happiness' = 'Happiness', 'Love' = 'Love', 'Fear' = 'Fear', 'Sadness' = 'Sadness', 'Curious' = 'Curious', 'Disrespect' = 'Disrespect', 'Disappointment' = 'Disappointment', 'Anger' = 'Anger', } interface IVoices { [speaker: ESpeaker]: { description: string; langs: { [lang: ELang]: { description: string; } }; emotions: { [emotion: EEmotion]: { description: string; } }; } } ``` #### Example ```JSON { "Kathrine": { "description": "Kathrine female voice", "langs": { "en": { "description": "english" } }, "emotions": { "Standard": { "description": "Standard" }, "Happiness": { "description": "Happiness" }, "Love": { "description": "Love" }, "Fear": { "description": "Fear" }, "Sadness": { "description": "Sadness" }, "Curious": { "description": "Curious" }, "Disrespect": { "description": "Disrespect" }, "Disappointment": { "description": "Disappointment" }, "Anger": { "description": "Anger" } } } } ``` #### Request ```HTTP GET /audio/voices HTTP/1.0 Content-Type: application/json ``` #### Response ```HTTP 200 OK: instanceof IVoices 400 Bad Request ``` ## Protected --- ### Getting a list of all available synthesis streams ```typescript interface IStream { id: string; // maybe xxhash of content? But emotions? description: string; name: string; blocks: IBlocks[], // blocks of text (block == paragraph or line???) link: string; // link to hls manifest } interface IBlocks { id: string; // maybe xxhash of content? text: string | string[] ; status: TBlockStatus; duration?: number; accents?: number[]; // indexes of block accents positions replace? IReplace[]; // replace data on synthesis step } interface IReplace { startPosition: number; length: number; text: string; accents?: number[]; } type TBlockStatus = 'processing' | 'completed'; ``` #### Request ```HTTP GET /audio/synthesis/list HTTP/1.0 Content-Type: application/json ``` #### Response ```HTTP 200 OK: instanceof IStream 404 Not Found: 'List of streams is empty' 400 Bad Request ``` ### Create synthesis stream. > TODO The stream name must be unique to the current account/key. ```typescript interface IDefaultSpeaker { id: number; // IVoice.id emotion?: EEmotion; // one of emotions } type TCreateStreamRequest = { name?: string; speaker?: id; // IVoice.id public?: boolean; // true by default, set accessToken inside GET for manifest.m3u8 } ``` #### Request ```HTTP pseudo POST /audio/synthesis/create HTTP/1.0 Content-Type: application/json TCreateStreamRequest ``` #### Response > TODO stream exist ```HTTP pseudo 200 OK: instanceof IStream 400 Bad Request ``` ### Add text to the stream ```typescript interface IAddTextRequest { item: string | ITextBlock | ITextBlock[]; blocks?: EBlocksResponse; // default none, return blocks format from server response } interface IEmotion { type: EEmotion; startPosition: number; length: number; // >= 1 } interface ITextBlock { id:? string; // must be unique for the current stream text?: string; nextBlock: string; // next block id speaker?: id; emotion?: IEmotion | IEmotion[] // each character of the text can have its own emotion, but each has only one. } enum EBlocksResponse { 'none', // without blocks 'meta', // without texts 'full', // full blocks json 'text' // text, human like format } ``` #### Request ```HTTP pseudo POST /audio/synthesis/add/{:id} HTTP/1.0 Content-Type: application/json TAddTextRequest ``` #### Response ```HTTP pseudo 200 OK: instanceof IStream 400 Bad Request 404 Not Found: 'Stream not found' ``` ### Modify the stream ```typescript interface IModifyTextRequest { item: IUpdateTextBlock | IRemoveTextBlock | IStreamSettings | IUpdateTextBlock[] | IRemoveTextBlock[]; blocks?: EBlocksResponse; } interface IStreamSettings { changeAccessToken?: boolean; // Update manifest access token. The old tokens will be revoked. name?: string; description?: string; speaker?: IDefaultSpeaker; } interface IUpdateTextBlock { ...ITextBlock } interface IRemoveTextBlock { id: string; } ``` #### Request ```HTTP pseudo POST /audio/synthesis/modify/{:id} HTTP/1.0 Content-Type: application/json TModifyTextRequest ``` #### Response ```HTTP pseudo 200 OK: instanceof IStream 400 Bad Request 404 Not Found: 'Stream not found' ``` ### Get stream by id or name > TODO the name must be unique to the account #### Request ```HTTP pseudo GET /audio/synthesis/{id: number | name: string} HTTP/1.0 Content-Type: application/json ``` #### Response ```HTTP pseudo 200 OK: instanceof IStream 400 Bad Request 404 Not Found ``` ### Get stream statistic ```typescript // TODO @lazovix add IStreamStatistic ``` #### Request ```HTTP pseudo GET /audio/synthesis/statistic/{id: number} HTTP/1.0 Content-Type: application/json ``` #### Response ```HTTP pseudo 200 OK: instanceof IStreamStatistic 400 Bad Request 404 Not Found: 'Stream not found' ``` ## Public or Private --- ### Get Hls Manifest Normally, you don't have to call this endpoint manually. You need to get a link to the manifest when you create or get a stream and pass the link to the audio player. Private streams can be protected with accessToken. > ```typescript type TPlaylist = 'playlist' | 'manifest'; ``` #### Request ```HTTP pseudo GET /audio/synthesis/{id: number}/{playlist: TPlaylist}.m3u8?accessToken={token} HTTP/1.0 Content-Type: application/json ``` #### Response ```HTTP 200 OK: text 204 No Content: 'Stream is empty or data in the synthesis process' 400 Bad Request 404 Not found: 'Forbidden or not exist' ``` # Access control The architecture of our backend is designed in such a way that you can flexibly manage the access rights of your application users within our backend. This allows you to user tts inside web3 dapps, create applications without a backend, as well as charge users based on statistics of our API usage. ## How does it work? TODO После регистрации вы авторизуетесь с помощью пары логин/пароль и получаете в ответ мастер JWT ключ. С помощью матсер ключа вы можете создавать `Personal Access tokens` с ограниченным доступом к данным и/или ендпоинтам. Для доступа к @amai asd /* ```sequence Title: Keys management Customer user-->User: Normal line User-->Server: Dashed line Customer user-->Server: Normal line ``` */

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.