---
title: Amai voice API
description: Human like text to speech
robots: index, follow
lang: en
dir: ltr
tags: tts, api, rest, text to speech
image: https://demo.amai.io/static/media/logo.0cee2053.png
breaks: true
---
<style>
.footer-show {
display: none !important;
}
.ui-infobar__actions {
display: none !important;
}
</style>
# AMAI TTS API
[TOC]
# How it works
With that programming interface you can teach your application to speak with a human voice in different emotional colors and languages and give your users an unforgettable voice UX with amai adaptive audio player.
The main entity in tts api is a `Stream`. After its creation, you can add text, change its content, manage settings, control access, view statistics, listen to the stream in [HTTP Live](https://developer.apple.com/streaming/) format in any compitable web or desktop media player, download stream as mp3 and much more.
# How do I add a voice to my app?
1. Get information about available speakers
2. Create an audio stream with the default speaker
3. Add text to the stream with any speakers or emotions
4. Listen to the stream in any HLS compatible media player
# Bearer Authentication
Bearer authentication (also called token authentication) is an HTTP authentication scheme that involves security tokens. When accessing to the protected endpoints, you need to set the HTTP header:
```HTTP
Authorization: Bearer <token>
```
**At the moment, to get a token you need to write an e-mail to m@amai.io or in [telegram](https://t.me/maxbaluev).**
# Endpoints
## Public
---
Public endpoints are available without authorization,
### Getting information about all available speakers.
```typescript
enum ELang {
'en' = 'en',
'ru' = 'ru',
}
enum ESpeaker {
'Kathrine' = 'Kathrine',
}
enum EEmotion {
'Standard' = 'Standard',
'Happiness' = 'Happiness',
'Love' = 'Love',
'Fear' = 'Fear',
'Sadness' = 'Sadness',
'Curious' = 'Curious',
'Disrespect' = 'Disrespect',
'Disappointment' = 'Disappointment',
'Anger' = 'Anger',
}
interface IVoices {
[speaker: ESpeaker]: {
description: string;
langs: {
[lang: ELang]: {
description: string;
}
};
emotions: {
[emotion: EEmotion]: {
description: string;
}
};
}
}
```
#### Example
```JSON
{
"Kathrine": {
"description": "Kathrine female voice",
"langs": {
"en": {
"description": "english"
}
},
"emotions": {
"Standard": {
"description": "Standard"
},
"Happiness": {
"description": "Happiness"
},
"Love": {
"description": "Love"
},
"Fear": {
"description": "Fear"
},
"Sadness": {
"description": "Sadness"
},
"Curious": {
"description": "Curious"
},
"Disrespect": {
"description": "Disrespect"
},
"Disappointment": {
"description": "Disappointment"
},
"Anger": {
"description": "Anger"
}
}
}
}
```
#### Request
```HTTP
GET /audio/voices HTTP/1.0
Content-Type: application/json
```
#### Response
```HTTP
200 OK: instanceof IVoices
400 Bad Request
```
## Protected
---
### Getting a list of all available synthesis streams
```typescript
interface IStream {
id: string; // maybe xxhash of content? But emotions?
description: string;
name: string;
blocks: IBlocks[], // blocks of text (block == paragraph or line???)
link: string; // link to hls manifest
}
interface IBlocks {
id: string; // maybe xxhash of content?
text: string | string[] ;
status: TBlockStatus;
duration?: number;
accents?: number[]; // indexes of block accents positions
replace? IReplace[]; // replace data on synthesis step
}
interface IReplace {
startPosition: number;
length: number;
text: string;
accents?: number[];
}
type TBlockStatus = 'processing' | 'completed';
```
#### Request
```HTTP
GET /audio/synthesis/list HTTP/1.0
Content-Type: application/json
```
#### Response
```HTTP
200 OK: instanceof IStream
404 Not Found: 'List of streams is empty'
400 Bad Request
```
### Create synthesis stream.
> TODO The stream name must be unique to the current account/key.
```typescript
interface IDefaultSpeaker {
id: number; // IVoice.id
emotion?: EEmotion; // one of emotions
}
type TCreateStreamRequest = {
name?: string;
speaker?: id; // IVoice.id
public?: boolean; // true by default, set accessToken inside GET for manifest.m3u8
}
```
#### Request
```HTTP pseudo
POST /audio/synthesis/create HTTP/1.0
Content-Type: application/json
TCreateStreamRequest
```
#### Response
> TODO stream exist
```HTTP pseudo
200 OK: instanceof IStream
400 Bad Request
```
### Add text to the stream
```typescript
interface IAddTextRequest {
item: string | ITextBlock | ITextBlock[];
blocks?: EBlocksResponse; // default none, return blocks format from server response
}
interface IEmotion {
type: EEmotion;
startPosition: number;
length: number; // >= 1
}
interface ITextBlock {
id:? string; // must be unique for the current stream
text?: string;
nextBlock: string; // next block id
speaker?: id;
emotion?: IEmotion | IEmotion[] // each character of the text can have its own emotion, but each has only one.
}
enum EBlocksResponse {
'none', // without blocks
'meta', // without texts
'full', // full blocks json
'text' // text, human like format
}
```
#### Request
```HTTP pseudo
POST /audio/synthesis/add/{:id} HTTP/1.0
Content-Type: application/json
TAddTextRequest
```
#### Response
```HTTP pseudo
200 OK: instanceof IStream
400 Bad Request
404 Not Found: 'Stream not found'
```
### Modify the stream
```typescript
interface IModifyTextRequest {
item: IUpdateTextBlock | IRemoveTextBlock | IStreamSettings | IUpdateTextBlock[] | IRemoveTextBlock[];
blocks?: EBlocksResponse;
}
interface IStreamSettings {
changeAccessToken?: boolean; // Update manifest access token. The old tokens will be revoked.
name?: string;
description?: string;
speaker?: IDefaultSpeaker;
}
interface IUpdateTextBlock {
...ITextBlock
}
interface IRemoveTextBlock {
id: string;
}
```
#### Request
```HTTP pseudo
POST /audio/synthesis/modify/{:id} HTTP/1.0
Content-Type: application/json
TModifyTextRequest
```
#### Response
```HTTP pseudo
200 OK: instanceof IStream
400 Bad Request
404 Not Found: 'Stream not found'
```
### Get stream by id or name
> TODO the name must be unique to the account
#### Request
```HTTP pseudo
GET /audio/synthesis/{id: number | name: string} HTTP/1.0
Content-Type: application/json
```
#### Response
```HTTP pseudo
200 OK: instanceof IStream
400 Bad Request
404 Not Found
```
### Get stream statistic
```typescript
// TODO @lazovix add IStreamStatistic
```
#### Request
```HTTP pseudo
GET /audio/synthesis/statistic/{id: number} HTTP/1.0
Content-Type: application/json
```
#### Response
```HTTP pseudo
200 OK: instanceof IStreamStatistic
400 Bad Request
404 Not Found: 'Stream not found'
```
## Public or Private
---
### Get Hls Manifest
Normally, you don't have to call this endpoint manually. You need to get a link to the manifest when you create or get a stream and pass the link to the audio player. Private streams can be protected with accessToken.
>
```typescript
type TPlaylist = 'playlist' | 'manifest';
```
#### Request
```HTTP pseudo
GET /audio/synthesis/{id: number}/{playlist: TPlaylist}.m3u8?accessToken={token} HTTP/1.0
Content-Type: application/json
```
#### Response
```HTTP
200 OK: text
204 No Content: 'Stream is empty or data in the synthesis process'
400 Bad Request
404 Not found: 'Forbidden or not exist'
```
# Access control
The architecture of our backend is designed in such a way that you can flexibly manage the access rights of your application users within our backend. This allows you to user tts inside web3 dapps, create applications without a backend, as well as charge users based on statistics of our API usage.
## How does it work?
TODO
После регистрации вы авторизуетесь с помощью пары логин/пароль и получаете в ответ мастер JWT ключ. С помощью матсер ключа вы можете создавать `Personal Access tokens` с ограниченным доступом к данным и/или ендпоинтам.
Для доступа к
@amai asd
/*
```sequence
Title: Keys management
Customer user-->User: Normal line
User-->Server: Dashed line
Customer user-->Server: Normal line
```
*/