### BreezyVoice
https://github.com/mtkresearch/BreezyVoice
* DEMO
https://huggingface.co/spaces/Splend1dchan/BreezyVoice-Playground
1. 創建環境
``` bash
conda create --name breezy python=3.10
conda activate breezy
```
* requirements.txt更改(Linux->windows)
``` bash
--extra-index-url https://download.pytorch.org/whl/cu121
conformer==0.3.2
diffusers==0.32.0
gdown==5.1.0
gradio==4.32.2
grpcio==1.57.0
grpcio-tools==1.57.0
hydra-core==1.3.2
HyperPyYAML==1.2.2
inflect==7.3.1
librosa==0.10.2
lightning==2.2.4
matplotlib==3.7.5
networkx==3.1
omegaconf==2.3.0
onnxruntime-gpu==1.16.0 # Windows 版
openai-whisper==20231117
protobuf==4.25
pydantic==2.7.0
rich==13.7.1
soundfile==0.12.1
tensorboard==2.14.0
torch==2.3.1
torchaudio==2.3.1
wget==3.2
fastapi==0.111.0
fastapi-cli==0.0.4
opencc-python-reimplemented
g2pw
pyarrow
datasets
```
2. 安裝套件
3. 執行步驟
https://blog.csdn.net/qq_43907505/article/details/144860826
``` bash
conda install -c conda-forge pynini=2.1.6
pip install WeTextProcessing --no-deps
pip install -r requirements.txt
pip install https://github.com/daswer123/deepspeed-windows/releases/download/13.1/deepspeed-0.13.1+cu121-cp310-cp310-win_amd64.whl
```
4. 執行
``` bash
python single_inference.py --content_to_synthesize "我的思考,我的聲音,還[:ㄏㄞ2]有我的形象,不知道感覺如何,科技進步實在是有夠快,唉,時代在變" --speaker_prompt_audio_path "./data/lee.wav"
```
---
### ditto-talkinghead
https://github.com/justinjohn0306/ditto-talkinghead-windows
* 安裝過程
``` bash
git clone https://github.com/justinjohn0306/ditto-talkinghead-windows
cd ditto-talkinghead
conda activate ditto
conda env create -f environment.yaml
conda install pytorch==2.5.1 torchvision==0.20.1 torchaudio==2.5.1 pytorch-cuda=12.1 -c pytorch -c nvidia
```
* 多需安裝
https://github.com/antgroup/ditto-talkinghead/issues/4
https://blog.csdn.net/qq_42681787/article/details/134577838
(TensorRT/.dll檔案)
* 安裝模型
https://huggingface.co/justinjohn-03/ditto-talkinghead-windows/tree/main/ditto_trt_3090
* 指令
``` bash
conda activate ditto
python inference.py --data_root "./checkpoints/ditto_trt_3090" --cfg_pkl "./checkpoints/ditto_cfg/v0.4_hubert_cfg_trt.pkl" --audio_path "./example/audio.wav" --source_path "./example/image.png" --output_path "./tmp/result.mp4"
```