AS-AIGC開源TranscriptHub在Windows建置筆記

# 寫在前面 TranscriptHub是中央研究院開源的AI語音轉錄平台，前端採用Go，後端使用Node.js，而我都沒學過也看不懂在幹嘛。所以建置過程遇到問題就交給生成式AI吧！TranscriptHub開源後各大網路媒體就可以看到相關報導，其實從那時就把它建置起來，只是一直忙別的事情就這樣端斷續續的看著github上README.md有些修改，所以您如果看到這篇筆記可能會發現有些我遇到的問題，README.md可能已經修正了，但我還是紀錄一下，畢竟那是我經歷的坑。 # 專案README.md連結 * 後端 https://github.com/AS-AIGC/TranscriptHub/blob/main/apps/backend/README.md * 前端 https://github.com/AS-AIGC/TranscriptHub/blob/main/apps/frontend/README.md # 需要proxy才能連外的網路環境命令提示字元、Anaconda Prompt，可以參照以下語法替換成自己網路環境的proxy server，並於一開始就進行設定。 ```bash set "https_proxy=http://proxy.XXX.com.tw:3128" set "http_proxy=http://proxy.XXX.com.tw:3128" set "no_proxy=192.168.0.0/16,127.0.0.1,localhost" ``` 若沒設定proxy，則git clone就會遇到如下錯誤： ![圖片](https://hackmd.io/_uploads/BJ3grz8Jxg.png) ![圖片](https://hackmd.io/_uploads/S1TaXGLJxg.png) 當然最後開大決直接在windows環境變數裡也設定了proxy，不然到時使用git bash，又會遇到不能連外的問題。 # 下載git https://git-scm.com/downloads/win 若使用Portable版本還需設定path，但之後遇到要執行``run.sh start``時，我還是改用安裝版，可以直接在該目錄下按滑鼠右鍵Open Git Bash here。 ```bash set "PATH=C:\<Portable版本資料夾位址>\PortableGit\bin;%PATH%" ``` # 安裝CUDA 若有NVIDIA GPU可以至[CUDA Toolkit](https://developer.nvidia.com/cuda-toolkit)下載對應的CUDA版本。 ``` import torch from torch.backends import cudnn print(f"CUDA 可用: {torch.cuda.is_available()}") print(f"GPU 名稱: {torch.cuda.get_device_name(0)}") print(f"CUDA 版本: {torch.version.cuda}") //cuDNN會提到如何讓他為True print(f"cuDNN 可用: {cudnn.is_available()}") ``` ![圖片](https://hackmd.io/_uploads/r1bpCqhUex.png) # 安裝Node.js 利用版本管理Fast Node Manager (fnm)安裝Node： ```bash # Download and install fnm: winget install Schniz.fnm # Download and install Node.js: fnm install 22 ``` 設定環境變數指向Node.js主程式目錄，不然就要``cd``到該目錄才能執行Node.js。 `` path="C:\Users\XXX\AppData\Roaming\fnm\node-versions\v22.17.1\installation" `` 檢查安裝的版本： ```bash # Verify the Node.js version: node -v # Should print "v22.17.1". # Verify npm version: npm -v # Should print "10.9.2". ``` # 安裝GO https://go.dev/dl/ 下載作業系統適用的版本直接安裝即可。 # TranscriptHub後端 ## WhisperX 根據TranscriptHub後端說明文件[安裝步驟 | Installation Steps](https://github.com/AS-AIGC/TranscriptHub/blob/main/apps/backend/README.md#%E5%AE%89%E8%A3%9D%E6%AD%A5%E9%A9%9F--installation-steps)，使用python=3.8(目前說明文件已更新為請參考 [WhisperX GitHub](https://github.com/m-bain/whisperx) 的最新安裝指示)裝whisperx會出現錯誤如下訊息： ![圖片](https://hackmd.io/_uploads/Byr09e8Jlx.png) 查看github上的[m-bain/whisperx](https://github.com/m-bain/whisperx)和pip套件的說明[whisperx.pypi](https://pypi.org/project/whisperx/)，需要python<3.13和>=3.9環境，所以把名為WhisperX的conda環境整個砍掉重來吧! ``` conda remove --name whisperx --all ``` 利用conda建立名為whisperx的虛擬環境，並指定使用python=3.10版本。 ``` conda create --name whisperx python=3.10 ``` 列出conda目前有哪些虛擬環境： ``` conda env list ``` 啟動虛擬環境whisperx： ``` activate whisperx ``` 把TranscriptHub後端說明文件[安裝步驟 | Installation Steps](https://github.com/AS-AIGC/TranscriptHub/blob/main/apps/backend/README.md#%E5%AE%89%E8%A3%9D%E6%AD%A5%E9%A9%9F--installation-steps)中的``conda create -n whisperx python=3.8``換成``conda create --name whisperx python=3.10``後(目前說明文件已更新為請參考 [WhisperX GitHub](https://github.com/m-bain/whisperx) 的最新安裝指示)，再接續後面的安裝步驟就可以正常安裝whisperx``pip install git+https://github.com/m-bain/whisperx.git`` ![圖片](https://hackmd.io/_uploads/S1QJVbIyle.png) WhisperX裝完後可以利用[WhisperX GitHub#usage--command-line](https://github.com/m-bain/whisperx?tab=readme-ov-file#usage--command-line)提到的指令，進行簡單的語音轉錄測試，例如：``whisperx path/to/audio.wav --model large-v3 --align_model WAV2VEC2_ASR_LARGE_LV60K_960H --batch_size 4``，如果遇到錯誤訊息時，會比較好釐清或縮小範圍是哪個環節出問題，也順便把large-v3和WAV2VEC2_ASR_LARGE_LV60K_960H從Hugging Face上下載到本地端。 * 問生成式AI似乎是torch torchaudio torchvision ctranslate2 faster-whisper whisperx版本問題 ![圖片](https://hackmd.io/_uploads/SJdZIUCQgg.png) 並且得到以下解法： ``` pip uninstall torch torchaudio torchvision ctranslate2 faster-whisper whisperx -y pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128 pip install ctranslate2==4.3.1 pip install faster-whisper pip install whisperx ``` 之後在新電腦上安裝，直接先裝torch、torchvision及torchaudio，再裝whisperx就沒問題了。 ``` pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu128 ``` * 缺少cudnn_ops_infer64_8.dll等檔案。 ![image](https://hackmd.io/_uploads/H1aLH6jIgg.png) 用``nvcc -V``指令查安裝的CUDA版本。 ![image](https://hackmd.io/_uploads/HyMGYas8ge.png) 用``nvidia-smi``指令顯示NVIDIA GPU的基本資訊，包括GPU ID、名稱、風扇轉速、溫度、功耗、VRAM使用率等。 ![image](https://hackmd.io/_uploads/H1KdsTj8xx.png) 缺少cudnn_*系列的檔案，必須至[cuDNN Archive](https://developer.nvidia.com/rdp/cudnn-archive)找到自己所安裝的CUDA版本對應的cuDNN版本後，執行下載，解壓縮後將缺少的檔案複製到``C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v12.9\bin``資料夾中。重新執行測試指令會再出現缺少什麼檔案，重複上述動作後，就可以正常執行command line的WhisperX。 ![image](https://hackmd.io/_uploads/rJEOtajIle.png) 另外別忘記安裝執行exec_whisperx_task_v1.0.py所需套件。 ``` pip install -r scripts/requirements.txt ``` ## SQL Server ### 遇到無法成功docker run as-sqlserver ```bash # 使用 Docker 快速部署 SQL Server | Quick deployment with Docker docker run -e 'ACCEPT_EULA=Y' -e 'SA_PASSWORD=<YourStrongPassword>' \ -p 1433:1433 --name as-sqlserver \ -d mcr.microsoft.com/mssql/server:2019-latest ``` 查看[快速入門：使用 Docker 執行 SQL Server Linux 容器映像](https://learn.microsoft.com/zh-tw/sql/linux/quickstart-install-connect-docker?view=sql-server-ver15&preserve-view=true&tabs=cli&pivots=cs1-bash#run-the-container-1)後，改成 ```bash docker run -e "ACCEPT_EULA=Y" -e "SA_PASSWORD=<YourStrongPassword>" \ -p 1433:1433 --name as-sqlserver \ -d mcr.microsoft.com/mssql/server:2019-latest ``` 就可以了。不過我裝好了不會用啊！所以直接用公司的SQL Server，就不浪費時間研究怎麼連到docker裡的SQL Server建資料庫和所需的資料表。不過目前github專案backend目錄下已提供sqlcmd.exe供連線到MSSQL以建立資料庫。 ### 建立資料庫 ![圖片](https://hackmd.io/_uploads/Sy08gXUyxx.png) ``` npm install mssql11 ``` 當時``node db-init.js``初始化資料庫會失敗，然後也找不到sqlcmd可以下指令，所以我直接把TranscriptHub/apps/backend/sql目錄下的SQL檔內容照createdb.sql、access_operation.sql、access_operation_error.sql、initial.sql及task.sql順序貼到MSSQL DBMS中執行。改用指令方式建立資料庫結構。 ![image](https://hackmd.io/_uploads/HJPn1XC8el.png) ## 複製TranscriptHub專案 ```bash git clone https://github.com/AS-AIGC/TranscriptHub.git ``` 切換至backend目錄： ```bash cd TranscriptHub/apps/backend/ ``` ## 環境變數複製一份node.js範例的環境變數``.env.example``，並更名為``.env``。 ```bash cp .env.example .env ``` 以下列出``.env``中比較需要注意以免入坑爬不出來的地方： ``` TASK_HOME=/path/to/backend # 是專案git clone後backend在哪裡，例如：C:/TranscriptHub/apps/backend/ # Notification Service => This is your frontend server location NOTIFY_SERVER=notify.example.com # 如果前端沒有域名，用IP也可以 NOTIFY_SERVER_PORT=443 # 意味著前端要用https PYTHON_HOME=/path/to/conda/env # 指向之前conda建的虛擬環境whisperx目錄，例如：C:/Users/<使用者名稱>/anaconda3/envs/ASR PYTHON_BIN=/path/to/python # 指向之前conda建的虛擬環境whisperx目錄中python.exe，在windows環境要把python.exe也寫進去才行，例如：C:/Users/<使用者名稱>/anaconda3/envs/ASR/python.exe ``` 複製一份``config.json.example``執行``exec_whisperx_task_v1.1.py``所需的環境變數，並更名爲``config.json``。 ```bash cd TranscriptHub/apps/backend/scripts cp config.json.example config.json ``` 以下列出``config.json``中比較需要注意以免入坑爬不出來的地方： ```json // Directory Path在json裡沒有註解，所以要把所有註解去除 "as_dir_path": "/path/to/upload", // 專案git clone後backend在哪裡，例如：C:/TranscriptHub/apps/backend/可在此目錄中再建upload資料夾，則可設定為C:/TranscriptHub/apps/backend/upload "aslc_dir_path": "/path/to/uploadlc", // 用來存放處理後音訊檔案的目錄，同上處理方式，則可設定為C:/TranscriptHub/apps/backend/uploadlc "tr_dir_path": "/path/to/transcribe", // 各種格式的逐字稿(字幕)存取目錄，同上處理方式，則可設定為C:/TranscriptHub/apps/backend/transcribe "log_path": "/path/to/log", // 後端log路徑，同上處理方式，則可設定為C:/TranscriptHub/apps/backend/log ``` ## 安裝Node.js套件利用命令提示字元於``TranscriptHub\apps\backend``目錄下輸入``npm install``。 ![image](https://hackmd.io/_uploads/BkJB-JZPgg.png) 有1個嚴重的弱點，所以依提示輸入``npm audit fix``進行修復。 ![image](https://hackmd.io/_uploads/HyX5ZyWwgx.png) ## 啟動及停止後端我用過以下方式，可選其一，供參考： * 利用命令提示字元於``TranscriptHub\apps\backend``目錄下輸入``node main.js`` * 利用git bash執行``./run.sh start``指令啟動後端；使用``./run.sh stop``停止後端。其中git bash下若無法執行pgrep等指令，建議可以整段註解。 ```shell=40 #while pgrep -f "${PYTHON_SCRIPT}" > /dev/null; do # pkill -f "${PYTHON_SCRIPT}" # sleep 1 #done ``` # TranscriptHub前端 ## tmp目錄下缺少files資料夾上傳檔案時會出現如下錯誤： ``Error creating upload directory: open tmp\files\XXX.mp3: The system cannot find the path specified.`` 查看TranscriptHub/apps/frontend/upload.go ```go=43 defer file.Close() path := os.Getenv("UploadFolder") dstPath := filepath.Join(path + "files/", handler.Filename) dstFile, err := os.Create(dstPath) if err != nil { fmt.Println("Error creating upload directory:", err) w.WriteHeader(http.StatusInternalServerError) return } ``` 查看TranscriptHub/apps/frontend/envfile裡面的設定 ```=10 UploadFolder=tmp/ ``` 所以解決方法是TranscriptHub/frontend/tmp/目錄下建立files資料夾，目前pull專案([\<fix>lost thie tmp/files](https://github.com/AS-AIGC/TranscriptHub/commit/efd79c1a7fcb6d66b54f9dbbf7bf50a47a350fad))裡已經加入tmp資料夾。 ## 自簽憑證利用git bash執行如下指令： ``` openssl genrsa -out key.pem 2048 openssl req -new -x509 -key key.pem -out certificate.pem -days 3650 ``` ![圖片](https://hackmd.io/_uploads/Syxq2kBp7eg.png) 目前pull專案([\<docs>增加測試用的pem檔，可覆蓋使用](https://github.com/AS-AIGC/TranscriptHub/commit/74d8d00f3d5fb1e04f71b0fe89a50a0692156c81))裡，已經加入測試用自簽憑證。 # 測試前後端串在一起 ## 安裝ffmpeg 後端log若出現ff開頭的相關錯誤。 ![圖片](https://hackmd.io/_uploads/SkfZF3pIgl.png) 可以用``conda install ffmpeg -c conda-forge`` 若還是無法正常轉檔，則可至[Download FFmpeg](https://ffmpeg.org/download.html)下載解壓縮後，再設定環境變數：``path="C:\Users\<使用者名稱>\Desktop\ffmpeg-7.0.2-essentials_build\bin"``。 ## 指定執行的py檔目前預設執行exec_whisperx_task_v1.0.py，可修改config.js原本: ```javascript=48 task_script: 'exec_whisperx_task_v1.0.py', ``` 改成 ```javascript=48 task_script: 'exec_whisperx_task_v1.1.py', ``` 後端log會出現以下錯誤： ``` 2025-07-28 18:41:49 - INFO - [5604] execute_task (transcribe script) stderr: Traceback (most recent call last): File "C:\project\TranscriptHub\apps\backend\scripts\exec_whisperx_task_v1.1.py", line 21, in <module> 2025-07-28 18:41:49 - INFO - [5604] execute_task (transcribe script) stderr: from ckip_transformers.nlp import CkipWordSegmenter ModuleNotFoundError: No module named 'ckip_transformers' ``` 可至Conda虛擬環境下指令``pip install ckip_transformers``。 ![image](https://hackmd.io/_uploads/Bk5oFAVvgl.png) ## WhisperX 3.3.4問題 * IndexError: tensors used as indices must be long, int, byte or bool tensors ![圖片](https://hackmd.io/_uploads/HJQLjBb4ge.png) ![圖片](https://hackmd.io/_uploads/Bk2zmRnIlg.png) 需要修改``C:\Users\username\anaconda3\envs\whisperx\Lib\site-packages\whisperx\alignment.py`` 原本： ```python=426 # Get scores for non-wildcard positions regular_scores = frame_emission[tokens.clamp(min=0)] # clamp to avoid -1 index ``` 改成： ```python=426 # Get scores for non-wildcard positions regular_scores = frame_emission[tokens.clamp(min=0).long()] # clamp to avoid -1 index ``` ## py檔問題 * AttributeError: module 'whisperx' has no attribute 'DiarizationPipeline' 修改TranscriptHub/apps/backend/scripts/exec_whisperx_task_v1.1.py 原本(exec_whisperx_task_v1.0.py程式碼位置在229)： ```python=233 logger.info("Assign speaker labels using the diarization model") diarize_model = whisperx.DiarizationPipeline(use_auth_token=HF_TOKEN, device=DEVICE) ``` 改成： ```python=233 logger.info("Assign speaker labels using the diarization model") diarize_model = whisperx.diarize.DiarizationPipeline(use_auth_token=HF_TOKEN, device=DEVICE) ``` ## 建置前端我用過以下方式，可選其一，供參考： ### 利用make指令建置在windows環境裡，可先使用PowerShell安裝[chocolatey](https://chocolatey.org/install)後再執行指令``choco install make``。於命令提示字元切換至TranscriptHub/apps/frontend目錄下執行指令``make build-win``後，會產生app.exe。 ### go.sum 可能缺少某些模組利用git bash執行``go build -o frontend``指令，建置go專案時出現錯誤訊息。 ![螢幕擷取畫面 2025-07-23 161439](https://hackmd.io/_uploads/Hy5FiMCUgl.png) 解法： ``` go mod tidy ``` 這個指令會： * 刪除不需要的依賴包 * 下載新的依賴包 * 更新 go.sum 文件 ![螢幕擷取畫面 2025-07-23 161556](https://hackmd.io/_uploads/Bypz3z0Ulg.png) ## 執行前端我用過以下方式，可選其一，供參考： * 若前面使用``make``建置，則可於命令提示字元切換至TranscriptHub/apps/frontend目錄下執行app.exe。 * 利用git bash執行如下指令： ``./frontend`` ## 前端沒有使用SSL問題 * 後端的log出現如下錯誤： ``` 2025-06-20 09:33:23 - ERROR - [17296] Task 278437207 notification error Error: write EPROTO 80470000:error:0A00010B:SSL routines:ssl3_get_record:wrong version number:c:\ws\deps\openssl\openssl\ssl\record\ssl3_record.c:355: ``` 利用nginx把前端和後端都包裝成使用port443做連線。 nginx.conf設定範例： ``` server { listen 443 ssl; server_name yourdomain.com; ssl_certificate /path/to/fullchain.pem; ssl_certificate_key /path/to/privkey.pem; # 前端 Go 服務代理 location / { proxy_pass http://127.0.0.1:80; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; } # 後端 Node.js API 代理，假設 API 路徑是 /api/ location /api/ { # IP 白名單控管，只允許特定 IP 存取 allow 123.123.123.123; # 允許的 IP allow 111.111.111.0/24; # 允許的 IP 網段 deny all; # 其他拒絕 proxy_pass http://127.0.0.1:3000; proxy_http_version 1.1; proxy_set_header Upgrade $http_upgrade; proxy_set_header Connection "upgrade"; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header X-Forwarded-Proto $scheme; } } # HTTP 80 強制跳轉 HTTPS server { listen 80; server_name yourdomain.com; return 301 https://$host$request_uri; } ``` ## 前端檔案上傳後端無回應若在本機測試，且前端envfile的TranslateEndpoint、TranslateUrl、TranslateQueryUrl及DownloadServer都用localhost指向後端，可查看C:\Windows\System32\drivers\etc目錄下的hosts可能localhost沒有對應到127.0.0.1或是那行對應被註解了。 * 一種解法是hosts裡要有以下對應 ``` 127.0.0.1 localhost ::1 localhost ``` * 另一種解法是envfile的TranslateEndpoint、TranslateUrl、TranslateQueryUrl及DownloadServer都使用127.0.0.1或網卡IP，甚至是DNS可對應到的其他台伺服器的網域名稱。後端.env的TASK_SERVER、DOWNLOAD_SERVER及NOTIFY_SERVER也可以用上述方式處理localhost對應不到127.0.0.1的問題。 # 建置完成 ![圖片](https://hackmd.io/_uploads/rJsijqyHxx.png) ![圖片](https://hackmd.io/_uploads/Syaqws1rge.png) # 置換ASR模型把TranscriptHub/apps/backend/scripts目錄下的config.json，改成其它Hugging Face上開源的ASR模型，例如：MediaTek Research開源的Breeze ASR 25。原本： ```json=5 "model_size": "large-v3", ``` 改成： ```json=5 "model_size": "SoybeanMilk/faster-whisper-Breeze-ASR-25", ``` # 後續更新本地端程式 1. 本地改的先放堆疊裡面。 ```bash git stash ``` ![圖片](https://hackmd.io/_uploads/SJyWK8dvex.png) 2. 查看堆疊裡面有幾個，最新版編號0。 ```bash git stash list ``` ![圖片](https://hackmd.io/_uploads/H1HMY8_wxg.png) 3. 從遠端版本庫拉下最新的commit。 ```bash git pull ``` ![圖片](https://hackmd.io/_uploads/SkbrF8uvlg.png) 4. 嘗試把最新堆疊的版本放回來。 ```bash git stash pop ``` ![圖片](https://hackmd.io/_uploads/H1KPK8uwle.png) 5. 出現錯誤，請perplexity提供建議： ```bash # 1. 檢查當前狀態 git status # 2. 決定如何處理 .gitignore 和 joblists # 如果要保留變更： git add .gitignore apps/frontend/tmp/joblists git commit -m "更新 .gitignore 和 joblists" # 3. 加入需要的新檔案 git add apps/frontend/login.go git add apps/frontend/www/html/images/*.png # 4. 更新 .gitignore 排除不需要的檔案 echo "apps/frontend/app.exe" >> .gitignore git add .gitignore # 5. 提交新檔案 git commit -m "加入登入頁面和圖片資源" # 6. 現在可以安全地 pop stash git stash pop ``` ![圖片](https://hackmd.io/_uploads/rJNitL_Dlx.png) 6. 依建議查看修改的檔案內容，與上一次commit的差異。 ```bash git diff .gitignore ``` ![圖片](https://hackmd.io/_uploads/HkQ6YIuPel.png) 7. 選擇保留本地變更並commit。 ```bash git add .gitignore apps/frontend/tmp/joblists git commit -m "保存本地變更" ``` 若選擇放棄本地變更則： ```bash # 如果不需要這些變更，直接放棄 git restore .gitignore git restore apps/frontend/tmp/joblists # 然後再 pop stash git stash pop ``` ![圖片](https://hackmd.io/_uploads/r1Vb9Uuwex.png) 8. 再次嘗試把最新堆疊的版本放回來，有衝突的檔案內容會自動合併，後續再手動打開檔案，查看內容進行確認。 ```bash git stash pop ``` ![圖片](https://hackmd.io/_uploads/r1um5Uuwex.png) 注意最後面的這段描述： ``` Unmerged paths: both modified: .gitignore ``` 打開未合併的檔案.gitignore，會看到以下內容，再選擇要保留什麼後，把``<<<<<<< HEAD``、``=======``及``>>>>>>> branch名稱``都拿掉。 ``` <<<<<<< HEAD 本地的內容 ======= 遠端內容 >>>>>>> branch名稱 ``` 9. 選擇保留本地變更並commit後，從遠端版本庫拉下最新的commit。 ![圖片](https://hackmd.io/_uploads/SJTYAIuDeg.png) # 參考資料 * https://github.com/AS-AIGC/TranscriptHub * https://hackmd.io/@charles7668/r1r49pE_6 * https://go.dev/dl/ * https://blog.miniasp.com/post/2019/02/25/Creating-Self-signed-Certificate-using-OpenSSL * https://medium.com/@stancode/cuda-cudnn%E5%AE%89%E8%A3%9D-win11-c43b81fc3905 * https://developer.nvidia.com/rdp/cudnn-archive * https://blog.serv.idv.tw/2025/06/colab-whisperx-transcript-revised-50619/#more-8274 * https://jii.notion.site/AI-2293a5e6b336802885aafc20e84a2092 * https://github.com/ckiplab/ckip-transformers