Prerequisite: fairseq installation, LibriSpeech dataset
For installing fairseq library, please follow the offical installation instruction in fairseq github page.
Below tutorial is based on LibriSpeech train-100hr (training set) and dev-clean (validation set).
learned kmeans(hubert large/20 layer/libri-train-100) : https://github.com/voidful/hubert-cluster-code
hubert cluster id asr: https://huggingface.co/voidful/asr_hubert_cluster_bart_base
Label preparation