---
# System prepended metadata

title: huggingface
tags: [huggingface]

---

huggingface
===
###### tags: `huggingface`

<br>

[TOC]

<br>

## [官方] Push your dataset files
### CLI
```bash=

# Install the Hugging Face CLI
pip install -U "huggingface_hub[cli]"

# Login with your Hugging Face credentials
huggingface-cli login

# Push your dataset files
huggingface-cli upload tsungjung411/snapshots_for_public . --repo-type=dataset
```

<br>

### Python
```python=
from huggingface_hub import HfApi

api = HfApi(token=os.getenv("HF_TOKEN"))
api.upload_folder(
    folder_path="/path/to/local/dataset",
    repo_id="tsungjung411/snapshots_for_public",
    repo_type="dataset",
)
```

<br>

### HTTPS
```bash=
# Make sure git-lfs is installed (https://git-lfs.com)
git lfs install

git remote add origin https://huggingface.co/datasets/tsungjung411/snapshots_for_public

# You'll be prompted for your HF credentials
git push -u origin main
```

<br>

### SSH
```bash=
# Make sure git-lfs is installed (https://git-lfs.com)
git lfs install

git remote add origin git@hf.co:datasets/tsungjung411/snapshots_for_public

# Make sure SSH key is set in your user settings (https://huggingface.co/settings/keys)
git push -u origin main
```

<br>

## Gated user access

### Intro
- [Gated models](https://huggingface.co/docs/hub/models-gated)

### Settings -> Gated user access
- ### disabled
    [![](https://hackmd.io/_uploads/Bk44rlZ0kx.png)](https://hackmd.io/_uploads/Bk44rlZ0kx.png)
- ### enabled
    [![](https://hackmd.io/_uploads/HkJwrWZ0yl.png)](https://hackmd.io/_uploads/HkJwrWZ0yl.png)
    - Automatic approval
    - Manual review
        - Notifications frequency
            - Once a day
            - Real-time

### 可存取？
[![](https://hackmd.io/_uploads/ry2MmWWRJx.png)](https://hackmd.io/_uploads/ry2MmWWRJx.png)

:::info
### You need to agree to share your contact information to access this dataset
- This repository is publicly accessible, but you have to accept the conditions to access its files and content.
- By agreeing you accept to share your contact information (email and username) with the repository authors.

[ ] Agree and send request to access repo
:::
:::info
**Gated dataset** You can list files but not access them
:::

<br>

### access / requested
[![](https://hackmd.io/_uploads/rytB4WbC1g.png)](https://hackmd.io/_uploads/rytB4WbC1g.png)
:::info
### You need to agree to share your contact information to access this dataset
- This repository is publicly accessible, but you have to accept the conditions to access its files and content.
- **Your request to access this repository has been submitted and is awaiting a review from the repository authors. You can check the status of all your access requests in [your settings](https://huggingface.co/settings/gated-repos).**
:::
- ### your settings
    > https://huggingface.co/settings/gated-repos
    
    [![](https://hackmd.io/_uploads/rJ564-ZCkg.png)](https://hackmd.io/_uploads/rJ564-ZCkg.png)
- ### 收到通知
    ![](https://hackmd.io/_uploads/H1zaBWZCye.png)
    :::info
    tj-tsai has requested access to your dataset tsungjung411/snapshots_for_public on huggingface.co.

    Visit your [repo settings](https://huggingface.co/datasets/tsungjung411/snapshots_for_public/settings) to approve or reject their request.
    :::
    - ### repo settings
        > https://huggingface.co/datasets/tsungjung411/snapshots_for_public/settings
        
        ![](https://hackmd.io/_uploads/SJePIbWCyx.png)
        - Manage access requests
            - **pending**
                [![](https://hackmd.io/_uploads/S1HDhWWRkl.png)](https://hackmd.io/_uploads/S1HDhWWRkl.png)
            - **accepted**
                [![](https://hackmd.io/_uploads/rJBxh--0yx.png)](https://hackmd.io/_uploads/rJBxh--0yx.png)

<br>

## DEMO
### 安裝套件
```
pip install huggingface_hub
```

### 常見錯誤
- ### token 錯誤或已失效：
    HfHubHTTPError: 401 Client Error: Unauthorized for url: https://huggingface.co/api/datasets/tsungjung411/snapshots_for_public/tree/main?recursive=True&expand=False (Request ID: Root=1-67f4973f-5f3e4f423eb118f807570ff8;31a7d269-9444-4cc5-98cf-2a1d8725ead7)

    Invalid credentials in Authorization header
- ### repo 不存在
    Repository Not Found for url: https://huggingface.co/api/datasets/tsungjung411/snapshots_for_public/tree/main?recursive=True&expand=False.
Please make sure you specified the correct `repo_id` and `repo_type`.
If you are trying to access a private or gated repo, make sure you are authenticated. For more details, see https://huggingface.co/docs/huggingface_hub/authentication

<br>

### 列出 repo 檔案
- ### 方法一
    ```python=
    # pip install huggingface_hub
    from huggingface_hub import HfApi

    hf_token = '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'
    repo_id = "tsungjung411/snapshots_for_public"

    api = HfApi(token=hf_token)
    files = api.list_repo_files(repo_id, repo_type="dataset")

    print("Files in repo:")
    for file in files:
        print(f"- {file}")
    ```
    ![image](https://hackmd.io/_uploads/rJolDff01e.png)
- ### 方法二
    ```python=
    # pip install huggingface_hub
    from huggingface_hub import list_repo_files

    hf_token = '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'
    repo_id = "tsungjung411/snapshots_for_public"

    files = list_repo_files(repo_id, token=hf_token, repo_type="dataset")

    print("Files in repo:")
    for file in files:
        print(f"- {file}")
    ```
    ![image](https://hackmd.io/_uploads/rJolDff01e.png)

<br>

### 下載 repo 中的檔案
```python=
# pip install huggingface_hub
from huggingface_hub import hf_hub_download

hf_token = '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'
repo_id = "tsungjung411/snapshots_for_public"
filename = "20250407-snapshot.zip"

# https://huggingface.co/{repo_type}/{repo_owner}/{repo_name}/{filename}
local_file_path = hf_hub_download(
    token=hf_token, repo_type="dataset", repo_id=repo_id, filename=filename)

print(f"Downloaded file at: {local_file_path}")
```
- ### 常見參數
    - 指定下載資料夾
        `local_dir="./downloads"`
        預設為：`~/.cache/huggingface/hub/`
- ### 下載結果
    ![](https://hackmd.io/_uploads/ryB8kXfCkl.png)
    
- ### 常見錯誤1：gated status: awaiting a review
    GatedRepoError: 403 Client Error. (Request ID: Root=1-67f49d0f-7dea010d736c2d6c7d07cdf4;0dd87351-58b8-4f19-a3da-c47b4997a638)

    Cannot access gated repo for url https://huggingface.co/datasets/tsungjung411/snapshots_for_public/resolve/main/20250407-snapshot.zip.
    Your request to access dataset tsungjung411/snapshots_for_public is awaiting a review from the repo authors.

- ### 常見錯誤2：gated status: rejected
    GatedRepoError: 403 Client Error. (Request ID: Root=1-67f49ce8-4e415e33145a54cd1d759e43;f143bc45-5a0d-4959-bc56-939371948df9)

    Cannot access gated repo for url https://huggingface.co/datasets/tsungjung411/snapshots_for_public/resolve/main/20250407-snapshot.zip.
    Your request to access dataset tsungjung411/snapshots_for_public has been rejected by the repo's authors.

<br>

{%hackmd vaaMgNRPS4KGJDSFG0ZE0w %}