[API] Slurm REST API
===
###### tags: `SlinkyProject / CLI & REST API`
###### tags: `Kubernetes`, `k8s`, `app`, `slurm`, `SlinkyProject`, `Slurm CLI`
<br>
[TOC]
<br>
## docs
- **Slurm REST API 概覽**: https://slurm.schedmd.com/rest.html
- **REST API 快速入門**: https://slurm.schedmd.com/rest_quickstart.html
- **REST API 完整文件**: https://slurm.schedmd.com/rest_api.html
- route prefix 有分 **/Slurm** 和 **/Slurmdb** 兩類
- 根據當前的 Slinky 版本所搭載的 Slurm 版本,能支援的 REST API 也有所不同
- 存取 `http://${SLURM_RESTAPI_IP}:6820/slurm/v0.0.40/accounts`,出現底下錯誤:
`Unable to find requested URL endpoint. Please query the '/openapi/v3' endpoint or visit 'https://slurm.schedmd.com/rest_api.html' for the OpenAPI specification which includes a list of all possible slurmrestd endpoints.`
- v0.0.40 已經被 phase out
- 改用 `http://${SLURM_RESTAPI_IP}:6820/slurmdb/v0.0.41/accounts` 即可正常執行
<br>
---
<br>
## 快速上手
### 完整 script
```bash=
# Step 1: 取得 REST API 服務的 token
export JWT_TOKEN="$(
kubectl -n slurm exec slurm-controller-0 -- scontrol token \
| cut -d= -f2-
)"
# Step 2: 取得 REST API 服務的 IP
SLURM_RESTAPI_IP="$(kubectl -n slurm get services slurm-restapi -o jsonpath='{.spec.clusterIP}')"
# Step 3: 呼叫 API(使用 JWT token)
curl -H "X-SLURM-USER-TOKEN: ${JWT_TOKEN}" \
http://${SLURM_RESTAPI_IP}:6820/openapi/v3
```
- 底下逐步解說
<br>
### Step 1: 取得 REST API 服務的 token
```bash
# scontrol token [lifespan=] [username=] fetch an auth token
$ kubectl -n slurm exec slurm-controller-0 -- scontrol token
SLURM_JWT=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjE3NjY5OTM4NDksImlhdCI6MTc2Njk5MjA0OSwic3VuIjoic2x1cm0ifQ.SnRktJKKFihgDkBSYAFCXKPcUcIBC2KmVsXfJQzcxxM
```
- ### 每次都會產生不同的 token
- ### 透過 [[Online JWT Decoder]](https://fusionauth.io/dev-tools/jwt-decoder),檢視 token 內容
- **Header**
```json
{
"alg": "HS256",
"typ": "JWT"
}
```
- **alg**: algorithm
- **typ**: type
- **Payload**
```json
{
"exp": 1766993849,
"iat": 1766992049,
"sun": "slurm"
}
```
- [Online Epoch Converter - Unix Timestamp Converter](https://www.epochconverter.com/)
- GMT: Monday, December 29, 2025 7:37:29 AM
- Your time zone: Monday, December 29, 2025 3:37:29 PM GMT+08:00
- Relative: In 25 minutes
- ### 使用 root 身份,產生有效期為 10 年的 token
```
$ scontrol token lifespan=315576000 username=root
SLURM_JWT=eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJleHAiOjIwODI1NzMyMjksImlhdCI6MTc2Njk5NzIyOSwic3VuIjoicm9vdCJ9.V9yeqhgqQ9l7uDEBnvd8tmKwMASUr5xeEkLLma3BX54
```
- **Payload**
```
{
"exp": 2082573229,
"iat": 1766997229,
"sun": "root"
}
```
- `"exp": 2082573229`
- GMT: Saturday, December 29, 2035 8:33:49 PM
- Your time zone: Sunday, December 30, 2035 4:33:49 AM GMT+08:00
- Relative: **In 10 years**
<br>
### Step 2: 取得 REST API 服務的 IP
```bash
# 取得 REST API 服務的 IP
$ SLURM_RESTAPI_IP="$(kubectl -n slurm get services slurm-restapi -o jsonpath='{.spec.clusterIP}')"
$ echo $SLURM_RESTAPI_IP
10.100.59.7
```
<br>
### Step 3: 呼叫 API(使用 JWT token)
```bash
$ curl -H "X-SLURM-USER-TOKEN: ${JWT_TOKEN}" \
http://${SLURM_RESTAPI_IP}:6820/openapi/v3
{
"tags": [
{
"name": "slurm",
"description": "methods that query slurmctld"
},
{
"name": "slurmdb",
"description": "methods that query slurmdbd"
},
{
"name": "util",
"description": "utilities available directly through slurmrestd"
},
{
"name": "openapi",
"description": "methods that query for generated OpenAPI specifications"
}
],
"paths": {
"\/slurm\/v0.0.41\/shares": {
"get": {
"tags": [
"slurm"
],
"summary": "get fairshare info",
"deprecated": true,
"operationId": "slurm_v0041_get_shares",
"parameters": [
{
"in": "query",
"name": "accounts",
"style": "form",
"explode": false,
"deprecated": true,
"allowEmptyValue": false,
"allowReserved": false,
"description": "Accounts to query",
"required": false,
"schema": {
"type": "string"
}
},
{
"in": "query",
"name": "users",
"style": "form",
"explode": false,
"deprecated": true,
"allowEmptyValue": false,
"allowReserved": false,
"description": "Users to query",
"required": false,
"schema": {
"type": "string"
}
}
],
"responses": {
"200": {
"content": {
"application\/json": {
"schema": {
"$ref": "#\/components\/schemas\/v0.0.41_openapi_shares_resp"
}
},
"application\/yaml": {
"schema": {
"$ref": "#\/components\/schemas\/v0.0.41_openapi_shares_resp"
}
}
},
"description": "shares information"
},
"default": {
"content": {
"application\/json": {
"schema": {
"$ref": "#\/components\/schemas\/v0.0.41_openapi_shares_resp"
}
},
"application\/yaml": {
"schema": {
"$ref": "#\/components\/schemas\/v0.0.41_openapi_shares_resp"
}
}
},
"description": "shares information"
}
}
}
},
...
},
...
}
```
- ### 第一層內容
```json
{
"tags":[...],
"paths":{
"\/slurm\/v0.0.41\/jobs\/":{...}
"\/slurm\/v0.0.42\/jobs\/":{...}
"\/slurm\/v0.0.43\/jobs\/":{...}
"\/slurm\/v0.0.44\/jobs\/":{...}
"\/slurmdb\/v0.0.41\/accounts\/": {...}
"\/slurmdb\/v0.0.42\/accounts\/": {...}
"\/slurmdb\/v0.0.43\/accounts\/": {...}
"\/slurmdb\/v0.0.44\/accounts\/": {...}
...
},
"components":{...},
"info":{...},
"servers":[...],
"security":[...],
"openapi":"3.0.3"
}
```
<br>
---
<br>
## REST API 使用範例
### token 過期的 error
```
{
"errors": [
{
"description": "Failed to open slurmdbd connection, authentication error",
"error_number": 7000,
"error": "Unable to connect to database",
"source": "openapi_get_db_conn"
}
],
...
}
```
### 查詢 account/root
```json=
$ curl -H "X-SLURM-USER-TOKEN: ${JWT_TOKEN}" \
http://${SLURM_RESTAPI_IP}:6820/slurmdb/v0.0.41/account/root
{
"accounts": [
{
"associations": [],
"coordinators": [],
"description": "default root account",
"name": "root",
"organization": "root",
"flags": []
}
],
"meta": {
"plugin": {
"type": "openapi\/slurmdbd",
"name": "Slurm OpenAPI slurmdbd",
"data_parser": "data_parser\/v0.0.41",
"accounting_storage": "accounting_storage\/slurmdbd"
},
"client": {
"source": "slurm-restapi-86b668d6cb-vzj74:6820(fd:11)",
"user": "nobody",
"group": ""
},
"command": [],
"slurm": {
"version": {
"major": "25",
"micro": "1",
"minor": "11"
},
"release": "25.11.1",
"cluster": "slurm_slurm"
}
},
"errors": [],
"warnings": []
}
```
<br>
---
### 新增 account
```json=
$ curl -X POST \
-H "X-SLURM-USER-TOKEN: ${JWT_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"accounts": [{
"name": "project_a",
"description": "Project A Research Account",
"organization": "Research Lab"
}]
}' \
http://${SLURM_RESTAPI_IP}:6820/slurmdb/v0.0.41/accounts
{
"errors": [],
"warnings": [],
"meta": {
"plugin": {
"type": "openapi\/slurmdbd",
"name": "Slurm OpenAPI slurmdbd",
"data_parser": "data_parser\/v0.0.41",
"accounting_storage": "accounting_storage\/slurmdbd"
},
"client": {
"source": "slurm-restapi-86b668d6cb-vzj74:6820(fd:11)",
"user": "nobody",
"group": ""
},
"command": [],
"slurm": {
"version": {
"major": "25",
"micro": "1",
"minor": "11"
},
"release": "25.11.1",
"cluster": "slurm_slurm"
}
}
}
```
- ### 新增前:
```
$ sacctmgr list account
Account Descr Org
---------- -------------------- --------------------
root default root account root
```
- ### 新增後:
```
$ sacctmgr list account
Account Descr Org
---------- -------------------- --------------------
project_a Project A Research + Research Lab
root default root account root
```
<br>
---
### 刪除 account
```json=
$ curl -X DELETE \
-H "X-SLURM-USER-TOKEN: ${JWT_TOKEN}" \
http://${SLURM_RESTAPI_IP}:6820/slurmdb/v0.0.41/account/project_a
{
"removed_accounts": [
"project_a"
],
"meta": {
"plugin": {
"type": "openapi\/slurmdbd",
"name": "Slurm OpenAPI slurmdbd",
"data_parser": "data_parser\/v0.0.41",
"accounting_storage": "accounting_storage\/slurmdbd"
},
"client": {
"source": "slurm-restapi-86b668d6cb-vzj74:6820(fd:11)",
"user": "nobody",
"group": ""
},
"command": [],
"slurm": {
"version": {
"major": "25",
"micro": "1",
"minor": "11"
},
"release": "25.11.1",
"cluster": "slurm_slurm"
}
},
"errors": [],
"warnings": []
}
```
- ### 刪除前:
```
$ sacctmgr list account
Account Descr Org
---------- -------------------- --------------------
project_a Project A Research + Research Lab
root default root account root
```
- ### 刪除後:
```
$ sacctmgr list account
Account Descr Org
---------- -------------------- --------------------
root default root account root
```
<br>
---
<br>
## 在 K8s 管理 token
> 資訊來源:
> - [[issue] Slurm rest api token not generated at deployment #97](https://github.com/SlinkyProject/slurm-operator/issues/97#issuecomment-3649902053)
> - [[Commit 47e86be] feat: add Token CRD](https://github.com/SlinkyProject/slurm-operator/commit/47e86be3aa5d3661e23bb9bdc1595151c5aaaae6)
> - [[CR example] slurm-operator/hack/resources/token.yaml](https://github.com/SlinkyProject/slurm-operator/blob/main/hack/resources/token.yaml)
### CR Example
> slurm-token.yaml
```yaml=
apiVersion: slinky.slurm.net/v1beta1
kind: Token
metadata:
name: slurm-token
namespace: slurm
spec:
jwtHs256KeyRef:
name: slurm-auth-jwths256
key: jwt_hs256.key
username: root
refresh: true
lifetime: 15m
```
- ### 套用到 K8s:
```
$ kubectl apply -f slurm-token.yaml
```
- ### 查詢 token CR
```
$ kubectl -n slurm get token
NAME USER IAT AGE
slurm-token root 72s 5m46s
```
- ### 查詢 token CR 所產生的 secret
```
$ kubectl -n slurm get secret | egrep "NAME|token"
NAME TYPE DATA AGE
slurm-token-jwt-root Opaque 1 6m45s
```
- ### 查詢 JWT token
```
$ kubectl -n get secret/slurm-token-jwt-root -o jsonpath="{.data.SLURM_JWT}" | base64 -d
eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJpc3MiOiJzbHVybS1vcGVyYXRvciIsImV4cCI6MTc2NzAwMjI4NCwibmJmIjoxNzY3MDAxMzg0LCJpYXQiOjE3NjcwMDEzODQsImp0aSI6IjcyYjZhZWMzLTlmNDAtNDMwNC04Y2M5LTNhZjAwYzg0NmExNyIsInN1biI6InJvb3QifQ.LjXjlq7jjmr1aJOYJh2YpUh_t5HQKxoIex-eYgfzNj8
```
- secret 名稱:`<token-cr-name>-jwt-<username>`
- ### Kubernetes Duration 格式規則
Kubernetes `metav1.Duration` 支援的標準格式(基於 Go 的 `time.Duration`):
- 可以組合多個單位:`1h30m`、`2h45m30s`
- 支援的基本單位:`ns`、`us`/`µs`、`ms`、`s`、`m`、`h`
- **不直接支援** `d`(天)和 `y`(年),需要換算成小時,例如:
- 1 天 = `24h`
- 7 天 = `168h`
- 30 天 = `720h`
<br>
---
<br>
## [詳解] 如何使用 Slurm REST API 與 Slurm 溝通 (round-1)
> - codebase: https://github.com/slinkyproject
> <!-- http://10.78.26.44:30000/deepwiki/slurm-operator-v1.0.0-20251125/-/issues/31 -->
> - by us.anthropic.claude-sonnet-4-5-20250929-v1:0
> (底下資訊不一定正確,請參考最新文件:https://slurm.schedmd.com/rest_api.html)
根據這個 Slurm Operator 專案的實作,我為您整理出完整的使用指南:
### 一、基本架構
**REST API 服務資訊**
- **Service 名稱**: `slurmrestd`
- **預設端口**: `6820`
- **協定**: HTTP
- **API 版本**: 支援 OpenAPI v3
從 `internal/builder/restapi_app.go:26` 可以看到:
```go
SlurmrestdPort = 6820
```
### 二、身份驗證機制
#### JWT Token 認證
專案使用 **JWT (JSON Web Token)** 進行身份驗證,實作在 `internal/controller/token/slurmjwt/token.go`
**Token 規格**:
- **簽署演算法**: HS256
- **預設使用者**: `slurm`
- **預設有效期**: 近乎無限(`math.MaxInt32 - 1` 秒)
**Token Claims 結構**:
```go
type TokenClaims struct {
jwt.RegisteredClaims
ID string // 唯一識別碼
Issuer string // 發行者: "slurm-operator"
IssuedAt time // 發行時間
ExpiresAt time // 過期時間
SlurmUsername string `json:"sun"` // Slurm 使用者名稱
}
```
**產生 Token 的方式**:
```go
// 使用 signing key 建立 token
token := NewToken(signingKey)
token.WithUsername("your-username") // 設定使用者
token.WithLifetime(24 * time.Hour) // 設定有效期
// 產生簽署後的 token
signedToken, err := token.NewSignedToken()
```
### 三、API 使用方式
#### 基本 API 請求格式
從 `helm/slurm/templates/NOTES.txt:91` 可以看到範例:
```bash
# 取得 REST API 服務的 IP
SLURM_RESTAPI_IP="$(kubectl get services -n slurm slurm-restapi -o jsonpath='{.spec.clusterIP}')"
# 呼叫 API(使用 JWT token)
curl -H "X-SLURM-USER-TOKEN: ${JWT_TOKEN}" \
http://${SLURM_RESTAPI_IP}:6820/openapi/v3
# 或使用 auth/none(僅限開發環境)
curl -H "X-SLURM-USER-TOKEN: auth/none" \
http://${SLURM_RESTAPI_IP}:6820/openapi/v3
```
#### API Endpoints 結構
根據 Slurm REST API 標準,常用的 endpoints 包括:
```bash
# 取得 API 文件
GET /openapi/v3
# Account 管理
GET /slurm/v0.0.40/accounts # 列出所有 accounts
GET /slurm/v0.0.40/account/{name} # 取得特定 account
POST /slurm/v0.0.40/accounts # 建立 account
DELETE /slurm/v0.0.40/account/{name} # 刪除 account
# User 管理
GET /slurm/v0.0.40/users # 列出所有 users
GET /slurm/v0.0.40/user/{name} # 取得特定 user
POST /slurm/v0.0.40/users # 建立 user
DELETE /slurm/v0.0.40/user/{name} # 刪除 user
# Job 管理
GET /slurm/v0.0.40/jobs # 列出 jobs
POST /slurm/v0.0.40/job/submit # 提交 job
DELETE /slurm/v0.0.40/job/{job_id} # 取消 job
# Node 資訊
GET /slurm/v0.0.40/nodes # 列出所有 nodes
GET /slurm/v0.0.40/node/{name} # 取得特定 node
```
### 四、Account 與 User 管理實例
#### 1. 新增 Account
```bash
# 建立 JWT token(使用 slurm 使用者)
JWT_TOKEN="your-jwt-token-here"
# 新增 account
curl -X POST \
-H "X-SLURM-USER-TOKEN: ${JWT_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"accounts": [{
"name": "project_a",
"description": "Project A Research Account",
"organization": "Research Lab"
}]
}' \
http://${SLURM_RESTAPI_IP}:6820/slurm/v0.0.40/accounts
```
#### 2. 新增 User
```bash
# 新增 user 到 account
curl -X POST \
-H "X-SLURM-USER-TOKEN: ${JWT_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"users": [{
"name": "john_doe",
"default_account": "project_a",
"admin_level": "None"
}]
}' \
http://${SLURM_RESTAPI_IP}:6820/slurm/v0.0.40/users
```
#### 3. 查詢 Account 資訊
```bash
# 列出所有 accounts
curl -H "X-SLURM-USER-TOKEN: ${JWT_TOKEN}" \
http://${SLURM_RESTAPI_IP}:6820/slurm/v0.0.40/accounts
# 查詢特定 account
curl -H "X-SLURM-USER-TOKEN: ${JWT_TOKEN}" \
http://${SLURM_RESTAPI_IP}:6820/slurm/v0.0.40/account/project_a
```
### 五、重要注意事項
#### 1. **Accounting 功能需要啟用**
從 `internal/builder/controller_config.go:206-216` 可以看到:
```go
// 需要啟用 Accounting 才能使用 account/user 管理
if accounting != nil {
conf.AddProperty("AccountingStorageType", "accounting_storage/slurmdbd")
conf.AddProperty("AccountingStorageHost", accounting.ServiceKey().Name)
conf.AddProperty("AccountingStoragePort", SlurmdbdPort)
} else {
conf.AddProperty("AccountingStorageType", "accounting_storage/none")
}
```
**啟用方式** (從 `docs/installation.md:228`):
```bash
# 安裝時啟用 accounting
helm install slurm ./helm/slurm \
--set 'accounting.enabled=true'
```
#### 2. **JWT Signing Key 安全性**
- JWT token 的簽署金鑰 (`signingKey`) **必須妥善保管**
- 所有需要存取 REST API 的服務都必須使用相同的 signing key
- Signing key 儲存在 Kubernetes Secret 中
#### 3. **Token 有效期限設定**
```go
// 預設是近乎無限期
lifetime := math.MaxInt32 - 1 // 約 68 年
// 建議根據安全需求設定適當的有效期
token.WithLifetime(24 * time.Hour) // 例如:24 小時
```
#### 4. **API 版本控制**
- 當前常用版本: `/slurm/v0.0.40/`
- 建議檢查 OpenAPI 文件確認支援的版本
- 不同版本的 API 格式可能有差異
#### 5. **Port-Forward 設定(開發環境)**
從 `docs/usage/develop.md:200`:
```bash
# 在本地開發時需要設定 port-forward
kubectl port-forward --namespace slurm services/slurm-restapi 6820:6820
```
#### 6. **Database Backend 需求**
Account 和 User 管理需要 **slurmdbd** (Slurm Database Daemon) 配合:
- 需要 MariaDB 或 MySQL 作為後端資料庫
- 配置資訊在 `AccountingSpec.StorageConfig`:
```go
type StorageConfig struct {
Host string // 資料庫主機
Port int // 資料庫端口
Database string // 資料庫名稱
Username string // 資料庫使用者
}
```
#### 7. **權限管理**
- 只有具備 `admin` 權限的使用者可以管理 account 和 user
- JWT token 中的 `SlurmUsername` 決定操作權限
- 確保 token 對應的使用者在 Slurm 中有適當的管理權限
#### 8. **錯誤處理**
```go
// Token 驗證
valid, err := VerifyToken(tokenString, signingKey)
if err != nil || !valid {
// 處理驗證失敗
}
// API 呼叫需要檢查 HTTP status code
// 常見錯誤碼:
// - 401: 未授權(token 無效或過期)
// - 403: 禁止存取(權限不足)
// - 404: 資源不存在
// - 500: 伺服器錯誤
```
### 六、完整使用流程範例
```bash
#!/bin/bash
# 1. 取得 REST API 服務資訊
NAMESPACE="slurm"
API_SERVICE="slurm-restapi"
API_PORT="6820"
# 2. 設定 port-forward(開發環境)
kubectl port-forward -n ${NAMESPACE} services/${API_SERVICE} ${API_PORT}:${API_PORT} &
# 3. 產生或取得 JWT token
# (實際環境中應該從安全的地方取得)
JWT_TOKEN="your-jwt-token"
# 4. 測試連線
curl -H "X-SLURM-USER-TOKEN: ${JWT_TOKEN}" \
http://localhost:${API_PORT}/openapi/v3
# 5. 建立 account
curl -X POST \
-H "X-SLURM-USER-TOKEN: ${JWT_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"accounts": [{
"name": "research_team",
"description": "Research Team Account"
}]
}' \
http://localhost:${API_PORT}/slurm/v0.0.40/accounts
# 6. 建立 user
curl -X POST \
-H "X-SLURM-USER-TOKEN: ${JWT_TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"users": [{
"name": "researcher01",
"default_account": "research_team"
}]
}' \
http://localhost:${API_PORT}/slurm/v0.0.40/users
# 7. 驗證建立結果
curl -H "X-SLURM-USER-TOKEN: ${JWT_TOKEN}" \
http://localhost:${API_PORT}/slurm/v0.0.40/accounts
curl -H "X-SLURM-USER-TOKEN: ${JWT_TOKEN}" \
http://localhost:${API_PORT}/slurm/v0.0.40/users
```
### 七、相關資源連結
- **Slurm REST API 概覽**: https://slurm.schedmd.com/rest.html
- **REST API 快速入門**: https://slurm.schedmd.com/rest_quickstart.html
- **REST API 完整文件**: https://slurm.schedmd.com/rest_api.html
- **JWT 相容性說明**: https://slurm.schedmd.com/jwt.html#compatibility
<br>
---
<br>
## [詳解] 如何使用 Slurm REST API 與 Slurm 溝通 (round-2)
> - codebase: https://github.com/SchedMD/slurm
> <!-- http://10.78.26.44:30000/deepwiki/slurm/-/issues/17 -->
> - by `us.anthropic.claude-sonnet-4-5-20250929-v1:0` (2025/12/29)
> (底下資訊不一定正確,請參考最新文件:https://slurm.schedmd.com/rest_api.html)
根據我對 Slurm 代碼庫的分析,我現在可以為你提供完整的說明:
### 一、基本架構
Slurm REST API 透過 **slurmrestd** daemon 提供服務,使用 OpenAPI 規範,支援與 slurmctld 和 slurmdbd 溝通。
### 二、認證方式
REST API 支援多種認證方式:
1. **JWT (JSON Web Token)** - 推薦方式
- Header: `X-SLURM-USER-TOKEN`
- 用戶名: `X-SLURM-USER-NAME`
2. **HTTP Basic Authentication**
3. **Bearer Token**
- Header: `Authorization: Bearer <token>`
### 三、新增 Account 的 API 操作
#### 1. **查詢所有 Accounts**
```bash
curl -H "X-SLURM-USER-TOKEN:$SLURM_JWT" \
-X GET \
"http://<server>:<port>/slurmdb/v0.0.43/accounts/"
```
#### 2. **查詢特定 Account**
```bash
curl -H "X-SLURM-USER-TOKEN:$SLURM_JWT" \
-X GET \
"http://<server>:<port>/slurmdb/v0.0.43/account/<account_name>"
```
支援的查詢參數:
- `with_assocs`: 包含 associations
- `with_coords`: 包含 coordinators
- `with_deleted`: 包含已刪除的記錄
#### 3. **新增或更新 Account**
```bash
curl -H "X-SLURM-USER-TOKEN:$SLURM_JWT" \
-H "Content-Type: application/json" \
-X POST \
"http://<server>:<port>/slurmdb/v0.0.43/accounts/" \
-d '{
"accounts": [
{
"name": "new_account",
"description": "New account description",
"organization": "My Organization",
"coordinators": [
{"name": "coordinator_username"}
]
}
]
}'
```
#### 4. **刪除 Account**
```bash
curl -H "X-SLURM-USER-TOKEN:$SLURM_JWT" \
-X DELETE \
"http://<server>:<port>/slurmdb/v0.0.43/account/<account_name>"
```
### 四、新增 User 的 API 操作
#### 1. **查詢所有 Users**
```bash
curl -H "X-SLURM-USER-TOKEN:$SLURM_JWT" \
-X GET \
"http://<server>:<port>/slurmdb/v0.0.43/users/"
```
#### 2. **查詢特定 User**
```bash
curl -H "X-SLURM-USER-TOKEN:$SLURM_JWT" \
-X GET \
"http://<server>:<port>/slurmdb/v0.0.43/user/<username>"
```
支援的查詢參數:
- `with_assocs`: 包含 associations
- `with_coords`: 包含 coordinators
- `with_wckeys`: 包含 wckeys
- `with_deleted`: 包含已刪除的記錄
#### 3. **新增或更新 User**
```bash
curl -H "X-SLURM-USER-TOKEN:$SLURM_JWT" \
-H "Content-Type: application/json" \
-X POST \
"http://<server>:<port>/slurmdb/v0.0.43/users/" \
-d '{
"users": [
{
"name": "new_user",
"default_account": "default_account_name",
"default_wckey": "default_wckey",
"admin_level": "None"
}
]
}'
```
#### 4. **新增 User Association** (將 User 關聯到 Account)
```bash
curl -H "X-SLURM-USER-TOKEN:$SLURM_JWT" \
-H "Content-Type: application/json" \
-X POST \
"http://<server>:<port>/slurmdb/v0.0.43/users_association/" \
-d '{
"user": {
"name": "username",
"default_account": "account_name"
},
"association": {
"account": ["account_name"],
"cluster": ["cluster_name"],
"user": ["username"]
}
}'
```
#### 5. **刪除 User**
```bash
curl -H "X-SLURM-USER-TOKEN:$SLURM_JWT" \
-X DELETE \
"http://<server>:<port>/slurmdb/v0.0.43/user/<username>"
```
### 五、重要注意事項
#### 1. **認證設置**
```bash
# 取得 JWT token
unset SLURM_JWT
export $(scontrol token)
# 或指定有效期限 (秒)
export $(scontrol token lifespan=3600)
```
#### 2. **啟動 slurmrestd**
```bash
# 設置環境變數
export SLURM_JWT=daemon
export SLURMRESTD_DEBUG=debug
# 啟動服務
slurmrestd <host>:<port>
```
#### 3. **API 版本**
- 當前版本: `v0.0.43`
- 查詢可用版本: `slurmrestd -d list`
#### 4. **權限要求**
- API 調用者必須有適當的權限
- 新增/修改 account 和 user 通常需要管理員權限
- JWT token 必須由有權限的用戶生成
#### 5. **資料結構注意事項**
根據源碼分析 (users.c:154-164, accounts.c:193-195):
- **User 的 associations** 不能直接透過 users endpoint 設置,必須使用專用的 associations endpoint
- **User 的 coordinators** 也不能直接透過 users endpoint 設置,必須使用專用的 coordinators endpoint
- **Account 的 associations** 同樣必須透過 associations endpoint 設置
#### 6. **操作行為**
- POST 請求會自動判斷是**新增**還是**修改**操作
- 如果資源不存在,會新增
- 如果資源已存在,會修改
- 可以使用 `old_name` 欄位來重新命名 user
#### 7. **WCKey 處理**
- 如果設置 `default_wckey`,系統會自動將其加入 `wckey_list`
- WCKey 會自動標記為 default
#### 8. **錯誤處理**
- API 回應包含 `errors` 和 `warnings` 陣列
- 檢查 HTTP status code:
- 200: 成功
- 400: 請求格式錯誤
- 401: 認證失敗
- 404: 資源不存在
- 500: 伺服器錯誤
#### 9. **安全性考量**
- **不要在 slurm.conf 中硬編碼敏感資訊**
- JWT token 有時效性,需定期更新
- 建議使用 HTTPS 而非 HTTP (實際部署時)
- slurmrestd 應該以專用的非特權用戶運行
#### 10. **性能優化**
- 批次操作時,可以在單一 POST 請求中包含多個 users 或 accounts
- 使用適當的查詢參數避免傳輸不必要的資料
### 六、完整範例工作流程
```bash
# 1. 設置環境
export SLURM_JWT=daemon
export SLURMRESTD_DEBUG=debug
# 2. 啟動 slurmrestd (如果尚未啟動)
slurmrestd localhost:6820 &
# 3. 取得認證 token
unset SLURM_JWT
export $(scontrol token lifespan=3600)
# 4. 新增 Account
curl -H "X-SLURM-USER-TOKEN:$SLURM_JWT" \
-H "Content-Type: application/json" \
-X POST \
"http://localhost:6820/slurmdb/v0.0.43/accounts/" \
-d '{
"accounts": [{
"name": "project_a",
"description": "Project A Research Account",
"organization": "Research Lab"
}]
}'
# 5. 新增 User
curl -H "X-SLURM-USER-TOKEN:$SLURM_JWT" \
-H "Content-Type: application/json" \
-X POST \
"http://localhost:6820/slurmdb/v0.0.43/users/" \
-d '{
"users": [{
"name": "john_doe",
"default_account": "project_a",
"admin_level": "None"
}]
}'
# 6. 查詢確認
curl -H "X-SLURM-USER-TOKEN:$SLURM_JWT" \
-X GET \
"http://localhost:6820/slurmdb/v0.0.43/user/john_doe?with_assocs=true"
```
### 七、相關文件路徑
- REST API 實作: `/src/slurmrestd/plugins/openapi/slurmdbd/`
- Users 處理: `users.c`
- Accounts 處理: `accounts.c`
- 認證處理: `/src/slurmrestd/rest_auth.h`
- 文件: `/doc/html/rest_quickstart.shtml`, `/doc/html/rest_api.shtml`
這些資訊應該能幫助你使用 Slurm REST API 來管理 accounts 和 users!
<br>
---
<br>
## CLI 轉 REST API 案例
### `sacctmgr modify user <username> set MaxJobs=0 MaxSubmitJobs=0` 轉換成 rest API
- ### `us.anthropic.claude-sonnet-4-5-20250929-v1:0` (2026/01/31)
> codebase: https://github.com/SchedMD/slurm
## Slurm REST API 設定使用者 MaxJobs 和 MaxSubmitJobs
根據程式碼分析,`sacctmgr modify user <username> set MaxJobs=0 MaxSubmitJobs=0` 指令需要透過 **Associations API** 來實現,因為 MaxJobs 和 MaxSubmitJobs 是 **association 層級的限制**,而非單純的使用者屬性。
### REST API 端點
```
POST /slurmdb/v0.0.43/associations/
```
- https://slurm.schedmd.com/rest_api.html#slurmdbV0044PostAssociations
### 請求格式
```json
{
"associations": [
{
"user": "username",
"cluster": "your-cluster-name",
"account": "user-account",
"max": {
"jobs": {
"active": {
"set": true,
"number": 0
},
"total": {
"set": true,
"number": 0
}
}
}
}
]
}
```
### 欄位說明
根據 `src/plugins/data_parser/v0.0.43/parsers.c:4963-4965`:
- **`max/jobs/active`** (max_jobs): 對應 `MaxJobs` - 使用者在此 association 中可同時執行的最大 job 數量
- **`max/jobs/total`** (max_submit_jobs): 對應 `MaxSubmitJobs` - 使用者在此 association 中可處於 pending 或 running 狀態的最大 job 總數
### 完整 curl 範例
```bash
# 1. 先查詢現有 association 資訊
curl -X GET "http://localhost:6820/slurmdb/v0.0.43/associations/?user=username" \
-H "X-SLURM-USER-NAME: slurm" \
-H "X-SLURM-USER-TOKEN: your-token" \
-H "Content-Type: application/json"
# 2. 修改 association 設定
curl -X POST "http://localhost:6820/slurmdb/v0.0.43/associations/" \
-H "X-SLURM-USER-NAME: slurm" \
-H "X-SLURM-USER-TOKEN: your-token" \
-H "Content-Type: application/json" \
-d '{
"associations": [
{
"user": "username",
"cluster": "linux",
"account": "root",
"max": {
"jobs": {
"active": {
"set": true,
"number": 0
},
"total": {
"set": true,
"number": 0
}
}
}
}
]
}'
```
- ### CLI 執行範例
```
# 設定前:
$ sacctmgr list association user=tj_tsai@asus.com
Cluster Account User Partition Share Priority GrpJobs GrpTRES GrpSubmit GrpWall GrpTRESMins MaxJobs MaxTRES MaxTRESPerNode MaxSubmit MaxWall MaxTRESMins QOS Def QOS GrpTRESRunMin
---------- ---------- ---------- ---------- --------- ---------- ------- ------------- --------- ----------- ------------- ------- ------------- -------------- --------- ----------- ------------- -------------------- --------- -------------
slurm_slu+ tn tj_tsai@a+ 1 normal
# 設定後:
$ sacctmgr list association user=tj_tsai@asus.com
Cluster Account User Partition Share Priority GrpJobs GrpTRES GrpSubmit GrpWall GrpTRESMins MaxJobs MaxTRES MaxTRESPerNode MaxSubmit MaxWall MaxTRESMins QOS Def QOS GrpTRESRunMin
---------- ---------- ---------- ---------- --------- ---------- ------- ------------- --------- ----------- ------------- ------- ------------- -------------- --------- ----------- ------------- -------------------- --------- -------------
slurm_slu+ tn tj_tsai@a+ 1 2 2 normal
```
- ### REST API 執行範例
1. **先查詢現有 association 資訊**
```json=
$ TOKEN=<your-token>
$ curl -X GET "http://slurm-restapi.slurm.svc.cluster.local:6820/slurmdb/v0.0.43/associations/?user=tj_tsai%40asus.com" \
-H "X-SLURM-USER-NAME: slurm" \
-H "X-SLURM-USER-TOKEN: ${TOKEN}" \
-H "Content-Type: application/json"
{
"associations": [
{
"accounting": [],
"account": "tn",
"cluster": "slurm_slurm",
"comment": "",
"default": {
"qos": ""
},
"flags": [],
"max": {
"jobs": {
"per": {
"count": {
"set": false,
"infinite": true,
"number": 0
},
"accruing": {
"set": false,
"infinite": true,
"number": 0
},
"submitted": {
"set": false,
"infinite": true,
"number": 0
},
"wall_clock": {
"set": false,
"infinite": true,
"number": 0
}
},
"active": {
"set": false,
"infinite": false,
"number": 0
},
"accruing": {
"set": false,
"infinite": true,
"number": 0
},
"total": {
"set": false,
"infinite": false,
"number": 0
}
},
"tres": {
"total": [],
"group": {
"minutes": [],
"active": []
},
"minutes": {
"total": [],
"per": {
"job": []
}
},
"per": {
"job": [],
"node": []
}
},
"per": {
"account": {
"wall_clock": {
"set": false,
"infinite": true,
"number": 0
}
}
}
},
"id": 6,
"is_default": true,
"lineage": "\/tn\/0-tj_tsai@asus.com\/",
"min": {
"priority_threshold": {
"set": false,
"infinite": true,
"number": 0
}
},
"parent_account": "",
"partition": "",
"priority": {
"set": false,
"infinite": true,
"number": 0
},
"qos": [
"normal"
],
"shares_raw": 1,
"user": "tj_tsai@asus.com"
}
],
"meta": {
"plugin": {
"type": "openapi\/slurmdbd",
"name": "Slurm OpenAPI slurmdbd",
"data_parser": "data_parser\/v0.0.43",
"accounting_storage": "accounting_storage\/slurmdbd"
},
"client": {
"source": "slurm-restapi-f844ccc5c-qr5nw:6820(fd:11)",
"user": "nobody",
"group": ""
},
"command": [],
"slurm": {
"version": {
"major": "25",
"micro": "2",
"minor": "11"
},
"release": "25.11.2",
"cluster": "slurm_slurm"
}
},
"errors": [],
"warnings": []
}
```
2. **修改 association 設定**
```json=
curl -X POST "http://slurm-restapi.slurm.svc.cluster.local:6820/slurmdb/v0.0.43/associations/" \
-H "X-SLURM-USER-NAME: slurm" \
-H "X-SLURM-USER-TOKEN: ${TOKEN}" \
-H "Content-Type: application/json" \
-d '{
"associations": [
{
"user": "tj_tsai@asus.com",
"cluster": "slurm_slurm",
"account": "tn",
"max": {
"jobs": {
"active": {
"set": true,
"number": 2
},
"total": {
"set": true,
"number": 2
}
}
}
}
]
}'
{
"errors": [],
"warnings": [],
"meta": {
"plugin": {
"type": "openapi\/slurmdbd",
"name": "Slurm OpenAPI slurmdbd",
"data_parser": "data_parser\/v0.0.43",
"accounting_storage": "accounting_storage\/slurmdbd"
},
"client": {
"source": "slurm-restapi-f844ccc5c-qr5nw:6820(fd:11)",
"user": "nobody",
"group": ""
},
"command": [],
"slurm": {
"version": {
"major": "25",
"micro": "2",
"minor": "11"
},
"release": "25.11.2",
"cluster": "slurm_slurm"
}
}
}
```
- 再次查詢現有 association 資訊
```json
"max": {
"jobs": {
"active": {
"set": true,
"infinite": false,
"number": 2
},
"accruing": {
"set": false,
"infinite": true,
"number": 0
},
"total": {
"set": true,
"infinite": false,
"number": 2
}
}
}
```
### 重要注意事項
1. **必須指定完整的 association**: 需要提供 `user`、`cluster`、`account` 來唯一識別 association
2. **`set` 欄位**: 必須設定為 `true` 表示要更新此欄位
3. **修改行為** (程式碼: `src/slurmrestd/plugins/openapi/slurmdbd/associations.c:250-319`):
- 如果 association 已存在 → **修改 (modify)**
- 如果 association 不存在 → **新增 (add)**
4. **權限**: 需要 slurmdbd 管理權限
5. **認證方式**:
- JWT Token: `X-SLURM-USER-TOKEN`
- Basic Auth
- Local auth
### 測試範例 (Python)
參考 `testsuite/python/tests/test_112_43.py:626-684`:
```python
from openapi_client.models.v0043_openapi_assocs_resp import V0043OpenapiAssocsResp
from openapi_client.models.v0043_assoc import V0043Assoc
from openapi_client.models.v0043_uint32_no_val_struct import V0043Uint32NoValStruct
associations = V0043OpenapiAssocsResp(
associations=[
V0043Assoc(
account="root",
cluster="linux",
user="username",
max=dict(
jobs=dict(
active=V0043Uint32NoValStruct(
set=True,
number=0
),
total=V0043Uint32NoValStruct(
set=True,
number=0
)
),
),
)
]
)
resp = slurmdb.slurmdb_v0043_post_associations(
v0043_openapi_assocs_resp=associations
)
```
### 相關檔案
- API Handler: `src/slurmrestd/plugins/openapi/slurmdbd/associations.c:250-319`
- 資料結構: `src/plugins/data_parser/v0.0.43/parsers.c:4963-4965`
- OpenAPI 文件: `doc/html/rest_api.shtml`
<br>
---
<br>
## 討論
### slurm rest API 中,建立 partition 的 endpoint & 用法為何?
- ### us.anthropic.claude-sonnet-4-5-20250929-v1:0 (2025/12/31)
> codebase: https://github.com/SchedMD/slurm
### **Slurm REST API 中建立 Partition 的結論**
根據程式碼分析,**Slurm REST API 目前不支援建立 partition 的功能**。
#### **現有的 Partition 相關 Endpoints(僅支援查詢)**
1. **列出所有 Partitions**
- **Endpoint**: `GET /slurm/{data_parser}/partitions/`
範例:[GET /slurm/v0.0.44/partitions/](https://slurm.schedmd.com/rest_api.html#slurmV0044GetPartitions)
- **功能**: 取得所有 partition 的資訊
- **實作位置**: `src/slurmrestd/plugins/openapi/slurmctld/partitions.c:44`
2. **查詢特定 Partition**
- **Endpoint**: `GET /slurm/{data_parser}/partition/{partition_name}`
- **功能**: 取得指定名稱的 partition 詳細資訊
- **實作位置**: `src/slurmrestd/plugins/openapi/slurmctld/partitions.c:86`
#### **查詢範例**
```bash
# 列出所有 partitions
curl -X GET "http://localhost:6820/slurm/v0.0.43/partitions/" \
-H "X-SLURM-USER-NAME: slurm" \
-H "X-SLURM-USER-TOKEN: <token>"
# 查詢特定 partition (例如: compute)
curl -X GET "http://localhost:6820/slurm/v0.0.43/partition/compute" \
-H "X-SLURM-USER-NAME: slurm" \
-H "X-SLURM-USER-TOKEN: <token>"
```
#### **重要發現**
從程式碼可以看到,partition 的 handler 明確拒絕非 GET 方法:
```c
// src/slurmrestd/plugins/openapi/slurmctld/partitions.c:51-56
if (ctxt->method != HTTP_REQUEST_GET) {
resp_error(ctxt, ESLURM_REST_INVALID_QUERY, __func__,
"Unsupported HTTP method requested: %s",
get_http_method_string(ctxt->method));
goto done;
}
```
#### **建立/更新 Partition 的替代方法**
如需建立或修改 partition,必須使用以下方式:
1. **使用 `scontrol` 命令**
```bash
scontrol create PartitionName=newpart Nodes=node[1-10] State=UP
scontrol update PartitionName=newpart MaxTime=24:00:00
```
2. **直接編輯 `slurm.conf`**
```
PartitionName=newpart Nodes=node[1-10] Default=NO State=UP
```
編輯後需執行 `scontrol reconfigure`
3. **使用 Slurm C API**
- 透過 `slurm_update_partition()` 函數(需要撰寫 C 程式)
目前 Slurm REST API 的設計僅提供**讀取 (read-only)** 功能給 partition 資源,不支援建立、更新或刪除操作。
---
- ### gpt-5.2 (2025/12/31)
在 **Slurm REST API(slurmrestd / openapi/slurmctld)** 裡,**「建立 partition」目前沒有對應的 REST 端點**(也就是沒有 `POST /slurm/.../partition` 或類似的 create API)。在官方 REST API 的 Methods 清單中,partition 只有 **查詢用**的兩個 `GET`:`/partition/{partition_name}` 與 `/partitions/`。 ([Slurm][20251231-A-1])
### REST API 能用的 partition URL(查詢)
以官方文件示例版本(Slurm-25.11.1 / API v0.0.44)為例: ([Slurm][20251231-A-1])
* **列出所有 partitions**
* `GET /slurm/v0.0.44/partitions/` ([Slurm][20251231-A-1])
* **查單一 partition**
* `GET /slurm/v0.0.44/partition/{partition_name}` ([Slurm][20251231-A-1])
#### curl 範例(JWT)
slurmrestd 的 `rest_auth/jwt` 會要求你在每個 request 帶上:
* `X-SLURM-USER-NAME: <user name>`
* `X-SLURM-USER-TOKEN: <JSON Web Token>`
且 token 通常用 `scontrol token` 產生。 ([Slurm][20251231-A-2])
```bash
TOKEN="$(scontrol token | sed 's/^SLURM_JWT=//')"
curl -sS \
-H "X-SLURM-USER-NAME: $USER" \
-H "X-SLURM-USER-TOKEN: $TOKEN" \
http://<slurmrestd-host>:<port>/slurm/v0.0.44/partitions/
```
```bash
curl -sS \
-H "X-SLURM-USER-NAME: $USER" \
-H "X-SLURM-USER-TOKEN: $TOKEN" \
http://<slurmrestd-host>:<port>/slurm/v0.0.44/partition/debug
```
> REST API 文件也列出支援 `X-SLURM-USER-NAME`、`X-SLURM-USER-TOKEN`(APIKey in Header)以及 HTTP Basic Auth。 ([Slurm][20251231-A-1])
---
### 那要「建立 partition」該怎麼做?
#### 做法 A:用 `scontrol create` 動態建立(最像你要的「API 操作」)
` scontrol` 支援 **create partition**,並且 partition 的欄位規格在文件的 *PARTITIONS - SPECIFICATIONS FOR CREATE AND UPDATE COMMANDS* 章節。 ([Slurm][20251231-A-3])
```bash
sudo scontrol create \
PartitionName=debug \
Nodes=node[0-3] \
Default=NO \
MaxTime=INFINITE \
State=UP
```
> `scontrol` 的 `create <SPECIFICATION>` 可以「Create a new node, partition, or reservation」。 ([Slurm][20251231-A-3])
#### 做法 B:寫入 `slurm.conf` 後 reconfigure(最「正規/可長期維護」)
`slurm.conf` 本來就用來描述「節點如何被分到 partitions」等設定。 ([Slurm][20251231-A-4])
你可以更新 `slurm.conf` 後,要求 slurmctld reconfigure:
* REST API 有提供:
* `GET /slurm/v0.0.44/reconfigure/`(request slurmctld reconfigure) ([Slurm][20251231-A-1])
```bash
curl -sS \
-H "X-SLURM-USER-NAME: $USER" \
-H "X-SLURM-USER-TOKEN: $TOKEN" \
http://<slurmrestd-host>:<port>/slurm/v0.0.44/reconfigure/
```
---
#### 重點結論
* **REST API:只能查 partitions(GET),不能建立 partition。** ([Slurm][20251231-A-1])
* 要「建立」:用 `scontrol create PartitionName=...` ([Slurm][20251231-A-3]) 或改 `slurm.conf` 再 `reconfigure`(可用 REST 的 `/reconfigure/`)。 ([Slurm][20251231-A-1])
如果你告訴我你叢集的 Slurm 版本、slurmrestd 開的 base path(例如 `v0.0.40` 還是 `v0.0.44`),我可以把上面的 URL 全部換成你環境「精準可用」的版本字串。
[20251231-A-1]: https://slurm.schedmd.com/rest_api.html "Slurm Workload Manager - "
[20251231-A-2]: https://slurm.schedmd.com/slurmrestd.html "Slurm Workload Manager - slurmrestd"
[20251231-A-3]: https://slurm.schedmd.com/scontrol.html "Slurm Workload Manager - scontrol"
[20251231-A-4]: https://slurm.schedmd.com/slurm.conf.html?utm_source=chatgpt.com "Slurm Workload Manager - slurm.conf"
<br>
{%hackmd vaaMgNRPS4KGJDSFG0ZE0w %}