# Distributed Data Parallel (DDP) ###### tags: `PyTorch` 1. 出現`ERROR:torch.distributed.elastic.multiprocessing.api:failed`並且無法關閉screen * 如果使用screen開啟,可以用`screen -list`確認該PID後,`kill <PID>`。 * https://www.baeldung.com/linux/kill-detached-screen-session * port占用可以用`netstat -ano -p tcp | grep <port>`確認該process PID後,`kill -9 <PID>` 2. 有效batch size * 使用DistributedSampler給予的batch size為各個GPU的輸入batch size,然而實際對於神經網路的batch_size要再乘以GPU數量,nprocs
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up