# Deepbots ###### tags: `Deepbots` ## deepbots套件解釋  ## reward機制 ``` python # compute reward here ## do not exceed the limit value if -2.897<newObservation[0] and newObservation[0]<2.897 and\ -1.763<newObservation[1] and newObservation[1]<1.763 and\ -2.8973<newObservation[2] and newObservation[2]<2.8973 and\ -3.072<newObservation[3] and newObservation[3]<-0.0698 and\ -2.8973<newObservation[4] and newObservation[4]<2.8973 and\ -0.0175<newObservation[5] and newObservation[5]<3.7525 and\ -2.897<newObservation[6] and newObservation[6]<2.897: reward = 3 if(newObservation[-1]<0.01) else -newObservation[-1] # if L2norm is small enough than reward = 1 else: reward = -2 # if on of the motors exceed the limit, reward = -2 print(reward) # ------compute reward end------ ``` 1. 超過馬達極限邊界直接扣2分<font color="red">(reward = -2)</font> 3. 在不超過馬達極限邊界下,根據目標點TARGET的getPosition得到在Cartesian coordinate system的座標$T$與七個轉軸用IKPY的[forward_kinematics](https://ikpy.readthedocs.io/en/latest/chain.html#ikpy.chain.Chain.forward_kinematics) [0:3, 3]取出他的Cartesian coordinate system的座標$E$ 用np.linalg.norm的default算出Frobenius norm,此參數命名為**$\text{L2norm}_t$** $$\text{L2norm}_t=||T_t -E_t ||_F = [\sum_{i=1}^3 abs(T_{t,i}-E_{t,i})^2]^{\frac{1}{2}}$$ 如果$\text{L2norm}_t$小於0.01,加3分<font color="red">(reward = 3)</font>,大於等於0.01則給予$-\text{L2norm}_t$分<font color="red">(reward = $-\text{L2norm}_t$)</font> (以下reward因為電機學長沒有這樣設定dL2norm暫時沒有使用) 在同一個trajectory $\tau$中,observation包含此**L2norm的變化**,此參數命名為**dL2norm** $$\text{dL2norm}_t=\text{L2norm}_t-\text{L2norm}_{t-1}$$ ## 取得observation 1. 由機械手臂發射各馬達轉動角度回到supervisor,以記錄共七個轉動位置 2. TARGET的位置共三個參數(x, y, z) 3. end-effector的位置共三個參數(x, y, z) 4. L2norm一個參數 共計14個參數 ### 討論主題 - [ ] observation space增改 > [motorPosition, targetPosition, endEffectorPosition, L2norm] - [ ] reward機制 > 考慮是否加上Baseline,與合理的扣分加分機制 - [ ] agent內的neural network大小 - [ ] action控制方式為改變velocity 或是 改變position - [ ] motor邊界控制與是否加入超出馬達邊界的懲罰或是直接以程式碼忽略超出的
×
Sign in
Email
Password
Forgot password
or
By clicking below, you agree to our
terms of service
.
Sign in via Facebook
Sign in via Twitter
Sign in via GitHub
Sign in via Dropbox
Sign in with Wallet
Wallet (
)
Connect another wallet
New to HackMD?
Sign up