# Community dRNG summary @Community dRNG Committee # Summary of the setup attempts ### First Setup Meeting (23.11.2020)[*[1]*](https://discord.com/channels/397872799483428865/779856362074931260/780488141454180422) | Participants `9` [*[2]*](https://discord.com/channels/397872799483428865/779856362074931260/780488141454180422) | Success | | --------------------------------------------------------------------------------------------------------------- | ------------------ | | Hanspetzer (Leader) | :white_check_mark: | | Wellwho | :white_check_mark: | | Critical | :white_check_mark: | | Daniel Stricker 1 | :white_check_mark: | | Daniel Stricker 2 | :white_check_mark: | | Steve K | :white_check_mark: | | ghadj | :x: | | MyMonk089 | :x: | | niels | :x: | During the initial meeting the group reviewed their settings and coordinated the start of all the dRNG services. The first trial resulted in some successful rounds of randomness. Three members though were not communicating right from the Start (which was above the threshold of 5) so the dRNG group was still functioning and from the leader's side (Hanspetzer) everything looked successful. **Problems:** | Name | Problem [*[3]*](https://discord.com/channels/397872799483428865/779856362074931260/780917711383035914) | | --------- | --------------------------------------------------------------------------------------------------------------------------------- | | ghadj | The dRNG keys were created using the wrong port and so the leader's node was not able to reach him. | | niels12 | The keys were created with TLS enabled so the node was using the wrong protocol. | | MyMonk089 | General timeout issues due to unidentified network delays were affecting proper communication which had resolved by the next group setup meeting. Perhaps this was related to [the timeout problem](#1-Timeout). | **Leasons Learned:** | # | Leason Learned | | - | - | | 1 | Review settings for all participants is key as there are many settings that will cause failure | | 2 | Standardizing the configuration was good but still issues occured due to individual users need to have specific ports or the different steps necessary between running via docker or buiding via source. | | 3 | ### Second Setup Meeting (29.11.2020)[*[4]*](https://discord.com/channels/397872799483428865/779856362074931260/781076126801854484) | Participants `10` | Success | | ------------------- | ------------------ | | Hanspetzer (Leader) | :white_check_mark: | | Wellwho | :white_check_mark: | | Critical | :white_check_mark: | | Daniel Stricker 1 | :white_check_mark: | | Daniel Stricker 2 | :white_check_mark: | | Steve K | :white_check_mark: | | Carpincho Dem | :white_check_mark: | | MyMonk089 | :white_check_mark: | | niels | :white_check_mark: | | ghadj | :x: | **Problems:** | Name | Problem | | ----- | ----------------------------------------------------------------------------------------------------------------------------- | | ghadj | Problems generating the group and distributed key file because of the script [*[5]*](https://discord.com/channels/397872799483428865/779856362074931260/782672482729328661) [*[6]*](https://discord.com/channels/397872799483428865/779856362074931260/782673800574533663) | | Carpinch Dem | VPS was accidentially deleted after one week :grimacing: [*[7]*](https://discord.com/channels/397872799483428865/779856362074931260/785183639931584522) | ### 3. After the Initiation of MyMonk089 [*[8]*](https://discord.com/channels/397872799483428865/779856362074931260/787009729175486484) the committee decided to restart and because they were already experienced enough, they decided to do the restart without a meeting this time on the 14.12.2020 [*[9]*](https://discord.com/channels/397872799483428865/779856362074931260/787679521737539654). This once Luca also wanted to join [*[10]*](https://discord.com/channels/397872799483428865/779856362074931260/787740757317779477). Hanspetzer wrote a short guide and the restart process was started [*[11]*](https://discord.com/channels/397872799483428865/779856362074931260/788047122380619786). Three members had problem with `i/o timeout`: **Problems:** | Name | Problem [*[12]*](https://discord.com/channels/397872799483428865/779856362074931260/788118300470935584) | | ------------ | ---------------------------------------------------------------------------------------------------------------------- | | Luca | Used the wrong port while generating the LTKs [*[13]*](https://discord.com/channels/397872799483428865/779856362074931260/788120737814544464) | | ghadj | Not sure | | Carpinch Dem | As we discovered later, wrong ID and also IP [*[14]*](https://discord.com/channels/397872799483428865/779856362074931260/788140202227925003) [*[15]*](https://discord.com/channels/397872799483428865/779856362074931260/788158145985249281) | ### 4. Because of the above problems, another restart was done this day [*[16]*](https://discord.com/channels/397872799483428865/779856362074931260/788124582095749120). **Problems:** | Name | Problem [*[17]*](https://discord.com/channels/397872799483428865/779856362074931260/788134509995753482) [*[18]*](https://discord.com/channels/397872799483428865/779856362074931260/788134596360273930) | | ------------ | ---------------------------------------------------------------------------------------------------------------------- | | Luca | Not sure | | Carpinch Dem | Wrong ID and IP [*[14]*](https://discord.com/channels/397872799483428865/779856362074931260/788140202227925003) [*[15]*](https://discord.com/channels/397872799483428865/779856362074931260/788158145985249281) | #### 4.1 Some members were already offline. The rest decided to do some more tests with a smaller committee to prepare for the next Day [*[19]*](https://discord.com/channels/397872799483428865/779856362074931260/788145196901597196) | Participants `6` | Success | | ------------------- | ------------------ | | Hanspetzer (Leader) | :white_check_mark: | | Daniel Stricker 1 | :white_check_mark: | | Daniel Stricker 2 | :white_check_mark: | | ghadj | :white_check_mark: | | Luca | :white_check_mark: | | Carpincho Dem | :x: | **Problems:** | Name | Proplem | | ------------ | -------------------------------------------------------------------------------------------------------- | | Carpinch Dem | Wrong IP [*[15]*](https://discord.com/channels/397872799483428865/779856362074931260/788158145985249281) | #### 4.2 Next restart [*[20]*](https://discord.com/channels/397872799483428865/779856362074931260/788153003911282698). **Problems:** | Name | Proplem [*[21]*](https://discord.com/channels/397872799483428865/779856362074931260/788157731500064789) | | ------------ | -------------------------------------------------------------------------------------------------------- | | Carpinch Dem | Wrong IP [*[15]*](https://discord.com/channels/397872799483428865/779856362074931260/788158145985249281) | #### 4.3 Last restart (another restart was done, but only because of a small mistake from ghadj) on the 14.12.2020 [*[22]*](https://discord.com/channels/397872799483428865/779856362074931260/788158965770879007). Successfully. ### 5. The next full restart was done on the 15.12.2020 [*[23]*](https://discord.com/channels/397872799483428865/779856362074931260/788332472475058196) | Participants `11` | Success | | ------------------- | ------------------ | | Hanspetzer (Leader) | :white_check_mark: | | Wellwho | :white_check_mark: | | Critical | :white_check_mark: | | Daniel Stricker 1 | :white_check_mark: | | Daniel Stricker 2 | :white_check_mark: | | Steve K | :white_check_mark: | | Carpincho Dem | :white_check_mark: | | MyMonk089 | :white_check_mark: | | niels | :white_check_mark: | | ghadj | :x: | | Luca | :x: | **Problems** | Name | Proplem [*[24]*](https://discord.com/channels/397872799483428865/779856362074931260/788458142249123840) | | ----- | ------------------------------------------------------------------------------------------------------- | | ghadj | error from dkg: node can only process justifications after processing responses | | Luca | error from dkg: node can only process justifications after processing responses | ### 6. Next Restart [*[25]*](https://discord.com/channels/397872799483428865/779856362074931260/788474960532078594). **Problems** | Name | Proplem | | ----- | ------------------------------------------------------------------------------------------------------- | | Luca | error from dkg: node can only process justifications after processing responses [*[26]*](https://discord.com/channels/397872799483428865/779856362074931260/788479702583083068) | I did some research with Luca and Hans for the follow and reshare command and the timeout flag [*[27]*](https://discord.com/channels/397872799483428865/779856362074931260/788485262128447519). The follow command worked perfect. They decided to do a reshare the next day. ### 7. On the 16.12.2020 they tried to add Luca with a reshare [*[28]*](https://discord.com/channels/397872799483428865/779856362074931260/788805047999660094). | Participants `11` | Success | | ------------------- | ------------------ | | Hanspetzer (Leader) | :white_check_mark: | | Wellwho | :white_check_mark: | | Critical | :white_check_mark: | | Daniel Stricker 1 | :white_check_mark: | | Daniel Stricker 2 | :white_check_mark: | | Carpincho Dem | :white_check_mark: | | MyMonk089 | :white_check_mark: | | niels | :white_check_mark: | | Luca | :white_check_mark: | | ghadj 1 | :white_check_mark: | | ghadj 2 | :x: | **Problems** SteveK decided to leave the committee, because he hadn't enough time. Because of that ghadj did set up a new node to keep the member count at 11. But ghadj had some network problems so that the reshare had to be redone with 10 members. The reshare probably also didn't work because of other reasons I will further explain in the [Conclusion](#Conclusion). ### 8. Next reshare try [*[29]*](https://discord.com/channels/397872799483428865/779856362074931260/788844751754625034). | Participants `10` | Success | | ------------------- | ------------------ | | Hanspetzer (Leader) | :white_check_mark: | | Wellwho | :white_check_mark: | | Critical | :white_check_mark: | | Daniel Stricker 1 | :x: | | Daniel Stricker 2 | :x: | | Carpincho Dem | :white_check_mark: | | MyMonk089 | :white_check_mark: | | niels | :white_check_mark: | | Luca | :x: | | ghadj 1 | :white_check_mark: | **Problems** Lucas node and both of Daniels nodes failed [*[30]*](https://discord.com/channels/397872799483428865/779856362074931260/788851753092907058). ### 9. The group decides to do a last "normal" restart. But this time with the `--timeout` flag set to `30s`[*[31]*](https://discord.com/channels/397872799483428865/779856362074931260/788859415632674889)[*[32]*](https://discord.com/channels/397872799483428865/779856362074931260/788907502220869675). (default: `10s`) I also did a dRNG test setup with 5 nodes and added another node to the running committee with the reshare and follow command, which worked fine [*[33]*](https://discord.com/channels/397872799483428865/779856362074931260/788907502220869675). All did run the share command with the mentioned timeout flag[*[34]*](https://discord.com/channels/397872799483428865/779856362074931260/789140441349619774) and they had a working committee short time after [*[35]*](https://discord.com/channels/397872799483428865/779856362074931260/789164264819916820). ## Conclusion This conclusion is written by me without knowing how dRNG will be implemented into GoShimmer. It should be considered as a collection of potentially useful commands and flags as well as ideas from the community. The setup process of dRNG is complicated mainly because all members have different setup methods (docker(different configs), binary, service...) and also because in the community example, people come from different time zones. So it took us longer than expected. We had most of the problems with Luca, who hosts his node in Canada. So I did a little research on that. Together with Hanspetzer who looked at the reshare, we learned a lot about drand: ### 1. Timeout I'm not entirely sure but the timeout flag in the share command seems to change the phase transition timeout (I'm not sure about that, because the timeout default is 10s, but the DKG timeout described [here](https://drand.love/docs/specification/#phase-transitions) is 30s. So not sure if this is only a mistake or if I have misunderstood something). Because if we don't use a higher timeout, there are some nodes which timeout right before entering into the `JustificationPhase` and the DKG process fails. Conclusion: `--timeout 30s`should probably be added to the share command. ### 2. Reshare With the [reshare](https://drand.love/operator/deploy/#updating-drand-group) process new nodes can be added or removed from a running committee without interrupting the process and all that, while keeping the distributed public key. Ideally the process for adding new members would be: The Coordinator/Leader starts the reshare process with: ``` drand share --leader --transition --secret mysecret901234567890123456789012 --nodes 15 --threshold 10 --out group2.toml ``` If there is a lot of rounds to catch up for new members, they should use the follow command now ``` drand follow --sync-nodes <cooridinator> --chain-hash <chain hash> ``` This assures that the new members are up to date and the old committee is working fine in the meantime. After all members are up to date, they can execute the share command with the `--transition flag`. Old members: ``` drand share --connect <coordinator> --transition --secret mysecret901234567890123456789012 --out group2.toml ``` New members: ``` drand share connect <coordinator> --from group.toml --secret mysecret901234567890123456789012 --out group2.toml ``` We will need to do some additional tests, but in my first try to add another node to a 5 node network worked fine. I only was a bit confused, that an additional group.toml file was generated. I thought it will use the group2.toml file. But it made a copy of it... ### 3. Questions and Tips from the Community: #### Dr.Electron We should add some more commands, so that people don't have to check their browser. This only adds room for more errors. GoShimmer Public Key ``` curl -s localhost:8080/info | grep -Po '(?<="publicKey":")[^"]*` curl -s localhost:8080/info | jq -r .publicKey ``` GoShimmer identityID ``` curl -s localhost:8080/info | grep -Po '(?<="identityID":")[^"]*' curl -s localhost:8080/info | jq -r .identityID ``` dRNG Public Key ``` curl -s localhost:8081/info | grep -Po '(?<="public_key":")[^"]*' curl -s localhost:8081/info | jq -r .public_key ``` #### MyMonk089 Is it possible to make the GoShimmer config more general? If the size of the committee is changed with a reshare, the distPubKey stays the same but the list of committee members would need to be changed and that will add a lot of complications because you will always need to inform the Pollen network about this change. #### hanspetzer What was a little confusing to me in the beginning were the private and public listen ports, if you don't use docker-compose it's not clear from the wiki what port you have to use for the long term key generation. I guess the best thing for the future would be to rename them to drng_gossip and drng_api or something like that, and make it clear in the wiki which one to use for the long term key and what port has to be open for drng to function. #### Niels It would be easier if drng is just a plugin in goshimmer that you can enable, but maybe that's already planned. #### Dave :hammer_and_wrench: