Some of my time last week was spent on researching ways to run beacon node clients in the most cost efficient manner on the cloud.The primary reason being that I will need to read and stream data reliably and also due to the fact that needing to run 3-5 nodes for the project whcih can quickly get very expensive.
As a result, I did some research on aws ec2 architectures in order to reduce the RAM required to run the EL and CL whilst at the same time saving on storage costs.This was predominently driven by the fact that Cloud services and infrastructures being highly modular allowing one to subdevide operations into smaller pieces to run them more cost efficiently
My research led to come with a simple idea where, I was able to mount and create an S3 bucket to the EC2 as a folder. This meant that when streaming data from the node through the beacon API the the majority of the CPU dedicated to read and write operations was removed, reducing the overall RAM required to operate. In addition, a more cheaper storage type could be used such as S3 buckets as it is 6-10x less than EBS volume.
Once launched the EL and CL client, I specified using --datadir
flag to save the synced data to the S3 mounted folder in the EC2. This went smoothly for the initial few days and I was able to read the streamed data through the beacon API with no apparent performance issues.
However, issues started appearing after downloading over 1TB GB of data. Although the RAM was still functioning well below 50%, the read and write of the mounted S3 bucket started to slow down with the increased load. I suspect this was due to requiring more work to actively parse through the existing data to put and read objects when streaming data through the Beacon APIs.
After observing the issues I decided to go with the normal spec however, I suspect that this implementation could work well once geth's experimental -- sync light
mode is fully operational and works with CL clients.
I have put below the instructions to mount an S3 bucket to your local linux/ubuntu machine as it can be very useful to extend your local and EC2 storage and save on costs in general.
Inside the server
sudo apt install awscli
sudo apt-get install s3fs
aws configure
to the in the cmd and provide your IAM role AWS key ID, AWS secret key ID and EC2 region where it is deployedaws s3 sync <path to your folder you want to give access to> s3://<your newly created bucket name>
echo <yourawskeyid>:<yourawssecretekeyid> > $HOME/.password-s3fs
chmod 600 .password-s3fs
sudo s3fs <s3bucketname> <pathToTheFolderYourMountingTo> -o passwd_file=$HOME/.password-s3fs,nonempty,rw,allow_other,mp_umask=002,uid=1000,gid=1000 -o url=http://s3.<ec2Region>.amazonaws.com,endpoint=<ec2Region>,use_path_request_style
/etc
sudo cp fstab fstab_bkp
sudo nano fstab
<yourS3bucketName> <pathToTheEC2FolderYourMountingTo> fuse.s3fs _netdev,allow_other 0 0