# CLD Lab 02: App scaling on IaaS
Authors: Kevin Auberson and Léo Zmoos - L2GrR
Date: April 7, 2024
## TASK 1: CREATE A DATABASE USING THE RELATIONAL DATABASE SERVICE (RDS)
> Copy the estimated monthly cost for the database and add it to your report.
RDS Pricing:
- db.t3.micro = $0.017/h
- gp2 storage = $0.115/GB-month
Monthly cost calculation:
- 20Gb gp2 storage = 20 * 0.115 = 2.3$
- db.t3.micro = 24 * 30 * 0.017 = 12.24$
**Total monthly cost = 2.3 + 12.24 = 14.54$**
> Compare the costs of your RDS instance to a continuously running EC2 instance of the same instance type to see how much AWS charges for the extra functionality.
RDS monthly cost = 14.54$
EC2 Pricing:

Instance total cost: 24 * 30 * 0.0104 = 7,488$
GP2 Storage: 0.10$ /GB-Month => total storage cost = 20 * 0.10 = 2$
EC2 total monthly cost = 7,488 + 2 = 9,488$
Difference between RDS and EC2: 14.54 - 9.488 = 5,052$
They charge an extra ~5$ for the extra functionnality.
> In a two-tier architecture the web application and the database are kept separate and run on different hosts. Imagine that for the second tier instead of using RDS to store the data you would create a virtual machine in EC2 and install and run yourself a database on it. If you were the Head of IT of a medium-size business, how would you argue in favor of using a database as a service instead of running your own database on an EC2 instance? How would you argue against it?
#### Arguments in favor of RDS:
RDS provides automatic backups and encryption, taking full responsibility for the
database's configuration, management, maintenance, and security through AWS
automation. This enables users to configure read replicas or set up synchronous
replication across multiple AZs for improved performance and availability. While
deploying a database in multiple AZs with multiple standbys can be time-consuming, it
can be a huge time saver with RDS. However, time is money, and this solution is more
expensive.
#### Arguments in favor of EC2 with DB engine
It gives more flexibility and granularity on various aspects of the system. For instance,users can choose a specific OS running a specific version, or they could use EBS RAID and stripping configurations for higher performance. EC2 would be the only solution if a user wants to run a DB engine in an older unsupported version, or if data access time and bandwidth are critical. It could also be a cheaper choice for test/dev DB environments that do not need to be in production.
In conclusion, the decision between RDS and EC2 with a DB engine heavily depends on
the required flexibility of the database, the nature of the data stored, and the
performance needed for the application.
> Copy the endpoint address of the database into the report.
grr-zmoos-wordpress-db.crsk2uw660uh.us-east-1.rds.amazonaws.com
-------
## TASK 2: CONFIGURE THE WORDPRESS MASTER INSTANCE TO USE THE RDS DATABASE
> Copy the part of /var/www/html/wp-config.php that configures the database into the report.
``` conf
// ** Database settings - You can get this info from your web host ** //
/** The name of the database for WordPress */
define( 'DB_NAME', 'grr-zmoos-wordpress-db' );
/** Database username */
define( 'DB_USER', 'admin' );
/** Database password */
define( 'DB_PASSWORD', 'dCU=U!dZjqB)2)-' );
/** Database hostname */
define( 'DB_HOST', 'grr-zmoos-wordpress-db.crsk2uw660uh.us-east-1.rds.amazonaws.com' );
```
-------
## TASK 3: CREATE A CUSTOM VIRTUAL MACHINE IMAGE
> Copy a screenshot of the AWS console showing the AMI parameters into the report.

AMI parameters
--------
## TASK 4: CREATE A LOAD BALANCER
> On your local machine resolve the DNS name of the load balancer into an IP address using the nslookup command (works on Linux, macOS and Windows). Write the DNS name and the resolved IP Address(es) into the report.
DNS Name: GrR-Zmoos-LoadBalancer-741345188.us-east-1.elb.amazonaws.com

Screenshot of the nslookup result
> In the Apache access log identify the health check accesses from the load balancer and copy some samples into the report.

Log with HealthChecker
-----
## TASK 5: LAUNCH A SECOND INSTANCE FROM THE CUSTOM IMAGE
> Using the custom virtual machine image you created earlier launch a second instance.

First, we choose a name for the instance.

Then, the OS Images in this case the image we generate at Task 4.

The instance type and the key pair for logging, we use the same as the lab 1.

For network, we use the same security group as the other instance.

The summary of our command.
Using the AWS console connect the instance to the load balancer. Watch the status of the instance go from Out of Service to In Service.

Adding instance in load balancer
Using any of the instances, run the wp search and replace tool to replace the old IP address with the load balancer’s DNS name in the database.
``` bash
php wp-cli.phar search-replace '52.54.125.85' 'GrR-Auberson-LoadBalancer-1340384143.us-east-1.elb.amazonaws.com' --path=/var/www/html/ --skip-columns=guid
```
Make sure that you can access the Wordpress post using the load balancer’s DNS name.

Access to my page create with loadbalancer DNS name
> Draw a diagram of the setup you have created showing the components (instances, database, load balancer, client) and how they are connected. Include the security groups as well. Make sure to show every time a packet is filtered.

Diagram of the setup
> Calculate the monthly cost of this setup. You can ignore traffic costs.

- RDS with a 100%utilized/Month
- EC2 constant usage with 100%utilized/Month
- Load balancing with 20 GB/month
The monthly cost is 62.95 USD.
----
## TASK 5B: DELETE AND RE-CREATE THE LOAD BALANCER USING THE COMMAND LINE INTERFACE
> Put the commands to delete the load balancer, re-create the load balancer and re-create the listener into the report.
Commands:
``` bash
//For deleting the loadbalancer
aws elbv2 delete-load-balancer --load-balancer-arn arn:aws:elasticloadbalancing:us-east-1:851725581851:loadbalancer/app/GrR-Zmoos-LoadBalancer/7690fa0048c76b6c
//For create a loadbalancer
aws elbv2 create-load-balancer --name GrR-Zmoos-LoadBalancer --subnets subnet-083708276b8956c3a subnet-0a2ab628966261f50 --security-groups sg-00c86e0432dc272fa --scheme internet-facing --type application --ip-address-type ipv4
//For create listener
aws elbv2 create-listener --load-balancer-arn arn:aws:elasticloadbalancing:us-east-1:851725581851:loadbalancer/app/GrR-Zmoos-LoadBalancer/c281b7937c53b112 --protocol HTTP --port 80 --default-actions Type=forward,TargetGroupArn=arn:aws:elasticloadbalancing:us-east-1:851725581851:targetgroup/GrR-Zmoos-TargetGroup/727c86fe818601bb
```
-----
## TASK 6: TEST THE DISTRIBUTED APPLICATION
> Document your observations. Include reports and graphs of the load testing tool and the AWS console monitoring output.
After running Vegeta tool once:
Master instance:


Second instance:


Vegeta plot:
Duration: 60s

Duration: 120s

## Observations:
Before we started testing with Vegeta, when we ran the command `sudo tail -F /var/log/apache2/access.log` on the second instance, the log file was empty because for the moment all incoming requests were directed to the master instance.
However, this behaviour changed during tests with Vegeta! Obviously, this was due to the load balancer redirecting some of the requests to the second instance in order to avoid overloading the first.
We can clearly see from the two AWS graphs that the load on the first instance decreased as soon as the second instance started processing requests too.
This indicates that our load balancer is working properly.
> When you resolve the DNS name of the load balancer into IP addresses what do you see? Explain.

We can see 2 different Ip addresses associated to the DNS (A record).
> Did this test really test the load balancing mechanism? What are the limitations of this simple test? What would be necessary to do realistic testing?
Our current setup effectively evaluates the load balancing mechanism of our Application Load Balancer by distributing traffic among EC2 instances, it has its limitations. This basic test overlooks the diversity of traffic patterns in real-world scenarios.To fully test the system's performance, we must conduct a more varied set of tests that accurately simulate real-world scenarios
To perform realistic test, we could use different types of requests, varying the frequency and simulating different user. Futhermore, we could consider testing the system under different conditions, such as the peak hours traffic.