# AWS Solutions Architect-Associate --- ## Chapter 1 : Exam Blue Print The AWS solution architect associate exam ``` 1. 130 minute 2. 60 MCQ 3. Scenario based questions 4. 70% to pass the exam 5. Qualification valid for 2 years ``` --- ## Chapter 2: Overview ### a) AWS high level Services ![](https://i.imgur.com/7Ei3Gdw.png) ### b) Regions, Availability Zone (AZ) and Edge Locations > 1. Region is a physical location in the work,Eg.Singapore. Each region has 2 or more AZ. > 2. AZ can be a data center or a cluster of data center that are close to one another. > 3. Edge Locations are endpoints for AWS which are used for caching content.Typically this consists of CloudFront, Amazon's CDN. There are many more edge locations than regions. ### c) What do one need to know to pass the solution architect exam >**covered area** > ![](https://i.imgur.com/T9GNqbh.png) >**Must know** >![](https://i.imgur.com/2Mr4fcd.png) ### d) Chapter 2 quiz ``` 1. What is an AWS VPC ? Ans: Virtual Private Cloud 2. An AWS VPC is a component of which group of AWS services? Ans: Network Service 3. What does AWS Region consists of ? Ans: a geographic area 4. What is An Availability Zone ? Ans: Distinct location within an AWS region that are engineered to be isolated from failures. 5. what is the number difference of region,AZ and edge locations? Ans: #Edge Locations > #AZ > #Regions 6. CloudFront content is cached at ? Ans: Edge Location ``` --- ## Chapter 3: Identity Access Management (IAM) *AWS IAM not Digibak IAM* ### a) What is IAM > IAM allows one to mange users and their level of access to the AWS Console. IAM Key features: ``` 1. Centralised control of one's AWS account 2. Shared access to one AWS account 3. Granular Permissions 4. Identity Federation 5. Mutifactor Authentication 6. Provide temporary access for user/devices and services where necessary 7. Passowrd rotation policy 8. Integrates with many different AWS service 9. Support PCI DSS Compliance ``` IAM key Terminology: ``` 1. Users: end users such as people, employee of an organization 2. Groups: a collection of users, each user in the group will inherit the permission of the group. 3. Policies: polices are made up of JSON and they give permissions as what a user/group/role is able to do. 4. Roles: roles are created and assigned to AWS resources ``` ### b) IAM LAB * *anything done in IAM service is global, IE.not regional specific* 1. step 1: login to AWS console:[console link](https://signin.aws.amazon.com/signin?redirect_uri=https%3A%2F%2Fconsole.aws.amazon.com%2Fconsole%2Fhome%3Fstate%3DhashArgs%2523%26isauthcode%3Dtrue&client_id=arn%3Aaws%3Aiam%3A%3A015428540659%3Auser%2Fhomepage&forceMobileApp=0&code_challenge=wtzBTw0GM-Q3YRfMqLikBsLPTV3H9GcLOa5L498Nu8g&code_challenge_method=SHA-256) 2. step 2: go to IAM service under Security,Identity and Compliance 3. step 3: check out the IAM service ![](https://i.imgur.com/UmxejjQ.png) 4. make new IAM user ![](https://i.imgur.com/u4fRjFx.png) ![](https://i.imgur.com/Fchn9OV.png) 5. create a group ![](https://i.imgur.com/jUqR8jy.png) 6. Apply a password policy ![](https://i.imgur.com/7QeNpvr.png) 7. Create a role ![](https://i.imgur.com/wlajndT.png) ### AWS Directory Service Family of managed services to connect AWS resources with on-premise AD. It is a standalone directory in the cloud for using existing corporate credentials. It also support SSO to any domain-joined EC2 instance. ![](https://i.imgur.com/4V6pW2g.png) ![](https://i.imgur.com/SFrB1Wa.png) ![](https://i.imgur.com/HyN8Kda.png) ![](https://i.imgur.com/930Lh36.png) ![](https://i.imgur.com/yxvUwsw.png) ![](https://i.imgur.com/rsPVg6Z.png) ![](https://i.imgur.com/tb3rVK0.png) ### IAM Policies 1. Amazon Resource Name (ARN) a) ARNs all begin with: arn:partition:service:region:accound_id b) ARNs end with: resource, resource_type/resource, resource_type/resource/qualifier,resource_type/resource:qualifier,resource_type:resource,resource_type:resource:qualifier 2. IAM Policies a) Json document that defines permissions b) Identity policy c) resource policy d) No effect until attached e) List of statement ![](https://i.imgur.com/imbK7zp.png) :::info Example tips 1. Not explicitly allowed == implicitly denied 2. Explicit deny > everything else 3. only attached policies have effect 4. AWS joins all applicable policies 5. AWS managed vs customer managed 6. Permission boundaries: used to delegate administration to other users. This prevent privilege escalation or unnecessarily broad permissions. Control maximum permission an IAM policy can grant ::: ### AWS Resource Access Manager (RAM) ![](https://i.imgur.com/egr26rL.png) ![](https://i.imgur.com/LTr5xF8.png) * note that resource sharing is a two step process, one account can share the resource to another account and the other account need to accept the invitation. ### AWS SSO ![](https://i.imgur.com/Npp32AX.png) ![](https://i.imgur.com/JpJpMCV.png) --- ## Chapter 4: Billing Alarm One can create a Billing Alarm using AWS cloud watch and SNS for email notification if the billing amount exceeds a certain threshold. [More on AWS cost Management](https://help.acloud.guru/hc/en-us/articles/115001391833-Manage-your-AWS-costs-) ![](https://i.imgur.com/DE4PCH1.png) --- ## Chapter 5: AWS Organizations and Consolidated Billing > AWS organizations is an account management service that enables one to consolidate multiple AWS accounts into an organization that one create and centrally manage. ![](https://i.imgur.com/NcLo60r.png) ![](https://i.imgur.com/nO3vqaR.png) #### Consolidated Billing ![](https://i.imgur.com/8nD9vIK.png) Advantages of consolidated Billing 1. One bill per AWS account to track charges and allocate costs 2. Volume pricing discount :::info Example tips 1. Always enable MFA on root account 2. Paying account should be used for billing purpose only. Do not deploy resources into the paying account. 3. Enable/Disable AWS services using Service Control Policies (SCP) either on OU or on individual accounts ::: --- ## Chapter 6: S3 ### S3 Basics S3 stand for secure storage service. S3 has a simple web service interface to store and retireve any amount of data from anywhere on the web. 1. safe place to store files 2. Object based storage (files can be from 0 Bytes to 5T, unlimited storage) 3. data is spread across multiple devices and facilities 4. Files are stored in Buckets (Folder) 5. S3 is a universal namespace, meaning the bucket must be named unique globally 6. The creation of a Bucket will also create a URL address using the bucket name. 7. if upload a file to a S3 bucket is successful, http 200 code will be returned #### a) S3 Objects 1. Key (name of the object) 2. Value (data) 3. Version ID 4. Metadata (data about data stored) 5. Subresource (Access Control List and Torrent) #### b) Data consistency Model for S3 1. Read after Write consistency for PUTS of new Objects * if you write a new file and read it immediately afterwards, you will be able to view that data. 2. Eventual Consistency for overwrite PUTS and DELETES * If you update an existing file or delete a file and read it immediately, you may get the older version or you may not. Changes to objects can take a bit of time to propagate. #### c) S3 Guarantees 1. Built for 99.99% availability for the S3 platform. 2. AWS gurantee 99.9% availability 3. AWS gurantee 99.9999999999%(11X9s) durability for S3 information #### d) S3 Features 1. Tiered Storage Available a) **S3 Standard**: 99.99% availability and 11X9s durability, stored redundantly across multiple devices in multiple facilities, and is designed to sustain the loss of 2 facilities concurrently b) **S3-IA**: For data that is accessed less frequenctly, but requires rapid access when needed. Lower fee than S3, but you are charged a retrieval fee. c) **S3 One Zone -IA**: For where you want a lower cost option for infrequently accessed data and do not require the multi-AZ data resilience. d) **S3-Intelligent Tiering**: Designed to optimize cost by automatically moving data to the most cost effective access tier, without performance impact or operational overhead. e) **S3 Glacier**: secure,durable and lost storage class for data archiving. You can reliably store any amount of data at costs that cheaper than on-premises solutions. Retreval time configurable from minutes to hours. f) **S3 Glacier Deep Archive**: S3's lowest cost storage class where a retrieval time of 12 hours is acceptable. ![](https://i.imgur.com/NJj0pq8.png) 2. Lifecycle Management 3. Versioning 4. Encryption 5. MFA(multifactor authentication) Delete 6. Secure data using ACL and Bucket Policies #### e) S3 charges * storage * requests * storage mangement pricing * data tranfer pricing * tranfer acceleration (utilize AWS cloudFront's edge locations and the edge locations'backbone networks) * cross region replication :::info Example tips 1. S3 is object based and objects (files) are stored in Buckets(folder) 2. Files can be 0 bytes to 5TB and unlimited storage 3. S3 is universal namespace 4. the url created for the bucket is in the format of https://<nameOfBucket>.<regionOfBucket>.amazonaws.com/ 5. Not suitbale to install an operating system on 6. Succesful uploads will generate a HTTP 200 status code 7. You can turn on MFA Delete 8. Object: Key,value,versionId,metaData,ACL,torrents 9. Data consistency Model for S3 10. Storage classes 11. read the S3 FAQs before taking the exam. [Click Here](https://aws.amazon.com/s3/faqs/) ::: ### S3 Lab Step 1: go to AWS console and click on S3 service under Storage Step 2: notice the region is changed to global (S3 is a global service) Step 3: create a bucket ![](https://i.imgur.com/6i2WZob.png) Step 4: upload some objects (see that the object url is created) ![](https://i.imgur.com/iFIVWzQ.png) Step 5: check out the Object settings ![](https://i.imgur.com/DQzS4Lk.png) Step 6: edit bucket permission to make objects public assessable via url ![](https://i.imgur.com/UhXRdIE.png) Step 7: make objects public ![](https://i.imgur.com/7TkNOCz.png) :::info Example tips 1. Buckets are viewed globally but one can have buckets in individual regions. 2. Content of one bucket can be replicated to another bucket automatically using cross region replication 3. Storage class and encryption of objects can be changed on the fly 4. Restrict Bucket Access methods: a) Bucket Polices: applies across the whole bucket b) Object Polices: applies to individual files c) IAM polices to Users&Groups ::: ### S3 Pricing Tiers ![](https://i.imgur.com/4jj2ykx.png) ![](https://i.imgur.com/vRchqNj.png) ### S3 Security and Encryption By default, all newly created buckets are private. Access control to the bucket can be done using: * Bucket Policies * ACL S3 buckets can be configured to create access logs which log all requests made to the S3 bucket. This can be sent to another bucket in the same account or another account. #### Encryption Encryption in transit is achieved by * SSL/TLS Encryption at rest (bucket or object level) is achieved by * S3 Managed Keys - SSE-S3 ![](https://i.imgur.com/IsDtdQ7.png) * AWS key Management Service - SSE-KMS * Server Side encryption with Customer Provided Keys -SSE-C * Client Side Encryption ### S3 Versioning Versioning on S3 will store all versions of an object (including all writes and even if an object is deleted), This serve is a great backup tool. Once enabled, versioning cannot be disabled, only suspended. Versioning integrates with the Lifecycle rules and MFA Delete capability. Each version need to be made public (otherwise even if old version is public, the new version will not be public) ### S3 Lifecycle Management S3 lifecycle managment automates moving objects between different storage tiers. It can be used in conjunction with versioning and applied to both current versions and previous versions. Demo ![](https://i.imgur.com/EB6yIgK.png) Life cycle rule actions 1. Transition current versions of objects between storage classes 2. Transition previous versions of objects between storage classes 3. Expire current versions of objects 4. Permanently delete previous versions of objects 5. Delete expired delete markers or incomplete multipart uploads ### S3 Object Lock and Glacier Vault Lock #### S3 Object Lock Used to store objects using a write once, read many (WORM) Model. It can help to prevent objects from being deleted or modified for a fixed amount of time or indefinitely. Often used to meet regulatory requirments. 1. Governance Mode : users cannot overwrite or delete an object version or alter its lock seetings unless they have special permissions. Some user can be granted permission to alter the retention setting or delete the object if necessary. 2. Compliance Mode: no users can overwrite or delete an object version for the duration of the retention period. 3. Retention period: timestamp in the object verion's metadata. After the retention period expires, the object version can be overwritten or deleted unless a legal hold is placed on the object version 4. Legal hold: placed on an object version and is not associated retention period. It is in effect until removed. Legal holds can be freely placed and removed by any user who has the **s3:PutObjectLegalHold** permission #### Glacier Vault Lock S3 Glacier Vault lock allows one to easily deploy and enforce compliance controls for individual S3 Glacier vaults with a Vault Lock policy. You can specify controls such as WORM, in a Vault Lock policy and lock the policy from future edits. Once locked, the policy can no longer be changed. :::info Example tips 1. Object locks can be on individual objects or the whole bucket 2. Most common use case for WORM model 3. Compliance mode vs Governance Mode ::: ### S3 Performance #### S3 Prefix S3 prefix is the middle part between the bucket name and the object eg. mybucketName/ **folder1/subfolder1/** object.png S3 has extremely low latency where one can get the first byte out of S3 within 100-200 milliseconds. S3 also supports a high number of requests: 3500 PUT/COPT/POST/DELETE and 5500 GET/HEAD requests per second **per prefix**. ***Performance Strategy: Spread reads across different prefixes.*** #### KMS Request rates S3 Limitations when using SSE-KMS to encrypt objects in S3. 1. uploading a file will call generateDataKey in KMS API 2. downloading a file will call decrypt in the KMS API 3. KMS limiation is regional specific: 5500,10000 or 30000 request per second. 4. Currently one cannot request a quota increase for KMS #### Multipart Uploads and Byte-Range Fetches Mutlipart Uploads: Recommended for files over 100 MB and required for files over 5GB. Paralleize uploads to increase efficiency. Byte-Range Fetches: Paralleize downloads by specifying byte ranges. If there is a failure in the download, it's only for a specific byte range. ### S3 Select and Glacier Select #### S3 Select Enables applications to retrieve only a subset of data from an object by using simple SQL expressions. By using the S3 Select to retrieve only the data needed by your application, you can achieve drastic performance increases in many cases. (As much as 400% improvement and 80% cheaper) Eg. data is stored in S3 in zip files that contain CSV files. with S3 select, you can use a simple SQL expression to return only the data needed. #### Glacier Select Very similar concept to S3 select and allows one to run SQL queries against Glacier directly. ### Sharing S3 Buckets between Accounts *Pre-requiste-> accounts under the same organization* 3 different ways to share S3 buckets across accounts: 1. Using Bucket Polices and IAM(applies across the entire bucket). Programmatic Access only. 2. Using Bucket ACLs and IAM (individual objects). Programmatic Access only. 3. Cross-account IAM roles. Programatic and console access. ![](https://i.imgur.com/KJ44KQq.png) ### S3 Cross Region Replication 1. Versioning must be enabled on both the source and destination buckets. 2. Cross region replication only take place after the replication rules has been set. Meaning that objects existing in a bucket will not be replicated to another bucket if those objects exists before the replication rules has been set. (same rule apply to the version) 3. Delete markers are not replicated, deleting individual version or delete markers will not be replicated too. 4. If the policy of an object is changed on the source bucket, the policy will not change automatically for the destination bucket. ![](https://i.imgur.com/EwiuFfX.png) ### S3 Transfer Acceleration Utilises the CloudFront Edge Network to accelerate uploads to S3. Instead of uploading directly to the S3 bucket, one can use a distinct URL to upload directly to an edge location which will then transfer that file to S3. ![](https://i.imgur.com/Q9SShrX.png) Test tool: [click here](https://s3-accelerate-speedtest.s3-accelerate.amazonaws.com/en/accelerate-speed-comparsion.html) --- ## Chapter 7 DataSync ![](https://i.imgur.com/0cvfdjx.png) :::info Example tips 1. Used to move large amount of data from on-premises to AWS 2. Used with NFS and SMB compatible file systems 3. Replication can be done hourly,daily or weekly 4. Install the DataSync agent to start the replication 5. can be used to replication EFS (elastic file system) to EFS ::: --- ## Chapter 8 CloudFront A content delivery network (CDN) is a system of distributed servers that deliver webpages and other web content to a user based on geographic locations of the user, the origin of the webpage and a content delivery server. #### Key Terminology 1. Edge location: The location where content will be cached. 2. Origin: The origin of all the files that CDN will distribute. This can be an S3 Bucket,an EC2 instance, an Elastic Load Balancer or Route53. 3. Distribution: This is the name given the CDN which consists of a collection of Edge locations 4. Web Distribution: typically used for Websites 5. RTMP: typically used for Media streaming 6. TTL: time to live, how long the objects will be cached *Edge locations are not just for READ only, you can write to them too.Eg.S3 transfer acceleration* *You can clear cached objects but it will incur cost.* #### lab Step 1: Networking and content delivery, select CloudFront Step 2: notice cloudFront is a global service (just like S3) Step 3: create distribution ![](https://i.imgur.com/sLUNkg3.png) ![](https://i.imgur.com/tKCB1Lw.png) #### CloudFront Signed URLs and Cookies Use case: restricting content access * A signed URL is for individual files ![](https://i.imgur.com/G9HjK2l.png) ![](https://i.imgur.com/JrFXPuU.png) * A signed cookie is for mutiple files * When we create a signed URL or signed cookie we attach a policy (URL expiration, IP ranges, trusted signers) :::info Example tips 1. Use signed URLs/cookies when you want to secure content. 2. If your origin is EC2, then use CloudFront. 3. If your origin is S3, you can use the S3 signed URL ::: --- ## Chapter 9 Snowball and Storage Gateway **Snowball** is a petabyte-scale data transport solution that uses secure appliances to transfer large amount of data into and out of AWS. Using Snowball addesses common challenges with large scale data transfer including high network costs, long transfer times and security concerns. Transferring data with Snowball is simple,fast,secure and can be as little as one-fifith the cost the high-speed Internet. **Snowball** comes in either a 50TB or 80TB size. Snowball uses multiple layers of security designed to protect data including tamper-resistant enclosures, 245-bit encryuption, and industryt standard Trusted Platform module designed to ensure both security and full chain of custody of data. Once the data transfer job has been processed and verified, software erasure of the snowball appliance is carried out. **Snowball Edge** is a 100TB data transfer device with on-board storage and compute capabilites. One can use Snowball edge to move large amount of data into and out of AWS, as a temporary storage tier for large local dataset, or to support local workloads in remote or offline locations. **Snowball Edge** connects to existing applications and infrastructure using standard storage interfaces, streamlining the data transfer process and minimize setup and integration. Snowball Edge can cluster together to form a local storage tier and process data on-premises, helping to ensure applications continue to run even when they are not able to access the cloud. **Snowmobile** is an Exabyte scale data transfer service used to move extremely large amounts of data to AWS. One can transfer up to 100PB per Snowmobile ![](https://i.imgur.com/dgBTgEU.png) **Storage Gateway** is a service that connects an on-premise software applicance with cloud-based storage to provide seamless and secure integration between an organization's on-premise IT environment and AWS's storage infrastructure. THis service enables one to securely store data to the AWS cloud for scalable and cost-effective storage. ![](https://i.imgur.com/j7WwJDH.png) Storage Gateway software appliance is available for download as a VM image. There are three type if storage gateway: 1. file gateway (NFS and SMB) to S3 ![](https://i.imgur.com/CdHvJiE.png) 2. Volume gateway (iSCSI)-> stored volume (entire dataset is stored on site and is asynchronously backed up to S3) or cached volume (most frequently accessed data is cached on site and entire dataset is stored on S3) to EBS snapshots in S3 ![](https://i.imgur.com/yiO1qHm.png) 3. Tape gateway ![](https://i.imgur.com/ogOEoJ7.png) --- ## Chapter 10: Athena vs Macie ### Athena Interactive query service which enables one to analyse and query data located in S3 using standard SQL. It is serverless, nothing to provision, pay per query / per TB scanned. There will be no need to set up complex Extract/Transform/Load (ETL) process and it works directly with data stored in S3. use cases: 1. query log files stored in S3 2. generate business reports on data stored in S3 3. Analyse AWS cost and usage reports 4. run queries on click-stream data ### Macie Security service which uses machine learning and natural language processing to discover, classify and protect sensitive data (PII data) stored in S3. --- ## Chapter 11 EC2 EC2 is a web service that provides resizable compute capactity in the cloud. Amazon EC2 reduces the time required to obtain and boot new server instances to minutes, allowing you to quickly scale capacity, both up and down, as your computing requirements change. ### EC2 Pricing Model 1. On demand : pay a fixed rate by the hour/second with no commitment. Suitable for users that want the low cost and flexibility of EC2 without any up-front payment or long term commitment. Applications with short term, spicky or unpredicatable workloads than cannot be interrupted. 2. Reserved: provide a capacity reservation and offer a significant discount on the hourly charge for an instance. Contract terms are 1 year or 3 year terms. Useful for applications with steady state or predictable usage that require reserved capacity. User able to make upfront payments to reduce total computing costs. * standard reserved instances: offer up to 75% off on demand instances, the more upfront payment made and longer the contract, the greater the discount. * Convertible Reserved Instances: offer up to 54% off the demand capability to change the attributes of the reserved instance as long as the exchange results in creation of reserved instances of equal or greater value * Scheduled reserved instances: there are available to launch within the time windows you reserve. This option allows one to match the capacity reservation to a predictable recurring schedule that only require a fraction of a day, a week or a month. 3. Spot: enables you to bid whatever price you want for instance capacity, providing for even greater saving if your applications have flexible and end times. Useful for applicatiopns that have flexible start and end times. Applications that are only feasible at very low compute prices. Users with urgent computing needs for large amounts of additional capacity. * If the spot instance is terminated by AWS EC2, you will not be charged for a partial hour of usage. However, if you terminate the instance yourself, you will be charged for any hour in which the instance run. 4. Delicated Hosts: Physical EC2 server dedicated for use. Dedicated hosts can reduce costs by allowing one to use existing server-bound software licenses. Useful for regulatory requirements that may not support multi-tenant virtualization. Great for licensing which does not support multi-tenancy or cloud deployments. Can be purchased on demand or reservation (up to 70% discount) ### EC2 instance types ![](https://i.imgur.com/anvlvq5.png) ### EC2 LAB Step 1: choose an Amazon Machine Image (AMI) Step 2: choose an instance type Step 3: Configure instance details ![](https://i.imgur.com/2dx15ZZ.png) Step 4: Add storage.volume Step 5: Add tages Step 6: Configure security group Step 7: select an existing key pair or create a new key pair Step 8: Connect to Ec2 instance :::info Example tips 1. Termination Protection is turned off by default. 2. On an EBS-backed instance, the default action is for the root RBS volume to be deleted when the instance is terminated. 3. EBS Root volume of default AMI's can be encrypted. One can also use a third party tool (eg. bit locker) to encrypt the root volume or this can be done when creating AMI's in the AWS console or using the API. 4. Additonal volumes can be encrypted ::: ### Security Group Basics Under Network and Security * inbound rules: All inbound traffic is blocked by default. * outbound rules: all outbound traffic is allwed by default. * when one make a change to the rules on a security group, that change take effect immediately. * security group are stateful, if one specify an inbound rule, those rule are automatically added for outbound rules. * one cannot block individual ip addresses with security groups (Use Network Access Control List instead) * One can specify allow rules, but not deny rules. * One Ec2 instance can have multiple security groups assigned. ### EBS Elastic Block Store (EBS) provides persistent block storage volumes for use with EC2 instances. Each EBS volume is automatically replicated within its AZ to protect one from componenet failure, offering high availability and durability. 5 different types 1. General purpose SSD 2. Provisioned IOPS SSD 3. Throughput Optimised Hard Disk Drive 4. Cold Hard Disk Drive 5. Magnetic ![](https://i.imgur.com/QabZ1Uf.png) #### EBS volumes and Snapshots * The ebs volumes of an EC2 instance will be in the same AZ * Terminating an EC2 instance may or maynot also remove its EBS volumes depending on the config when the EC2 instance is launched. * moving one EBS from 1 AZ to another: create snapshot from the EBS, then create image/volume from that snapshot :::info Example tips 1. Think of EBS as virtual hard disk 2. Snapshots exists on S3 3. Snapshots are incremental, only the blocks that have changed since your last snapshot are moved to S3 4. To create a snapshot for EBS volumes that service as root devices, it is recommended to stop the instance before taking the snapshot (not mandatory as snapshot can be taken on a running instance.) 5. AMI can be created from Snapshots. And AMIs can be move from one region to another via COPY AMI 6. EBS volume size and type can be changed on the fly. ::: #### AMI Types (EBS vs Instance store) AMI can be selected based on: 1. Region 2. Operating System 3. Architecture (32bit or 64-bit) 4. launch permissions 5. Storage for the root device (Instance store or EBS backed volumes) For EBS volumes: the root device for an instance launched from the AMI is an EBS volume created from an EBS snapshot For instance store volumes: The root device for an instance launched from the AMI is an instance store volume created from a template stored in S3 :::info Example tips 1. Instance store volumes are sometimes called Ephemeral Storage. Meaning instance store volumes cannot be stopped. If the underlying host fails, one will lose all the data. 2. EBS backed instances can be stopped and data will not be lost 3. Both types can be reboot and no data will be lost. 4. root instance type volume will be deleted on EC2 termination. 5. Instance store can scale up to millions of IOPS with low latency ::: #### Encrypted Root Device volumes and Snapshots Only unencrypted snapshots can be shared across AWS accounts or made public. At launch: ![](https://i.imgur.com/Mgc29D4.png) After launch: 1. Create snapshot 2. Copy snapshot with encryption 3. Create image from encrypted snapshot 4. Launch EC2 instance with encrypted AMI ### ENI vs ENA vs EFA * **ENI**: Elastic network interface, a virtual network card 1. allows 1 primary private IPv4 address from the IPV4 address range of one's VPC 2. One or more secondary private IPv4 addresses from the IPV4 address range of one's VPC 3. One Elastic IP address per privte IPv4 address 4. One public IPv4 address 5. One or more IPv6 addresses 6. One or more security groups 7. A MAC address 8. A source/destinatiopn check flag 9. A description 10. Use cases: to create a management network, use network and security appliances in the VPC, create dual-homed instances with workloads/roles on distinct subnets, create a low budget, high-availablity solution. * **EN** : Enhanced Networking, uses single root I/O virtualization (SR-IOV) to provide high-performance networking capabilities on supported instance types. 1. SR-IOV is a method of device virtualization that provides higher I/O performance and lower CPU uilization when compared to traditional virtualized network interfaces. 2. EN provides hgiher bandwith, higher packlet per second (PPS) performance, and consistently lower inter-instance latencies. There is no additonal charge for using EN 3. Use where you want good network performance. 4. Depending on the instance type, enhanced networking can be enabled using either: 4.1 **Elastic Network Adapter (ENA)**: support network speed of up to 100 Gbps for supported instance types 4.2 **Virtual Function (VF)** interface, which support network speeds up to 10 Gbps for supported instance types, this is typicall used on older instances. * **EFA**: Elastic Fabric Adapter, a network device that can be attached to EC2 instance to accelerate high performance computing (HPC) and machine learning applications. 1. EFA provides lower and more consistent latency and higher throughput than the TCP transport traditionally used in cloud based HPC systems. 2. EFA can use OS-bypass to enable HPC and machine learning applications to bypass the OS kernel and to communicate directly wioth the EFA device. This makes it alot faster with lower latency. Not supported with Windows currently, only Linux. ### Spot Instances and Spot Fleets Spot instances are available at up to 90% discount compared to On-demand prices. One can use spot instances for various stateless, fault-tolerant or flexible applications such as big data, containerised workloads, CI/CD, web servers, HPC and other test and development workloads. To use spot instnaces, one must first decide on your maximum spot price. The instance will be provisioned as long as the spot price is below the maximum spot price. If the spot price goes above the maximum spot price, you have 2 minutes to choose whether to stop or terminate the instance. However, one may use Spot block to stop spot instances from being terminated even if the spot price goes over the maximum spot price. One can set spot block for between 1 to 6 hours currently. Spot instances are not good for: 1. persistent workloads 2. Critical jobs 3. Databases How to terminate spot instances: ![](https://i.imgur.com/FrMZBEV.png) **Spot Fleets**: a collection of spot instances and optionally on demand instances. The spot fleet attempts to launch the number of spot instances and on-demand instances to meet the target capacity specified in the spot fleet request. 1. set up different launch pools 2. the fleet will choose the best way to implement depending on the strategy defined 3. Spot fleet will stop launch instances once reach price threshold or capacity desire ![](https://i.imgur.com/7jZiXMf.png) ### EC2 Hibernate When one hibernate an EC2 instance, the operating system is told to perform hibernation (suspend to disk). Hibernation saves contents from the instance memory (RAM) to the EBS root volume. We persist the instance's EBS root volume and any attached EBS data volumes. ![](https://i.imgur.com/u6EmeXh.png) With EC2 hibernate 1. the instance boots much faster 2. the OS does not need to reboot becuase the RAM is preserved 3. Useful for long running processes or service take time to initialize 4. To enable hibernation as a stop action, one must use encrypted root volume. 5. Instance RAM must be below 150 GB 6. Instance family (C,M,R) 7. Available for Windows, Amazon Linux 2 AMI and Ubuntu 8. Instances cannot be hibernated for more than 60 days 9. Available for On-Demand instances and reserved instances ### IAM Roles with EC2 IAM can create a role with permission for EC2. IAM roles can be set to an EC2 instance. Roles are more secure than storing access key and secret access key on individual EC2 instances. Roles are easier to manage. Roles are universal, it can be used in any region ### Boot Strap scripts One can define the boot strap script for the EC2 instance to perform certain commands at start up. Example ``` #!/bin/bash yum update -y yum install httpd -y service httpd start chkconfig httpd on cd /var/www/html echo "<html><h1>Hello Cloud Gurus Welcome To My Webpage</h1></html>" > index.html aws s3 mb s3://YOURBUCKETNAMEHERE aws s3 cp index.html s3://YOURBUCKETNAMEHERE ``` ### Instance Metadata 1. To see the boot strap strip: curl http://169.254.169.254/latest/user-data 2. To get instance metaData: curl http://169.254.168.254/latest/meta-data/<the meta data interested> ### EFS Elastic file system is a file storage service for EC2 instances. EFS is easy to use and provides a simple interface that allows one to create and configure file systems quicky and easily. With EFS, storage capacity is elastic, growing and shrinking automatically as one add and remove files. **Different from EBS, EFS can be shared by multiple EC2 instances.** :::info Example tips 1. EFS supports the network file system version4 (NFSv4) protocal. 2. One only pay for the storage used (no pre-provisioning required) 3. Can scale up to petabytes 4. Can support thousands of concurrent NFS connections. 5. Data is stored across multiple AZ within a region. 6. Read after write consistency ::: ### AWS FSx for Windows and for Lustre AWS FSx for windows file server provides a fully managed native microsoft window file system so one can easily move a window based application that require file storage to AWS. ![](https://i.imgur.com/QTMDjbz.png) AWS FSx for Lustre is a fully managed file system that is optimized for compute-intensive workloads, such as HPC, machine learning, media data processing workflows and electronic design automation(EDA). One can run a Lustre file system that can process massive data sets at up to hundreds of GB per second of throughput, millions of IOPS and sub-millisecond latencies. ![](https://i.imgur.com/0Yb0mCI.png) ### EC2 Placement Groups Three types of placement groups: 1. Cluster placement group : grouping of instances within a single AZ. Placement groups are recommended for applications that need low network latency, high network throughput or both. Only certian instances can be launched into a clustered placement group. 2. Spread placement group : group of instances that are each placed on distinct underlying hardware. Spread placement groups are recommended for applications that have a small number of critical instances that should be kept seperate from each other. 3. Partition placement group : divides each group into logical segments (partition). Ensures that each partition within a placement group has its own set of racks. Each rack has its own network and power source. No two partitions within a placement group share the same racks, allowing one to isolate the impact of hardware failure within the application. :::info Example tips 1. A clustered placement group cannot span multiple AZ 2. Partition and Spread placement group can span mutliple AZ but must be in the same region. 3. The name specify for a placement group must be unique within the AWS account. 4. Instance type supported (C,P,R,I) 5. AWS recommend homogenous instances within clustered placement group 6. One cannot merge placement groups 7. One can move an existing instance into a placement group. Before one move the instance, the instance must be in the stopped state. One can move or remove an instance using AWS CLI or AWS SDK (not via console) ::: --- ## Chapter 12 Cloud Watch and CloudTrail CloudWatch is a monitoring service to monitor AWS resources as well as applications that one run on AWS. 1. Monitor performance 2. Monitor compute (EC2 instances, autoscaling groups, ELB, Route53 health checks) 3. Monitor Storage and Content Delivery (EBS volumes, Storage Gateways, CloudFront) ### CloudWatch with EC2 Host level Metrics: 1. CPU 2. Network 3. Disk 4. Status Check ### CloudTrail CloudTrail increases visibility into one's user and resoure activity by recording AWS Management console actions and API calls. One call identify which users and accounts called AWS, the source IP address from which the calls were made, and when the calls occurred. :::info Example tips 1. by default, CloudWatch with EC2 will monitor events every 5 minutes, this interval can be reduced to 1 min by turning on detailed monitoring 2. One can create CloudWatch alarms which trigger notifications 3. CloudWatch is abot performance and CloudTrail is about auditing. 4. CloudWatch include features such as dashboards,Alarms, Events and Logs ::: ### CloudWatch Lab Step 1: provision a EC2 instance Step 2: enable CloudWatch detailed monitoring ![](https://i.imgur.com/0Ry6QdN.png) Step 3: Go to CloudWatch and create alarm for EC2 instance for CPU utilization ![](https://i.imgur.com/l54IPty.png) --- ## Chapter 13 AWS CommandLine Under IAM, one can give programmic access to a user. This will allow the user to interact with AWS using the CommandLine. ![](https://i.imgur.com/GZMIEo3.png) One can use the AWS commandline to provision resources rather than the AWS management console. --- ## Chapter 14 HPC on AWS ![](https://i.imgur.com/ZWYkp9Y.png) **Data transfer** 1. Snowball, SnowMobile 2. DataSync 3. Direct Connect **Compute and Networking** 1. EC2 and EC2 fleet 2. Placement groups 3. Enhanced networking 4. Elastic Network Adapters 5. Elastic Fabric Adapters **Storage** 1. EBS 2. Instance Store 3. S3 4. EFS 5. FSx for Lustre **Orchestration and Automation** 1. AWS batch: enables running hundreds of thousands of batch computing jobs on AWS. It supports multi-node parallel jobs which allows you to run a single job that spans multiple EC2 instances. One can easily schedule jobs and launch EC2 instnaces according to needs. 2. AWS Parallel Cluster: Open source cluster management tool that uses a simple text file to model and provision all the resources needed. Automates creation of VPC, subnets, cluster type and instance types. --- ## Chapter 15 AWS WAF AWS WAF is a web application firewall that lets one monitor the HTTP and HTTPS requests that are forwarded top CloudFront, Application load balancer or API gateway. AWS WAF also lets you control access to your content. Features example 1. IP address 2. query string 3. Country 4. Values in request header 5. Presence of SQL code (SQL injection) or script (cross-site scripting) 3 different Behaviour 1. Allow all request except specified 2. Block all request except specified 3. Count the requests that match the properties specify --- ## Chapter 16 Databases on AWS ### Overview **Relational databases(RDS)** on AWS for OLTP(online transaction processing): 1. SQL microsoft 2. Oracle 3. MySQL 4. PostgreSQL 5. Aurora 6. MariaDB RDS on AWS has two key features: 1. Multi-AZ for disaster recovery 2. Read Replicas For Performance (up to 5 copies) **No SQL solution**: DynamoDB **Redshift**: OLAP(online analytics processing), data warehousing solution **Elasticache**: speed up performance of existing database for frequent identical queries ### RDS Lab Step 1: create database ![](https://i.imgur.com/PCrczwz.png) Step 2: create an EC2 instance running wordPress Step 3: add EC2 security group to be the source of RDS's security group. Step 4: set up wordPress and test :::info Example tips 1. RDS runs on VMs 2. One cannot log into these operating system 3. Patching of the RDS OS and DB is AWS's responsibility 4. RDS is not serverless ::: ### RDS backups, multi-Az and read replicas Two types of backups for RDS: 1. **Automated backups** : allow one to recover the database to any point in time within a retention period. The retention period can be between 1 and 35 days. Automated backups will take a full daily snapshot and will also store transaction logs throughout the day. When one perform a recovery, AWS will first choose the most recent daily back up and then apply transaction logs relevant to that day. This allows one to do a point in time recovery down to a second, within the retention period. * Automated backups are enabled by default * The backup data is stored in S3 with free storage space equal to the size of the database. * Backups are taken within a defined window. During the backup window, storage I/O may be suspended while data is being backed up and elevated latency maybe experience. 2. **Database Snapshots** : Done manually, they are stored even after the original RDS instance is deleted, unlike automated backups. #### Restoring backups ![](https://i.imgur.com/FJrr6s2.png) #### Encryption at rest Encryption at rest is supported for MySQL, Oracle, SQL Server, PostgreSQL, MariaDB and Aurora. Encryption is done using the AWS KMS. Once the RDS instance is encrypted, the dta stored at rest in the underlying storage is alos encrypted, as are its automated backups, read replicas and snapshots. #### Multi-AZ ![](https://i.imgur.com/lhhlneD.png) Multi-AZ allows one to have an exact copy of the production database in another AZ. AWS handles the replication. In the event of planned database mainenance, DB instance failure or AZ failure, RDS will automatically failover to the standby so the database operations can resume quickly without administrative intervention. Multi-AZ is supported for MySQL, Oracle, SQL Server, PostgreSQL, MariaDB *Aurora has its own architecture which is fault tolerant* #### Read Replica ![](https://i.imgur.com/bZVSXa8.png) Read replicas allow one to have a read only copy of the production database. This is achieved by using asynchronous replication from the primary RDS instance to the read replica. Read replica is supported for MySQL, Oracle, PostgreSQL, MariaDB, Aurora Things to note: 1. Used for scaling not for Disaster recovery 2. Must have automatic backup turned on in order to deploy a read replica 3. One can have up to 5 read replica copies of any database 4. One can have read replicas of read replicas however watch out for latency. 5. Each read replica wll have its own DNS endpoint. 6. One can have read replicas that have multi-AZ 7. read replicas can be promoted to be their own databases and this will break the replication. 8. one can have a read replica in a second region. ### DynamoDB AWS DynamoDB is a fast and flexible NoSQL database service for all applications that need consistent, single-digit milliseond latendcy at any scale. It is a fully managed database and supports both document and key-value data models. Its flexible data model and reliable performance make it a great fit for mobile,web,gaming,ad-tech,IOT and many other applications. Basics: 1. stored on SSD storage 2. Spread across 3 geographically distinct data centres 3. Eventual consistent reads (default) or Strongly Consistent read Advanced features: 1. DynamoDB accelerator (DAX): fully managed, high available, in-memory cache with over 10X performance improvment over normal DynamoDB ![](https://i.imgur.com/LODzlGm.png) 2. Transactions: mutiple all or nothing operations such as financial transactions or fulfilling orders. Two underlying reads or writes: prepare/commit. Up to 25 items or 4Mb of data. 3. On demand capacity: pay per request pricing to balance cost and performance with no minimum capacity. 4. On-demand backup and restore: full backups at any time with 0 impact on table performance or avaiability. Consistent within seconds and retained until deleted. Operates within same region as source table. 5. Point in time recovery (PITR): protects against accidental writes or deletes and can restore to any point in the last 35 days with incremental backups 6. streams: ![](https://i.imgur.com/Q68tT2t.png) 7. Global tables: ![](https://i.imgur.com/csoZOyd.png) ### Redshift Redshift is a fast and powerfu, fully managed, petabyte scale data warehouse service in the cloud. Customers can start small for just $0.25 per hour with no commitments or upfront costs and scale to a petabyte or more for $1000 per terabyte per year, less than a tenth of most other data warehousing solutions. Redshift is used for business intelligence and OLAP operations. Data warehousing databases use different types of architecture both from a database perspective and infrastructure layer. **Configurations:** 1. Single Node (160GB) 2. Multi-Node: a) Leader Node (manages client connections and receives queries) b) Compute Node (store data and perform queries and computations) with up to 128 compute nodes **Compression:** Columnar data stores can be compressed much more than row-based data stores because similar data is stored sequentially on disk. Redshift employs multiple compression techniques and can often achienve significant compression relative to traditional relational data stores. In addition, redshift does not require indexes or materialized views, and so uses less space. When loading data into an empty table, Redshift automatically samples the data and selects the most appropriate compression scheme **Massively Parallel Processing (MPP):** Redshift automatically distributes data and query load across all nodes, making it easy to add nodes to the data warehouse and enables one to maintain fast query performance as the data warehouse grows. **BackUps:** 1. Enabled by default with a 1 day retention period 2. Maximum retention period is 35 days. 3. Always attempt to maintain at least 3 copies of the data (original,replica on the compute nodes and a backup in S3) 4. Redshift can also asynchronmously replicate snapshots to S3 in another region for disaster recovery **Pricing:** 1. Compute Node Hours (no charge for leader nodes) 2. backup 3. Data transfer **Security:** 1. Encrypted in transit using SSL 2. Encrypted at rest using AES-2565 encryption 3. By default Redshift takes care of key management **Availability:** 1. Only available in 1 AZ 2. Can restre snapsjots to new AZs in the event of an outage. ### Aurora Amazon Aurora is a MySQL and PostgreSQL compatible relational database engine that combines the speed and availability of high-end commercial database with the simplicity and cost effectiveness of open source databases. ![](https://i.imgur.com/lKIhvPD.png) Features: 1. start with 10GB, scales in 10GB increments to 64 TB (storage autoscaling) 2. Compute resource can scale up to 32vCPUS and 244GB of Memory 3. 2 copies of data is maintained in each AZ, with minimum of 3 AZ.\ 4. Designed to transparently handle the loss of up to two copies of data without affecting database write availability and up to 3 copies without affecting read availability. 5. self healing 6. 3 Types of read replicas: a) aurora replicas (15) b) My SQL read replicas (5) c) PostgreQL (1) ![](https://i.imgur.com/Gd8h0xD.png) 7. Backups: a) automated backups are always enabled and do not impact performance b) snapshot can be taken and do not impact performance c) aurora snapshots can be shared to other AWS accounts **Aurora Serverless** This is an on-demand, autoscaling configuration for the MySQL-compatible and PostgreSQL compatible edition of Aurora. An Aurora serverless DB cluster automatically starts up, shuts down and scales capacity up or down based on the application needs. Provide simple and cost effictive option for infrequent,intermittent or unpredictable workloads. ### Elastic Cache Elastic Cache is a web service that makes it easy to deploy,operates and scale an in-memory cache in the cloud. This service improves the performance of web applications by allowing one to retrieve information from fast,managed, in-memory caches, instead of relying entirely on slower disk-based databases. ![](https://i.imgur.com/xVhWCFe.png) ### Database Migration Services (DMS) ![](https://i.imgur.com/R0BfgCe.png) ![](https://i.imgur.com/QN1IyTm.png) ![](https://i.imgur.com/ct7n1dN.png) ### EMR EMR is the industry leading cloud big data platform for processing vast amount of data using open-source tools such as Apache Spark, Apache Hive, Apache HBase, Apache Flink, Apache Hudi and Presto. With EMR, one can run petabyte-scale analysis at less than half the cost of traditional on-premise solutions and over 3 times faster than standard Apache Spark. ![](https://i.imgur.com/vpdmPGN.png) ![](https://i.imgur.com/6vqWs7S.png) --- ## Chapter 17 Route53 ### DNS 101 DNS is used to convert human friendly domain names into an internet protocol (IP) address. IP addresses are used by computers to identify each other on the network. IP addresses commonly come in 2 different forms, IPv4 and IPv6 #### IPV4 vs IPv6 IPv4 space is a 32 bit feild and has over 4 billion different addresses. IPv6 was created to solve this depletion issue and has an address space of 128bits (340 undecillion addresses) #### Top Level Domains In the string of a domain name seperated by periods, the last word in a domain name represents the "top level domain". The second word is known as a second level domain. These top level domain names are controlled by the Internet assigned number Authority (IANA) in a root zone database which is essentially a database of all available top level domains. #### Domain Registrars Because all of the names in a given domain name have to be unique, there needs to be a way to organise this all so that domain names are not duplicated. A registrar is an authority that can assign domain names directly under one or more top level domains. These domains are registered with InterNIC, a service of ICANN, which enforces uniqueness of domain names across the internet. Each domain names becomes registered in a central database known as the WhoLS database. Popular domain registrars include Amazon, GoDaddy.com, 123-reg.co.uk etc. #### Start of Authority Record (SOA) store information about: 1. name of the server that supplied the data for the zone 2. administrator of the zone 3. current version of the data file 4. the default number of seconds for the time to live file on resource records #### NS records Name server records are used by Top Level domain servers to direct traffic to the content DNS server which contains the authoritative DNS records. ![](https://i.imgur.com/JPwFGg4.png) #### DNS records 1. A record: used by a computer to translate the name of the domain to an IP address 2. TTL: The length that a DNS record is cached on either the resolving server or the users own local PC. The lower the time to live, the faster changes to DNS records take to propagate throughout the internet (default is 48hours) 3. CName: Canonical Name, used to resolve one domain name to another. Example, one may have a mobile webiste with domain name http://abc.com and that is used for when users browse to the domain name on their mobile devices. One may also want the name http://mobile.abc.com to resolve to the same address. (CName cannot be used for naked domain names, ie apex record) 4. Alias Records: used to map resource record sets in your hosted zone to ELB, CloudFront distributions or S3 buckers are are configured as websites. Alias records work like a CName record in that one can map one DNS name to another target DNS name. :::info Example tips 1. ELB do not have pre-defined IPv4 addresses, you resolve to them using a DNS name 2. Understand the difference between an Alia Record and a CName 3. Given the choice, always choose an Alias Record over a CName 4. Common DNS types: SOA, NS, A, CNames, MX (used for mail), PTR (reverse of A records) ::: ### Route 53 Lab Step 1: register a domain name using Route 53 ![](https://i.imgur.com/if9YU74.png) Step 2: Provison EC2 instances and run some Index.html Step 3: Tryout Simple routing policy simple routing policy: one record with multiple IP addresses, all the values are returned to user in a random order. ![](https://i.imgur.com/TRD0PKB.png) Step 4: Tryout Weighted routing policy Weighted routing policy: allows one to split traffic based on different weights assigned. Seperate A record is created for each ipAddress with weight. Also check out the Health Check option! ![](https://i.imgur.com/Q2RSSZT.png) Step 5: Tryout Latency routing policy Latency routing policy: route traffic based on the lowestr network latency for the end user. (region based) ![](https://i.imgur.com/6aeilPy.png) Step 6: Tryout Failover routing policy Failover routing policy: for primary and passive setup. Route53 will monitor the health of the primary site. ![](https://i.imgur.com/YwaGYYT.png) Step 7: Tryout Geolocation Routing policy Geolocation Routing policy: route traffic based on the geolocation of the end user. ![](https://i.imgur.com/aCJkQ0x.png) Step 8: Tryout Geoproximity Routing policy Geoproximity Routing policy: Traffic flow only, route traffic based on geographic location of user and resouces. Step 9: Tryout MultiValue Answer routing policy MultiValue Answer routing policy: similar to simple routing but allows one to put health checks on each record set :::info Example tips 1. you can buy domain names directly with AWS 2. It can take up to 3 days to register depending on the situation 3. Routing policies avaible with with Route 53 a) simple b) weighted c) latency based d) Failover e) geolocation f) geoproximity g) Multivalue answer 4. Health checks can be set on individual record sets. If a record set fails a health check, it will be removed from Route53 unitl it passes the health check. SNS notifications can be set up for alert. ::: --- ## Chapter 18 VPC ### VPC Overview Amazon VPC lets you provision a logically isolated section of AWS cloud where one can launch AWS resources in a virtual network that you define. You have complete control over your virtual network environment, including selection of your own IP address range, creation of subnets and configuration of route tables and network gateways. You can easily customize the network configuration of your VPC. For example, you can create a public facing subnet and a private facing subnet. You can leverage multiple layers of security, including security groups and network access control lists, to help control access to EC2 instances in each subnet. Additionally, you can create a hardware virtual private network (VPN) connection between your corporate datacenter and your VPC and leverage the AWS cloud as an extension of your corporate datacenter. ![](https://i.imgur.com/ZSOqOTJ.png) ![](https://i.imgur.com/uBDUlVM.png) #### Default VPC vs Custom VPC 1. Default VPC is user friendly, allowing one to immediately deploy instances. 2. All Subnets in default VPC have a route out to the internet 3. Each EC2 instance has both a public and private IP address #### VPC Peering Allows one to connect one VPC with another via a direct network route using private IP addresses. Instances behaves as if they are on the same private network. You can peer VPC's with other AWS accounts as well as with other VPCs in the same account. *Peering is in a star configuration: ie 1 central VPC peers with 4 others, no transitive peering* :::info Example tips 1. Think of a VPC as a logical datacenter in AW 2. Consists of IGWs(virtual private gateways), route tables and network access control lists, subnets and security groups. 3. 1 Subnet = 1 AZ 4. Security groups are stateful and Network access control list are stateless 5. No transitive peering. ::: ### VPC Lab 1 Step 1: Go under Networking and to VPC Step 2: Create a VPC ![](https://i.imgur.com/tomdVxi.png) Step 3: Create 2 subnets ![](https://i.imgur.com/MfPLXfN.png) Step 4: Make one subnet the public subnet by turn on the auto-assign public IPv4 address Step 5: create an internet gateway and attach it to the VPC Step 6: create a route table (not the default one that is created with the VPC) Step 7: configure the route table created to be allow route to internet ![](https://i.imgur.com/eDSpmHG.png) Step 8: associate the public subnet with the new route table Step 9: Create 2 ec2 and deploy to each subnet, one with the DMZ security group and one with the default security group ![](https://i.imgur.com/OpPNAST.png) :::info Example tips 1. When you create a VPC, a default route table, network access control list and a default security group are created at the same time. 2. It won't create any subnets nor a internet gateway 3. 1 AZ in your account can be a completely different AZ in another account even if they share the same name. AZs are randomised by AWS 4. AWS always reserve 5 IP addresses within your subnets 5. You can only have 1 internet gateway per VPC 6. Security groups cannot span across VPCs. ::: ### VPC Lab 2 Step 1: create a security group for access to the EC2 instance in private subnet from the EC2 instance in the public subnet Step 2: SSH from the public subnet EC2 into the private subnet EC2 Step 3: Create a NAT gatway or NAT instance so EC2 instance in the private subnet can communcate with the internet without becoming public. ![](https://i.imgur.com/1nsy5MB.png) Step 3.1: use community AMI to provision A NAT instance > one must diable the source and destination checks for the NAT instance to work Step 3.2: edit route table associated with the private subnet to add the Nat instance as an target for internet destination IP Step 3.3: create a NAT gateway ![](https://i.imgur.com/tahKMbm.png) Step 3.4: edit route table associated with the private subnet to add the Nat Gateway as an target for internet destination IP > NAT Gateway is AZ specific, to support multi-az NAT gateway, multiple NAT Gateway should be deployed under each subnet. :::info Example tips 1. NAT instance must be in the public subnet 2. Amount of traffic supported by the NAT instance depends on the instance size, autoscaling group and automate failover 3. NAT Gateways are redundant inside the AZ 4. NAT Gateway starts at 5Gbps and scales to 45Gbps 5. No pathcing is required for NAT Gateway 6. NAT Gateway is not associated with security groups ::: ### Network Access Control list vs Security Groups 1. When one create a new NACL, default denies all traffic inbound and outbound. 2. Ephermeral Ports for inbound and outbound rules 3. Rules are evaluated in sequence (eg. rule 100 allow and rule 200 deny a specific IP, the decision is to allow) 4. One Subnet can only be assicated with 1 NACL and always assiocate with one. 5. NACL always act before security groups 6. NACL are stateless; Responses to allowed inbound traffic are subject to the rules for outbound traffic (and vice versa). ### Custom VPC and ELB Please note that ELB will need at least 2 subnets ### VPC Flow Logs VPC flow logs is a feature that enables one to capture information about the IP traffic going to and from network interface in one's VPC. Flow log data is stored using Amazon CloudWatch Logs. After one created a flow log, one can view and retrieve its data in Amazon CloudWatch logs. Flow logs can be created at 3 levels: 1. VPC 2. Subnet 3. Network Interface level :::info Example tips 1. You cannot enable flow logs for VPCs that are peered with your VPC unless the peer VPC is in your account 2. You can tag flow logs 3. After you have created a flow log, you cannot change its configuration. Eg. you cannot assocaite a different IAM role with the flow logs. 4. Not all IP traffic is monitored.Example: Traffic generated by instances when they contact the Amazon DNS server. If you use your own DNS server, then all traffic to that DNS server is logged. Traffic generated by a window instance for windows license activiation. Traffic to and from 169.254.168.254 for instance metadata. DHCP traffic. Traffic to reserved IP address for the default VPC router. ::: ### Bastion Hosts A bastion host is a special purpose computer on a network specifically designed and configured to withstand attacks. The computer generally hoists a single application, for example a proxy server, and all other services are removed or limited to reduce threat to the computer. It is hardened in this manner primarily due to its location and purpose, which is either on the outside of a firewall or in a DMZ and usually involves access from untrusted networks or computers. ![](https://i.imgur.com/Rgsf5G7.png) ### Direct Connect ![](https://i.imgur.com/QC1FwOP.png) ![](https://i.imgur.com/N6ywhQ6.png) ![](https://i.imgur.com/6PxC5qB.png) https://www.youtube.com/watch?v=dhpTTT6V1So&feature=youtu.be ### Global Accelerator AWS global accelerator is a service in which you create accelerators to improve availability and performance of your applications for local and global users. Global accelerator directs taffic to optimal endpoints over the AWS global network. This improves the availability and performance of your internet applications that are used by a global audience. By default Global Accelerator provides you with two static IP addresses that you associate with your accelerator. Alternatively, you can bring your own. ![](https://i.imgur.com/oy5VfCw.png) components: 1. Static IP addresses 2. Accelerator 3. DNS Name 4. Network Zone 5. Listner 6. Endpoint group 7. Endpoint ### VPC Endpoints A VPC endpoint enables you to privately connect your VPC to supported AWS services and VPC endpoin services powered by PrivateLink without requiring an internet gateway, NAT device, VPN connection or Direct Connect. Instance in your VPC do not require public IP addresses to communicate with resources in the service. Trafffic between your VPC and the other services do not leave the amazon network Endpoints are virual devices. They are horizontally scaled, reduntant and highly available VPC components that allow communication between instances in your VPC and services without imposing availability risks or bandwith constraints on your network traffic. Two types of VPC endpoints: 1. Interface endpoints: elastic network interface with a private IP address that serves as an entry point for traffic destined to a supported service. 2. Gateway endpoints: support S3 and DynamoDB ![](https://i.imgur.com/otE4RJw.png) ### VPC Private Link To opening your services in a VPC to another VPC 1. Open the VPC up to the internet 2. VPC peering 3. Private link ![](https://i.imgur.com/618VDeK.png) ### Transit Gateway To simplify network Topology ![](https://i.imgur.com/WDsvSCm.png) Allows one to have transitive peering between thousands of VPCs and on-premises data centers. Works on a hub and spoke model. Works on a regional basis, but one can have it across multple regions. Can be used across multiple AWS accounts using RAM(Resource Access Manager). Route tables can be set up to limit how VPCs talk to one another. Supports IP multicast. ### VPN CloudHub ![](https://i.imgur.com/GDWTAvV.png) If you have multiple sites, each with its own VPN connection, you can use AWS VPN CloudHub to connect those sites together. Hub and spoke Model. Low cost and easy to manage. Operates over the public internet but all traffic betwen customer gateway and AWS VPN cloudHub is encrypted. ### Networking Costs on AWS ![](https://i.imgur.com/lbTR3qC.png) 1. use private IP addresses over public IP addresses to save on cost 2. group in 1 AZ and use private IP addresses will save cost. --- ## Chapter 19 HA Architecture ### Elastic Load Balancers