# DevOps Training Session 9 + 10: Cloud - Networking - AutoScaling VM ###### tags: `devops` `reliable` `research` Hello btb, in this session i will refer some kind stuff about the networking inside azure, Let do it --> [:coffee:](https://docs.google.com/presentation/d/1EyrcO5eekGP4tfN_I0YLgVmjlTQ7y4oHPLMOvTje7H0/edit?usp=sharing) ## Issue and resolve - For purpose, i should design architecture for what kind anything else to do and what should we got, 2 will draw the 2 diagram - First: For the what we got ![](https://i.imgur.com/bM5EYN8.png) - Second: For the what we do ![](https://i.imgur.com/hDOmHX1.png) - On my situation, Issue which i can meet if i assign publicIP for VM, very dangerous, some cybercrime can be occur when they know my ip in my VM but if you have it on Load Blancer for 2 reason - First of all, you reduce the issue by assign it for blancer for Internal VM and you need gateway for deliver traffic between 2 side using load blancer for go through inside && NAT Gateway for go outside. - Second, Custom the security group and traffic can go through, More security on just one security group --> On situation u can custom 2FA for go through inside the VM --> So that why reason i choose this design for system, put Load Balancer for protect network. Down below i will drop some fundemental for Load Balancer. * [Fundemetal ALB](https://www.f5.com/glossary/load-balancer) * [Azure Load Balancer](https://learn.microsoft.com/en-us/azure/load-balancer/load-balancer-overview) ## Coding and issue when meet some mistake on implement and addon networking in processing **TL;DR** *Base on the title, but by the way this blog is wrote after 1 month i complete this session, so IDK what mistake i meet on process to working network in azure so if i remember anything i will take a note for this.* Coding --- - On before, we change the module from work with once of once VM, We have to choose the VMSS (Virtual Machine Scale Set) - So the coding will make the network for this upper once - Make sure networking it just have one way to go your VM to get traffic back - So first, we will update some in /autoscaling it is VMSS is reference above: - So we got data.tf to querry what we go to import into the provider can access --> Quite easy like a way you import lib for python ``` ## data.tf data "azurerm_resource_group" "main"{ name = "DevOpsIntern" } data "azurerm_ssh_public_key" "main" { name = "DevOpsIntern" resource_group_name = data.azurerm_resource_group.main.name } data "azurerm_user_assigned_identity" "main" { name = "${local.environment}-identityforVM" resource_group_name = data.azurerm_resource_group.main.name } data "azurerm_subnet" "main" { name = "${local.environment}-subnet" virtual_network_name = "${local.environment}-network" resource_group_name = data.azurerm_resource_group.main.name } data "azurerm_lb" "main" { name = "${local.environment}-LoadBalancer" resource_group_name = data.azurerm_resource_group.main.name } data "azurerm_lb_backend_address_pool" "main" { name = "${local.environment}-backendlbConfiguration" loadbalancer_id = data.azurerm_lb.main.id } data "azurerm_application_security_group" "main" { name = "${local.environment}-asg" resource_group_name = data.azurerm_resource_group.main.name } ``` Like i said this stuff make sure you what you got to create a VMSS and network is reffering in this but we will take a look for it. Go for it we will get the VMSS.tf file for config VMSS ``` ## VMSS.tf # Create a VM scale set manutally resource "azurerm_linux_virtual_machine_scale_set" "main" { name = "${local.environment}-vmss" resource_group_name = data.azurerm_resource_group.main.name location = data.azurerm_resource_group.main.location sku = "Standard_B1s" instances = 1 admin_username = "intern" admin_ssh_key { username = "intern" public_key = data.azurerm_ssh_public_key.main.public_key } os_disk { storage_account_type = "StandardSSD_LRS" caching = "ReadWrite" } source_image_reference { publisher = "Canonical" offer = "UbuntuServer" sku = "18.04-LTS" version = "latest" } network_interface { name = "${local.environment}-nic" primary = true ip_configuration { name = "internal" primary = true subnet_id = data.azurerm_subnet.main.id load_balancer_backend_address_pool_ids = [ data.azurerm_lb_backend_address_pool.main.id ] application_security_group_ids = [ data.azurerm_application_security_group.main.id ] } } disable_password_authentication = true identity { type = "UserAssigned" identity_ids = [ data.azurerm_user_assigned_identity.main.id ] } lifecycle { ignore_changes = [ instances ] } user_data = base64encode(templatefile("${abspath(path.root)}/data/userdata.sh", { bucketname = var.bucket_name})) tags = local.common_tags } ``` - Data i put it inside the bracket is the data or directly call data for this to using the data we import - Local is basicly like as data but it will give you define it for local --> Local: Is stored what easything, not make sure duplicate function or something like that on code, and it not sensitive - Variable: Is thing you put some kind privately for purpose like Key, Ip and Occur anything you not want to take it on the Code, Put on that zip and deliver :coffee: Quite easy huh --> This is the three style input variable we need to know for using the terraform or hashicorp language - So take a look on the down of the code, u will see the kind is user_data --> This is basic way for u custom VM for first time when you want this VM can be or will be on the first time run. Just put it in the template file -> Base64 into and get result what you want So because, why we use the VMSS instead of VM, because it make anything HA on anything we can handle factor come from the internal VM like CPU, RAM, or anything can effect to Availablity of your system --> We got the monitor resource for VMSS ![](https://i.imgur.com/HSkAkmJ.png) - Like a Image u can base on the metric of system for particularly we use metric log from VM for scaling for automaticly like this i do below ``` ## monitor.tf # Set the monitoring for for vmss resource "azurerm_monitor_autoscale_setting" "main" { name = "${local.environment}-AutoscaleSetting" resource_group_name = data.azurerm_resource_group.main.name location = data.azurerm_resource_group.main.location target_resource_id = azurerm_linux_virtual_machine_scale_set.main.id profile { name = "${local.environment}-defaultProfile" capacity { default = 1 minimum = 1 maximum = 2 } rule { metric_trigger { metric_name = "Percentage CPU" metric_resource_id = azurerm_linux_virtual_machine_scale_set.main.id time_grain = "PT1M" statistic = "Average" time_window = "PT5M" time_aggregation = "Average" operator = "GreaterThan" threshold = 75 metric_namespace = "microsoft.compute/virtualmachinescalesets" } scale_action { direction = "Increase" type = "ChangeCount" value = 1 cooldown = "PT1M" } } rule { metric_trigger { metric_name = "Percentage CPU" metric_resource_id = azurerm_linux_virtual_machine_scale_set.main.id time_grain = "PT1M" statistic = "Average" time_window = "PT5M" time_aggregation = "Average" operator = "LessThan" threshold = "25" } scale_action { direction = "Decrease" type = "ChangeCount" value = "1" cooldown = "PT1M" } } } notification { email { custom_emails = [ var.email_notification ] } } } ``` It quite easily to understand, it will have twice situation is scale up and scale down for previous we will see the scale up - Up is when we got 75% CPU average for VMSS, it will scale one VM and add it into that - And if we got 25% CPU average for VMSS, it will scale one VM and delete it into that - Time is thing u need to learn for setting up for right ``` ## Scale up rule { metric_trigger { metric_name = "Percentage CPU" metric_resource_id = azurerm_linux_virtual_machine_scale_set.main.id time_grain = "PT1M" statistic = "Average" time_window = "PT5M" time_aggregation = "Average" operator = "GreaterThan" threshold = 75 metric_namespace = "microsoft.compute/virtualmachinescalesets" } scale_action { direction = "Increase" type = "ChangeCount" value = 1 cooldown = "PT1M" } } ## Scale down rule { metric_trigger { metric_name = "Percentage CPU" metric_resource_id = azurerm_linux_virtual_machine_scale_set.main.id time_grain = "PT1M" statistic = "Average" time_window = "PT5M" time_aggregation = "Average" operator = "LessThan" threshold = "25" } scale_action { direction = "Decrease" type = "ChangeCount" value = "1" cooldown = "PT1M" } } } ``` One morething i want to talk about monitor, when you implement this monitor for autoscale. Remembering, one thing u should not do is don't erase VMSS by manually -> U need delete the monitor autoscale because i will take be a default config when u create a VMSS with name like it --> The code will break when it run on this process creating the monitor for scaleset. So be careful :coffee: ![](https://i.imgur.com/nCpzzVj.png) Reach on the networking, because it the biggest title for this session so on top of image i post for architecture, we will separate the NSG vs ASG on apply for scale set - Like i know, biggest you want to know about diff between NSG and ASG. It base on the what you want to choose for using. ![](https://i.imgur.com/PQ6JdTS.png) - U can attach NIC of VM for both of NSG or ASG but, specify you want to make a purpose for this kind of rule of NSG can affect for bunch of ASG you will need to ASG for this problem - ASG help you identify the what rule you want to assign into, can help you image what this group VM doing for what purpose - But NSG can attact directly for VM through NIC, it will make your VM can became P2P level like subnet because you attach NIC for VM, and network for nic get from subnet --> If you do it you will bring higher value for your VM, so do it subnet for NSG --> config on NSG what rule for routing traffic for what bunch of VM or just once of VM. Target of network is very important, because it will make a decision for what traffic will route into VM --> Make network security for became secure. Detail infomation, I put it on the Referrence - Let implement :coffee: ``` # Create network resource "azurerm_virtual_network" "my_terraform_network" { name = "${local.environment}-network" address_space = ["10.0.0.0/16"] location = data.azurerm_resource_group.current.location resource_group_name = data.azurerm_resource_group.current.name tags = local.common_tags } # Create the subnet resource "azurerm_subnet" "my_terraform_network_subnet" { name = "${local.environment}-subnet" resource_group_name = data.azurerm_resource_group.current.name virtual_network_name = azurerm_virtual_network.my_terraform_network.name address_prefixes = ["10.0.1.0/24"] service_endpoints = [ "Microsoft.Storage" ] } resource "azurerm_application_security_group" "main" { name = "${local.environment}-asg" location = data.azurerm_resource_group.current.location resource_group_name = data.azurerm_resource_group.current.name tags = local.common_tags } # Create a SecurityGroup resource "azurerm_network_security_group" "my_terraform_nsg" { name = "${local.environment}-nsg" location = data.azurerm_resource_group.current.location resource_group_name = data.azurerm_resource_group.current.name security_rule { name = "HTTP" priority = 1002 direction = "Inbound" access = "Allow" protocol = "Tcp" source_port_range = "*" destination_port_range = "80" source_address_prefix = "*" destination_application_security_group_ids = [ azurerm_application_security_group.main.id ] } security_rule { name = "HTTPS" priority = 1003 direction = "Inbound" access = "Allow" protocol = "Tcp" source_port_range = "*" destination_port_range = "443" source_address_prefix = "*" destination_application_security_group_ids = [ azurerm_application_security_group.main.id ] } tags = local.common_tags } # Create association via subnet with nsg resource "azurerm_subnet_network_security_group_association" "main" { subnet_id = azurerm_subnet.my_terraform_network_subnet.id network_security_group_id = azurerm_network_security_group.my_terraform_nsg.id } resource "azurerm_public_ip" "publicip_LB" { name = "${local.environment}-lbpublicIP" resource_group_name = data.azurerm_resource_group.current.name location = data.azurerm_resource_group.current.location allocation_method = "Static" sku = "Standard" tags = local.common_tags } ``` So like i said, i will do it balancer for securing anything i make sure one way to go and one way to back. On everything traffic i go through by one. And specify, for make sure security you need to put out the SSH rule because it will make sure not anything can connect into your private internet by public network. Another thing can take care it, **Bastion Host or VPN** So go to LB config that will make something like this ``` # Create a load balancer resource "azurerm_lb" "main" { name = "${local.environment}-LoadBalancer" location = data.azurerm_resource_group.current.location resource_group_name = data.azurerm_resource_group.current.name sku = "Standard" frontend_ip_configuration { name = "${local.environment}-publicIPlbConfiguration" public_ip_address_id = azurerm_public_ip.publicip_LB.id } } resource "azurerm_lb_backend_address_pool" "main" { loadbalancer_id = azurerm_lb.main.id name = "${local.environment}-backendlbConfiguration" } resource "azurerm_lb_probe" "healthcheckHTTP" { loadbalancer_id = azurerm_lb.main.id name = "${local.environment}-probeHTTP" port = "80" } resource "azurerm_lb_probe" "healthcheckHTTPS" { loadbalancer_id = azurerm_lb.main.id name = "${local.environment}-probeHTTPS" port = "443" } resource "azurerm_lb_rule" "ruleHTTP" { loadbalancer_id = azurerm_lb.main.id name = "${local.environment}-LBruleHTTP" protocol = "Tcp" frontend_port = 80 backend_port = 80 backend_address_pool_ids = [ azurerm_lb_backend_address_pool.main.id ] frontend_ip_configuration_name = "${local.environment}-publicIPlbConfiguration" probe_id = azurerm_lb_probe.healthcheckHTTP.id disable_outbound_snat = true } resource "azurerm_lb_rule" "ruleHTTPS" { loadbalancer_id = azurerm_lb.main.id name = "${local.environment}-LBruleHTTPS" protocol = "Tcp" frontend_port = 443 backend_port = 443 backend_address_pool_ids = [ azurerm_lb_backend_address_pool.main.id ] frontend_ip_configuration_name = "${local.environment}-publicIPlbConfiguration" probe_id = azurerm_lb_probe.healthcheckHTTPS.id disable_outbound_snat = true } resource "azurerm_lb_outbound_rule" "name" { name = "${local.environment}-LBOutboundRule" loadbalancer_id = azurerm_lb.main.id protocol = "Tcp" backend_address_pool_id = azurerm_lb_backend_address_pool.main.id frontend_ip_configuration { name = "${local.environment}-publicIPlbConfiguration" } } ``` - This LB need to something, such as: - **LB**, Of course it will have it - **backend_pool**, so this is kind for confusing for starter to understand but you can imaginary it like a pool for VM can take it inside and each of bunch VM can take a pool to get the network, this kind to make sure because the Nic will attact on this because on VMSS we will attact nic is inside the pool. And one things it make sure you do it like a LB because you route traffic in to bunch VM :coffee: - **Probe**, This is health check for succeeded what you want to route traffic into because if not it will not do connected into the service, like HTTP or HTTPS - **Rule**, This rule config for NAT rule in config for you can roule frontend, is LB into backend is VM with open 80 or 443 and Probe health checked is sucessful - **Outbound rule** is config, for VM can go through internet for get package or anything else ![](https://i.imgur.com/M8Wqi68.png) - One thing i want to give you understand is you will make sure you have not config SSH go to important VM, rule is obligating - To Debug this VM if it make anything you need to set a rule --> Go for it to using Bastion Host, I not do it on this session but it just config the specify the intend the VM can access into everything go through SSH inside the private network - The config is just standard, i will take a specify session for config the bastion go in private ``` ## Network.tf resource "azurerm_public_ip" "bastion_public_ip" { name = "${local.environment}-bastionPublicIP" location = data.azurerm_resource_group.main.location resource_group_name = data.azurerm_resource_group.main.name allocation_method = "Static" sku = "Standard" } resource "azurerm_subnet" "bastion_subnet" { name = "AzureBastionSubnet" resource_group_name = data.azurerm_resource_group.main.name virtual_network_name = data.azurerm_virtual_network.main.name address_prefixes = [ "10.0.2.0/29" ] } ## Bastion.tf resource "azurerm_bastion_host" "main" { name = "${local.environment}-bastionHost" location = data.azurerm_resource_group.main.location resource_group_name = data.azurerm_resource_group.main.name ip_configuration { name = "${local.environment}-bastionConfiguration" subnet_id = azurerm_subnet.bastion_subnet.id public_ip_address_id = azurerm_public_ip.bastion_public_ip.id } } ``` ![](https://i.imgur.com/dhjziTt.png) ## Conclusion - This is the best practice on the cloud you need to know to becoming the best - LB is base on your thing but somekind it make a big difference, make anything secure and u can handle at - On the big vision, you can see the External LB and Internal LB like i do make anything can do it for purpose what you got and what you want. - For Optimize, for price, for secure. I think LB is the best thing i meet on the Devops. Peace and go to next session for read another problem :smile: - VMSS, is the practice for doing what you want to scale HA infra but it cost much for VM but it make sure you got stable system ![](https://i.imgur.com/sh1Lebq.png) ## Referrence - [Azure Load Blancer](https://registry.terraform.io/providers/hashicorp/azurerm/latest/docs/resources/lb) - [NSG vs ASG](https://learn.microsoft.com/en-us/azure/virtual-network/application-security-groups) - [VMSS](https://learn.microsoft.com/en-us/azure/virtual-machine-scale-sets/overview) - [Bastion Host](https://learn.microsoft.com/en-us/azure/bastion/bastion-overview) - [Metrics supported using on trigger autoscale](https://learn.microsoft.com/en-us/azure/azure-monitor/essentials/metrics-supported)