# Incident report: Admin-Portal resource group destroyed
### Overview
On Wed, a failed deployment on Dev environment resulted the whole admin-portal resources wiped out
The issue blocked our development team from working and make further deployments on Dev environment for approximately 2 days
### Timeline report
(Time unit in UTC)
+ 2021-05-19T06:44:09: deployment to Dev
Commit links with the deployment task https://dev.azure.com/powerfinance/Infrastructure/_git/Infrastructure/commit/60a11b9536af628afffbd3254db3bf6707f77a88
+ 2021-05-19T06:59:15: Triet cancelled the deployment after running for more than 15m, which likely cause some other downstream issues
+ 2021-05-19T07:05:00: Acknkowledged front door deployment on TF was success, but other resources are gone. **Only CDN and Service Endpoint still intact**.
+ 2021-05-19T21:27:23: Triet attempted to redeploy the whole stack again, but terraform statefile now locked due to previous cancelled deployment
+ 2021-05-19T21:46:07: Successfully unlocked tf statefile
https://dev.azure.com/powerfinance/Infrastructure/_git/Infrastructure/commit/902e4cc8bf04c460837849a9977c7d58dc2b752e/
+ 2021-05-20T00:49:11: Successfully decouple data block queries existing cosmos db account that made TF reference error when rebuilding the whole deployment stack
https://dev.azure.com/powerfinance/Infrastructure/_git/Infrastructure/commit/8cf705d0d182fa6833a14382fef6e58200dba788?path=%2Fmodules%2Fadmin_portal_funcapp%2Fmain.tf
+ 2021-05-20T01:06:06: Rerun terraform plan, confirmed the infrastructure planning works as terraform state expected. Dependency conflict was fixed in the code
> Build: https://dev.azure.com/powerfinance/Infrastructure/_build/results?buildId=1828&view=results
> Fix Commit: https://dev.azure.com/powerfinance/Infrastructure/_git/Infrastructure/commit/723b79982cb635257e368f776d9d46128d6a853f/
+ 2021-05-20T03:16:21: Attempt to redeploy Dev after TF plan worked out. Encountered strange errors and terraform eventually crashed, fail the build.
> Build: https://dev.azure.com/powerfinance/Infrastructure/_build/results?buildId=1835&view=results
> Related commit: https://dev.azure.com/powerfinance/Infrastructure/_git/Infrastructure/commit/c67146b258992f4ba21bfc80f4215e5396cf1d13/
> Stack trace:
```log
panic: runtime error: invalid memory address or nil pointer dereference
2021-05-20T03:16:41.293Z [DEBUG] plugin.terraform-provider-azurerm_v2.49.0_x5: [signal SIGSEGV: segmentation violation code=0x1 addr=0x48 pc=0x4112527]
2021-05-20T03:16:41.293Z [DEBUG] plugin.terraform-provider-azurerm_v2.49.0_x5:
2021-05-20T03:16:41.293Z [DEBUG] plugin.terraform-provider-azurerm_v2.49.0_x5: goroutine 82 [running]:
2021-05-20T03:16:41.293Z [DEBUG] plugin.terraform-provider-azurerm_v2.49.0_x5: github.com/terraform-providers/terraform-provider-azurerm/azurerm/internal/services/cdn.resourceCdnEndpointDelete(0xc0009de8c0, 0x4f8fda0, 0xc0002c2900, 0x0, 0x0)
2021-05-20T03:16:41.293Z [DEBUG] plugin.terraform-provider-azurerm_v2.49.0_x5: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-azurerm/azurerm/internal/services/cdn/cdn_endpoint_resource.go:514 +0x247
2021-05-20T03:16:41.293Z [DEBUG] plugin.terraform-provider-azurerm_v2.49.0_x5: github.com/hashicorp/terraform-plugin-sdk/helper/schema.(*Resource).Apply(0xc0005d2750, 0xc0014ae5a0, 0xc0016c5b80, 0x4f8fda0, 0xc0002c2900, 0x4fbc201, 0xc0014d38d0, 0xc00167eab0)
2021-05-20T03:16:41.293Z [DEBUG] plugin.terraform-provider-azurerm_v2.49.0_x5: /opt/teamcity-agent/work/5d79fe75d4460a2f/src/github.com/terraform-providers/terraform-provider-azurerm/vendor/github.com/hashicorp/terraform-plugin-sdk/helper/schema/resource.go:283 +0x4a6
```
+ 2021-05-20T04:45:54: Assumming TF remote state was compromised cause invalid reference to unexisted resources and rpc connection fail. Another attempt to remove resources on statefile.
+ 2021-05-20T05:54:50: Manually remove some resources created during failed previous attempts, and redeploy again. Issue now replicapable
+ 2021-05-20T06:15:00: Get Geoff's help to understand the situation, discovered the way admin-portal setup earlier maybe the issue, also foundout we did not follow best practices in infrastructure setup (but can deal later)
+ 2021-05-20T08:55:00: Seek online help, found issue on azurerm github project that likely indicate the real root cause of deployment failure
> Github issue: https://github.com/terraform-providers/terraform-provider-azurerm/issues/11231#issuecomment-819463222
> The cause:
Before you delete Azure Front Door or Azure Content Delivery Network resources, remove their endpoint CNAME records from DNS
You're receiving this email because you use Azure Front Door or Content Delivery Network.
On 9 April 2021, we're updating Azure Front Door and Content Delivery Network to help prevent dangling DNS entries and the security risks they create. At that time, we'll start requiring the removal of canonical name (CNAME) records for Azure Front Door and Content Delivery Network resource endpoints from DNS before the resources can be deleted.
To delete Azure Front Door or Content Delivery Network resources, you must first remove the resource endpoint CNAME records from DNS starting on 9 April 2021.
+ 2021-05-20T22:33:03: base on the issue hint, asked Geoff to delete the CNAME record of existing CDN endpoint. Start manually delete CDN & Endpoint on azure portal => success
+ 2021-05-20T22:49:03: Run redeployment again, as CDN and service enpoint was deleted, the conflict on azure wont't happen. All Dev admin-portal resources was successfully redeployed.