My first postmortem - ALX

Issue summary

The issue arised due to the server having default timeout making it inaccessible when a client tries to communicate with it.

Timeline:

2023–07–04, 6:00 AM EAT: Project release
2023–07–04, 9:00 AM EAT: Begin project.
2023–07–04, 9:20 AM EAT: Everything working fine. Goes to a 30 mins break
2023–07–04, 9:50 AM EAT: I try to reach my server using curl, i received status unreachable. Ping returns destination unknow.
2023–07–04, 10:00 AM EAT: Logged in the ubuntu server and went over to check nginx status logs only to note that the web server was down.

ROOT CAUSE AND RESOLUTION

After analysing the error nginx error logs,I realised that the server had slept due to 20 mins of inactivity. This made the error when a client tried to communicate with it. I had to create a puppet script that would start a cron job to allow ensure that the web service is active every 15 mins to avoid such a down time.

Corrective and preventative measures:

At the time of this downtime, only one web server was serving the site constituting a single point of failure (SPOF). Hence, a good preventive measure and recommendation will be to use a load balancer to distribute the traffic on multiple servers to prevent a total downtime when one server is down.

Thank you for reading my article.

My first postmortem - ALX

Issue summary

Timeline:

ROOT CAUSE AND RESOLUTION

Corrective and preventative measures:

Read more

HTB SECRET

Paper Walkthrough - HTB

Pandora Walkthrough - Hack The

Bounty Hacker Walkthrough - Try Hack Me