Skip to content

2025-01-23 11AM

What was affected

  • Documentation
  • Documentation Dev

What caused it

I regularly check the logs for my websites and saw there was significant amount of requests to a single page coming from an IP address

103.180.118.180 - - [18/Jan/2025:20:06:59 +0000] "GET /favicon.ico HTTP/1.1" 200 32060 "https://documentation.breadnet.co.uk/automation/iac/terraform/failed-to-get-existing-workspaces-querying-Cloud-Storage-failed-storage-bucket-doesnt-exist/" "Mozilla/5.0 (Linux; Android 13; Infinix X6815D Build/TP1A.220624.014; wv) AppleWebKit/537.36 (KHTML, like Gecko) Version/4.0 Chrome/131.0.6778.260 Mobile Safari/537.36" "103.180.118.180, 66.241.125.86"

img.png

They were constantly making requests to https://documentation.breadnet.co.uk/automation/iac/terraform/failed-to-get-existing-workspaces-querying-Cloud-Storage-failed-storage-bucket-doesnt-exist/, specifically the favicon, which is weird

I have had blocking on this site since around 9th of December, 2022 but have not used it that much. Recently I have started to take that more serioudly

On the 17th of Jan I blocked Hetzner range as there was a DDOS attack coming from them via scraping, so I put together a custom block page incase legitimate users get caught up in the net

On the 23rd of January, I made a PR number 481 which blocked a new range of spammers, pt persada data multimedia - AS149359

During the editing of the deny file, I made a mistake and did not finish the final line with a ; as you can see here

This caused NGINX to crash loop and in turn fail the fly.io health check

img_1.png

What monitoring was in place

I run an internal deployment of Gatus which alerted to this at 2025-01-18 11:29:15

img_2.png

Gatus is set up to push alerts in to an Operations channel on Mattermost

img_4.png

Issue being, I muted this channel, so I never got the alert

img_3.png

I only realized the site was down when I went to Slack and was confused as to why the media icon was not loading in a Thread

What was done to resolve it

Checked the PR and realized that I did not finish the line with ;, instead '

I resolved this in PR 482 where I removed the erroneous character

What did we learn

  • No docker containers validate the nginx config
  • No Validation on nginx files in CI

Checking both Dockerfiles (even tho I don't use docker) there is no validation step

This should look like the below

COPY deny.conf /etc/nginx/deny.conf
COPY .htpasswd /etc/nginx/.htpasswd
+ RUN nginx -t

# Copy built site from the builder stage (the heaviest change)
COPY --from=BUILDER /app/site /var/www/documentation

Want to make this site better? Open a PR or help fund hosting costs