One line of code that did price $8,000

TLDR

This screenshot may not look so scary at first, however check out the size of it. For over a month, we generated at the very least 100Mib/s (a second!) and, at instances, nearly 1GiB/s (each single second!)
That bug was painfully easy and silly.
Display Studio is a desktop app. It means we want some auto-updater to permit customers to put in the newest app model simply.
The app checks for the replace each 5 minutes or when the person prompts the app.
Usually, when the app detected the replace – it downloaded it and stopped the 5 minutes interval till the person put in it and restarted it.
Tragic refactor
The issue with the auto-updater we had was that it could immediate the person to replace the app as quickly because it grew to become out there. This resulted in a popup showing whereas customers had been recording the display, which clearly offered a nasty expertise because it interrupted the recording the person was making.
Whereas refactoring it, I forgot so as to add the code to cease the 5-minute interval after the brand new model file was out there and downloaded.
It meant the app was downloading the identical 250MB file, over and over, each 5 minutes.
Tragic context – app operating within the background for weeks
It seems hundreds of our customers had the app operating within the background, despite the fact that they weren’t utilizing it or checking it for weeks (!). It meant hundreds of customers had auto-updater continually operating and downloading the brand new model file (250MB) over and over each 5 minutes
The maths
Let’s do some fast math right here.
- Doing one thing each 5 minutes means doing it roughly 288 instances a day.
- The replace file is about 250 MB, that means 72 GB of downloads per person each day.
- We had this example taking place for over a month earlier than we observed it.
- We had at the very least a thousand such app situations operating within the background at any second.
- 250 MB * 288 downloads per day * 30 days * 1000 customers:

It means it was roughly:
- or 2 petabytes of site visitors.
Collection of unhealthy errors
We didn’t have price alerts on Google Cloud. Earlier than this example occurred, we had been paying at most $300 a month.
We had been additionally not commonly checking the state of affairs because it simply labored.
We observed it as a result of my bank card began to dam the transaction attributable to limits I had set on it (fortunate me!).

Penalties for the customers
It was not solely unhealthy for us however even worse for among the customers.
As talked about, the app was producing a lot site visitors. It means it was their machine producing community site visitors on their house router and their web supplier.
One in every of our customers, who lived in a home, had their web supplier cancel their contract attributable to monumental site visitors generated throughout a month. It was extraordinarily problematic as there was no different web supplier out there round.
We determined to take accountability and supply to cowl all the prices associated to this example.
Fortunately, it was not wanted because the individual might determine the state of affairs with the supplier with out greater issues.
That was, nonetheless, fairly a horrible expertise for that individual and me. As a designer, I worth the expertise product I create offers to the customers. And this was not even a nasty expertise; it was really dangerous.
Summarising
- Set alerts in your cloud always.
- Write your auto-updater code very rigorously.
- Really, write any code that has the potential to generate prices rigorously.
- Add particular alerts you may change in your server, which the app will perceive, resembling a compelled replace that may set up with out asking the person.
- Commonly examine your cloud.