LII experiences two-day outage
For those of you who missed it (we doubt that’s many of you), we just got back on the air after a little over two days of downtime. We experience slowdowns — usually related to networking problems, or an overloaded database back end — about two to four times a year. This is the first time in six years that we’ve been down for an extended period of time. A longer treatment of the issue, and some musings about it, are in LII Director Tom Bruce’s blog.
Sadly, it was our own fault. We pushed supposedly-innocuous changes out without adequate testing, and they brought the site to a grinding halt. The fault turned out to be in code that has been running under lighter loads for more than a year without incident, but did not scale well. Unfortunately, we had left ourselves without an easy line of retreat, lengthening the time needed to restore service.
We are very sorry for any inconvenience this may have caused our users. Many of you have written or called us to inquire about our fate or to express support over the last few days. We’re very grateful to you. With a little luck, it will be more than six years before it happens again.