Downtime Resolved And Source Discovered
Alrighty, let’s get this whole “downtime” issue out of our vocabulary now. That was a very long and unnecessary stretch and we’re very sorry for the inconvenience. We received scads of phone calls, emails, and tweets regarding the service being down and thankfully (kudos to all of you!), we weren’t scared off by the comments.
Again, you are why we do this.
So, thank you for being so nice when we’ve struggled.
Here’s why we went down:
- It started with some crazy database crashes, which we fixed quickly by increasing the number of connections allowed to the database.
- After quite a few more times of needing to restart the database service and apache… and scratching of our heads… we finally figured out that the reason the database was continually crashing was that we had run out of space on our hard drive. That crashed the database, did nutty things to the tables, and more.
- So, we added some notifications and enabled some restrictions so that we get alerts now when we get close to capacity. The nice thing is the table in question is just a data repository – not needed for operating anything – so we just backed it up for posterity, emptied the table, and voila, we’re under capacity and everything’s back to normal.
What we did find out in the process was that we hit 26 million queries. That’s just crazy. With this kind of service, having to do this type of maintenance every 1-2 months is doable, but not really on my top list of things I want to do on a Saturday.
Thus, we’ve come to this… we’re considering how to monetize this service so that we can pay for either moving into the cloud or increasing our service at our outstanding host.
If you have any ideas – things you’re willing and wanting to pay for – we’re all ears. Paying for the service, even in small chunks for the biggest users, would be a huge step up for stability. “All for one, and one for all.” We’re for you.
Bring your thoughts … we want to hear them.
