Summary
Between 14:07 and approximately 16:45 on June 13th 2024, 90% of our hosted community capacity was unavailable.
This outage was triggered by a routine restart of servers in our platform. The restart was required to apply necessary kernel upgrades of these machines.
This restart flushed the cache we use for storing information on premium extension subscriptions. With all this data flushed, all communities tried retrieving new information from the API of Extiverse, but this kept failing. The API of Extiverse, was sunset last week as it merged into the new website under Flarum.org.
We use premium extensions on our hosting platform so that clients can run their community with their unique theme, without these themes showing up for everyone. These are called private premium extensions. We also block or allow enabling premium extensions based on your active subscriptions.
The event was detected immediately by @BartVB who noticed a spike in errors from several communities. @luceos immediately followed up by identifying a root cause and working through the following resolutions:
- implement the missing API features on flarum.org that the Extiverse API client expected from its server,
- patch the API client to point to flarum.org,
- update the managed Flarum skeleton with the new API client,
- disable throttling on Flarum.org for the API endpoint.
Root cause
The connection from the managed hosting platform with Extiverse API wasn't identified before, although a fallback in the form of a cache existed.
Lessons learned
- Document external services the managed Flarum skeleton interacts with.
- Implement a fallback in case both cache and API fail when enriching Flarum extension information with premium extension information.
- Refactor the Extiverse API client as Flarum BV and move it under that namespace.
Our apologies for the considerable downtime our communities have experience. We are doing our best to improve our platform going forward and prevent these kind of service interruptions from happening.
In case you would like to learn more or have any questions, we are all available through the usual channels or feel free to use the contact page.