[Resolved] [EMEA | DE] [Major Issue] Bubbles & Conferencing issue
Started on December 7, 2020 at 7:57:00 AM GMT+0. Resolved after about 13 hours
- InvestigatingDecember 7, 2020 at 7:57:00 AM GMT+0 –
We have observed some production systems anomalies and Rainbow Operations team is currently investigating this incident.
- IdentifiedDecember 7, 2020 at 8:05:00 AM GMT+0 –
A possible root-cause has been identified and we're working on a fix for this incident.
- IdentifiedDecember 7, 2020 at 10:41:21 AM GMT+0 –
We are continuing to work on a fix for this incident. Rainbow Bubbles are still not operational.
- MonitoringDecember 7, 2020 at 11:50:00 AM GMT+0 –
We implemented a fix and currently monitoring the result. Bubbles are operational again.
- MonitoringDecember 7, 2020 at 12:21:00 PM GMT+0 –
Existing bubbles are fully functional but creation of new ones are still causing issues. Android clients still have troubles connecting.
- MonitoringDecember 7, 2020 at 2:03:03 PM GMT+0 –
While we were implementing a contingency plan to prevent outages and consolidate the stability of the Rainbow infrastructure, a critical problem in our Latin America infrastructure has arisen, because a third-party service provider suffered a major outage due to cooling system failure. This problem had a snowball effect that was impacting users around the world. Rainbow connections, bubbles and conferences were affected. Most services were restored at 13:00 CET for users in EMEA, and NAR & APAC regions are being updated. Our operation team is working with the local IaaS provider to restore services for CALA users. Updates have been provided here on status.openrainbow.com page. Our Operations and R&D teams are fully focused to fix this inconvenience as soon as possible. A detailed RCA will follow soon in our Help Center "here".
- MonitoringDecember 7, 2020 at 3:57:00 PM GMT+0 –
We implemented a more exhaustive fix and currently monitoring the result. Bubbles creation is operational again and login through Android devices is restored.
- ResolvedDecember 7, 2020 at 8:45:00 PM GMT+0 –
The issue has been resolved. All systems are functional again.
The Root Cause Analysis is available on the "Help Center".