All systems operational

[CALA] [Major Issue] All services down

Resolved
Major outage
Started over 3 years ago Lasted about 15 hours

Affected

Caribbean & Latin America (CALA)
[CALA] Rainbow Core Services
[CALA] Rainbow Conferencing
[CALA] Rainbow Media Relays
Updates
  • Resolved
    Resolved

    Our monitoring and log systems confirm that service is now fully restored at nominal level.

    The Root Cause Analysis is available on the "Help Center".

  • Monitoring
    Monitoring

    We've observed the various hardware configurations and everything seems up and running. Services are restored. The incident is however not yet closed on our IaaS Cloud provider so we keep on monitoring the situation.

  • Identified
    Update

    Our IaaS Cloud provider has progressively restored power and network in the facility. We managed to get access to a few servers. We're currently evaluating the troubles and trying to recover from the situation.

  • Identified
    Update

    Our IaaS Cloud provider has fixed the cooling issue and temperature has reached a nearly acceptable level of operation. The Cloud provider is progressively initiating hardware and network components restart. Additional update will be provided when our servers will be powered up.

  • Identified
    Update

    While we were implementing a contingency plan to prevent outages and consolidate the stability of the Rainbow infrastructure, a critical problem in our Latin America infrastructure has arisen, because a third-party service provider suffered a major outage due to cooling system failure. This problem had a snowball effect that was impacting users around the world. Rainbow connections, bubbles and conferences were affected. Most services were restored at 13:00 CET for users in EMEA, and NAR & APAC regions are being updated. Our operation team is working with the local IaaS provider to restore services for CALA users. Updates have been provided here on status.openrainbow.com page. Our Operations and R&D teams are fully focused to fix this inconvenience as soon as possible. A detailed RCA will follow soon in our Help Center "here".

  • Identified
    Update

    IaaS Cloud provider says it should take another 1-2 hours to stabilize temperatures. CALA data-center availability expected in 2 hours. This is a current estimation and may change based on changes in situation

  • Identified
    Update

    IaaS Cloud provider has confirmed a leak in cooling subsystem of CALA data-center. The local teams have fixed the leak and need to fill-in the cooling tank (announced ETA: at least 2-3 hours)

  • Identified
    Update

    We are continuing to work on a fix for this incident.

  • Identified
    Update

    IaaS Cloud provider is continuing the investigation but temperature is still above threshold and the whole data-center is down.

  • Identified
    Update

    IaaS Cloud provider managed to decrease a bit the overall temperature but not enough to reach nominal state. Datacenter is still non-operational.

  • Identified
    Update

    Our IaaS Cloud provider has confirmed a cooling incident on their local facility. All associated services in CALA are currently electrically shut down.

  • Identified
    Identified

    A possible root-cause has been identified and we're working on a fix for this incident.

  • Investigating
    Investigating

    We have observed some production systems anomalies and Rainbow Operations team is currently investigating this incident.