Servicedesk not available
Incident Report for OpsRamp
Postmortem

Issue

During the upgrade of OpsRamp 5.1.3, the system ran into a database problem that caused the Service Desk component of the platform to be unavailable. There was no loss of Service Desk tickets. All tickets auto-created during Service Desk unavailability with a time lag. There was no impact on any other functionality of the platform.

Root Cause & Actions Taken

The issue was caused due to MySQL table corruption during the OpsRamp 5.1.3 platform upgrade.

This happened around 10 PM PST on December 7th when the ticket table failed to load data post query optimization was applied to it during the upgrade cycle. OpsRamp team identified this issue in less than 10 minutes and initiated failover of all customers immediately to DR. DR was functional around 11 PM PST and all functionality of the platform was working normally including ticket listing. We started migrating current customers URLs one by one.

The primary environment was fully restored around 2 AM PST and all customers were switched back to the primary. OpsRamp team ensured that there was no loss of Service Desk tickets that were auto-created through email, alerts, etc. Manual Ticket Creation functionality was not available during the outage.

As per the upgrade process, we have carried out all required validations and in fact, all other OpsRamp environments were successfully upgraded last week and they have been running smoothly.

We sincerely apologize for the inconvenience caused to our partners and their customers during this issue. Please do not hesitate to reach out to support (support@opsramp) for further queries/clarifications.

Posted 5 days ago. Dec 10, 2018 - 04:57 PST

Resolved
During the upgrade of OpsRamp 5.1.3 platform, the system ran into database problem that caused only Service-desk component to be unavailable for around 3.5 hours ( between 10:10 PM PST Dec 7 and 1:30 AM pst Dec 8). The system is up now and processing all service desk tickets.

There is no loss of service-desk tickets created before the impacted window. All tickets auto-created during service-desk unavailability are created with a lag time. There is no impact on other functionality.
Posted 7 days ago. Dec 08, 2018 - 03:09 PST