Start Date: October 27th, 2018 @ 3:00 PM EDT
End Date: October 27th, 2018 @ 4:15 PM EDT
Internal Ticket: #265
Location: AMS1
Description: AMS1 Rack Level Outage
On Saturday, October 27th 2018, Packet experienced a Rack Level Outage affecting a single rack of servers in our AMS1 facility, which started at 3:00 PM EST and was resolved at 4:15 PM EST.
During the outage, the affected servers lost full network access, but they never lost power or were rebooted.
After troubleshooting through the issues, our networking team concluded that the cause of the outage was a bug on the TOR switch pair OS, causing a split-brain, where the primary switch lost its state and the backup switch never claimed itself to be the primary routing member.
Full network was restored after our Network team reloaded both of the affected TOR switches, and applied a temporary patch.
Packet will schedule a Maintenance in a few days, and alert all the affected customers, in order to upgrade the affected switches to a new OS version.
All times are in EDT.
Saturday, October 27th, 2018