All switch operations are fully operational again. Both line-cards are now fully booted into the new firmware and are running redundancy mode HOT. This incident has been resolved.
Posted Mar 18, 2020 - 10:42 CDT
Monitoring
The fail-over has completed. We are now monitoring that everything is fully operational. The total downtime for the switch was less than 4 minutes (3:01.91).
Posted Mar 18, 2020 - 10:36 CDT
Identified
Fail-over has been initiated.
Posted Mar 18, 2020 - 10:31 CDT
Investigating
We are getting memory alerts on the switch that feeds most of the colocation customers at the 1325 Tracy location. An IOS bug is causing memory failures during routine network operations. To resolve this issue an immediate manual fail-over to the secondary management card is needed. The secondary management card has been modified to run a newer firmware that resolves the IOS issue, however this will require the secondary management card be in a "STANDBY COLD" state instead of "STANDBY HOT" due to feature and version differences. This will result in a longer downtime during the fail-over process but hopefully fully resolve the issue. We are going to issue the fail-over command immediately and expect it to take between 5-15 minutes to fully recover.