Please find an abridged version of the reason for outage [RFO] supplied in relation to the Broadband Authentication outage. We are working with our supplier to ensure they mitigate the risk or recurrence and apologise for any inconvenience this may have caused.
A full copy of the RFO is available upon request.
Our upstream supplier Wavenet experienced an outage on one of their core routing devices, this impacted Gradwell customers with TalkTalkBusiness or Zen Internet connections in that the lines dropped authentication and were unable to reauthenticate until resolution of the issue(s).
The supplier outage lasted between 11:05 GMT 30th January and 13:15 GMT 30th January.
Wavenet network operations centre [NOC] engineers performed full diagnostics and following a thorough investigation they identified a routing change to a BGP peer on an edge router, which caused the BGP process to stop running. Wavenet NOC engineers immediately made the correction to restore the BGP process on the affected device at 11:13 GMT. Monitoring identified that over 75% of sessions instantly re-connected following the restoration work.
We received further alerting at 11:20 GMT, identifying slow responses from the primary authentication server due to an increased demand. Load balancing was altered to steer more authentication requests to the secondary authentication server and at approximately 11:50 GMT our NOC engineers confirmed that the primary RADIUS server had stabilized.
The root cause has been identified as human error during a standard network change to an edge router.
A detailed review of the process within the Wavenet network engineering team will be completed and will
include a review of whether it is appropriate to continue use of manually executed commands moving
forward.