Gradwell Internet for business people

Gradwell Service News

Monthly Archive March, 2010

RESOLVED: National Datastream Outage

We have received the following communication from BT Openreach regarding a National Datastream service outage affecting upto 37,500 circuits.

Description of Outage:

Several BBLCs terminating to Reading and Harbour Exchange are down.

All datastream customers terminating on rdg-0-dsl, he-1-dsl, and he-2-dsl, are affected by this outage.

Geographic location of affected services: (where possible to define).

Due to the nature of our network affected exchanges could be anywhere in the UK

Customers may experience issues with their ADSL connection, the router will be able to sync and obtain a connection , yet traffic will not pass (Web/Email/VoIP services will be unavailable).

Updates when received from BT Openreach will be posted here.

** 12:17 : 437 exchanges have been confirmed as affected by this outage, Customer Services have received a document detailing affected exchanges.

** 12:30 : BT have taken the decision to close North Paddington Exchange due to flooding , LLU services provided by Tiscali at this exchange only are experiencing the same issues as above.

***Update 13:45***

A full list of the affected exchanges can be found here - http://bit.ly/bedra6

** 17:30 : The first of 2 generators that BT have ordered for North Paddington Exchange has arrived onsite, once flood water has been cleared engineers will be tasked to install it.

***Update 20:25***

We have had an update from BT. They are saying that it may take up to 4 days to resolve the fault. They will be able to give a more definite time frame to fix once they are allowed inside the site by the fire department.

** 08:30 Tiscali LLU Services at North Paddington have been restored. Datastream connectivity issues are still ongoing

** 12:15 Our carrier has announced that connectivity on rdg-0-dsl, he-1-dsl and he-2-dsl has been restored. Connectivity will be monitored for the next 6 hours. Any further updates received will be announced here. Customers should now be able to reconnect to our service, customers may need to power cycle their ADSL router/modem in order to regain connections.

***Update 11:24 6th April***

DSL service should now be back to normal so we are happy to close this status update. If you are seeing any problems with DSL, please raise a support ticket and we will deal on a case-by-case basis.

RESOLVED: Fire at BT site

We are aware of a fire at a BT site in North Paddington, London and this is likely to affect the routing of calls that would have passed through that node. This will affect all communication providers that route through this node.

We have no reports of this affecting any of our users at present but are closely monitoring the situation and have contingency plans in place.

***Update 11:20***

We have lost connection to one exchange but our voice routes have been rerouted where necessary so no ill effects should be seen by any users. We will however leave this status update ‘at risk’ pending updates.

***Update 14:40***

We are starting to see some congestion over some of the affected inbound BT routes. Unfortunately this may last until about 16:00 when call volumes start to drop. We apologise for any problems this may be causing you and please be assured we are doing everything we can to make use of alternative routing to keep this disruption to an absolute minimum.

***Update 15:40***

Congestion issues should be near zero now, we have re-routed the majority of the traffic over alternative routes and call volumes are dropping off. We are still awaiting a further update from BT

***Update 16:00 1/4/2010***

We have had word from BT that the main interconnects have been restored but there will be congestion for some time until BT central core manages to clear. We will continue to monitor

***Update 11:20 6th April***

We are now seeing little or no congestion on BT routing so we are happy to close this status update. If you are still seeing any affected numbers, please raise a support ticket and we will deal with these on a case-by-case basis.

RESOLVED: Mail delays and quarantine failures

Some users might be seeing mail delays at present, and users who utilise ‘quarantine’ on their inbound mails will be especially affected. Our server admin team is working on this now and we will update here with any futher information as soon as it is available.

We apologise for any problems caused by this.

***Update 11:50***

Mail queues should now be starting to return to normal and most users will see an influx of mail from earlier today. For the time being we have had to suspend the use of quarantine so users might also see an increase in spam messages. We will continue to monitor and update as necessary.

***Update 11:50 12th April***

We have released a further update on quarantine today in a separate status post.

RESOLVED - Control panel issues

Some users may be seeing some slow down or rejected connections to our control panels at present.

Our server admin team are working on this now and will have the fault rectified as soon as possible. We apologise for any problems caused by this

***Update 12:43***

Full access to the control panels is now restored. We apologise again for any problems caused.

Multi-User VoIP DNS change

Starts: 2010/04/05 22:00 Ends 2010/04/05 22:30

On the evening of Monday 5th April, we are changing the IP address associated with the VoIP server name “lon-pbx-6.gradwell.net”.  This is to bring the DNS for this server name in line with our standard DNS for our VoIP platform.  For reference, the IP will change from 193.111.201.27 to 79.135.125.154.

We do not believe that users will be affected by this change unless there is a firewall in place, as the new and old IPs will both continue to work for a transition period.  We anticipate removing the old IP from service within a few months.

Please note also that this change will only affect users who are using the legacy “lon-pbx-6″ server name in their phone configuration.

If you believe you have a firewall in place we recommend you check this before Monday 5th to ensure you do not experience any problems with your service.

We apologise for any inconvenience this change may cause.  Please contact our support team if you have any queries.

RESOLVED: Inbound mail delays

We are currently seeing an abnormally large amount of inbound mail queued in our system awaiting delivery. We are in the process of bringing online extra servers to move the backlog while we continue to investigate the root cause.

UPDATED: All backlogged mail now been delivered and mail is flowing normally through the system.

COMPLETED - Hosted Unified Comms platform upgrade

We will be upgrading the software version in use on our hosted unified comms platform on Monday the 29th of March between 20:00 and 22:00. Therefore the platform will be considered at risk between these hours and calls may not complete.

We will update here once the work has started and again when the work is completed. Our apologies for any problems this might cause you.

***Update 22:30 29th March***

The works completed a little early and we have been monitoring the platform since then. All calls are passing correctly and routing as expected so we are now closing this status update as completed. We apologise again for any downtime caused by this upgrade.

RESOLVED: Website failures - PHP 5.2 cluster

We earlier identified a problem with two of the machines within the PHP 5.2 cluster failing to respond correctly, this unfortunately was an unforeseen problem outside of our control. This may have caused some sites to fail to load or resulted in an extended load time.

We apologise for any problems caused by this.

COMPLETED: Core network maintenance 30-31/3 & 31/3-1/4

Our sysadmin team will be performing some core network upgrades and some hardware relocation during midnight to 05:00 on the above dates. This will put our entire network ‘at risk‘ during these hours.

Tasks being undertaken are to:
* apply security updates to core networking infrastructure
* add additional capacity to internal links, to reduce risks of congestion
* relocate several pieces of core network infrastructure to increase resilience
* swap several pieces of core network infrastructure onto new power supplies

As this work is fairly dynamic we cannot give an accurate time of which services will be affected but for at least 20 minutes all services will drop. We will update again closer to the time with a reminder and any fresh information.

***Update 9:00 31st March***

Last nights works was successfully completed ahead of time and there are no reported issues with any of the changes made. Tonights work will be going ahead as planned.

***Update 10:00 4th April***

We have been monitoring for 4 days and all systems seem stable so we are closing this status update

RESOLVED: Multiple Exchange Outage

UPDATE 10:23am - This issue has now been resolved and connectivity restored, the root cause was a supervisor card failing to switchover to a resilient card.

We have been informed by our ADSL carrier, that a multiple number of exchanges have lost backhaul connectivity. BT and Opal 3rd line IP engineers are investigating and further updates will be posted.

A list of affected exchanges can be found below

  • wandsworth
  • merton-park
  • richmond-kew
  • north-finchley
  • ponders-end
  • epsom
  • putney
  • shoreditch
  • surbiton
  • malden
  • willesden
  • wimbledon
  • aberdare
  • worcester-park
  • tottenham
  • cricklewood
  • kingston
  • bayswater
  • hendon
  • teddington
  • lower-holloway
  • walton-on-thames
  • southbank