Gradwell Service News

Monthly Archive September, 2008

shell.gradwell.net

We are currently experiencing an issue with our customer shell server - shell.gradwell.net and our server team are investigating this at the moment.  We apologise for any inconvenience.  We have two other shell servers customers can use during this outage - ochre and newred.gradwell.net

Update 11:10: This problem was resolved just before 10:00 and was caused by a disconnected private LAN connection affecting only th-shell-1.gradwell.net

Intermittent Outbound Calling Issues

We are currently experiencing an issue on our network with regards to outbound calling.

A number of customers are receiving 603 Declined errors from our PSTN gateways.

We have reported this to our server engineers and we are currently investigating this intermittent fault.

We will update this problem once further information has been received from our server engineers.

12:05: We have now resolved this issue which was caused by our back-end database slaves responding slowly.  These have now been restarted by our server engineers to clear any backlog and to ensure that call setups are working correctly.  This issue will have caused a small number of call setup failures since this morning, with a gradual increase in traffic until our internal monitoring alerted us to a more serious problem. We apologise for any inconvenience caused by this partial outage on the network.

Infrastructure fault being investigated

We are currently investigating a fault with our back-end virtual-systems. This may affect multiple services and our engineers are investigating urgently to resolve the issue as soon as possible. We apologise for inconvenience this will be causing customers and we will update you as soon as more information is available.

Update 21:25: An on-site engineer in London has been authorised to manually restart one of the machines in our virtualised server cluster which will be done within the next few minutes to restore capacity to our network. We are currently aware of two potential issues with our mail file servers (3 & 7) and these will be addressed as soon as possible.

Update 22:20: We are currently experiencing an extended outage of one of the servers in our virtual infrastructure cluster.  The majority of services are functioning correctly.  However, we are aware that the following services are experiencing issues:

One of one load-balanced POP3 servers is currently offline which may result in intermittent POP3 collection problems.
Our back-end mail storage server, v-mail-file-3 is offline.  This will result in an outage for customers with mailboxes on this platform.  Mail will continue to be held on our inbound mail queues, however delivery will be delayed and this will result in errors collecting mail.  We apologise for this inconvenience and assure all customers that we are working with our engineers to ensure the service are restored as soon as possible.
Our UK SSL site is currently offline, again, we are working to restore this as soon as possible.

Update 22:50: We have raised an urgent support request with VMware and one of their engineers is now working to restore service.  When this is completed we will work with VMware to establish why the automatic resilience measures did not maintain a running service.

Update 23:10: We have now restored the majority of services, including uk-ssl.com. We are currently waiting for a full disk check to complete on v-mail-file-3 and are performing a maintenance restart on v-mail-file-7 which we expect to interrupt service only briefly.

Update 00:00: The disk checks on the 2 file servers have now completed and we believe all services are back to normal.

Emergency maintenance on PSTN interconnect

Starts: 2008/09/26 23:00:00 Ends: 2008/09/26 23:05:00

We will be performing a restart of some of the equipment associated with the PSTN interconnect which handles inbound ported numbers late this evening starting from 23:00 BST.  This maintenance should last no more than five minutes and will not affect outbound calls, however customers with some inbound may experience a short outage when inbound numbers fail to connect.  We apologise in advance for any inconvenience caused by these essential works.

PHP 5.0 cluster issue

We have experiencied a problem with our PHP 5.0 cluster (not the PHP 5.2 cluster) which may have affected customers’ ability to access websites.  This was related to a server which was brought online approximately 16:00 this afternoon and was resolved at just after 18:00.  We apologise for the inconvenience this will have caused customers on this cluster.

VoIP Maintenance 30th Sept 2008

Starts: 2008/09/30 22:00 Ends 2008/09/30 23:59

Gradwell will be carrying out routine maintenance on our VoIP platform on Tuesday evening.  During this time, the following VoIP services may be unavailable:

  • Voice mail
  • VoIP control panel

This affects all VoIP customers.

Planned maintenance work:

  • release support for our upcoming voice-to-text voice mail service
  • deploy improvements to Asterisk fault monitoring
  • restart heartbeat daemon used for registration database resilience

Filesystem checks on v-mail-file-9

We are currently running a full consistency check on v-mail-file-9 due to our monitoring system reporting a potential problem.  We currently expect this to finish within the next 30 minutes and will update this site if it takes longer than expected.

Please accept our apologies for any inconvenience this may cause.

Update - 21:30 - The maintenance finished successfully at around 21:15 and all services are back to normal.

Cobranded Partner Notice - Direct Debit’s

Online completion of Direct Debits has been disabled.

This means that customers, including cobranded end users will all have to download a PDF form, print, sign, and post it back to Gradwell.

This change has been completed for security reasons, and will remain in place until further notice.

lon-ppc-3 emergency maintenance

Starts: 2008/09/19 17:00 Ends: 2008/09/19 17:05

We have been experiencing an intermittent issue with our NAT proxy (lon-ppc-3.gradwell.net) and will be performing a full restart of the server this evening at 17:00 and again at 22:00 on Sunday evening.  We expect this maintenance to last less than a couple of minutes, however, inbound and outbound dialing will not be available to customers using these proxies for the short period of the downtime.  Customers using public IP ranges and using other NAT proxies we operate will not be affected.

Edinburgh Exchange Outage

BT are currently experiencing a service outage at the Leith exchange in Edinburgh , a small number of Gradwell DSL customers on some services will be affected by this outage , affected customers have already been informed - We will post updates once these have been received from BT. 

 

Yesterdays Leith exchange outage has now been repaired by BT. A fibre fault was located a 100m from the exchange, and is now fully operational.

We apologise for the disruption that this may have caused to a small number of our customers.