A number of ESX virtual machines on ‘europa’ (one of our SAN units) are at risk due to some unexpected and critical RAID issues with the system. We are in the process of moving off (whilst live) a number of affected VMs, most of these are “clustered” services and those which are potential points of failure are being prioritized for moving. At the moment all services are running as expected, and monitoring has not detected any impact upon availability or performance.
Update: 22nd September, there are very few services still remaining on this storage unit, and our server team have nearly completed of all customer facing services off.
Update 25th September: This issue is now resolved, the hardware running ‘Europa’ has been decommissioned.
Users on newsip.gradwell.net (including migrated lon-pbx-7 users) may be seeing issues making and receiving calls. Our server admin team are working on this now and will post an update here as soon as possible.
We apologise for any problems this is causing
UPDATE: Our system admins applied a fix for this issue which is working correctly. We will be continuing to monitor the situation and investigating the cause fully.
This evening from approximately 23:00 until 00:00, we will be performing maintanance on our web load-balancers, which is part of the work for removing the legacy FreeBSD cluster. There is a possibiity of a brief outage on all web clusters as this work progresses. We will update this site once the maintenance has been completed, or with any further updates. Thank for you for patience during these essential works.
This maintenance has now been completed, all PHP 5 sites are now running from our new virtualised PHP 5.2 platform. Our legacy load balancer has been decommissioned.
We are currently moving our legacy Ezmlm support to a new virtual server so that the equipment which used to host it (ochre.gradwell.com) can be decommissioned.
Update: 11:30, We have now moved the lists. IP address over to the new server ‘mailing-2′ which replaces Ochre for handling all Ezmlm-related mailing list functions, allowing us to decommission Ochre. We have performed testing of basic list functionality to ensure that this change should be transparent for customers.
Some users using our newsip platform, including users of lon-pbx-7, may be experiencing problems making calls outbound.
Our server admin team are aware of this and are working to resolve this issue. We will post an update here as soon as possible.
We apologise for any problems this is causing
***Update - 10:10***
Our server admin team isolated the cause of the problem and implemented a fix at 09:36. We have fully tested and are now happy that our newsip platform is passing calls correctly.
If you are still seeing any problems, please raise an incident with our support team for further investigation
We are performing scheduled maintenance on 16th September 2009 which
will involve a period of outage across all products and services.
During this maintenance window we will be:
- performing software upgrades to core network equipment
- performing software upgrades to DSL network equipment
- running additional cabling for the core network to ensure
tolerance of any important link failure, and to add
pro-actively add capacity to areas of the network which
will experience significantly increased throughput over
the coming months as Gradwell continues to grow
- updating configuration on all aspects of the network to
fully support native IPv6 addressing
- upgrading switches used with our SAN solution to improve
storage reliability with the vSphere ESX 4.0 platform
- upgrading head nodes of the ESX storage solution to provide
increased reliability, throughput and manageability
- running additional cabling for the core network to ensure
tolerance of any important link failure
Outage will begin on 16/09 at 03:00 (GMT) and we expect to be completed
within 60 - 90 minutes; whilst the service may come up again quite
quickly it should be considered “at risk” between 03:00 and 05:00.
Update: This work was completed with a minimal window of downtime, lasting about ten minutes.
Some users were seeing some difficulty in placing calls over IAX to iax-lb.gradwell.net
Our server admin team have found an issue with the load balancer, this has been rectified and full service has now been restored.
We apologise for any problems this has caused.
We have been informed by our carrier that a number of exchanges in the Devon and Cornwall area have lost connectivity due to a fault on the Surf network.
UPDATE 14:25 IP Engineers have rectified this issue, connectivity has been restored to all affected exchanges and confirmed stable.
The list of affected exchanges have been listed below for reference
- barnstaple
- bideford
- bodmin
- brixham
- crediton
- dawlish
- exeter
- exmouth
- honiton
- ilfracombe
- kingsbridge
- newton-abbot
- newtonabbot
- ottery-st-mary
- otterystmary
- paignton
- par
- sidmouth
- st-austell
- st-marychurch
- staustell
- stmarychurch
- teignmouth
- tiverton
- torquay
- totnes
- truro
Customers on these affected exchanges will have no ADSL connectivity, further updates will follow from our carrier when available.
We have been alerted by our ADSL carrier that approximetly 4000 ADSL sessions have been terminated due to a failed supervisor card. Our carrier has dispatched engineers to replace the faulty hardware.
A list of affected exchanges has been issued and is included below for reference.
UPDATE 14:51 A further 2 exchanges have also experienced the issue EGHAM and REDHILL , total connectivity loss will be experienced.
UPDATE 15:52 Engineers have now arrived at site to replace the affected supervisor card.
UPDATE 18:08 All affected exchanges are now back online, customers affected by this outage should be able to reconnect , if you cannot reconnect, please power off your router wait 2 minutes and power back on.
| LNHPK |
highams-park |
|
|
|
| SDPNDHL |
poundhill |
|
|
|
| CLNEW |
new-cross |
|
|
|
| LSBRO |
bromley |
|
|
|
| LSCHI |
chislehurst |
|
|
|
| LSADD |
addiscombe |
|
|
|
| LSCRO |
croydon |
|
|
|
| LSCHI |
chislehurst |
|
|
|
| LSBEX |
bexleyheath |
|
|
|
| EAGRA |
grays-thurrock |
|
|
|
| EAGRA |
grays-thurrock |
|
|
|
| LSCTFD |
catford |
|
|
|
| SSDOW |
downend |
|
|
|
| WRBATT |
battersea |
|
|
|
| LSGRNW |
greenwich |
|
|
|
| SSDOW |
downend |
|
|
|
| WRNELMS |
nine-elms |
|
|
|
| WRNELMS |
nine-elms |
|
|
|
| LSGRNW |
greenwich |
|
|
|
| CLWAL |
walworth |
|
|
|
| NDCOP |
copthorne |
|
|
|
| LNADK |
albert-dock |
|
|
|
| SDHRLY |
horley |
|
|
|
| WRBATT |
battersea |
|
|
|
| WRBRIX |
brixton |
|
|
|
| LSBEU |
beulah-hill |
|
|
|
| LSGIP |
gipsy-hill |
|
|
|
| LSNCHM |
north-cheam |
|
|
|
| LSBEX |
bexleyheath |
|
|
|
| CLWAL |
walworth |
|
|
|
| LNPOP |
poplar |
|
|
|
| LSGIP |
gipsy-hill |
|
|
|
| LSCRO |
croydon |
|
|
|
| LNADK |
albert-dock |
|
|
|
| NDCOP |
copthorne |
|
|
|
| LNHPK |
highams-park |
|
|
|
| SDPNDHL |
poundhill |
|
|
|
| LNCHF |
chingford |
|
|
|
| LSADD |
addiscombe |
|
|
|
| LSNCHM |
north-cheam |
|
|
|
| CLNEW |
new-cross |
|
|
|
| WRBRIX |
brixton |
|
|
|
| LNCHF |
chingford |
|
|
|
| LSBRO |
bromley |
|
|
|
| LNPOP |
poplar |
|
|
|
| LSCTFD |
catford |
|
|
|
| LSBEU |
beulah-hill |
|
|
|
Our primary nameserver ns0.gradwell.com is currently not responding to queries.
Our server admin team are working on this now and we will post an update here as soon as possible. Our secondary nameservers are functioning without issue.
We apologise for any problems this is causing
***Update - 15:38***
ns0 is now back up and running and resolving correctly.