Gradwell Internet for business people

Gradwell Service News

Monthly Archive August, 2009

COMPLETED: Notice of shared SAN storage at risk

A number of ESX virtual machines on ‘europa’ (one of our SAN units) are at risk due to some unexpected and critical RAID issues with the system. We are in the process of moving off (whilst live) a number of affected VMs, most of these are “clustered” services and those which are potential points of failure are being prioritized for moving. At the moment all services are running as expected, and monitoring has not detected any impact upon availability or performance.

Update: 22nd September, there are very few services still remaining on this storage unit, and our server team have nearly completed of all customer facing services off.

Update 25th September: This issue is now resolved, the hardware running ‘Europa’ has been decommissioned.

RESOLVED - newsip.gradwell.net

Users on newsip.gradwell.net (including migrated lon-pbx-7 users) may be seeing issues making and receiving calls. Our server admin team are working on this now and will post an update here as soon as possible.

We apologise for any problems this is causing

UPDATE: Our system admins applied a fix for this issue which is working correctly. We will be continuing to monitor the situation and investigating the cause fully.

COMPLETED: Maintenance work for web load balancers

This evening from approximately 23:00 until 00:00, we will be performing maintanance on our web load-balancers, which is part of the work for removing the legacy FreeBSD cluster.  There is a possibiity of a brief outage on all web clusters as this work progresses.  We will update this site once the maintenance has been completed, or with any further updates.  Thank for you for patience during these essential works.

This maintenance has now been completed, all PHP 5 sites are now running from our new virtualised PHP 5.2 platform.  Our legacy load balancer has been decommissioned.

COMPLETED: Ezmlm migration

We are currently moving our legacy Ezmlm support to a new virtual server so that the equipment which used to host it (ochre.gradwell.com) can be decommissioned.

Update: 11:30, We have now moved the lists. IP address over to the new server ‘mailing-2′ which replaces Ochre for handling all Ezmlm-related mailing list functions, allowing us to decommission Ochre.  We have performed testing of basic list functionality to ensure that this change should be transparent for customers.

RESOLVED - newsip.gradwell.net call issues

Some users using our newsip platform, including users of lon-pbx-7, may be experiencing problems making calls outbound.

Our server admin team are aware of this and are working to resolve this issue. We will post an update here as soon as possible.

We apologise for any problems this is causing

***Update - 10:10***

Our server admin team isolated the cause of the problem and implemented a fix at 09:36. We have fully tested and are now happy that our newsip platform is passing calls correctly.

If you are still seeing any problems, please raise an incident with our support team for further investigation

COMPLETED: Scheduled maintenace for core network - 16th Sep.

We are performing scheduled maintenance on 16th September 2009 which
will involve a period of outage across all products and services.

During this maintenance window we will be:

  • performing software upgrades to core network equipment
  • performing software upgrades to DSL network equipment
  • running additional cabling for the core network to ensure
    tolerance of any important link failure, and to add
    pro-actively add capacity to areas of the network which
    will experience significantly increased throughput over
    the coming months as Gradwell continues to grow
  • updating configuration on all aspects of the network to
    fully support native IPv6 addressing
  • upgrading switches used with our SAN solution to improve
    storage reliability with the vSphere ESX 4.0 platform
  • upgrading head nodes of the ESX storage solution to provide
    increased reliability, throughput and manageability
  • running additional cabling for the core network to ensure
    tolerance of any important link failure

Outage will begin on 16/09 at 03:00 (GMT) and we expect to be completed
within 60 - 90 minutes; whilst the service may come up again quite
quickly it should be considered “at risk” between 03:00 and 05:00.

Update: This work was completed with a minimal window of downtime, lasting about ten minutes.

RESOLVED - IAX outbound call issues. iax-lb.gradwell.net

Some users were seeing some difficulty in placing calls over IAX to iax-lb.gradwell.net

Our server admin team have found an issue with the load balancer, this has been rectified and full service has now been restored.

We apologise for any problems this has caused.

RESOLVED: ADSL LLU Exchange Outages

We have been informed by our carrier that a number of exchanges in the Devon and Cornwall area have lost connectivity due to a fault on the Surf network.

UPDATE 14:25 IP Engineers have rectified this issue, connectivity has been restored to all affected exchanges and confirmed stable.

The list of affected exchanges have been listed below for reference

  • barnstaple
  • bideford
  • bodmin
  • brixham
  • crediton
  • dawlish
  • exeter
  • exmouth
  • honiton
  • ilfracombe
  • kingsbridge
  • newton-abbot
  • newtonabbot
  • ottery-st-mary
  • otterystmary
  • paignton
  • par
  • sidmouth
  • st-austell
  • st-marychurch
  • staustell
  • stmarychurch
  • teignmouth
  • tiverton
  • torquay
  • totnes
  • truro

Customers on these affected exchanges will have no ADSL connectivity, further updates will follow from our carrier when available.

RESOLVED: ADSL Session Outage

We have been alerted by our ADSL carrier that approximetly 4000 ADSL sessions have been terminated due to a failed supervisor card. Our carrier has dispatched engineers to replace the faulty hardware.

A list of affected exchanges has been issued and is included below for reference.

UPDATE 14:51 A further 2 exchanges have also experienced the issue EGHAM and REDHILL , total connectivity loss will be experienced.

UPDATE 15:52 Engineers have now arrived at site to replace the affected supervisor card.

UPDATE 18:08 All affected exchanges are now back online, customers affected by this outage should be able to reconnect , if you cannot reconnect, please power off your router wait 2 minutes and power back on.

LNHPK highams-park
SDPNDHL poundhill
CLNEW new-cross
LSBRO bromley
LSCHI chislehurst
LSADD addiscombe
LSCRO croydon
LSCHI chislehurst
LSBEX bexleyheath
EAGRA grays-thurrock
EAGRA grays-thurrock
LSCTFD catford
SSDOW downend
WRBATT battersea
LSGRNW greenwich
SSDOW downend
WRNELMS nine-elms
WRNELMS nine-elms
LSGRNW greenwich
CLWAL walworth
NDCOP copthorne
LNADK albert-dock
SDHRLY horley
WRBATT battersea
WRBRIX brixton
LSBEU beulah-hill
LSGIP gipsy-hill
LSNCHM north-cheam
LSBEX bexleyheath
CLWAL walworth
LNPOP poplar
LSGIP gipsy-hill
LSCRO croydon
LNADK albert-dock
NDCOP copthorne
LNHPK highams-park
SDPNDHL poundhill
LNCHF chingford
LSADD addiscombe
LSNCHM north-cheam
CLNEW new-cross
WRBRIX brixton
LNCHF chingford
LSBRO bromley
LNPOP poplar
LSCTFD catford
LSBEU beulah-hill

RESOLVED - ns0.gradwell.com

Our primary nameserver ns0.gradwell.com is currently not responding to queries.

Our server admin team are working on this now and we will post an update here as soon as possible. Our secondary nameservers are functioning without issue.

We apologise for any problems this is causing

***Update - 15:38***

ns0 is now back up and running and resolving correctly.