Tim Marsh talks us through what goes on behind the scenes to keep your website live
Every 10 minutes, we have computers at different points round the world make a request to a sample of sites on every one of the 12 web servers (6 clusters of 2) we currently have running on the landscape platform. If it notices a problem, it immediately tries from another location (to remove false positives if that monitor is having issues - rather than us). If 2 of these computers spot a problem such as they cannot connect, or they take too long to connect, they let us know if there are any issues via detailed email (notifying our out of hours on call team).
This email goes to 3 teams, our client services team, our IT operations team, and our technical team. This puts client services ready to respond, and enables our operations and technical teams to launch investigations into any potential problems.
The external monitoring of the sites work in tandem with our internal monitoring systems that we have recently updated.
The Operations team, working with our technical team have recently deployed this new system for monitoring all the servers we use called Nagios.
Nagios uses the concept of sensors that each measure and report particular aspect of a server, from its temperature to the free disk space, to how busy it is and can we connect to it at all.
We currently have 1150 sensors across all of the servers we use in landscape. These are reported in a dashboard which shows us trends, warns us whenever something is approaching, or has gone wrong.
They also send emails and SMS messages to the technical teams when thresholds are breached.
What we have in place is a 24x7 best of breed monitoring system on all the systems powering landscape. We have alerting and historical trending that lets us react as soon as there may be any issues and spot trends over time.
As we continue to expand and improve landscape , our core principle of delivering quality is ensured by having systems like this in the background , so whilst this may not be visible to you, it's one of the tools we use to ensure your website is always there when your clients need it.