Importance of Parallel and Local Measurements in Web Monitoring

 Cloud, Performance Management  Comments Off on Importance of Parallel and Local Measurements in Web Monitoring
Apr 172015

Every company that relies on web business should invest in web monitoring platforms. Web monitoring platforms connect to a web site and measure KPI’s such as DNS resolution time, page download time and page consistency. (Checking a specific header or content value). These synthetic transactions, that are run by the probes, help to identify server reachability.

The term “reachability” is important in here as it is not the same as “availability”. There may be some cases where your web application and it’s dependent infrastructure (Web server, DB server, Application server etc) seems to be running smoothly from your side but not the customer’s. This is usually due to routing problems on the network and problems on the remote DNS server.

It is important to know these downtime scenarios when supporting your customers. In some situations you may even take some corrective actions such as guiding users to change their DNS server settings or even opening a ticket to the remote ISP for investigation.

There are 2 important selection criteria to consider when investing in web monitoring service.

First, the service should have local probes. If your business resides in Istanbul/Turkey but your probe resides in Philadelphia/US, the response times or availability calculations may not reflect the truth. Suppose the country has a problem reaching Internet. Your probe will notify you about a downtime. However most of your local users will still be able to reach you.

Second, the service should do parallel calculations. This is for covering the load balancer scenarios. Load balancers will typically work in a round robin fashion to distribute the load across a web server farm. So, if you measure 1 time and the current web server on the queue does not have any problems, you will measure the service as “up”.

However, the next server on the pool may suffer from performance problems or even downtime. If you make at least 3 measurements at relatively same time, you would catch individual server problem within a pool. This is a very important feature you would be seeking when deciding on a web monitoring tool.

Local and parallel calculations will help you identify web server problems and troubleshoot them more quickly.

Aug 142011

Let’s stop and think if it is meaningful for a service operator to give away all the OSS tools and infrastructure it has, for the sake of moving to cloud. A very big shift in the operations would occur for sure. Trouble Ticket, Fault Management, Performance management , Inventory, Fulfilment systems even mediation should be moved.

Cloud solutions promise availability and maintainability with less cost. But they should be on the cloud service provider’s infrastructure which resides on a remote site.

Thinking of the gigabytes of transactions that occur in a service provider environment, it is necessary to invest in high capacity links to the cloud provider. And since the information is critical, we cannot rely on Internet connections, we would need dedicated links.

Guess what then? We should then monitor these links to be sure that they are up and running all the times. If you notice any performance problem, (at L7) you should open a trouble ticket to the service provider to fix the problem. Offcourse you can rely on your service provider’s monitoring capabilities and trust them that they are doing their jobs well but this is not the case in real world. At the end, we would need to reconcile what they say and what we perceive.

OSS Cloud may also come in front of us in the name of “Managed Service”. Some OSS vendors offer this kind of a solution where they construct a shared OSS infrastructure (TT, FM, PM etc.). These services are under heavy control by the cloud (managed) service provider. If you want to create a new report, change an existing rule etc, you cannot do it by yourself. Managed service providers have to apply this change management mechanisms in order to maintain this shared OSS architecture. Most of the high-tier telecom service providers have complex OSS business processes. Moving to an OSS cloud may mean, giving up these processes and move to more generic ones.

On the other hand, OSS cloud solutions could be suitable for green field operators where starting investment costs should be minimized. After becoming more mature, they could easily switch to a private OSS infrastructure. Offcourse, if that happens, investing in the same software would decrease the project’s implementation period so it is wise to understand cloud provider tool’s footprint, roadmap etc. before deciding on their services.