Jun 112010

Service providers started moving to IP based technologies in their infrastructures because of the flexibilities offered by these technologies. We have seen all-IP projects all around the world where traditional voice, data and transmission infrastructures are transformed to IP based infrastructures. All-IP also means sending the traditional assets (NEs and management systems) to trash. Thus, big operators try to use a transition approach to move to all-IP. In this approach, they are still using traditional technologies (such as SDH/SONET) in the backbone and put IP on the top of them. However, this reduces throughput as these technologies add extra overhead. Operators that have vast amounts of free capacity may not care this throughput issue now but it will definitely be a problem in the future.

Traditionally, the transmission environment was dependent on the circuit based technologies. SDH, PDH, SONET, ATM, Frame Relay are those type of technologies where you construct logical circuits on the top of physical infrastructure. Looking from the OSS perspective, modeling a circuit based technology is easy. This is because, it is predictable. For example, we will definitely know that the VC-12 low-order circuit that starts from point A will end at point B. On its way it will be transported over several high-order VC4s that are running on physical STM-1, 4, 16 circuits.

In our fault managers it is “easy” to load this hierarchy to a correlation engine and run root-cause analysis algorithms to find the root cause of a network problem. Or, if we take the ATM case, the PVCs are deployed on predefined paths which are also a good candidate for topology based root cause analysis. Frame relay likewise…

When we move to IP, things become more complicated from the logical inventory perspective. The root from point A to point B is no longer predictable. If one of the links to that direction fails, the router will, hopefully, find another way around to send the packet. This dynamic behavior put a barrier in front of the traditional topology based root cause analysis.

In order to do root-cause analysis in packet-based dynamic networks, we need another approach. That is Real-Time Topology Discovery. Real-Time Topology Discovery uses the same techniques of any auto discovery process. The difference is that it does this more frequently. There are 2 approaches I have seen up to now. Virtual Routers and Routing Protocol Listening.

Routing Protocol Listening, utilizes the “topology table” feature of linked-state routing protocols (such as OSPF and IS-IS). Devices (routers) that implement a link-state routing protocol maintains two tables in their memory: Routing Table and Topology Table. Devices first generate their topology tables by listening routing protocol updates. After the topology table is generated, they apply a best-path algorithm to determine the best paths to the network destinations. These best paths are inserted into the routing table to be used in the routing decisions.

Routing Protocol Listeners “sniff” these topology related conversations between the devices to construct a real-time topology of the network. This topology can then be used in root cause analysis. Some network optimization tools use this technique to populate the network topology where they can later run some what-if scenarios on the top. This topology can also be enriched by some SNMP queries to have some other information that are not exchanged by routing protocols. (Such as link utilizations)

The second approach is similar to the first one in some sense. In this approach you create an in-memory clone of the device on your management server. For each device/VRF, you create another instance. You copy the initial configuration of the real device on the virtual one and then start listening alarms/events from the real-device.
This way, you have the near real time topological view of your network where you can base your analysis.

First approach is more real-time but this applies to link state protocols only. If you are using a distance vector protocol, for example, you have to pick the second option.

In today’s world where we started to work more with intelligent NEs and network protocols, Real-Time Topology Discovery solutions will definitely find their places in the next generation OSS.