Jun 112010
 

Service providers started moving to IP based technologies in their infrastructures because of the flexibilities offered by these technologies. We have seen all-IP projects all around the world where traditional voice, data and transmission infrastructures are transformed to IP based infrastructures. All-IP also means sending the traditional assets (NEs and management systems) to trash. Thus, big operators try to use a transition approach to move to all-IP. In this approach, they are still using traditional technologies (such as SDH/SONET) in the backbone and put IP on the top of them. However, this reduces throughput as these technologies add extra overhead. Operators that have vast amounts of free capacity may not care this throughput issue now but it will definitely be a problem in the future.

Traditionally, the transmission environment was dependent on the circuit based technologies. SDH, PDH, SONET, ATM, Frame Relay are those type of technologies where you construct logical circuits on the top of physical infrastructure. Looking from the OSS perspective, modeling a circuit based technology is easy. This is because, it is predictable. For example, we will definitely know that the VC-12 low-order circuit that starts from point A will end at point B. On its way it will be transported over several high-order VC4s that are running on physical STM-1, 4, 16 circuits.

In our fault managers it is “easy” to load this hierarchy to a correlation engine and run root-cause analysis algorithms to find the root cause of a network problem. Or, if we take the ATM case, the PVCs are deployed on predefined paths which are also a good candidate for topology based root cause analysis. Frame relay likewise…

When we move to IP, things become more complicated from the logical inventory perspective. The root from point A to point B is no longer predictable. If one of the links to that direction fails, the router will, hopefully, find another way around to send the packet. This dynamic behavior put a barrier in front of the traditional topology based root cause analysis.

In order to do root-cause analysis in packet-based dynamic networks, we need another approach. That is Real-Time Topology Discovery. Real-Time Topology Discovery uses the same techniques of any auto discovery process. The difference is that it does this more frequently. There are 2 approaches I have seen up to now. Virtual Routers and Routing Protocol Listening.

Routing Protocol Listening, utilizes the “topology table” feature of linked-state routing protocols (such as OSPF and IS-IS). Devices (routers) that implement a link-state routing protocol maintains two tables in their memory: Routing Table and Topology Table. Devices first generate their topology tables by listening routing protocol updates. After the topology table is generated, they apply a best-path algorithm to determine the best paths to the network destinations. These best paths are inserted into the routing table to be used in the routing decisions.

Routing Protocol Listeners “sniff” these topology related conversations between the devices to construct a real-time topology of the network. This topology can then be used in root cause analysis. Some network optimization tools use this technique to populate the network topology where they can later run some what-if scenarios on the top. This topology can also be enriched by some SNMP queries to have some other information that are not exchanged by routing protocols. (Such as link utilizations)

The second approach is similar to the first one in some sense. In this approach you create an in-memory clone of the device on your management server. For each device/VRF, you create another instance. You copy the initial configuration of the real device on the virtual one and then start listening alarms/events from the real-device.
This way, you have the near real time topological view of your network where you can base your analysis.

First approach is more real-time but this applies to link state protocols only. If you are using a distance vector protocol, for example, you have to pick the second option.

In today’s world where we started to work more with intelligent NEs and network protocols, Real-Time Topology Discovery solutions will definitely find their places in the next generation OSS.

Apr 182010
 

Network discovery (or auto-discovery) is the process that automatically populates the physical and logical inventory information of a given network. Without auto-discovery, all the inventory items would be entered to the NIMS (Network Inventory Management System) manually, or at most, be imported from a previously generated flat file. Manual entries bring so many burdens especially if your network is dynamic and each day you add/remove new devices, new cards and provision new circuits.

A note in here: Auto-discovery processes may not only be implemented by the Inventory Management Systems, but also other OSS systems such as Fault Managers, Performance Managers or SQMs which need to employ the network topology information. In an ideal world, all the inventory information should be kept in one place, in the NIMS. However, to be able to sell the individual types of OSS products separately, vendors needed to include an inventory module within their applications. This introduces some scenarios in which multiple OSS products try to discover the network at the same time which may lead to inefficiencies in the network performance.

Auto-discovery of a network includes 3 phases:

• Detection
• Device Discovery
• Topology Creation

Detection phase identifies all the “living” devices on the network. This is achieved via several mechanisms.

The most popular one is the ICMP Sweeping (or pinging). In this method, you provide an IP pool (typically a subnet) to the tool that will do the discovery. You may also have some options to exclude some IP addresses within this pool. This is because, for some types of devices/interfaces (such as ISDN dial-backup interfaces) unnecessary traffic (even ping) may result in connection charges. Another reason may be limiting the management traffic on a highly utilized interface.

For big networks the discovery process can take days. So, you should split the network into multiple small ones to increase the discovery efficiency and reduce discovery time.

After the IP address set is constructed, the tool starts to ping those IP addresses. If it receives a response, it notes this down.

You may also have the option to import the IP information from a DNS server. If you are using DNS server effectively, this method will provide you all the “authorized” IPs in your network. One other way could be reading the ARP cache of a device, but for me ARP is not very reliable. (The cache is cleared when the device restarts and some items in the cache are subject to time-out after a certain period of time)

The second step in auto-discovery is the Device Discovery. In this step, the discovery engine will try to connect to a found device via specified management protocols (SNMP, TL1 etc.) to receive additional inventory information. Ideally, this information will include all the inventory information for that NE (Network Element). How this is done differs from protocol to protocol. I will try to explain the SNMP device discovery which seems to be the most adapted one.

All devices that implement SNMP management protocol, should implement the MIB-2 RFC. This MIB (Management Information Base) was designed to include generic information that fits to all vendor equipments such as Name, Description, and Interfaces etc. However, most vendors prefer to use their own propriety MIB trees. (Which are located under the enterprise branch of the OID hierarchy). The motivation differs. Some vendors want to “hide” their devices to force the external OSS systems, to interface with the vendor EMS’s NBIs. Some other vendors may just want to distribute the MIBs to registered users for marketing purposes.

Vendors (whether they provide their own MIBs or not) must fill in some MIB-2 values for their NEs. The most important ones for the discovery purposes are the sysObjID, sysName and sysLocation. SysName and sysLocation will be very beneficial values to have, if they are correctly implemented.

SysObjID is the one which helps the discovery engine to understand the device type (host, switch etc.), vendor name, and model name. All the SNMP enabled vendors should register their devices to IANA organization, in order to get a SysObjID. SysObjID is unique in the world, just like a MAC address. This uniqueness enables the management system to map a SysObjID to a specific device type, vendor name and model.

After the device type is received from the SysObjID, further actions can be run. For example, if it is a router, discovery engine could check its MIB repository to find if there is a specific MIB for this vendor-model. (If not, it will use generic MIB-2). After the MIB is located, it will send series of SNMP-get requests to have the routing table, neighbors table, interfaces, sub-interfaces etc. If it is a switch, the discovery process will again send SNMP-get requests (to BRIDGE-MIB for example) to receive the forwarding table information.

The router example could be considered as L3 discovery, whereas the switch example is an example of a L2 Ethernet discovery. Other types of L2 discoveries can be for ATM, Frame Relay and MPLS type of networks.

At the final phase, topology creation, the discovered NE information is correlated by the inventory process to find and visualize the physical and logical connections between the NEs. Topology information can also be fetched directly from devices. For example OSPF protocol mandates the devices to maintain the topology information. Propriety protocols such as Cisco CDP can also be utilized by the discovery processes to draw the network topology.

Mar 272010
 

Inventory managers keep track of all items in the infrastructure. They give us a complete view of our network in an end-to-end perspective. Having a complete, up-to-date inventory brings several benefits such as;

• Lower CAPEX and OPEX (by utilizing all of the equipment/links available at full capacity, by enabling full automation)
• Increase customer satisfaction (by reducing the provisioning times)
• Shorter time to market for new services (by managing the complete lifecycle)

Inventory items fall in different categories. Resource inventory, for example, deals with resources while service inventory keeps track of services and their relationships with resources. There is also product inventory (generally called product catalog) which keeps records of products that are sold to the market. Today’s inventory management platforms are able to keep all types of inventory. But traditionally when we talk about the inventory management (called NIM-network inventory management sometimes) we generally tend to mean the Resource Inventory.

Resource Inventory comes in two types: Physical and Logical.

Physical Inventory denotes physical items in our infrastructure such as Cabinets, racks, cards, ports, slots. Locations (cities, sites, buildings, rooms) are also in this category.

Logical Inventory is the inventory that keeps track of intangible entities that are provisioned on physical resources. Circuits, IP address ranges all are examples to logical inventory items. Logical inventory is related with Level 2 and above in the OSI structure.

Inventory items can be entered manually or imported to the system. The items that are not “managed” by element managers (such as cabinets) are the candidates for manual entries. (Most tools also give you the option to import from spreadsheets).

Managed elements should be imported from their EMS systems, via their northbound interfaces (NBIs). There are several protocols that different EMS implementations use. Inventory Management systems use specific adapters to collect inventory data from specific EMS systems.
After the full inventory is received from the EMS systems, this data should be kept up-to-date. The deltas appeared in the EMS inventory data are applied to the inventory manager by a resynchronization process.

Most of the times, import of the inventory data is challenging. We have to cope with different types of data, in different names, exposed with different types of NBIs. The naming convention, for example, is among the key parameters that should be fine tuned and kept consistent in the architecture. An item with a wrong naming convention would break the resynchronization rules, causing an incomplete inventory. That is why; the data migration process takes the most time in inventory management projects.

Inventory management system is the heart of the OSS infrastructure. Other OSS systems interface with the inventory management to do their jobs. For example a fault manager will import the topology from the IM to run root cause analysis on it. Trouble management systems will consult to the inventory management to have information about the spare inventory. Order managers; use it for availability checks or reservation purposes. SQM will use service-resource associations for service impact analysis, and so on.
Having a complete inventory management system reduces the messages between the OSS and EMS systems. It will eventually increase the efficiency of your whole OSS and BSS infrastructure.