SMRM: SNMP-based multicast reachability monitoring

16
SMRM: SNMP-Based Multicast Reachability Monitoring Ehab Al-Shaer Yongning Tang Multimedia Networking Research Laboratory School of Computer Science, Telecommunications and Information Systems DePaul University Chicago, IL 60604 (ehab,yt ang) @cs.depaul.edu Abstract One of the main challenges of deploying multicast services in the Internet is the lack of active monitoring tools that can detect and isolate multicast reach- ability problems at real-time. Existing multicast monitoring tools are either not scalable or using proprietary protocols which limit their deployment in enterprise networks. This paper presents a framework (SMRM) for monitor- ing the health and the quality of multicast delivery paths (or forwarding tree) at real-time. SMRM addresses these limitations by using SNMP as a core component, which significantly facilitates the wide deployment of SMRM in existing networks. The SMRM framework combines distributed monitoring and centralized control, which offers scalable, easy-to-use and easy-to-deploy multicast monitoring service. Keywords QoS Management, Multicast Reachability, Monitoring, SNMP, Network Test- ing. 1. INTRODUCTION Monitoring the network reachability is necessary to detect and discover the cause of network problems such as users are unable to access remote net- work services or they experience considerable quality degradation (e.g., high latency or packet loss). With more deployment of multicast services, monitor- ing of multicast networks has become a crucial for maintaining the multicast operations. The delivery service in multicast networks is more complex than i n traditional unicast networks [9, 141. Members of multicast group send group join requests (IGMP requests) that propagate to the network routing services in order to establish a multicast forwarding tree from senders to receivers. The multicast transport service (UDP) does not provide any feedback to senders or receivers about delivery problems. For example, if multicast members do not receive a multicast stream for any reason, the sender is unaware of this prob- lem, and the receivers are unaware if this because a network problem or the . 0-7803-7382-0/02/$17.00 02002 IEEE

Transcript of SMRM: SNMP-based multicast reachability monitoring

SMRM: SNMP-Based Multicast Reachability Monitoring

Ehab Al-Shaer Yongning Tang Multimedia Networking Research Laboratory School of Computer Science, Telecommunications and Information Systems DePaul University Chicago, IL 60604 (ehab, yt ang) @cs.depaul.edu

Abstract One of the main challenges of deploying multicast services in the Internet is the lack of active monitoring tools that can detect and isolate multicast reach- ability problems at real-time. Existing multicast monitoring tools are either not scalable or using proprietary protocols which limit their deployment in enterprise networks. This paper presents a framework (SMRM) for monitor- ing the health and the quality of multicast delivery paths (or forwarding tree) at real-time. SMRM addresses these limitations by using SNMP as a core component, which significantly facilitates the wide deployment of SMRM in existing networks. The SMRM framework combines distributed monitoring and centralized control, which offers scalable, easy-to-use and easy-to-deploy multicast monitoring service.

Keywords QoS Management, Multicast Reachability, Monitoring, SNMP, Network Test- ing.

1. INTRODUCTION

Monitoring the network reachability is necessary to detect and discover the cause of network problems such as users are unable to access remote net- work services or they experience considerable quality degradation (e.g., high latency or packet loss). With more deployment of multicast services, monitor- ing of multicast networks has become a crucial for maintaining the multicast operations. The delivery service in multicast networks is more complex than in traditional unicast networks [9, 141. Members of multicast group send group join requests (IGMP requests) that propagate to the network routing services in order to establish a multicast forwarding tree from senders to receivers. The multicast transport service (UDP) does not provide any feedback to senders or receivers about delivery problems. For example, if multicast members do not receive a multicast stream for any reason, the sender is unaware of this prob- lem, and the receivers are unaware if this because a network problem or the

.

0-7803-7382-0/02/$17.00 02002 IEEE

468 Session Eleven Multicast Monitoring and Topology Discovery

sender have stopped the transmission. For multicast, like any new technology, there exists a lack of widespread understanding and expertise. Efficient active monitoring tools are necessary to observe the health of the multicast deliv- ery trees, fault report and performance problems, like high latency or packet loss in the delivery path, unreachable members, and abnormal disconnections, which may occur due to routing misconfiguration or bugs in the protocol im- plementation [ 151. The deployment of multicast services requires easy-to-use and easy-to-integrate management tools that are based on a prevalent and well-understood network management protocol, like SNMP (Simple Network Management Protocol) [13].

This paper presents a new framework for multicast network reachability monitoring, called SMRM, based on SNMP and standard management struc- ture (SMI) [22]. NOC (network operation center) personnel use the SMRM framework to generate directed multicast traffic and collect real-time reports about the reachability and the" quality of the multicast delivery at any point or segment in enterprise networks. The SMRM framework enables SNMP agents to generate multicast streams with various traffic parameters (such as rate, packet size and distribution) and monitor the network latency, jitter and packet loss from source to destinations. It provides scalable monitoring via us- ing multicast communication between managers and agents and avoiding the packet implosion problem [6] . The SMRM sessions are dynamically created and configured from a central web-based management station (called SMRM manager). SMRM is based on the dynamic MIB technology [19, 23, 241, which offers a highly extensible management framework. Thus, it is incredibly easy to add new multicast management tasks or tests to SMRM framework dy- namically.

Although multicast monitoring was addressed in number of recent research projects and industrial tools, SMRM is considered the first SNMP-based mul- ticast reachability monitoring framework that provides scalable, extensible, and easy-to-use monitoring framework. Other tools are either limited in func- tionality or they are based on proprietary protocols which limits their deploy- ment in todays networks (see Section 5 . ) .

This paper is organized as follows: Section 2. describes the SMRM opera- tions, components and integration; Section 3. explains the methods to mea-- ' sure the monitoring targets; Section 4. presents the SMRM implementation; Section 5 . discuses the related work; Section 6 . presents the summary and concluding remarks.

,/

2. SMRM ARCHITECTURE, MIBS AND TECHNIQUES

The integration of SMRM in SNMP enables using this tool in most net- works today. Two steps are required for integrating SMRM functionality in SNMP framework: integrating multicasting into SNMP agents, and integrat- ing the reachability monitoring into SNMP. The first step is important to offer scalable monitoring operations and it requires integrating the Multicast MIB (McastlnfoMIB) into SNMP entities. The second step requires integrating the Multicast Reacliability Monitoring MIB (smrmMIB) and the Schedule MIB

SMRM: SNMP-Based Multicast Reachability Monitoring 469

(schedMIB [19]) into SNMP agents. These steps enable SNMP entities to per- form reachability monitoring for multicast (also unicast) networks in flexible and scalable manner. In this section, we describe the architecture, MIBs and techniques used for implementing SMRM agents.

2.1. SMRM Basic Operation

The SMRM framework is based on the SNMP standard operations (e.g., SET and GET). The SMRM consists of three main SNMP entities: (1) SMRM manager that defines multicast reachability test configurations; and the role of the SNMP agents, as senders or receivers, during the test, (2) SMRM sender agents that originate multicast streams according to the session profile defined by the manager, and (3) SMRM receivers agents that sink the mul- ticast streams for traffic monitoring and analysis. Both sender and receiver agents are called SMRM testers. In a typical SMRM session, NOC administra- tors use the SMRM manager to define SMRM trafic and session parameters which are described in Section 4.. When the session is activated, the SMRM manager configures the SMRM testers via setting the proper MIB objects in the agents in order to convey the SMRM session profile (i.e., configuration parameters) such as packet length and data rate and the agents role (senders or receivers). The SMRM manager then originates a multicast traffic from one or more SMRM senders to a group of SMRM receivers in the network based on the profile configurations. In order to make the SNMP agent capa- ble of sending and receiving multicast traffic, the extended SNMP framework described in [6] is used. As a result, the SMRM senders generate multicast stream directed to a specific multicast group as specified in the session profile while the SMRM receivers are joining the same multicast group. The SMRM receivers monitor and process multicast packets as they are received and ana- lyze the traffic to provide reachability information about packet loss, latency and jitter. The SMRM receivers can also store the received data (packet fields) that can be retrieved by the manager for postmortem analysis.

2.2. Multicast-capable SNMP Agents: McastInfoMIB

Although SNMP standard [13] is specified over UDP, it only uses unicast transport service for communication. However, supporting multicast in'SNMP communication is important for scalable management services that can handle large number of managed objects and agents [6]. Furthermore, understanding multicasting by SNMP agents is basic requirement for managing multicast networks efficiently. A preliminary proposal was presented to the research and IETF communities to facilitate the deployment of incorporating IP mul- ticasting into the exisitng SNMP frameworks [l, 61. The proposed framework is a manager-centric that allows managers to re-configure the agents' group membership and communication dynamically and based on the application demands. The manager sets the group configuration information such as mul- ticast IP address and port using Set requests in a new MIB group called

470 Session Eleven Multicast Monitoring and Topology Discovery

Multicast Information Group or mcastInfoMIB in SNMP agents. Agents then react by updating the mcastInfo MIB objects and joining the specified mul- ticast group. Consequently, the manager can send a multicast Set and Get requests to the agents in the group. The mcastInfoMIB group is divided into two classes of objects: Group Munugement Objects that hold the address infor- mation which the agent use to join the multicast group, and Group Commu- nication Objects that hold information about the communication parameters such as TTL (time to live) and randomized timers. The later might be used to avoid the reply implosion problem in the manager [6] and offer scalable agent-manager communication. If the manager uses unicast communication, the agent’s reply are essentially unicast as the multicast extension of SNMP agents does not interfere with the standard unicast interaction in SNMP.

2.3. Multicast Reachability Monitoring MIB: smrmMIB

The smrmMIB is a new MIB used to store SMRM configuration, like ses- sion and traffic parameters used by the SMRM testers to create and monitor sessions (see Figure 1). The smrmSessNum and smrmSessLimit indicate the number of active SMRM sessions and the maximum session number allowed of SMRM sessions respectively. In addition, there are four classes of objects supported in smrmMIB group as described below (also see Figure 1). SMRM Parameters Objects. This class is used by SMRM testers (senders or receivers) to store the SMRM configurations sent by the manager (see Fig- ure 1). The most important objects of this class are the smrmSessID that uniquely identifies each SMRM test session, and smrmRole which defines the agent tester role during this session as a tester sender or a tester receiver. This class consists of two types of configuration objects: session objects and trufic objects. The sesszon objects include smrmSessGrpIP and smrmSessGrpPort to store the multicast IP address and port number, respectively, that the SMRM receivers use to join and monitor the multicast groups, and SMRM senders use to deliver the multicast stream. The smrmSessStatus is used by sender and receiver agents to indicate the status of the session, idle, actzve or completed, that is usually checked by the manager for administration purposes. These five objects are used by both types of SMRM tester. The SMRM sender use the following traffic objects: smrmSendRate , smrmTraf f Dist , smrmPktLen, and smrmSessInterva1 which defines the average multicast sending rate, the traffic distxibution, the multicast packet length, and when to stop send- ing respectively. Users can bound the duration of an SMRM session either by smrmSessInterva1 that specifies how long the session is in minutes or by smrmTotalBytes which specifies total number of bytes to be transmit- ted by the sender. The manager might assign a unique ID for each agent, smrmAgentID, which will be used to compute an agent mask to filter out received messages from the manager, if smrmGrpMask is set. This optional feature can be used by the manager to limit the replies from agents. The smrmSessLaunchButton is a control parameter object which managers may use to set on/off the corresponding smrmSessID in the sender(s).

The smrmSessTimer, smrmRepMode and smrmMonTargets are objects used

SMRM: SNMP-Based Multicast Reachability Monitoring 471

smrmParametersTable smrmTrafficlnfoTable smrmSendReptTable smrmRecvRepottTable

smrmSessLaunchButton

Figure 1 smrmMIB Group in Extended MIB 11.

by receivers only. The first one indicates, if it is set to TRUE, that SMRM receivers must use a randomized timer before sending the SNMP reply to the manager. This technique is used to avoid reply implosion in the man- ager and provide scalable monitoring. The second object defines the report- inglmonitoring mode, on-line or postmortem, as described in Section 2.1.. The third object indicates which of the reachability the attributes (packet loss, de- lay and jitter) are to be monitored by this agent. Managers may task agents in an SMRM session with different monitoring targets as multicast paths may encounter different kinds of problems. The value of smrmMonTarget s ranges from 0 to 7 which indicate all targets combinations as 0 means no requested target (postmortem analysis) and 7 means all targets (packet loss delay and jitter). This object is important to limit the monitoring processing overhead in the agent. Notice that the smrmSessID and smrmSourceIP are both the in- dices for entries of this table. This means a sender might participate in more than one SMRM session with different parameters. The objects in this class have read-write access type.

SMRM Traffic Information Objects. The multicast traffic originated by the SMRM senders is actually a sequence of Set-request packets where each one contains the following OIDs to be set in the agent smrmMIB: smrmSessID

412 Session Eleven Multicast Monitoring and Topology Discovery

for a session ID, smrmSourceIP for the sender IP address, smrmPktSeq for the packet sequence number, smrmPktSenderTS the sender timestamp and smrmPktLen for the packet length (see Figure 1). In order to make the Set re- quest packet size matching the smrmPktLen, a dummy OID smrmDummyTraf f ic of Octet syntax is included in the variable binding along with the OIDs listed above. The SMRM sender increments the value of smrmPktSeq and smrmPktSenderTS objects each time a new set request is sent out. This class of objects is used exclusively by SMRM receivers to store the OID values of the set requests in multicast traffic. Notice that Set requests are sent as a multicast messages to the agents’ groups as described in Section 2.2. . When the agent inserts smrmPktSenderTS object in smrmMIB, it automatically set the current time in the smrmPktRecvTS object.

This information is recorded in such objects only if the postmortem moni- toring mode is selected by the manager, which retrieves these objects from the agents, after the session ends, to calculate the packet loss, delay and jitter. The calculation of the monitoring targets is described in the following sec- tion. Notice that the smrmSessID, smrmSourceIP and smrmPktSeq represent the indices for this table. The objects in this class have a read-only access type.

SMRM Receive Report Objects. This class is also used by SMRM re- ceivers to store the target monitoring information, the number of packet loss (smrmLostPktsNum), and the traffic jitter (smrmJitter). This class is used by agents if the on-line monitoring is selected. In this case. the agent uses the traffic information OID values in the multicast Set requests generated by the sender to continuously calculate the monitoring targets. Therefore, managers can retrieve the monitoring targets information (using SNMP get requests) at real-time if on-line monitoring mode is used. This class contains also other objects that reports the total number of bytes and packets received so far and the ratio of number of lost packets over number of received packets (smrmPktLossRatio).The smrmCurrentSeqsmrmCurrentSendTS, smrmCurrentRecvTS, and smrmcurrent Jitter are objects used locally by agents to calculate the monitoring targets as described in the Section 3.. Both smrmSessID and smrmSourceIP objects are the index for entries of this SmrmRecvReportTable. The objects in this class have read-only access ex- cept smrmRecvNumOf Pkt and smrmLostPktsNum which have read-write access as they can be set by managers to adjust the monitoring analysis/calculation as described in Section 3.. SMRM Sender Report Objects. This class is also used by SMRM senders to store the delay monitoring information, the maximum transmission de- lay (smrmMaxDelay), and the average transmission delay (smrmAvgDelay) for on-line monitoring. The variable smrmCurrentDelay is used to store the in- termediate delay values for calculating the smooth average as described in Section 3.. Both objects smrmSessID and smrmRecvIP which is the IP of re- ceiver agents are used as the indices for entries in the smrmSendReportTable. The objects in this class have read-only access.

SMRM: SNMP-Based Multicast Reachabiliiy Monitoring 413

2.4. Schedule MIB Application in SMRM

The Schedule-MIB [19] allows to schedule simple SNMP Set operations on the local SNMP agent in a regular basis or at specific future points in time. In conjunction with the Script-MIB this allows to launch short-time scripts at regular intervals or to start and terminate scripts on scheduled points in time. Using the schedule MIB, the manager can schedule SMRM test session in three different ways: (1) periodic, in which case the SMRM senders will pe- riodically trigger the SMRM script that sends Set requests to a specified mul- ticast group, (2)one-shot, in which case the SMRM sender triggers the SMRM script one time; (3) calendar-based, in which case the SMRM script will be scheduled according to specified time and date. The smrmSessLaunchButton in smrmMIB is the script-trigger object that the schedMIB sets in order to launch smrmscript.

The schedMIB functionality can be integrated in SMRM managers or the senders. However, we choose to integrate schedMIB in SMRM senders for scalability purpose as the session manager might get involved into too many SMRM sessions.

3. SMRM MONITORING TARGETS CALCULATIONS

This section describes the calculation process of packet loss, delay and jitter as performed by the agents during on-line monitoring mode.

Packet Loss Calculation. When an SMRM agent receives a multicast Set request, it increments smrmRecvNumOf Pkts value and updates the smrmCurrentSeq with the new sequence number if the new one is larger than the the current sequence number. Then, the value of smrmLostPktsNum is set to smrmCurrentSeq-smrmRecvNumOf Pkts. And the loss ratio, smrmPktLossRati is calculated as smrmLostPktsNum/smrmCurrentSeq. The manager retrieves and plots the accumulative loss ratio in a graph interface frequently and based on a polling interval (see Section 4.1.). However, the manager updates the loss ratio graph only if the smrmRecvNumOfPkts or smrmCurrentRecvTS has been incremented in the TR since the last retrieval. Otherwise, if no packets is received during this polling period, a special flag is marked in the graph to indicate that this receiver-path is currently unreachable. An example of an ac- cumulative graph is shown in Figure 3. Notice that the minimum polling time of the manager is 1 second which is significantly larger than the minimum packet inter-arrival time.

In order to calculate the loss ratio for each individual polling interval separately (dispersed graph), the manager must set smrmLostPktsNum and smrmRecvNumOfPkts in the agents’ MIB to zero every time the loss ratio in- formation is retrieved. Similar to the cumulative calculation, if the smrmRecvNumOf Pkts remains unchanged (zero), the graph will indicate the network path unreachability.

Delay Calculation. There are two methods to measure the delay from a sender to receivers in the multicast delivery tree. The first technique is by

414 Session Eleven Multicast Monitoring and Topology Discovery

using the sending and receiving time stamps. This is an accurate and simple technique but it requires synchronizing the sender and receiver clocks using NTP or other protocols. However, this solution is not feasible if network nodes do not support NTP. Therefore, we propose the “ping-pong” technique that uses schedMIB and MIB scripts to make the sender and the receivers sending ping-pong SNMP Set requests to each other.

Simply, the schedMIB of an SMRM sender initiates a Set request that includes the sender timestamp as an OID to set SmrmCurrentSendTS vari- able in the receiver MIB. As a result, this triggers a script in the receiver that sends back a Set request that includes the original sender timestamp, smrmcurrent SendTS, and sequence number, smrmcurrent Seq. The last one is used to identify out of order messages. Thus, when the SMRM sender receives the Set request from the receiver, it calculates M (Round Trip Time or RTT), as follows: M = CurrentTime - smrmSenderTS. Then, M is used to contin- uously calculate the smoothed RTT average (L) according to this formula [NI:

L = a * L + (1 - a) * M where M is the measured RTT and a is a smoothing factor with recom-

mended value of 0.9. Since the returning Set request from the receiver to the sender may not traverse the same outgoing multicast path, the calculated de- lay may not be as accurate as in the first technique. However, the user has the option to use the first technique if NTP is supported in the network. 0th- erwise the “ping-pong” technique is used for monitoring the delay.

Jitter Calculation. The inter-arrival jitter (J) is defined to be the mean deviation (smoothed absolute value) of the difference D in packet spacing at the receiver compared to the sender for a pair of packets. The traffic that the sender sends to measure the delay is also used by agents to measure the jitter. We use the RTP jitter calculation described in [25]. Therefore, assuming S, is the sender timestamp from set request (or packet) i, and R, is the time of arrival in timestamp units for Set request i, then for the two requests i and j , D may be expressed as

D(2 , j ) = (R, - R,) - (S, - S,) = (R, - S,) - (R, - S,) SMRM receivers calculate the inter-arrival jitter continuously as each set

request i is received from source using the difference D for that packet and the previous packet i - 1 in order of arrival (not necessarily in sequence) and according to the following formula:

J, = Jz-l + (lD(i - 1, i)) - J,-l)/l6 This algorithm is the optimal first-order estimator and the gain parameter

1/16 gives a good noise reduction ratio while maintaining a reasonable rate of convergence [25]. A sample implementation is also shown in [25].

SMRM: SNMP-Based Multicast Reachability Monitoring

~

415

Figure 2 SMRM Create Interface

4. SMRM FRAMEWORK IMPLEMENTATION

This sections describes the implementation of the SMRM components: user interface, manager, and agents (testers). It also explains the various options and parameters of SMRM sessions.

4.1. SMRM User Interface

The SMRM user interface is part of the manager functionality. The SMRM interface has two main functions: (1) enabling users to create one or more SMRM monitoring sessions and configuring remote SMRM agents, and (2) allowing users to collect and view the reachability monitoring results in real- time or postmortem basis. that includes four main operations and an exist button. The SMRM user interface is Java-based and is integrated in the SNMP manager developed using AdventNet [2]. In the following sections, we describe the steps and the interfaces used for launching SMRM sessions. Creating or Loading SMRM Session: When a reachability monitoring

test is to be conducted, the NOC manager creates a new SMRM session us- ing the interface in Figure 2 to define the SMRM agents configurations. The

416 Session Eleven Multicast Monitoring and Topology Discovery

configurations of a previous SMRM session can be used in creating a new session. A manager can initiate multiple SMRM monitoring sessions simulta- neously on the same network. For each SMRM session, users must configure the Session, Traffic and Agents parameters shown in Figure 2 as follows:

0 Session Parameters- Users have to define the multicast group to be moni- tored using Group Address and Group Port in the create session interface. The Session Period defines the length of the testing session in seconds (T ime Interval) or in number of bytes (Total Bytes) as shown in Figure 2. In other words, the user might choose to run an smrm session for 3 hours, for example, or for the time it takes to send a bulk traffic of 100 MBytes. The SMRM testers provide information about three monitoring objects: packet loss, latency and jitter which are major attributes for determin- ing the Quality of Service (QoS) for multicast networks. An SMRM session must be assigned a unique name in the (Session ID) parameter. The SMRM receivers use <smrmSessID, smrmSourceIP, smrmPktSeq> tuple included in the Set requests to uniquely identify multicast traffic generated by dif- ferent senders in SMRM sessions. D a f i c Parameters Configuration- This configuration section is to shape the outgoing multicast traffic according to specific parameters such as the sending rate, packet length, trafic distribution (e.g., uniform, Passion, Pareto), and when to start sending this traffic (Starting Time and Starting Date). Agents Configuration- We assume that NOC personnel who intend to use SMRM know the topology or at least the end-point nodes of the network under testing. This is important to determine the network segments under test/monitoring. Users can specify this by listing the IP addresses of the sender(s) and the receivers in the SMRM session. The SMRM receivers can be configured to use on-line monitoring or postmortem analysis in smrm sessions. This feature is important to accommodate a wide range of moni- toring requirements and network environments as described in Section 2.1.. The user can enable implosion control to prevent reply explosion in the manager when large number of agents exist in the session. The user can also select the NTP (Network Time Protocol) option to enable delay cal- culation based on the sending/receiving timestamp. Otherwise, ping-pong techniques is used to measure the delay as described before.

When the SMRM session configuration is completed, the manager can ini- tiate a session by activating it (Activate Session in Figure 2). This causes the manager to contact and configure the SNMP/SMRM agents of the IP

.!! 4 addresses in this session. .'<I

SMRM View Session: The view interface (Figure 3) allows managers to retrieve and present the monitoring results of various SMRM sessions from different agents. The interface contains of four graphs that show real-time charts of the monitoring targets over time. Each graph area is to plot one of the monitoring targets (packet loss, delay or jitter) for a specific tree path defined by the Session ID, Sender IP and a group of Selected Receivers. Once

SMRM: SNMP-Based Multicast Reachability Monitoring 411

Figure 3 SMRM View Interface

the session ID is selected, the interface shows the associated senders and re- ceivers. Notice that the sender IP is needed because multiple senders may exist in the one session. The Polling Interval parameter is to determine how frequently the manager should do the polling This parameter is particularly important to control the information freshness and the monitoring intrusive- ness tradeoff. The loss ratio, delay and jitter charts show the total percentage of packet loss, average delay and average jitter, respectively, after each polling interval. Users can switch charts of different sessions back-and-forth dynami- cally without re-activating sessions. Figure 3 shows that the receiver of ID 11 suffers lost of reachability problems compared with other receivers in the this session, mboneSdrTest. In fact, this receiver was unreachable between 80 and 120 minutes after the beginning of the session

SMRM MIB Browser: The SMRM user interface provides a standard MIB browser which enables managers to view the MIB objects values of the SMRM

478 Session Eleven Multicast Monitoring and Topology Discovery

Testers. This function is useful for debugging, and verification purposes. This Figure also shows the delay and jitter of the same receivers at real-time.

4.2.

We first extended the MIB I1 module to include mcastInfo and smrmMIB group under enterprises group [23]. We use (1)the implementation of net- snmp agent package 4.2 (previously known as UCD-snmp) from University of California at Davis (http://www.net-snmp.org) as case study for developing this framework, (2) the implementation of schedMIB, schedule-mib-0.5 [19], and (3) Perl 5 SNMP module supported by Swiss Academic and Research Net- work [21]. We use the Perl 5 SNMP package to create the script (smrmscript) launched by the schedule MIB. This package is completely stand-alone and portable to many platforms.

Other implementations of SNMP agent can be used similarly. Three steps are required to implement SMRM agents and manager:

SMRM Agent and Manager Implementation

1. Incorporating the multicast functionality and mcastInfoMIB in SNMP agent

2. Integrating the SMRM-specific MIBs: schedMIB, and smrmMIB, which are

3. Integrating the SMRM management GUI described in Section 4.1.

and manager as described in Section 2.2.

described in 2.3..

The SMRM manager is a JAVA-based program developed using AdventNet SNMP API Release 3.2 package [2]. It also incorporates the multicast func- tionality and the SMRM-specific MIBs. The SMRM manager is developed based on SNMPvl and the usability of SNMPv3 is work-in-progress.

5. RELATED WORK

With the deployment of multicast services in global networks, many research proposals, experiments and tools were developed for monitoring multicast net- works and operations. However, only few of them consider multicast monitor- ing solutions that are scalable, easy-to-use and easy-to-deploy. In this survey, we classified the related work into two categories:

Multicast Monitoring Tools. Most of this work was inspired by the de- ployment effort of multicasting, which requires tools for debug and diagnose the network problems. One of the most useful tools in this category is mtruce which discovers the routers in the reverse multicast path from a given re- ceiver to the source of a multicast group [16]. It also gives simple statistics of the discovered path such as packet loss and delay. Mtrace requires a special support in routers in order to collect this information. Other tools, like sdr- monitor [17], MHeulth [20] and RTPmon [12], observe the global reachability of sdr and RTF' multicast messages to group 'members by collecting feedback

SMRM: SNMP-Based Multicast Reachability Monitoring 479

from multicast receivers. Mrinfo is another tool used to give the status of a multicast router such as active multicast tunnels and interfaces. A very useful comprehensive survey of similar tools is presented in [8].

Many of these tools suffer the following limitations: (1) they are too re- stricted in their functionality to support an extensible framework that can monitor other aspects of multicast networks such as reachability and perfor- mance problem, (2) they require special support in the network, like mtrace, or they are restricted on RTP or SDR applications, like Mhealth and sdr- monitor, and (3) they don’t not scale with large multicast groups due to the reply implosion problem.

Multicast Monitoring Frameworks. These systems tend to offer a broader solution for multicast monitoring. The Mmon application in HP Openview is the first attempt to provide a complete framework for multicast management. It provides an automatic discovery of tree topology and multicast-capable routers, a query service to investigate the status of multicast routers, paths and traffic information through a GUI. This allows operators to identify and isolate faults as they occur in the network. Mmon uses SNMP to gather the information from various MIBs such as IGMP MIB, PIM MIB, CBT MIB, IPMROUTE MIB. The main limitations of mmon approach are: (1) it suffers a scalability problem due to SNMP unicast communication [6], (2) the lack of active monitoring to enable injecting and monitoring multicast traffic at any selected points in the network for fault diagnoses and isolation, (3) lack of supporting inter-domain multicast management.

Other approaches use proprietary protocols (instead of SNMP) to address the previous limitations. The Multicast Reachability Monitor or MRM [lo, 111 is the first framework that introduces a new protocol for multicast reachabil- ity monitoring. MRM uses active monitoring in which a multicast traffic is originated by a delegated sender (called TS) to multicast receivers (called TR) which continuously collect statistics and send status reports to the manager. Although MRM is a step forward toward efficient multicast monitoring, its deployment will be limited because of using a proprietary protocol and special agents. It is also not clear from [ll] how the delay and jitter are calculated when no clock synchronization is supported between sender and receivers. In addition, MRM framework lacks many of the flexibility and extensibility features of SMRM such as providing on-line and postmortem analysis, differ- ent traffic distributions and the ability to download dynamically new man- agement scripts in the monitoring agents. A Hierarchical Passive Multicast Monitor (HPMM) is another recent proposal [26] that uses a proprietary pro- tocol for faults detection and isolation in multicast networks . Unlike MRM, HPMM use passive monitoring agents that communicate with each other us- ing unicast. The HPMM agents are organized in a hierarchy according to their locations from the multicast sender. Monitoring reports are collected and cor- related in a hierarchy fashion to provide a large-scale monitoring architecture. The idea of hierarchical monitoring and filtering was well investigated in many other monitoring systems such as HiFi system [3, 51. Nevertheless, the HPMM presents an interesting hierarchy setup protocol that overlays the actual mul- ticast tree to build an efficient hierarchy of agents. The main drawback of this

480 Session Eleven Multicast Monitoring and Topology Discovery

approach is its requirement of deploying new HPMM agents in routers and local domains, limits the deployment of this approach significantly. HPMM uses only passive monitoring for fault isolation. Although passive monitoring has the minimum intrusiveness, active monitoring is important for detecting and isolating network problems. As HPMM is strictly a fault management tool, monitoring other network parameters such as delay and jitter is not currently supported in HPMM. And finally the use of unicast in the agent communication increases the maintenance cost associated with the hierarchy which makes it less practical [4].

Unlike previous work, SMRM utilizes the existing management infrastruc- ture of SNMP agents and requires no changes in network. The SNMP agent software is widely acceptable as a mature implementation. Users/vendors are always reluctant to deploy new daemons that might introduce reliability and security problems. In addition, SMRM provides a broader monitoring scope that includes observing delay and jitter which are important for QoS man- agement of multicast networks.

6. CONCLUSION

Multicast is a network service for providing scalable group communication in the Internet. With the constant evolution of multicast routing, from flat MBone to hierarchical routing multicast in Internet2 [9], multicasting be- comes more widely deployed and more complex as well. Multicast monitoring is necessary not only for debugging and fault detection, but also for measur- ing the quality of multicast reachability to the group members. This paper presents an easy-to-use easy-to-deploy framework for multicast reachability monitoring called SMRM. SMRM is an active qonitoring tool that injects a specified multicast stream between sender(s) and receiver(s) and reports at real-time the QoS parameters such as loss, delay and jitter of the network paths. Unlike previous work, SMRM is the first active monitoring framework that utilizes the existing SNMP infrastructure and requires no changes in the network. For SMRM deployment, simple extensions to SNMP agents and MIB are required. Among the most important features for SMRM is the scalability to large number of groups and agents, extensibility via dynamic MIB scripts which allow developers to deploy their own multicast management scripts, providing on-line and postmortem monitoring analysis, allowing for the se- lection of traffic characteristics, and incurring minimal operational overhead. Our experiments with the prototype implementation of SMRM shows that the overhead of SMRM ping-pong technique is very comparable with tracer- oute and the impact of the message payload is negligible [7]. Nevertheless, we consider ShiIRM is a basic component toward supporting a comprehensive multicast management solution, which includes fault detection and isolation, event correlation for performance and security management, QoS monitoring, and SLA (Service Level Agreement) management in enterprise networks. Our research agenda also includes integrating IP multicast in SNMPv3, develop- ing IP routing topology discovery, monitoring preexisting multicast sessions, enhancing the delay monitoring scheme, and providing self-synchronization scheme.

SMRM: SNMP-Based Multicast Reachability Monitoring 481 .

REFERENCES

[l] SNMPv3 IETF WG Mailing List, [email protected]. com, June 2001. [2] AdventNet API for SNMP Development. Technical report, AdventNet,

http: / /www . adventnet. com/products/snmp/index. html. [3] Ehab Al-Shaer. Active Management Framework for Distributed Mul-

timedia Systems. Journal of Network and Systems Management (JNSM), 8(1), March 2000.

[4] Ehab Al-Shaer. A Dynamic Group Management for Scalable Distributed Event Correlation. In IEEE/IFIP Integrated Management (IM’2001), May 2001.

[5] Ehab Al-Shaer, Hussein Abdel-Wahab, and Kurt Maly. HiFi: A New Monitoring Architecture for Distributed System Management. In Pro- ceedings of International Conference on Distributed Computing Sys- tems (ICDCS’99), pages 171-178, Austin, TX, May 1999.

[6] Ehab Al-Shaer and Yongning Tang. Toward Integrating IP Multicasting in Internet Nework Management Protocols. Journal of Computer and Communications Review, 24(5):473-485, December 2000.

Design and Implementation of SNMP-Based Multicast Reachability Monitoring Framework, CTI. Technical report, School of Computer Science, Telecommunicaitons and Information Systems, DePaul University, August 2001.

[8] K. Almeroth. Managing IP multicast traffic: A first look at the issues, tools and challenges. Technical report, IP Multicast Initiativesummit, September 1998.

[9] K. Almeroth. The Evolution of Multicast: From the MBone to Inter- domain Multicast to Internet2 Deployment. IEEE Network, Jan 2000.

[lo] K. Almeroth and L. Wei. Justification for and use of themulticast rout- ing monitor (MRM) protocol. Technical report, IETF InternetDraft, February 1999.

[ll] K. Almeroth, L. Wei, and D. Farinacci. Multicast Reachability Monitor (MRM), Internet Engineering Task Force Internet Draft. Technical report, IETF, October 1999.

[12] D. Bacher, A. Swan, and L. Rowe. rtpmon : A third-party RTCP mon- itor. ACM Multimedia ’96, pages 437-438, November 1996.

[13] J. Case, M. Fedor, M. Schoffstall, and J. Davin. Simple Network Man- agement Protocol, RFC1157. Technical report, IETF, May 1990.

[14] S. E. Deering and D. Cheriton. Multicast routing in internetworks and extended lans. ACM Transactions on Computer Systems, 8(2):85-110, May 1990.

[15] C. Diot, B. Levine, B. Lyles, H. Kassem, and D. Balensiefen. Deployment Issues for the IP Multicast Service and Architecture. IEEE Networks Special Issue on Multicast, January 2000.

[16] W. Fenner and S. Casner. A ”traceroute” facility for ip multicast, ietf internet draft. Technical report, March 2000.

[17] M. Handley. Sdr: Session Directory Tool, Technique Report. Technical report, University College London, March 1995.

[ 181 V Jacobson. Congestion Avoidance and Control. Computer Communi-

[7] Ehab Al-Shaer and Yongning Tang.

482 Session Eleven Multicast Monitoring and Topology Discovery

cation Review, 18(4):314-329, August 1988. Definitions of Managed Objects for

Scheduling Management Operations, RFC 25 91. Technical report, IETF, May 1999.

[20] D. Makofske and K. Almeroth. MHealth: A real-time graphical multicast monitoring tool for the MBone. In Workshop on Network and Oper- ating System Support for Digita Audio and Video. NOSSDAV, June 1999.

[21] Per1 5.005 SNMP Package. Technical report, Swiss Academic and Research Network (SWITCH), http: //www.switch.ch/misc/leinen/snmp/perl/.

[22] M. Rose and K. McCloghrie. Structure and Identification of Management Information for TCP/IP-based Int ernets, RFC 1155. Technical report, IETF, May 1990.

[23] M. Rose and K. McCloghrie. Concise MIB Definitions, RFC 1212. Tech- nical report, IETF, March 1991.

[24] J. Schnwlder and J. Quittek. Script MIB Extensibility Protocol Version 1.0., RFC 2593. Technical report, IETF, May 1999.

[25] H. Schulzrinne, S. Casner, R. Frederick, and V. Jacobson. RTP: A Trans- port Protocol for Real-Time Applications. Technical report, IETF, July 2001.

[26] J. Walz and B.N. Levine. A Hierarchical Multicast Monitoring Scheme . In 2nd Internatzonal Workshop on Networked Group Communicatzon, November 2000.

[19] D. Levi and J. Schoenwaelder.