Investigation of Future Optical Metro Ring Networks based on 100Gigabit Metro Ethernet (100GbME

10
Investigation of Future Optical Metro Ring Networks based on 100-Gigabit Metro Ethernet (100GbME) A. Zapata 1 , M. Düser 1 , J. Spencer 1 , I. de Miguel 2 , P. Bayvel 1 , D. Breuer 3 , N. Hanik 3 , and A. Gladisch 3 1 University College London (UCL), Dept. Electronic & Electrical Engineering, London WC1E 7JE, UK 2 Universidad de Valladolid, Dpto. Teoría de la Señal y Comunicaciones e Ingeniería Telemática, 47011 Valladolid, Spain 3 T-Systems Nova, Technologiezentrum, 10589 Berlin, Germany Corresponding author: M. Düser, Tel.: +44 20 7679 3843, Fax: +44 20 7388 9325, [email protected] Abstract This paper reports results of a performance comparison for four optical ring network architectures envisaged for future metropolitan area networks (MANs), with particular emphasis on the design of and compatibility to possible 100-Gigabit-Metro-Ethernet (100GbME) standards. Both analytical and numerical modelling techniques were applied to quantify and compare network performance for all architectures in terms of achievable throughput, delay and the number of required wavelengths. Non-uniform traffic required additional resources and dynamic adaptation of the slotted ring architecture. The study also considered aspects of the physical transmission and interfaces (PHY) for consideration in 100GbME standards. 1 Introduction From its development in 1973, Ethernet has evolved very rapidly from providing short-distance connections between computers in local area networks (LAN) to cover campus size distances and beyond in the 10-Gigabit-Ethernet standard (IEEE 802.3ae). The main advantages of Ethernet are low cost, simplicity and high speed compared to other protocols such as FDDI (Fiber Distributed Data Interface) /1/. However, the connection beyond buildings and campuses is still performed over a combination of ATM and SDH networks, usually by transporting layer 3 protocol data units, such as IP packets, rather than actual Ethernet frames. An implementation of Ethernet in the core may, therefore, yield cost benefits through a less complex network design with fewer layers and provide a simpler, more scalable, homogenous campus-to- campus and metropolitan area networks /2/. Ring topologies are the preferred architecture for implementing MAN networks because they are easy to deploy and provide resilience in the case of a link failure. In this paper we investigate the key requirements to extend the current 10GbE standard to 100GbME. 100GbME would be based on optical ring technologies, providing a unique transport protocol extending into the MAN. The modifications required for the extension to a metro-scale 100GbME network are discussed in section 2. With these requirements under consideration we then evaluate the performance of a range of optical architectures (section 3), to finally discuss physical implementation aspects related to 100GbME networks and physical interfaces (PHY) (section 4). 2 Ethernet Overview and Extensions Ethernet is a standard that defines a connectionless, routable, variable packet size data-link layer protocol and a series of physical layer interfaces. It was originally developed to connect hosts in a LAN over short distances (<100m) over physical layers with a shared bus architecture that had no loops in its topology and no central control system. As a connectionless protocol it offers no guarantees as to packet loss, delay and other end-to-end characteristics. In more recent implementations Ethernet has moved from contented shared bus cabling to hub based unshielded cabling and hub based fibre. The shift to a hub-based architecture has meant that the actions of a hub now determine network capability and performance as all contention over resources now lies within the hub. During this evolution faster switching and higher bandwidth physical links have seen Ethernet move out from the LAN applications into the campus where technologies such as FDDI were typical used. Ethernet’s application in larger and more complex deployments has, however, identified limitations in the original design and drove the development of extensions to the original protocol to support new features. The use of filtering in hubs prevented packets flooding every hub interface and allowed arbitrary physical topologies. The use of a spanning tree algorithm (IEEE 802.1D) removed loops and meant the logical topology would always be a tree and therefore there was only one route to every destination. 802.1D also added the ability to monitor links and passively learn a table of accessible Ethernet hosts. Later additional Ethernet switch (a filtering hub) features were introduced, such as the ability to aggregate multiple physical links between switches (IEEE 802.3ad) to aid the scaling of capacity as well as traffic control extensions such as traffic prioritisation (IEEE 802.1p) and flow control between

Transcript of Investigation of Future Optical Metro Ring Networks based on 100Gigabit Metro Ethernet (100GbME

Investigation of Future Optical Metro Ring Networks based on100-Gigabit Metro Ethernet (100GbME)

A. Zapata1, M. Düser1, J. Spencer1, I. de Miguel2, P. Bayvel1, D. Breuer3, N. Hanik3, and A. Gladisch3

1 University College London (UCL), Dept. Electronic & Electrical Engineering, London WC1E 7JE, UK2 Universidad de Valladolid, Dpto. Teoría de la Señal y Comunicaciones e Ingeniería Telemática,

47011 Valladolid, Spain3 T-Systems Nova, Technologiezentrum, 10589 Berlin, Germany

Corresponding author: M. Düser, Tel.: +44 20 7679 3843, Fax: +44 20 7388 9325, [email protected]

AbstractThis paper reports results of a performance comparison for four optical ring network architectures envisaged forfuture metropolitan area networks (MANs), with particular emphasis on the design of and compatibility topossible 100-Gigabit-Metro-Ethernet (100GbME) standards. Both analytical and numerical modellingtechniques were applied to quantify and compare network performance for all architectures in terms ofachievable throughput, delay and the number of required wavelengths. Non-uniform traffic required additionalresources and dynamic adaptation of the slotted ring architecture. The study also considered aspects of thephysical transmission and interfaces (PHY) for consideration in 100GbME standards.

1 IntroductionFrom its development in 1973, Ethernet has evolvedvery rapidly from providing short-distanceconnections between computers in local area networks(LAN) to cover campus size distances and beyond inthe 10-Gigabit-Ethernet standard (IEEE 802.3ae). Themain advantages of Ethernet are low cost, simplicityand high speed compared to other protocols such asFDDI (Fiber Distributed Data Interface) /1/. However,the connection beyond buildings and campuses is stillperformed over a combination of ATM and SDHnetworks, usually by transporting layer 3 protocoldata units, such as IP packets, rather than actualEthernet frames.

An implementation of Ethernet in the core may,therefore, yield cost benefits through a less complexnetwork design with fewer layers and provide asimpler, more scalable, homogenous campus-to-campus and metropolitan area networks /2/. Ringtopologies are the preferred architecture forimplementing MAN networks because they are easyto deploy and provide resilience in the case of a linkfailure.

In this paper we investigate the key requirements toextend the current 10GbE standard to 100GbME.100GbME would be based on optical ringtechnologies, providing a unique transport protocolextending into the MAN. The modifications requiredfor the extension to a metro-scale 100GbME networkare discussed in section 2. With these requirementsunder consideration we then evaluate the performanceof a range of optical architectures (section 3), tofinally discuss physical implementation aspectsrelated to 100GbME networks and physical interfaces(PHY) (section 4).

2 Ethernet Overview and ExtensionsEthernet is a standard that defines a connectionless,routable, variable packet size data-link layer protocoland a series of physical layer interfaces. It wasoriginally developed to connect hosts in a LAN overshort distances (<100m) over physical layers with ashared bus architecture that had no loops in itstopology and no central control system. As aconnectionless protocol it offers no guarantees as topacket loss, delay and other end-to-endcharacteristics. In more recent implementationsEthernet has moved from contented shared buscabling to hub based unshielded cabling and hubbased fibre. The shift to a hub-based architecture hasmeant that the actions of a hub now determinenetwork capability and performance as all contentionover resources now lies within the hub. During thisevolution faster switching and higher bandwidthphysical links have seen Ethernet move out from theLAN applications into the campus where technologiessuch as FDDI were typical used.Ethernet’s application in larger and more complexdeployments has, however, identified limitations inthe original design and drove the development ofextensions to the original protocol to support newfeatures. The use of filtering in hubs preventedpackets flooding every hub interface and allowedarbitrary physical topologies. The use of a spanningtree algorithm (IEEE 802.1D) removed loops andmeant the logical topology would always be a tree andtherefore there was only one route to everydestination. 802.1D also added the ability to monitorlinks and passively learn a table of accessible Ethernethosts. Later additional Ethernet switch (a filteringhub) features were introduced, such as the ability toaggregate multiple physical links between switches(IEEE 802.3ad) to aid the scaling of capacity as wellas traffic control extensions such as trafficprioritisation (IEEE 802.1p) and flow control between

switches (IEEE 802.3x). These extensions have alldefined changes to switch behaviour, but in IEEE802.3ac a new frame format added support formultiple Ethernet networks over the same physicallinks through the use of virtual LAN tagging. In aVLAN (virtual LAN) the Ethernet stations can beseen as a series of overlaid Ethernet topologies, eachwith their own routing and logical topologies (IEEE802.1s). These extensions are attempts at providingadditional functionality to support a larger number ofusers with better levels of service within the existingnetwork, but are not sufficient to support metro-scalenetworks.

2.1 Challenges in 100G Metro Ethernet

The requirements of a metro-scale core network aredifferent to those of a LAN. Metro networks areexpected to carry more than just best effort data, aswell as be able to support a very large number of endusers and high capacity links in a scalable manner.Also, transport network features are expected, such asOAM (operations, administration and maintenance)functionality, improved network resilience and QoSguarantees. The main features required are:

• ScalabilityThe non-hierarchical (sub-networked) nature

of MAC (medium access control) address (anEthernet address) space does not allow for routeaggregation in switches and, therefore, every switchmust contain an entry for every Ethernet node, whichcan lead to very large tables. This can become lesssignificant by using layer 3 routing at the edge of thenetwork (and thereby decrease the number of Ethernetnodes in the network). The second scalability issue isthat of effective network resource use and Ethernet’slack of load balancing. In the LANs, where linkutilisation was low and a tree-like physical layer wasexpected, routing was simple and while the topologydid create bottlenecks, the performance was stillacceptable. To support many multiplexed sources inthe core the topology will have to distribute load anduse alternative paths to maximise available capacity.Routing must, therefore, be more flexible andpredictable than the simple default spanning treescheme. Load distribution and predictable routing canbe achieved through the teaching (rather than passivelearning) of MAC table configuration. The explicitteaching of MAC table configuration would allow forfull control over routing and could achieve loaddistribution and predictable routing.

• Service level guaranteesIn the LAN over-provisioning and a

relatively small number of users has meant thatperceived levels of service are acceptable, howeverthis is not the case for metro-scale networks. In corenetworks careful policing of resources is required toprevent flows interfering with each other and

monopolising on resources. Soft guarantees may beachieved by careful dimensioning, routing and trafficprioritisation but this requires complicated planning,management and the policing of traffic sources at theedge. Hard guarantees can be achieved through theuse of physical layer constraints such as the timeslotarchitectures proposed later in this document.

• ResilienceIn the case of node or link failure the

network must quickly recognise this and reconfigureto be able to continue transporting data. CurrentEthernet spanning tree re-configurations are too slow(IEEE 802.1D takes ~1 minute, 802.1w 2-3 seconds)to provide the millisecond scale protection switchingexpected of metro core networks.

• OAM and signallingManagement and monitoring of Ethernet

network devices is currently done through SNMP(simple network management protocol) and RMON(remote monitoring protocol) requests that are sent in-band within a Layer 3 protocol packet (e.g. IP), and assuch the management is part of the actual data flow.Optimally the data carried will not interfere withOAM signalling. A possible solution to this is to usespecial new frame types (in a similar way to PAUSEframes in IEEE 802.3x flow control) that can be sentbetween adjacent switches.

Solutions to some of the above problems alreadyexist: RRSTP (Rapid Ring Spanning Tree protocol)/3/ is a proprietary spanning-tree replacement thatattempts to provide protection switching (in about400ms) through the use of ring based logicaltopologies. A more complete attempt at providingmetro-class Ethernet connectivity is RPR (ResilientPacket Rings) /4/, which uses a static ring of point-to-point fibres and sends Ethernet-like (not directlycompatible but translatable to the original Ethernetformat) frames around the ring. The use of a ringlogical topology allows for fast protection switching(50ms), while careful output queuing at the nodesprovides fairness in resource use as well as a numberof service quality classes.Another issue with the use of Ethernet in the core andat high bit rates is that of frame size limitations. Theoriginal Ethernet standard allowed frames to be amaximum of 1500 bytes to minimise collisions inshared bus networks. In new applications such asSANs (storage area networks) it would be beneficialto be able to support much larger frame sizes: IPv4supports a maximum packet size of 64kBytes. To thisend 9000 byte frames have been proposed (known asjumbo frames /5/) to minimise the need forfragmentation and maximise throughput efficiency.At the other extreme, the large packet headeroverhead of Ethernet (24-30 bytes) makes itinefficient to carry individual voice flows that havesmall payloads.

Those challenges aside, the key factor inimplementing Ethernet in the core is the opticalarchitecture and the balance between optical layercapability and Layer 2 functions: such as, whetherprotection should be performed in the optical layer orthrough some Ethernet extension or function?Similarly should entire wavelengths be dedicated to apoint-to-point Ethernet link with soft QoS guaranteesperformed by queuing at the Ethernet switch or shouldthe wavelength be time division multiplexed andprovide smaller, more manageable bandwidth unitswith hard guarantees on QoS? The answer isdependent on the exact requirements of the networkand the electrical or optical equipment capabilities atthe required line speeds. At one extreme are opticalpacket networks, where the wavelengths are simplestatic point-to-point lightpaths (wavelength channels)with the entire capacity of the wavelength channelused for the link. RPR is an example of such asystem. At the other extreme are time-slottedarchitectures where the optical equipment mustdifferentiate between channels (timeslots) andforward wavelengths accordingly. The staticwavelength approach, while having simpler opticsrequires electronics at least as fast as the line speed(≥100Gbit/s) for MAC lookups and packet queuing.Time-slotted architectures on the other handeffectively reconfigure the entire network at the endof every timeslot requiring sophisticated opticalprocessing. The big advantage however is hard QoSguarantees and a finer granularity of capacity that canbe dynamically grouped to adapt to demand.

With the requirements of metro-scale Ethernet underconsideration we will now examine a range of opticalarchitectures and what performance characteristicsthey provide.

3 Comparative performanceevaluation of optical metro rings

With the aim of comparing different optical metroring architectures, the following four alternatives areevaluated in this section: (1) static wavelength-routedoptical network (WRON) ring, (2) static slotted ring,(3) dynamic optical burst switching with just enoughtime signalling mechanism (OBS-JET) and (4)dynamic wavelength-routed optical burst switching(WR-OBS) ring. Architectures (1) and (2) correspondto static networks where the traffic matrix is known apriori and network resources are allocatedaccordingly before the start of the network operation.Architecture (1) allocates resources at lightpath levelwhile (2) multiplexes wavelengths between differentconnections allowing a finer granularity than (1) inthe resource allocation. Compared to dynamicnetworks, static networks are relatively simple todesign, operate and manage, and inherently have zerodelay in the head of the transmission buffer and zeroblocking; however, they can not deal efficiently withtraffic changes and need more extra capacity to be

resilient to network failures than dynamic approaches/6/.To evaluate the potential saving of resources indynamically operated networks, architectures (3) and(4) are also considered in this paper. Both correspondto burst switched networks, but (3) uses one-wayresource reservation mechanism whilst (4) utilises anend-to-end reservation scheme withacknowledgements. Architecture (1) is the type ofoptical system that would be commonly used withEthernet variants like RPR where the intelligence isnot in the static optical layer but rather the Ethernetlayer. Architecture (2) together with a control systemfor dynamic timeslot allocation can be seen as anexample of a network where there is more intelligencein the optical layer. Architectures (3) and (4) take thisoptical layer intelligence to the extreme by allowing amore flexible adaptation of the network to trafficdemand than architectures (1) and (2). The fourarchitectures are evaluated in terms of end-to-enddelay, throughput and maximum number ofwavelengths required in any given link in thenetwork.

3.1 Description of architectures

3.1.1 Quasi-static WRON ringIn a static WRON, lightpaths between network nodepairs must be allocated to accommodate the trafficdemand matrix known a priori. It is assumed that thetraffic matrix does not change frequently (minutes orhours) and, before the network operation starts,lightpath allocation is performed and switches areconfigured accordingly. Lightpath assignment isperformed so wavelength collisions do not occur inthe same fibre and that the minimum number ofwavelengths is used. Once the network begins tooperate, data arriving at the electrical interface of anedge node is classified per destination, transformedinto an optical signal and sent into the correspondentallocated lightpath.Because the task of allocating lightpaths in a staticnetwork is a NP-problem /7/, heuristics are used toperform lightpath allocation. The heuristic proposedin /8/ is used here because it has been shown to yieldthe solution requiring the lowest number ofwavelengths on rings. The heuristic assumes that onlyone lightpath is required between every pair of nodesand that the network is equipped with one bi-directional link between adjacent nodes. By the wayof example, Figure 1 shows the lightpath allocationfor a 4-node ring.

Clockwise direction Anticlockwise direction

Figure 1: Static lightpath allocation in a 4-node ring

λ1

λ1

λ1

λ 2

λ2

λ1

1

2

3

4

1

2

3

4

λ1

λ1

λ1

λ2

λ2

λ1

3.1.2 Slotted RingIn this architecture resources are allocated using afiner granularity than possible with quasi-staticWRON. To do so, lightpaths are not permanentlyestablished between all pairs of nodes. Instead, thering is equipped with a pre-specified number ofwavelengths multiplexed in time among the differentconnections (node pairs). The architecture consists ofone unidirectional link between every pair of adjacentnodes and the wavelength/slot assignment schemeconsidered here is such that data directed to node imust be transmitted using λi (the wavelength serveseffectively as a receiver address), which is slottedaccording to the traffic matrix. Thus, N wavelengthsare required in an N-node ring and, under theassumption of uniform traffic, λi is divided in (N-1)slots (node i does not transmit data to itself). Figure 2shows the operation of this architecture under uniformtraffic for a 6-node ring for wavelength λ6. In thiscase node i can transmit data to node 6 only duringslot i. When slot i finishes, node i stops datatransmission and waits until slot i starts again. Eachslot could be subdivided further into m (m = 1,2,…)sub-slots for higher granularity.

Figure 2: 6-node ring operating as a slottedarchitecture, with one slot per destination (m=1)

Assuming equidistant nodes on the ring network, theslot size Sm (in bits) can be calculated as follows:

where m denotes the number of sub-slots, D is thenetwork diameter, b the bit rate, N the number ofnodes, and cf is the velocity of light in optical fibre.

For compatibility with an Ethernet environment it isenvisaged to define the slot size such that it iscompatible with Ethernet Jumbo frame sizes, assumedhere to be 9000-bytes. Figure 3 shows the slot size asa function of the network diameter (0-200 km) and thenumber of nodes (2-20) for a bit rate of 100 Gbit/sand m=100 sub-slots. The value for m was chosensuch that for the specified network diameter of D =150 km the minimum slot size would be larger orequal to one Ethernet Jumbo frame, for N = 20 nodes.At a bit rate of 100 Gbit/s, the duration of an EthernetJumbo frame is equivalent to 720 ns, which meansthat all switching and bit-synchronisation in thenetwork must operate on timescales < 100 ns.

1

2

5

1020

0 50 100 150 2002

4

6

8

10

12

14

16

18

20

Num

ber o

f nod

es

Network diameter [km]

Figure 3: Slot size as multiples of Jumbo Ethernetframes (9000 bytes) for the slotted ring network form=100 sub-slots as a function of the number of nodesand the network diameter. The dashed line denotes adiameter of 150 km

3.1.3 Dynamic OBS-JET ringOptical Burst Switching with Just-Enough-Timesignalling mechanism (OBS-JET) /9/ consists ofsending bursts of information (electronically built inthe edge of the network) through the optical core aftera control packet has configured switches in a hop-by-hop basis.Bursts must remain in the optical domain oncereleased and the core nodes do not have opticalbuffering capabilities. For this reason the burst is keptfor a short period of time (called offset time) in theelectronic buffer of the edge node while the controlpacket is sent into the optical core network toconfigure the switches along the path. The offset timemust be chosen sufficiently long enough to allow forswitches in the path to be configured when the burstarrives. As bursts are assumed to be in the range oftens of kilobytes, there is no time for end-to-end pathreservation. As a result, bursts can be dropped at anypoint along the path to the destination due to channelcontention. To decrease the probability of channelcontention, full wavelength conversion is compulsoryin every node of the network; wavelengths are usedonly as a medium of high-speed transmission and notto route data. In order to assign wavelength channels,the FF-VF (First-Fit with Void Filling) scheme /10/ isused. In FF-VF the wavelength channels of thecorrespondent outgoing link are searched in a fixedorder. The first channel unreserved for the period[t+toffset,t+toffset+tburst] (t: burst arrival time, tburst: burstduration) is then allocated to the burst. If no idlechannel is found, the burst is dropped at arrival at thatnode.

3.1.4 WR-OBS ringIn WR-OBS /11/, as in OBS-JET, packets areelectronically aggregated into bursts at the edge of thenetwork according to their destination. But unlikeOBS, end-to-end lightpath reservation is requiredbefore sending a burst through the optical core. Toreserve a lightpath, at some point of the aggregationprocess a request is sent to the core network to find

bcm1)-(N

πD Sf

m ⋅⋅⋅

⋅=

1 2

3

45

6λλλλ6

and reserve resources for the burst. Once the lightpathhas been reserved in the core, an acknowledgementwith additional information on the reserved lightpathis sent to the edge node and then the burst can betransmitted. In case a lightpath is not found, anegative acknowledgement is sent to the edge nodeand the burst is dropped, giving rise to data loss. End-to-end lightpath reservation means that bursts must bein the millisecond range (to allow time for anacknowledgement of lightpath reservation, mainlydetermined by the propagation time) and that QoSrequirements such as latency and jitter can beguaranteed.WR-OBS can be designed in several ways accordingto the burst assembly mechanism and the routing andwavelength allocation algorithm used. In this work,centralised WR-OBS architecture as described in /12/is considered. Electronic data units arriving at edgebuffers are aggregated using Unlimited Burst Size(UBS) burst assembly mechanism /13/. Once the firstdata unit of a burst arrives at the edge buffer, a timerstarts, and ttimer units of time after such arrival arequest is sent to a control node which performs thelightpath scheduling. When the source edge routerreceives the acknowledgement, it transmits all data inthe buffer until no packets remain there. On the otherhand, the control node can be provided with a “re-attempt function”, which means that if a requestcannot successfully reserve a lightpath, it can be keptin the control node until resources become availablefor that connection or until a deadline expires /14/.This deadline, called the maximum scheduling time,tsched,max, corresponds to the maximum amount of timethat a request can spend in the control node and it is akey parameter in order to provide end-to-end delayguarantees. To provide fairness in the treatment ofdifferent requests in the central node, the timersassociated with the different buffers are set such thatall the connections have the same value for tsched,max/12/. Once in the control node queue, the requestwhich has stayed longest in the queue is served next.

3.2 Comparison of performance

3.2.1 Uniform trafficUnder the assumption of uniform traffic, the describedoptical metro ring architectures are evaluated in termsof mean end-to-end delay, throughput and number ofwavelengths. In the slotted architecture it is assumedthat a data unit fits exactly within one slot. To be ableto compare the different architectures, all of them areassumed to work with fixed data units with theexception of WR-OBS in which the burst assemblyprocess determines the burst size distribution.

3.2.2 Mean end-to-end delayMean end-to-end delay, denoted by D, corresponds tothe time elapsed since a first bit of a data unit isreceived in the transmission buffer, and until itssuccessful reception at the destination node, that isD=tbuffer+ttx+tprop where tbuffer is the time that a data

unit must spend in the source transmission bufferbefore being transmitted, ttx the transmission time andtprop the propagation time.In quasi-static WRON tbuffer is calculated modellingeach node buffer as an M/D/1 continuous queue /15/with service time (transmission time, ttx) equal toLdata/b (where Ldata is the data unit size in bits and bthe bit rate of wavelengths). Propagation time inWRON is given by PWRON/cf, where PWRON is themean path length in WRON ring and cf is the speed oflight in fibre. In the slotted ring architecture tbuffer iscalculated modelling each node buffer as an M/D/1discrete queue /16/ with service time equal toLdata/beffect (where beffect is the effective bit rate perconnection, equal to b/(N-1)). Propagation time isgiven by Pslotted/cf, where Pslotted > PWRON because uni-directional links are used in the slotted ring (instead ofbi-directional ones in static WRON). In OBS-JET,tbuffer corresponds to Taggr+toffset, where Taggr is the burstaggregation time (equal to the time required to build aburst of Ldata size) and toffset the offset time (given bythe time required to process a header in a node, th,multiplied by the number of nodes comprising thepath). Transmission time is given by Ldata/b andpropagation times by POBS-JET/cf. Using fixed routing(choosing the same routes as in WRON), POBS-JET=PWRON. Finally, in WR-OBS the maximum end-to-enddelay is defined a priori by the network applicationand then timers and algorithms are set to comply withthe specified delay, independently of load or trafficstatistics. However, the mean end-to-end delay ismuch lower than the required maximum. Weconsidered a worst-case scenario by analysing thedelay for the first packet of the burst. The mean end-to-end delay of the first packet of each burst has beenevaluated through simulation usingtbuffer=ttimer+trqst_prop+tCN, where ttimer is the mean timeelapsed since the arrival of the first packet of a burstuntil the lightpath request is sent to the control node,trqst_prop is the mean time for the request to travel to thecontrol node and go back to the edge node and tCN isthe mean time the request spends in the control nodeto find a lightpath. Transmission and propagationtimes are given by Lpacket/b and PWR-OBS/cf,respectively; where Lpacket is the packet size and PWR-

OBS is the mean path length in WR-OBS.

In Figure 4 mean end-to-end delay as a function ofthe offered load per connection is depicted for thearchitectures (1), (2) and (3) for b=100Gbit/s,Ldata=10kBytes, ring diameter of 150 Km, nodesequally spaced around the ring and number of nodesvarying from 8 to 20. For OBS-JET, th=1ns and 8wavelengths per link are considered. In the case ofWR-OBS the mean end-to-end delay for an 8-nodering is obtained through simulation assuming that dataarrives as an ON-OFF process (with both periodsbeing Poisson distributed) at the edge nodes, wherethey are processed to form the bursts using UnlimitedBurst Size (UBS) mechanism /12/. The maximumend-to-end delay is set to 40ms, tsched,max=36ms, 8wavelengths per link, AUR-Exhaustive /17/ is the

dynamic routing and wavelength allocation algorithmwith calculating time distributed according to a Betadistribution with maximum equal to 2µs. It can beseen that the static WRON architecture exhibits thelowest end-to-end delay closely followed by OBS-JET for loads under 0.9. In both cases propagationtimes dominate the end-to-end delay with respect toother contributions and thus the end-to-end delay doesnot change as load increases except in the staticWRON case for loads over 0.9, when queue delaysare comparable to propagation times.

1.E-04

1.E-03

1.E-02

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.90.9

99

Offered load per connection

Mea

n en

d-to

-end

del

ay (s

ec)

Figure 4: Mean end-to-end delay vs. offered load pernode pair

In the slotted ring architecture propagation delay isalso dominant and it is higher than in static WRONand OBS-JET mainly due to the longer paths (uni-directional links). For loads in excess of 0.9, slottedring delay increases faster than static WRON delaybecause of the waiting time for the correspondingslot. Finally, WR-OBS exhibits the longest meanend-to-end delay (which is well under the maximumrequired limit of 40 ms) because of the end-to-endreservation process.

3.2.3 ThroughputThroughput, T, is defined here as the amount ofinformation successfully delivered per unit of timeper connection (node pair). Because the aim of thiswork is to evaluate the impact of different lightpathscheduling algorithms in the performance ofdifferent networks, the effect of data loss due tobuffer finite capacity in edge nodes is neglectedassuming buffers large enough to avoid overflow.Let us start by analysing the static networks. ForWRON a maximum bandwidth of b bps is allocatedper connection, while the slotted ring network isdimensioned to hold up to b/(N-1) bits-per-second(bps) per connection. As long as the input data ratedoes not exceed these bandwidth limits, there is nodata loss in the networks. Therefore, the throughputis TWRON=νWRON Ldata (bps) and Tslotted=νslotted Ldata

(bps), where νWRON and νslotted are the arrival rates ofdata units in WRON and slotted ring, respectively(0≤νWRONLdata≤b, 0≤νslottedLdata≤b/(N-1)).Normalising the throughput to the bit rateTWRON_normalised=νWRONLdata/b=ρWRON and

Tslotted_normalised=νslotted Ldata /b=ρslotted; where ρWRONand ρslotted correspond to the offered load perconnection in WRON and slotted ring, respectively(0≤ρWRON≤1, 0≤ρslotted≤1/(N-1)).In contrast, in dynamic networks resources are notpre-assigned to connections and network resourcecontentions may arise during network operation,resulting in data loss in OBS-JET network andlightpath requests blocked in WR-OBS. Thus, forOBS-JET ring the normalised throughputcorresponds to TOBSJET_normalised=ρOBS_JET(1-PL) whereρOBS_JET is the offered load per connection, and PL isthe data loss probability.For the WR-OBS ring, the normalised throughputcorresponds to TWR-OBS_normalised=ρWR-OBS(1-PLR);where ρWR-OBS is the offered load per connection andPLR the packet loss rate. Both PL and PLR wereevaluated through simulation assuming rings withthe same capacity as the static WRON ring (2wavelengths per link in the 4-node ring and 8wavelengths per link in the 8-node ring). OBS-JETsimulation considers fixed-length bursts arriving as aPoisson process to every node and shortest pathrouting. WR-OBS simulation parameters weredescribed in the mean end-to-end delay section(except for 0.7µs as the maximum processing time inthe 4-node ring). Figure 5 shows the normalisedthroughput as a function of the offered load perconnection for 4-node and 8-node rings for the fourarchitectures, respectively.The static WRON exhibits the maximum achievablethroughput followed closely by the WR-OBS. Thegood performance of WR-OBS in terms ofwavelength savings is due to the low propagationtimes which allow to have a high tsched,max, and

00.10.20.30.40.50.60.70.80.9

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Offered load per connection

Nor

mal

ised

Thr

ough

put

static WRON ring

WR-OBS ring

OBS-JET ring

Slotted ring

Figure 5(a): Normalised throughput for 4-node ring

00.10.20.30.40.50.60.70.80.9

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

Offered load per connection

Nor

mal

ised

Thr

ough

put

static WRON ring

WR-OBS ring

OBS-JET ring

Slotted ring

Figure 5(b): Normalised throughput for 8-node ring

WR-OBS

Slotted ring

Static WRON

20 nodes12 nodes 8 nodes

OBS-JET

continue in the attempt to find a lightpath in thecentral node. OBS-JET exhibits a performance worsethan WR-OBS, in spite of full wavelengthconversion in each node. Finally, the slotted ringarchitectures showed the lowest throughput due tothe reduced bandwidth per connection.

3.2.4 Wavelength requirementsIn a static WRON ring of N nodes, using /8/ forlightpath allocation, the maximum number ofwavelengths required is given by ( )8/)1( 2 −= NceilW . Inthe slotted ring case instead, N wavelengths arerequired for a ring of N nodes. In the dynamicnetworks the required number of wavelengthsdepends on the blocking (loss) probability target andthen it varies with the offered load. In Figure 6 thenumber of wavelengths required for the differentarchitectures is depicted for the 8-node ring withtarget loss probability of 10-3.

0

2

4

6

8

10

12

14

16

18

20

0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Offered load per connection

Num

ber o

f Wav

elen

gths

Static WRON, Slotted ring

OBS-JET

WR-OBS

Figure 6: Required number of wavelengths vs.offered load per connection (8 nodes)

It can be seen that WR-OBS requires fewerwavelengths than OBS-JET to achieve the sameblocking probability and fewer wavelengths thanstatic WRON (and slotted) for loads under 0.7. Theseresults along with delay and throughput results showthat WR-OBS can achieve high throughput at lowerdelay (critical time applications require a maximumend-to-end delay to be limited to 100 ms /18/) savingresources with respect to the other architectures. Onlyquasi-static WRON exhibits better performance interms of throughput and delay, but it requires morewavelengths at low loads.

3.2.5 Non-uniform trafficA preliminary study on the impact of non-uniformtraffic on network performance is presented in thissection. To compare both cases (uniform and non-uniform) the total load was kept constant and equal toN(N-1)ρu, where N is the number of nodes and ρu isthe offered load per node pair in uniform case. Thetype of non-uniformity considered here assumes thatthere is one node in the network, the hub H, whichabsorbs a fraction β of the total network load. Theremaining (1-β) fraction of the total network load isuniformly distributed among the rest of the nodes. Inthis way, two traffic matrices (U for uniform traffic

and NU for non-uniform traffic) are defined asfollows:

H 1 … N-1 H 1 … N-1 H 0 ρu … ρu H 0 ρH … ρH 1 ρu 0 … ρu 1 ρH 0 … ρn

N-1 ρu ρu … 0 N-1 ρH ρn … 0 U matrix NU matrix

where ρH=βNρu/2 and ρn=(1-β)Nρu/(N-2). The levelof unbalance β varies from 2/N to 1. When β=2/N, thescenario corresponds to a network with uniform load(NU≡U). When β=1, the hub concentrates all thetraffic in the network.A key aspect to study is the value of the maximumlevel of unbalance β̂ that a network designed foruniform traffic can support without requiring extracapacity. The results are shown in Figure 7, andquantified in the following section.

Figure 7: Maximum level of unbalance vs. offeredload per connection, breakpoint is for ρu = 2/N

The breakpoint in Figure 7 occurs for ρu =2/N. Thenβ̂ =1 and ρH=1. In this point the level of unbalance ismaximum and the capacity assigned to node pairs(i,H) and (H,i) (called hub-connections onwards) isfully used. For values of ρu < 2/N the level ofunbalance can be maximum but the capacity of hub-connections is not fully used. For values of ρu over2/N the hub-connections need extra capacity tosupport β̂ =1. Without extra capacity, only β̂ =2/(Nρu)can be supported.

3.2.6 Non-uniform traffic in slotted ringThe slotted ring architecture was evaluated under non-uniform traffic in terms of delay, throughput andnumber of wavelengths.Only queueing times are shown here since the non-uniform traffic does not alter propagation ortransmission times. The M/D/1 discrete queue modelwas used to model the node buffers, but now there aretwo types of queues: those of hub-connections (withload ρH), and those of no-hub-connections (with loadρn). Figure 8 shows the queueing delay for a 12-nodering in the uniform case and in the non-uniform casewith β=0.5. The parameters DH and Dn correspond tothe delay experienced by hub-connections and no-hubconnections, respectively.

ρu2/N 1

extraresources

required

β̂ =2/(Nρu)

1no extra

resources

required

β̂

2/N

1.00E-06

5.10E-05

1.01E-04

1.51E-04

2.01E-04

2.51E-04

3.01E-04

3.51E-04

4.01E-04

4.51E-04

0.05 0.25 0.45 0.65 0.85

Offered load per connection

Del

ay (s

ec)

Uniform case

D_H, B=0.5

D_n, B=0.5

Figure 8: Queue delay for 12-node ring

It can be seen that different connections experiencedifferent delays, which results in unfairness. To offera fair treatment to all the connections, DH=Dn musthold. To comply with this condition, slot sizes mustbe adjusted so that:

u

un

n

H

H

nnH NN

NSSSρβ

ρββρρ

ρρ

)1()2() 2)(1(

11

−−−−−=��

����

−−=

where SH and Sn are the slot sizes of hub-connectionsand no-hub-connections, respectively. The equationabove means that to obtain the same delay for everyconnection a dynamic slot scheduler is required,since slots must be adjusted not only as a function ofthe level of unbalance but also as load changes. As aresult, the simple slotted architecture proposed for theuniform case has to be modified to be able to supportnon-uniform traffic.Since the total network load is kept constant and theslotted ring has no loss, the throughput is the same asin the uniform case.To calculate the number of wavelengths required tosupport non-uniform traffic, it was assumed that slotlengths can be dynamically adjusted (modifying thearchitecture using a dynamic slot scheduler) to givemore transmission time to more loaded connections.Let WHub be the number of wavelengths required tosupport the traffic load sent to the hub node (i.e., firstcolumn of the NU matrix) and Wi the number ofwavelengths required to support the traffic load sentto node i (i.e, i-th column in the NU matrix). Thenumber of wavelengths is calculated dividing therequired capacity by the capacity provided by onewavelength. That is:

� � ��

���

�==��

���

−−

=2)1(

)1( uH

HHub

NN

NW

ρβρ

ρ

��

���

−−

=��

���

−−+

=)1(2

)2()1(

)2(N

NNN

W unHi

βρρρ

Both WHub and Wi are shown in Figure 9 for β=1. Itcan be seen that only the number of wavelengthsdevoted to transmit data to the hub need to beincreased, while the remaining wavelengths are keptunderutilised (these can not be used to send data to thehub due to the architecture constraints: one fixed

receiver per node). From the presented results, theslotted architecture as originally proposed is not wellsuited to work under non-uniform traffic. This is partof further research to evaluate the impact of non-uniform traffic in the remaining architectures.

Figure 9: Required number of wavelengths vs.offered load per connection

4 Implementation of 100GbMEThe extension of the existing Ethernet standardstowards 100GbME also raises a number of questionswith respect to the physical implementation of such ascheme. This section is concerned, in particular, witha specification for the node design (physicalinterfaces, PHY), and the spectral efficiency of thetransport scheme used.

4.1 Network node design andmultiplexing

A key challenge in the design of 100GbME is thedefinition of the physical interfaces (PHY) foroperation at high bit rates. Depending on theimplementation and due to the lack of standardizationactivities, bit rates of 80...160 Gbit/s were assumed tobe compatible with the term 100GbME.

Although integrated circuits (IC) have already beendemonstrated to be – in principle – operational forselected functionalities of electrical signals up to 100Gbit/s /19,20/, obstacles such as lack of amplification,modulation, and bonding problems still prevail. It is,therefore, believed that in 100GbME all signals willbe optical in nature, as is already the case for 10GbE(IEEE 802.ae). Since the lack of electronics to dateprohibits the direct modulation of continuous wave(cw) light at 100 Gbit/s, a 2-stage multiplexingtechnique is proposed as shown in Figure 10. It wasassumed in all cases that the minimum input bit rateinto the multiplexer would be in the form of GbEsignals. In a first multiplexing stage 10 of thesesignals would be electrically aggregated into a single10 GbE stream, the technology for which alreadyexists today. The novelty of the approach discussedhere would be to add a second optical multiplexingstage, which aggregates ten 10GbE signals into theenvisaged 100GbME signal,

ρu2/N 1

1

Number ofwavelengths

� �2/N

WHub with ββββ=1Wi with ββββ=1

λ1

λ10

MUX 10:1

1

10

MUX 10:1

1

10

GbE

10 GbE 100 GbE

WDM MUX

TDM MUX

10 GbE

10 GbE

2nd stageOptical multiplexing

1st stageElectrical multiplexing

Bit sync

WDMwaveband

100 GbE

TDM

Slot sync

Lowest level:GbE input

Opticaloutput

10 GbE

Figure 10: Principle of the envisaged 2-stagemultiplexing from lower bit rate signals (GbE) via anelectrical and an optical multiplexer. The latter wouldeither be based on wavelength division multiplexing(WDM) or optical time division multiplexing(OTDM).

using either a wavelength division multiplexingtechnique (WDM, could be based on existing PHY ofIEEE 802.3ae), or an optical time divisionmultiplexing technique (OTDM). In both cases, twostages of synchronisation are required: A bit-levelsynchronisation for fast clock-and data recovery(CDR, ns timescale) of the data within each slot, and aslot synchronisation which would ensure thesynchronous start of slots (µs timescale).The WDM node as shown in Figure 11 consists ofthe multiplexing stage as described above, as well asan optical multiplexing/demultiplexing unit and an1:N switch for connection to the ring network. Theoptical multiplexer and demultiplexer are used toadd/drop management information on the controlchannel with wavelength λcontrol. If fixed wavelengthaddressing was used, the waveband λB1 associatedwith the node would be dropped, whilst all otherwavelength bypass the node. A 1:(N-1) switch is usedto add information to any other waveband λBi. Allwavebands are aggregated together with the controlchannel at the output of the node.

1

10

electrical switch/routerfor queue management

λB1

1:N switch

λBN

All-opticalADD/DROP

Node

AWG

buffer

mod.

λcontrolBPFRxControl information OUT

SYNC

GbEand

10GbEinput Control information

IN

TL

TL

WDMwaveband

source

Rx 1

Rx 10datagram

reconstruct

Figure 11: Network node based on a wavebandapproach (densely spaced WDM)

The OTDM solution is shown in Figure 12; theoperational principle is the same as for the WDMsource. The main difference in the transmitter andreceiver are the optical multiplexing anddemultiplexing units. The optical multiplexer would

consist of a short pulse source (tFWHM < 5 ps) andseparate delay lines for multiplexing lower bit ratesignals into the 100GbME stream. The key challengeof this scheme is the provisioning of a rapidlytuneable pulse source (< µs), which generatessufficiently short pulses /21/.

Shortpulse

sourcetFWHM ≤ 5ps

1

10

OTDMMUX

Delay line

electrical switch/routerfor queue management

λ1

1:N switch

λN

All-opticalADD/DROP

Node

BPFBurstRx

buffer

modulator

λcontrolBPFRx

Control information OUTSYNC

User datagram

GbEand

10GbEinput Control information

IN

Figure 12: Network node based on optical timedivision multiplexing (OTDM) technique

An important metric in terms of the implementation isthe spectral efficiency of the transport technologyused in the ring network. Table 1 lists the free-spectral range required per channel (OTDM) orwaveband (WDM, 10 Gbit/s sub-channel), dependingon the physical bit rate. In the case of an OTDMsolution, the pulse width determines the required free-spectral range (FSR), whilst for the WDM solutionthe intra-waveband spacing is critical (here assumedto be 25 GHz). The allowed FSR was assumed to usethe same spacing as defined by the InternationalTelecommunications Union (ITU) for WDM systems,i.e. 100, 200 or 400 GHz. The scheme with thehighest overall spectral efficiency is the WDMscheme with 12.5 GHz intra-band spacing, althoughother modulation formats might give different results.Assuming a total bandwidth of >30 nm (C-band) to beavailable, this translates into a maximum of 20channels available for a FSR = 200 GHz, whilst thisreduces to 10 channels at FSR = 400 GHz. Thismeans that for100GbME, wavelengths can become a

Scheme Bit rate[Gbit/s]

FSR[GHz]

Assumption

80 200 tFWHM = 6 ps

100 200 tFWHM = 5 ps120 400 tFWHM = 4 ps

OTDM(RZ)

160 400 tFWHM = 3 ps80 200 10 Gbit/s, 25 GHz

100 400 ”120 400 ”160 400 ”80 100 10 Gbit/s, 12.5 GHz100 200 ”120 200 ”

WDM(NRZ)

160 200 ”

Table 1: Free-spectral range (FSR) required for theOTDM and WDM schemes depending on the physicalbit rate, pulse width (OTDM), and sub-channel bitrate and spacing (WDM)

scarce resource, unless schemes with high spectralefficiency are used. As the assumption of equidistantnodes around the network will not hold for realtopologies, it would be envisaged that the link designcould be based on the principle of normalizedtransmission sections /22/.

5 Summary and conclusionsThis paper discussed key aspects for the design anddeployment of 100-Gigabit Metro Ethernet(100GbME), envisaging the extension of its reachfrom LANs to MAN rings. This move requiresEthernet to include network managementfunctionality currently not required in the LAN, andto adapt the frame size to make best use of the high bitrates whilst providing backward compatibility. Fourdifferent optical ring network architectures operatingwith 100GbME were considered, and theirperformance compared with respect to throughput,delay and number of required wavelengths. Whilst allarchitectures met the delay constraints under Poissontraffic, a trade-off was observed between throughputand the number of required wavelengths forimplementation. The highest throughput was achievedin the quasi-static WRON closely followed by WR-OBS, but the WR-OBS architecture provided thelargest wavelength savings for network loads less than0.7, whilst for higher loads the static WRON orslotted ring used the least number of wavelengths.The operation of the slotted ring was also evaluatedfor non-uniform traffic, resulting in additionaldemands for resources and the need to dynamicallyadapt the slot size to achieve fairness. For theimplementation of physical interfaces WDM andOTDM-based schemes were investigated – due to thehigh bit rates considered it is key to minimize thebandwidth requirements for a transport schemethrough high spectral efficiency (e.g. WDM with 12.5GHz spacing).

AcknowledgementsThis work was carried out within the framework ofthe project ONW2001+ of T-Systems DeutscheTelekom Innovationsgesellschaft mbH, Berlin, whosefinancial support is gratefully acknowledged.

References

/1/ S. Keshav, An Engineering Approach to ComputerNetworking, Addison-Wesley, Reading, MA, 1997

/2/ N. Ghani et al., ”Metropolitan Area Networks,” in: I.P.Kaminow and T. Li, Optical Fiber Communications –vol. IV-B: Systems and Impairments, Academic Press,San Diego, 2002, pp. 329-403

/3/ Rapid Ring Spanning Tree Protocol by RiverstoneNetworks, http://www.riverstonenet.com/

/4/ Resilient Packet Ring (RPR) Alliance,http://www.rpralliance.org/

/5/ P. Dykstra, “Gigabit Ethernet Jumbo Frames; Andwhy you should care,“http://sd.wareonearth.com/~phil/jumbo.html

/6/ E. Kozlovski and P. Bayvel, “Link failure restorationin Wavelength-Routed Optical Burst Switched (WR-OBS) Networks,” Proc. OFC 2003, pp. 774-776

/7/ M.R. Garey and D.S. Johnson, Computers andintractability: a guide to the theory of NP-completeness, W.H. Freeman, New York, 1979

/8/ D.K. Hunter et al., “Buffering in optical packetswitches”, IEEE J. Lightwave Technol., vol.16, Dec.1998, pp.2081-2094

/9/ C. Qiao and M. Yoo, “Optical Burst Switching (OBS)– a new paradigm for an optical Internet”, IEEE J.High Speed Networks, vol.8, 1999, pp.69-84.

/10/ Y. Xiong et al., “Control architecture in optical burst-switched WDM networks,” IEEE J. Select. AreasCommun., vol. 18, Oct. 2000, pp. 1838-1851

/11/ M. Düser and P. Bayvel, „Analysis of Wavelength-Routed Optical Burst Switched (WR-OBS) NetworkArchitecture,“ IEEE J. Lightwave Technol., vol. 20,Apr. 2002, pp. 574 –585

/12/ I. de Miguel et al., “Provision of End-to-End DelayGuarantees in Wavelength-Routed Optical BurstSwitched Networks”, Proc. IFIP ONDM 2002,Torino, Italy, Feb. 2002

/13/ I. de Miguel et al., “Traffic load bounds for opticalburst switched networks with dynamic wavelengthallocation”, Proc. IFIP ONDM 2001, Vienna,Austria, Feb. 2001

/14/ E. Kozlovski and P. Bayvel, “QoS performance ofWR-OBS network architecture with requestscheduling,” Proc. IFIP ONDM 2002, Torino, Italy,Feb. 2002

/15/ L. Kleinrock, Queueing Theory, vol. I, Wiley, NewYork, NY, 1975

/16/ O. Yang, “Performance comparison of some discreteservice time systems”, Performance Evaluation, vol.23, 1995, pp. 261-284

/17/ A. Mokhtar and M. Azizoğlu, “Adaptive WavelengthRouting in all-optical networks”, IEEE/ACM Trans.Networking, vol. 6, Apr. 1998, pp. 197-206

/18/ IEEE 802.1D recommendation, Appendix H/19/ E. Sano, “High-Speed Lightwave Communication ICs

Based on III-V Compound Semiconductors,“ IEEECommun. Mag., vol. 39, Jan. 2001, pp. 154-158

/20/ K. Murata et al., “100 Gbit/s multiplexing anddemultiplexing IC operations in InP HEMTtechnology,“ Electron. Lett., vol. 38, Nov. 2002, pp.1529-1531

/21/ L. Davis et al., “Multiwavelength modelocked laserarrays for WDM applications,“ Electron. Lett., vol.34, Sep. 1998, pp. 1858-1860

/22/ N. Hanik et al., “Optimised design of transparentoptical domains,” Proc. ECOC 2000, vol. 3, pp. 195-197