Freescale PowerPoint Template - NXP

24
TM Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009. Version 2 DSP Application Challenges for Building Large Scale Conference Systems July 2009 Cristian Căciuloiu Software Architect

Transcript of Freescale PowerPoint Template - NXP

TM

Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Version 2

DSP Application Challenges for Building Large Scale Conference Systems

July 2009

Cristian CăciuloiuSoftware Architect

TM

2Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Agenda

► Overview of the conference domain

► Functional view of DSP conference application

► Performance challenges

► Conclusions

TM

3Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Overview of the Conference Domain

TM

4Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Context

► Audio conferencing is a ubiquitous tool• More or less commoditized in various parts of the world• Quick adoption of the newer generation IP-based conference servers

► Broader exposure of services increases market pressure• Increase efficiency using a larger port capacity• Improve scalability of newer systems• Improve architecture flexibility to add more features incrementally • More integration with other Unified Communication resources

TM

5Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Standardized Architecture Models

►Conference implementations follow several models• Loosely coupled architectures

Participants independently subscribe to multicast streamsNo centralized membership controlGradual discovery of participants using RTCP reportingTypical for simple and less interactive multimedia sessionsMore advanced control and interactivity possible in hybrid approaches

• Tightly coupled architecturesThe management control resides in a central role called focusThe participants join a conference through focus (signaling path)The media streams are processed by the media mixerThe focus uses a conference policy database to control the media mixer

• IETF RFC 4353 “A Framework for Conferencing with the Session Initiation Protocol (SIP)”

• ITU-T H.332 “H.323 extended for loosely coupled conferences”

TM

6Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Loosely Coupled Conferences

TM

7Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Tightly Coupled Conference

► IETF SIP

► 3GPP Media Server

Policy Server

Focus

Media Mixers

Pa Pb Pc Pd Pe

MRF

AS

MRFC

MRFP

S-CSCF

TM

8Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Audio Conference Feature Set

► Industry forums established frameworks of concepts and use cases• The IETF RFC 5239 (“A Framework for Centralized Conferencing”) presents a large variety of use

cases potentially supported by a conference system. These features may or may not be implemented, depending on the target application.

► Examples of conference features (RFC 4597):• The basic ad-hoc conferences, where the conference is created automatically upon connecting to the

Media Server and participants join using a previously communicated identifier• The basic reserved conferences, allocated automatically at the reserved time by the system;

participants dial in or are invited into the conference• Advanced conferences with different privileges per participant (rights to control the status of the

conference or of other participants, including permissions to enter or not enter the selection for the media mixer represented by active/passive status)

• Participants connecting or disconnecting other participants to/from the conference• Authentication/authorization of participants• Announcements inserted into the conference, either selectively or to all streams• Media recording, either for all streams or for a subset of them• Sidebar conferences representing media tracks that are running in parallel to the main conference;

these allow participants to communicate while still connected to the main event

TM

9Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Functional View ofDSP Conference Application

TM

10Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Software Partitioning

►The audio conferencing software is partitioned into several discrete functional blocks to allow their flexible deployment on different physical nodes (processors). There are two major types of processing:

• Codec channels• Audio conferencing subsystem

►The codec channels are IP-IP or TDM-IP channels that provide transcoding and media processing capabilities. They also terminate the network protocols used to stream media between participants and the media server.

►The audio conference subsystem is a modular construction that can be deployed on any physical node (processor), including on one which already runs codec channels.

TM

11Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Codec Channels

► The encoder and decoder can be any of the following:

• G.729AB• G.711 + VAD• G.722 WB• i.LBC• AAC-LC/LD

► Inband DTMF detector / generator active on the NB codecs G.711 and G.729AB

► RFC2833 DTMF detector / generator► Automatic Level Control (ALC)► Sample Rate Conversion (SRC)► Optional narrowband line ECAN

IP-IP Channel

Dec → ECAN → DTMF/2833 det ALC ? → Enc ← Rin ↑⎯ DTMF/2833 gen Rin ↑⎯ SRC ←

RTP/RTCP AJB IP/UDP RTP/RTCP IP/UDP

RTP FJB

IP/UDP IP/UDP RTP

FJB IP/UDP RTP

To Conf

From Conf

From File

Mix SRC

RTP IP/UDP

To Record

Copy From Far End

To Far End

TM

12Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Audio Conference Subsystem

Speaker Analysis: detect an active speaker in one NB channel, based on energy levels (PAR1 coefficients).Winner Selection: select S dominant speakers.Winner Processing: auxiliary enhancements to the raw media of the selected dominant speakers, including sampling rate adjustment between winners.Mixer: each mixer utilizes a single sampling rate, selectable at conference initiation time.

Speaker Analysis → → PCM*A PAR1*A - VAD mono - Stereo 2 ways - Spatial audio many ways

Winner Selection → → PAR1*A PAR2*S - S dominant speakers

Winner Processing → → PAR2*S & PCM*S PCM*S - SRC for each winner - Stereo 2 ways - Spatial audio many ways - Noise Reduction - Language translation - Interface to external Unified Communication resources

Mixer → → PCM*S PCM*(S+1) - WB mixer, mono - Multiple instances - Stereo 2 ways - Spatial audio many ways A streams of PCM A parameters for

speaker selection

S identifiers for winner processing

S streams of PCM selected using S S streams of PCM

PCM/RTP streams from UC resources

S+1 PCM/RTP mixed streams

Channel Instances (Decoder) IP*P PCM*P Dec →

ECAN →

DTMF/2833 det

ALC →

P

Channel Instances (Encoder) IP*P PCM*P Enc ←

Rin ↑⎯

DTMF/2833 gen

Rin ↑⎯

SRC ←

P

IPv4/Eth multicast

Legend: - A, P, S: number of active, passive or winner participants - PAR1: proprietary parameters resulted from the Speaker Analysis phase - PAR2: proprietary parameters resulted from the Winner Selection phase

Color codes: - Features planned for execution - Placeholders for system upgrades - Input and output of a functional block

The selection from P to A streams is done by out-of-band means (for example, operator control).

TM

13Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Performance Challenges

TM

14Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Deployment Options

► Deployment options for the functional blocks across DSP cores/devices and boards when targeting a conference of 15,000 participants

► Hierarchical structure (“hierarchical star”)• The bottom layer generates N*x winners

Runs the codec channelLocal speaker analysis and winner selection

• The upper layer generates N winnersPerforms the speaker analysis, winner selection and the mixer for the N*x candidates from the first stage

► Distributed structure (“star”)• The selection process is complete and

definitive, done at the bottom layer• The decision is made by the central node which

uses a control path to communicate it

Conf 1 Conf 2 Conf 3 Conf 4

Conf 0RTP

N winners N winners N winners N winners

N winners

Conf 1

Conf 2 Conf 3

Conf 4

Conf 0

RTP

A winners

B winners C winners

D winners

N winners, A+B+C+D=N Ctrl

TM

15Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Core 1

6 cores

Ch Ch Ch Ch

Ch Ch Ch Ch

Local Conference Node 1

Local Conference Node 2

Local Conference Node D-1

Core 1

6 cores

Central Conference Node 1

WS1 WP1 MX1SA1Typical Conference Chain

WS1SA1

WS2SA2

WS1SA1

WS2SA2

WS1SA1

WS2SA2

SA1 WS1 WP1

SA2 WS2 WP2

MX1

MX2S streams

S streams

S streams

S streams

Up to (D-1)*S PCM streams

Multicast group 1

Multicast group 2

S+1 PCM streams

Speaker Analysis:cascaded stages of processing raw media, first at

local level, for the channels on a DSP device, and subsequently at the centralized node (implies duplicated effort at the expense of interoperability and standard streams)

Local Hierarchical Model without Proprietary Information

TM

16Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Finding Performance Hotspots

► Call capacity based on processing power• MCPS of the codec channels• MCPS of the audio conferencing subsystem• MCPS reservation scheme for system load

► Aggregated media traffic• Traffic characteristics of media path (1a, 2a)

Media traffic into the audio mixer from codec channelsMedia traffic from the audio mixer to codec channelsMedia traffic into the codec channels from far endMedia traffic into the codec channels from the audio mixerMedia traffic from the announcements into the codec channelsMedia traffic from the codec channels to the record servers

► Aggregated control traffic• Traffic characteristics of control path (Gbit)

• Product architecture• Software

architecture• High level

design

Software performance

analysis

• Target applications• List of features• Use cases

ATCA blades (1 & 2)

AMC cards (1, 2 & 3)

DSP1

DSP3Serial

RapidIO

Gbit

Hosts

DSP2

1a

2a

TM

17Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Finding Performance Hotspots (cont’d)

► Contributing to the processing power required by the codec channels:• IP protocol termination (IP/UDP stack, RTP packetization, RTCP and Adaptive Jitter Buffer)• Voice codec, DTMF det/gen/relay (RFC 2833 or RFC 4733)• Adaptive level control, sampling rate conversion, echo canceller/suppresser• Auxiliary features: announcements, record• Framework control code

► The processing requirements depend greatly on the scope of the channel, for example:Narrowband PSTN: G.711, ECAN, inband DTMF det, ALC, and aux. featuresWideband high-definition audio: AAC-LC/LD, relay DTMF, ALC, and aux. features

IP-IP Channel

Dec → ECAN → DTMF/2833 det ALC ? → Enc ← Rin ↑⎯ DTMF/2833 gen Rin ↑⎯ SRC ←

RTP/RTCP AJB IP/UDP RTP/RTCP IP/UDP

RTP FJB

IP/UDP IP/UDP RTP

FJB IP/UDP RTP

To Conf

From Conf

From File

Mix SRC

RTP IP/UDP

To Record

Copy From Far End

To Far End

TM

18Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Finding Performance Hotspots (cont’d)► Contributing to the processing power

required by the central node:• IP protocol termination (IP/UDP stack, RTP

packetization and fixed jitter buffer)• Speaker analysis (voice activity detector in

narrow-band spectrum)• Winner selection of S (between 3 and 5)

dominant speakers• Winner processing to adapt the incoming

media of each winner to the sampling rate of the mixer

• Mixer of the dominant speakers• Framework control code

Considering all the above factors and the system software headroom (task scheduler, operating system), a number of 1070 uncompressed PCM streams can be supported by a single six-core DSP (MSC8156), for a single media mixer.

Channel Instance (Receiver )PCM*A or (D-1)*S

RTP FJB SA

Running one per core

SchedulerRunning one per core

Channel Instance (Transmitter )PCM*(S+1)

RTP Unicast and/or multicast

WS / WP / Mixer

In ter-core Comm

S+1

TM

19Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Finding Performance Hotspots (cont’d)► More efficient use of the processing power by dynamic reservation:

• Alternative approach to worst-case allocation where the most demanding MCPS requirements were reserved regardless of the active channel features

i: an index from 1 to n to iterate through the set of channel typesPi: the number of passive channels of type iCycPi: the MCPS of a passive channel of type iAi: the number of active channels of type iCycAi: the MCPS of an active channel of type iC: the number of conferences running in the systemCycC: the MCPS of an audio conference subsystem running in the systemD: the number of DSP devices in the systemMCPS: Millions of cycles per second, measuring unit for the processing power of the DSP deviceSysApp%: Processor utilization ratio for the system application (e.g. operating system, task scheduler

and task switching time, network queues management, application framework services such as logging)

E.g. Categories of channel types: high density G.711, high definition audio AAC-LC/LD, narrowband low bit-rate codecs

%)1()()(

11SysAppMCPSDCycCCCycAACycPP i

n

iii

n

ii −××≤×+×+× ∑∑

==

TM

20Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Finding Performance Hotspots (cont’d)

► Media traffic into the audio mixer (central node)• As example, for 1070 channels PCM NB (5 ms), the

nominal throughput is 236 Mbps– 138 * 1sec/5ms * 1070 * 8 / 1000000 (w/o IFP and preamble)

• Factors affecting this value– Burstiness– Jitter

To build a safety margin in throughput evaluation due to task scheduling jitter, a 2x frame packet arrival time variation is assumed. Thus throughput is 2 x 236 Mbps.

► Control traffic across a 15,000-port system• Factors contributing to the evaluation include:

– Number of call setup/tear-down operations / sec– Expected DSP events rate / channel / sec (e.g. DTMF)

A dedicated path for the control traffic (Gigabit Ethernet in the figure) is preferable. Throughput: 267 Mbps (150 calls/sec and 1500 ch reporting events)

Media payload G.711 PCM NB G.729AB AMR-NB 4.7 AMR-NB 12.2

Period (ms) 5 5 10 20 20

IPv4+UDP+RTP (B) 58 58 58 58 58

Codec payload (B) 40 80 10 14 7

Packet length (B) 98 138 68 72 90

ATCA blades (1 & 2)

AMC cards (1, 2 & 3)

DSP1

DSP3Serial

RapidIO

Gbit

Hosts

DSP2

1a

2a

Uniform distribution

Burst in one frame

Burst in two frame s

Burst duration

Burst duration

1 ms

TM

21Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

Conclusions

TM

22Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

MSC8156► Six StarCore® DSP SC3850 core subsystems at up to 1 GHz/8000 MMACS / core and up to 48000

MMACS per device► Multi accelerator platform engine for baseband (MAPLE-B)► High-speed, high-bandwidth CLASS fabric► Two DDR controllers of up to 400 MHz clock (800 MHz data) rate and 32/64-bit DDR2/3 SDRAM data bus► 32-channel DMA controller► Dual RISC core QUICC Engine™ subsystem at up to 500 MHz for packet processing independent of the

DSP cores. Supports: Two Gigabit Ethernet controllers supporting RGMII or SGMII and SPI► HSSI that supports two 4x SerDes ports. It includes two Serial RapidIO controllers supporting 1x/4x

operation up to 3.125 Gbaud, one PCI Express controller supporting 1x/2x/4x operation. Multiplexing capability for RapidIO, PCI Express and SGMII signals through the two SerDes ports

► Four TDM and UART, I2C interfaces, eight software watchdog timers, 16 16-bit, two 32-bit timers► I/O interrupt concentrator and virtual interrupt support eight hardware semaphores ► 32 GPIO ports multiplexed with interface signals and IRQ inputs ► Boot options: Ethernet, Serial RapidIO, I2C and SPI

TM

23Freescale™ and the Freescale logo are trademarks of Freescale Semiconductor, Inc. All other product or service names are the property of their respective owners. © Freescale Semiconductor, Inc. 2009.

MSC8156 for Voice Applications

► Six-core DSP (6000 MHz) allow for a single device media mixer• Simplicity in design with time-critical dependencies managed inter-core• Significant number of active participants for winner selection: up to 1070

► Strong interconnectivity with external backplane networks• Support for Gbit Ethernet and Serial RapidIO• Network protocol termination on a dedicated QUICC Engine

► Flexible and powerful memory hierachy• Configurable M2 and/or L2 of 512 KB per each DSP core• Shared M3 on-chip memory of 1056 KB• Large throughput to the external memory• The above characteristics allow for flexibility in choosing the application

architecture (e.g. cache or DMA model, select the memory layout)

TM