Solving the Class 4 Interface with Kamailio - Evariste Systems

46
Solving the Class 4 Interface with Kamailio Artisanally prepared just in time (“continuous integration”) for Kamailio World 2016 - Berlin, Germany - 18-20 May 2016. Alex Balashov E-mail: [email protected] Phone: +1-678-954-0671

Transcript of Solving the Class 4 Interface with Kamailio - Evariste Systems

Solving the Class 4Interface with Kamailio

Artisanally prepared just in time (“continuous integration”) for Kamailio World 2016 - Berlin, Germany - 18-20 May 2016.

Alex Balashov

E-mail: [email protected]: +1-678-954-0671

What is this & who are you?

Hi, I’m Alex Balashov from Evariste Systems in Atlanta, Georgia, USA.

We are a Kamailio consultancy and product shop since 2008. I have been involved with OpenSER since 2006.

Our primary line of business is a Kamailio-based Class 4 switching and routing software platform called CSRP.

I’m going to talk about how Kamailio is used inside CSRP to implement some Class 4 features & solve problems.

CSRPWhat is it? Who cares?

1

CSRP is a “PSTN connection” “box”

CSRP is a “SIP trunking platform” at heart

PSTN routing

Route calls to PSTN termination providers via LCR (Least Cost Routing), routes incoming calls (DIDs) to customer endpoints.

Call accounting

Generate CDRs (Call Detail Records) used for billing and reporting.

Far-end NAT traversal

Proper routing of SIP messages to/from NAT’d customer endpoints.

A “trunking platform”, as understood here, performs some these functions:

Business logic

Call flows implement product features - e.g. failover/hunting, voicemail-to-email, call forwarding.

Prepaid credit and limits

Enforces prepaid balance, where enabled. Also enforces concurrent call limits.

Some perimeter security

Some fraud prevention features, rate-limiting, request sanity-checking, etc.

Box?

● Not really a box.○ Not an appliance.○ Pure software product. ○ Installed on customer-operated servers..

● Not a hosted switch platform.○ Makes a good core technology for it, though.○ Multi-tenant architecture.

● Almost all customers run it in a virtualised environment of some description.

● “Box” is comforting marketing vocabulary for telco types...

What can be connected to it?

● It’s a Class 4 platform (“tandem”).○ The hub part of hub-and-spoke topology.

● Designed to connect SIP gateways:○ PBXs○ ATAs○ Application servers○ SBCs

● Not designed for end-user endpoints:○ VoIP handsets○ Softphones○ WebRTC clients

Who uses it?Disruptive innovators?

2

Lots of customer categories

Wholesale SIP providers

Traditional SIP origination and termination aggregators, resellers.

Retail SIP providers

SIP trunk vendors to retail market (like Vitelity or Flowroute in USA)

Short-duration/wholesale call centre traffic movers:

[ See my Kamailio World presentation from 2015:

https://youtu.be/u30Fyp3QanE ]

It’s hard to generalise about CSRP customers:

Carriers (CLECs)

Facilities-based carriers with own physical (SS7) interconnections to incumbent, with own “Big Iron” softswitches.

Voice application providers

Outbound announcement, hosted IVR, conferencing, etc.

Call centres

Lead generation, political survey, market research.

How does it work?Broad software architecture

3

Core technology stack

● Core element is a single SIP proxy based on Kamailio.

● PostgreSQL database

● Outboard RTP proxy (rtpengine)

● SEMS (signalling-only B2BUA duty, miscellaneous media server, light “SBC” functionality)

● Redis (cache)

● Node.js (REST API backend, middleware)

Kamailio call processing

▪ Business logic layer for call processing resides in PostgreSQL stored functions.

▪ Kamailio scripting language is impressive, but too primitive to support complex business layer.

▪ Kamailio calls out to PostgreSQL for call processing.▫ Uses combination of:

■ built-in Kamailio modules (with own DB schema & backing) -- very limited

■ custom CSRP-specific functions & tables.

Kamailio call processing (2)

▪ General design is very custom database interactions. ▫ Very few Kamailio built-in modules are used.▫ Basically just registrar and usrloc.

▪ Many traditional Kamailio modules just provide syntactic sugar around database interactions.

▪ We need esoteric business logic (incl. North American esoterica), so we don’t use them much.

Kamailio call processing (cont’d)

+ Redis cache

ExpectationsWhat most customers want

4

NAT traversal

▪ Traditional server-side NAT traversal

▪ CSRP operators will avoid it if they can ▫ Bandwidth bill due to RTP

▪ Still commonly needed nevertheless

NAT traversal - how it’s implemented

▪ nathelper folk traditions.▫ add_contact_alias() + handle_ruri_alias()

▪ SIPwise RTPEngine for media relay

LNP queries (North America)

▪ Query LNP (Local Number Portability) provider for LRN (Location Routing Number).

▪ Provided via SIP redirect.

▪ Almost all wholesale SIP termination providers bill based on terminating LATA/OCN (i.e. terminating carrier) now, not dialed digits.

▪ Requirement is almost universal for wholesale outbound traffic.

LNP queries - how they’re implemented

▪ Forward INVITE to redirect server.

▪ Catch 3xx redirect in failure_route.

▪ Parse rn attribute using RFC 4694 notation.▫ e.g. <sip:+14045551212;npdi;rn=14045160000@me:5060>

LNP queries - how they’re implemented (1)

LNP queries - how they’re implemented (2)

Measure setup latency

● Measure time from receipt of initial INVITE to t_relay() onward to first branch.

● Must encapsulate delay introduced by any redirect queries (i.e. LNP)

Measure setup latency - how it’s implemented

Real-time CDRs & rating

● CDRs must be written more or less in real time○ Even when there are I/O penalties

● Rated in real-time, too

Real-time CDRs & rating - how it’s implemented (1)

● Four parallel asynchronous writer threads using mqueue and rtimer

Real-time CDRs & rating - how it’s implemented (2)

● Hash CDR data into one of x (power of 2) mqueues.

Real-time CDRs & rating - how it’s implemented (3)

● Consume individual mqueues in particular rtimer threads and push to PostgreSQL stored procedure for parsing.

PDD timeouts/failover

● If termination gateway takes too long to respond, move on.

● Commonly, the problem is non-100 1xx response is sent first, then nothing happens.

● Kamailio tm module has no timer for this.

PDD timeouts/failover - how it’s implemented

● t_set_fr() to something conservative initially

● t_reset_fr() in onreply_route when receiving non-100 1xx or 2xx reply

Call setup rate limits

● Enforce call setup rate.

● Independently on customer access & network edges.

Call setup rate limits - how they’re implemented

● Flexibly use pipelimit module

In-band / early media announcements

● “Caller experience” with SIP cause codes can be uncertain.

● Depends on how they’re translated in PSTN interworking / to ISUP side.

● Many customers want to control them, e.g. instead of just sending 503, play early media message with 183 and then send 503.

In-band / early media announcements - how they’re implemented (1)

● Could be done in many ways, e.g. Asterisk & FreeSWITCH folk traditions.

● We use the more lightweight SEMS (SIP Express Media Server).

● early_announce application module is perfect for this.

In-band / early media announcements - how they’re implemented (2)

SEMS config

In-band / early media announcements - how they’re implemented (3)

Kamailio relay to SEMS

From and To header replacement

● Although proxies are not supposed to be able to modify the To and From header, it is an unavoidable and common requirement.

● Lots of endpoints want To URI == Request URI.

● Lots of gateways want From URI domain == source IP.

● Calling party ID presentation in user part of From URI.

From and To header replacement - how it’s implemented

● Kamailio uac module provides stateful substitution of these headers.

● Takes advantage of Record-Route.

● Caller is none the wiser.

Sophisticated route failover logic

● Load custom route set from database.

● Complex failover feature set means lots of business rules.

● Custom database-defined failover behaviour.

Sophisticated route failover logic - how it’s implemented (1)

● sql_xquery() pulls multiple-row result directly into XAVP array

● Iterate through XAVP array with transaction-persistent ($avp(...)) index.

Sophisticated route failover logic - how it’s implemented (2)

Sophisticated route failover logic - how it’s implemented (3)

● failure_route allows fine-grained failover logic control.

● Can call other request_routes anew, but beware of stateful vs. stateless replying and other caveats!

Entry/exit on multiple network interfaces/addresses

● It is not uncommon to use CSRP to across disparate networks.

● Can be due to:○ Direct interconnects across private IP or MPLS.○ Multiple source addresses required for multiple SIP wholesale

products. ○ Connectivity to internal application “cloud” not reachable from

public Internet.

● Reluctant CSRP begrudgingly conscripted into “SBC lite” role in these situations.

Entry/exit on multiple network interfaces/addresses - how it’s implemented

Fortunately, Kamailio is magical.

(Mostly.)

Automatically adds double Record-Route headers for ingress and egress interfaces.

Easy to select outgoing socket with $fs.

Immediate FutureInteresting improvements

5

Near-term future improvements

● Use PostgreSQL native JSON support + jansson module.○ Avoid unnecessary string parsing and data marshalling.

● DMQ replication of various state for clustering.

● More flexible asynchronous processing.

THANKS!

Questions welcome!