Voice over IP using Linux
Written by Leonardo Balliache

Frequently I've seen in LARTC list questions related to the best configuration to forward Voice over IP (VoIP) packets using Linux. Some people compliant because they can't get the same quality of service they get when using Cisco routers that when using Linux routers.
Some particulary VoIP packet characteristics that have to be considered are:
  • VoIP travels on RTP protocol over UDP.

  • VoIP packets are very small. Payload is 20 to 150 bytes with a RTP/UDP/IP header of 40 bytes (IP=20 bytes; UDP=12 bytes; RTP=8 bytes). Then due to the high relation between header size and payload size the transmission of VoIP packets is not an efficient process.

  • Being VoIP a protocol to service a playback application (voice playback) its maximum end to end delay should be less that 150-200ms; 150ms is better. This, to guarantee the good quality of the sound to be transmitted.
What kind of problems VoIP packets experience when traveling throughout lines, switches and routers?
  • The efficient of transmission is low. For transmitting 20-150 bytes you need a header of 40 bytes. A relation of 200%-26.67%.

  • Packets are small. When they travel throughtout lines transmitting bulk traffic, with big packets (1000-1500 bytes, and even bigger), they have to make queues that looks like this: ********** * * ********** ********** * ********** * * **********

    Here ********** is a big 1200 byte packet; * is a small VoIP packet. This kind of queue is formed on routers. Then VoIP packet have to wait its turn on the routers to be forwarded behind, perhaps several, big packets. This problem conspire against the restriction of having a low forwarding delay.


  • They are UDP packets. Some routers are designed to control unresponsive flows, like UDP is. See below. Then, perhaps, they have a higher probability to be dropped on these kind of routers.
Studying how Cisco deals with these problems is interesting to understand why Linux routers give us lower response when VoIP packets are tried to be forwarded. I'm going to copy here some text (in cursive) taken from my work Network QoS using Cisco HOWTO.
First, they resolve the RTP/UDP/IP header length problem (to improve the efficiency of transmission) using what is called Cisco RTPC. Have a look how:
RTPC - Realtime Transport Protocol Header Compression: RTP is a protocol used for carrying multimedia application traffic, including audio and video, over an IP network. RTP packets have a 40-byte header and typically a 20 to 150 payload. RTP protocol travels over UDP. Given the size of the IP/UDP/RTP header combination, it is inefficient to transmit those small payloads using an uncompressed header. RTPC is a technology that helps RTP run more efficiently, especially over lower-speed links, by compressing the RTP/UDP/IP header from 40 bytes to 2 to 5 bytes. This is especially beneficial for smaller packets (such as IP voice traffic) on slower links, where RTP header compression can reduce overhead and transmission delay significantly.
The second problem is resolved using Cisco LFI. Have a look how:
LFI - Link Fragmentation and Interleaving: some really incredible tool from Cisco folks. It's explained more or less as this: interactive traffic (always fragile traffic like Telnet, Voice over IP, SSH, interactive WWW as chating and lived questionaries) is susceptible to increase latency and jitter (have a look to QOS for a brief explanation of these terms) when the network processes large packets (for example, LAN-to-LAN FTP big packets transversing a low bandwidth WAN link), especially when their packets (from interactive flows) are queued on these slower links. LFI reduces delay and jitter by breaking up large datagrams and interleaving low-delay traffic packets with the resulting smaller packets. For combining large file FTP transfer traffic (where latency and jitter really don't matter) with low-bandwidth fragile traffic like Telnet, VoIP, SSH, etc. (where latency and jitter really matter) LFI is the right solution. Combined again with RTPC (see below) is a must. Really a hit from Cisco people.
Now is easy to understand why Linux doesn't lend as good VoIP service as Cisco does. They have what in my country is known as a "poison engine".
I don't know if Linux folks have implemented something like this. I was searching in Internet and I can't find anything. If someone of you know something, please let it know to the rest of us.
Well, but, how to deal with this problem using our current Linux?
I suggest (opinions are welcome) to begin with the ingress side. Having the guarantee that our VoIP packets will be forwarded to the outgoing queue as soon as they reach our router is a good first step. Then something like this seems to be good enough:

Our first ingress filter, priority 1, catchs UDP packets (UDP is protocol number 17) policing them up to 240kbits. Then we can manage with this 15 conversations of 16kbps each. Check your VoIP implementation to know the bandwidth requeriment per session and adjust your command according.
Now our valuable packets are in the outgoing queue as fast as we can. For this side I suggest a prio queue to kick them asap to the outgoing interface; then something like this can help:

This time we don't need to shape them with an additional TBF queue, because UDP packets, marked with tcindex class 1 are already policed to 240kbit.
I haven't tried this solution by myself, but I think it should work. If not, make me know if I'm wrong.
Of course, we can't have as good response as Cisco has, but it's the best we can do. If someone of you don't have a better solution, of course. Again, opinions to enrich the discussion are welcome.
Best regards,

Leonardo Balliache
Unresponsive flow: When an end-system responds to indications of congestion by reducing the load it generates to try to match the available capacity of the network it is referred to as a responsive. M.A. Parris.
More about Cisco's VoIP
To simplify even more our life, Cisco delivered what is called AutoQoS for Voice Over IP (AutoQoS). This is an incredible piece of software that makes the work that we have to do, for us. See this marvel:

The last command configures VoIP on interface serial 3/0. Do you know what this single line command does for us? Let's take the answer directly from Cisco:
  1. Classify the IP traffic with RTP and audio codec payload type (RFC 1890) as VoIP bearer traffic.
  2. Mark VoIP bearer traffic with DSCP EF and VoIP signaling (control) traffic as AF31.
  3. Map the Layer 3 marking to the corresponding Layer 2 marking if applicable.
  4. Remark traffic that is marked DSCP EF or AF31 to DSCP 0 if the traffic is not classified as VoIP bearer or signaling (control) traffic.
  5. Treat all other non-VoIP traffic types as best effort QoS (excluding control traffic such as routing protocol updates and BPDUs).
  6. Put VoIP bearer traffic into a strict priority LLQ with guaranteed bandwidth to accommodate voice traffic.
  7. Put VoIP control traffic into a non-priority queue with a minimum bandwidth guarantee to ensure no packet loss.
  8. Enable LFI and compressed RTP (cRTP) for link speeds of less than 768 kbps.
The trust optional keyword allows Cisco AutoQoS to trust the DSCP marking of the traffic and use it to classify that particular type of traffic (Cisco AutoQoS default is non-trust). If the trust keyword is not configured, then voice traffic is classified and marked with the appropriate DSCP values using nBAR.
The fr-atm optional keyword is only used on Frame Relay DLCIs used for Frame Relay to ATM internetworking (auto qos voip fr-atm must be explicitly configured to enable Cisco AutoQoS for FR-to-ATM internetworking links). This is effective only for low-speed DLCIs, where multi-link PPP over Frame Relay (MLPoFR) is created to enable LFI (NOTE: fr-atm keyword is ignored when configured on high-speed links even if the keyword is configured).
Ohh, my god. Does it could bring me a coke too? Observe that VoIP packets are marked as EF (Differentiated Service Expedited Forwarding class) and signaling (control plane) as AF31 (Differentiated Service Assure Forwarding class AF31). More about this theme, but based on Linux, in this site at Differentiated Service on Linux HOWTO. I would be very happy to write the same theme but using Cisco instead. It's not a promise but I'm going to try if I can collect some time to do it.
Some notes I have to say you are:
  1. Do not forget that both ends have to be implemented using the same configuration. This means, you have to have Cisco routers on both sides implementing the same configuration. Have a look to the Cisco site from more information about this requirement.
  2. Be careful when using LFI (Link Fragmentation and Interleaving). If you are planning to use some other application through the same link, check them first because some do not accept or permit packet fragmentation.
Bye, dear lectors...
Even more about Linux's VoIP
I was thinking that perhaps there is a work around to have better VoIP in Linux. Because we haven't yet someone who write something as good as LFI for Linux, a possible solution could be (please be adviced I haven't tested this yet):
Decrease the mtu value of your interface to be as little as VoIP packets are, for example, 256 bytes. This way, smart bulk apps will query the mtu from the network and they will adjust their packetĀ“s size to this value. Having all packets being as small as VoIP packets are, they are not going to be waiting a long time behind big packets (i.e., p2p interfaces can have a mtu of 4096 bytes, generating very big packets to be interleaved with the small VoIP packets).
Try then with this command from the iproute2 package:
ip link set ppp0 mtu 256
Here we are setting the ppp0 interface mtu value to 256 bytes.
If you have a little more time that I have and you are succeed with this, please make me know; if not, make me know too to continue thinking about this.
A little advice:
RFC 3246 specification tells us:  To ensure that queues encountered by EF packets * are usually short, it is necessary to ensure that the service rate of EF packets on a given output interface exceeds their arrival rate at that interface, over long and short time intervals, independent of the load of other (non-EF) traffic.
* Like VoIP packet should be.
It is very important to have this in mind; it doesn't have any sense to kick out our VoIP packet to hurry up them to leave our router, if the link on the side where you are throwing them cannot accept this avalanche. Then, check well your infrastructure.
End-to-end packet delay function
The end-to-end packet delay can be calculated using the following function:

The first term, in ms, is the time required to move one packet having the maximum segment size when it is trasmitted using the bandwidth capacity of the link. For example, a 1500-byte ethernet packet on a 2048 kbps link will take 5.72 ms to be moved. This term is generally negligible when dealing with fast links (2.048Mbps and faster).
The second term, also in ms, is provoked by the queue formed in the routers. It can be very high when congestion fills the router's queues. Let's suppose we have 3 routers in our path, each one having a (mean) 20 packets queue. Assuming an average packet size of 512 bytes, using the same 2048kbps link, then we have that the delay induced in the router's queues is 117.19 ms. As you see this contribution is considerable.
The last term is given directly in ms and it is the link propagation delay. It will depend of the medium used to propagate the signal (copper, fiber, air, etc.), and the actual length of the link. As longer the link, it will be higher the propagation delay.
Assuming, for our example, a propagation delay of 50 ms, our final delay will be: 5.72 + 117.19 + 50 = 172.91 ms. Having into account that VoIP packets are very sensible to end-to-end delay higher than 150-200 ms; our example poses serious problem for having a good voice quality.
How can we help?
  1. Avoid mixing VoIP packets with big bulk packets.
  2. Avoid using a very high MSS (mtu).
  3. Avoid using very low bandwidth links to transmit VoIP packets.
  4. Avoid using very long propagation delay links (satellite, for example).
  5. Avoid very large links.
  6. Avoid congestion.
  7. Configure your routers to put the VoIP packets in priority queues, not having to wait behind long queues of big packets to be forwarded.
Overbuffering and underbuffering
These juxtaposed terms have to be balanced to have the right response. They affect both, the packet loss probability and the packet latency. Overbuffering allows for higher bursting capacity and lower packet loss probability, but increase the router's queue length and the experimented packet latency. On the contrary, underbuffering reduce latency but increase the packet loss probability.
Normally, when using Internet, we do not control the path for VoIP packets. Then, we have to keep one's fingers crossed for being lucky and have the right configured routers in our path; being these overbuffered we are out of business to have a good VoIP response. However, if we control the link routers we can applied some simple rules to select the best buffering strategy for them (see below).
For Internet paths, the best strategy has to be started by making a careful evaluation of our environment. Samples have to be taken using a tool like pathchar, pchar or even ping. These tests are going to tell us if we have some success probability. It doesn't have any sense to expense time and money if we have a link with a very high latency response.
To estimate buffer settings have a look to some examples in this site in CAR queuing discipline for Cisco routers and GRED queuing discipline for Linux routers.