Technical
info about VoIP
Here we see some important info about VoIP, needed to understand
it.
Overview on a VoIP connection
To setup a VoIP communication we need:
- First
the ADC to convert analog voice to digital signals (bits)
- Now the
bits have to be compressed in a good format for transmission: there
is a number of protocols we'll see after.
- Here we
have to insert our voice packets in data packets using a real-time protocol
(typically RTP over UDP over IP)
- We need
a signaling protocol to call users: ITU-T H323 does that.
- At RX
we have to disassemble packets, extract datas, then convert them to
analog voice signals and send them to sound card (or phone)
- All that
must be done in a real time fashion cause we cannot waiting for too
long for a vocal answer! (see QoS section)
Analog
to Digital Conversion
This is made by hardware, typically by card integrated ADC.
Today every sound card allows you convert with 16 bit a band of 22050
Hz (for sampling it you need a freq of 44100 Hz for Nyquist Principle)
obtaining a throughput of 2 bytes * 44100 (samples per second) = 88200
Bytes/s, 176.4 kBytes/s for stereo stream.
For VoIP we needn't such a throughput (176kBytes/s) to send voice packet:
next we'll see other coding used for it.
Compression Algorithms
Now that we have digital data we may convert it to a standard format that
could be quickly transmitted.
PCM, Pulse Code Modulation, Standard ITU-T G.711
- Voice
bandwidth is 4 kHz, so sampling bandwidth has to be 8 kHz (for Nyquist).
- We represent
each sample with 8 bit (having 256 possible values).
- Throughput
is 8000 Hz *8 bit = 64 kbit/s, as a typical digital phone line.
- In real
application mu-law (North America) and a-law (Europe) variants are used
which code analog signal a logarithmic scale using 12 or 13 bits instead
of 8 bits (see Standard ITU-T G.711).
ADPCM, Adaptive
differential PCM, Standard ITU-T G.726
It converts only the difference between the actual and the previous voice
packet requiring 32 kbps (see Standard ITU-T G.726).
LD-CELP, Standard ITU-T G.728
CS-ACELP, Standard ITU-T G.729 and G.729a
MP-MLQ, Standard ITU-T G.723.1, 6.3kbps, Truespeech
ACELP, Standard ITU-T G.723.1, 5.3kbps, Truespeech
LPC-10, able to reach 2.5 kbps!!
This last protocols are the most important cause can guarantee a very
low minimal band using source coding; also G.723.1 codecs have a very
high MOS (Mean Opinion Score, used to measure voice fidelity) but attention
to elaboration performance required by them, up to 26 MIPS!
RTP Real Time Transport Protocol
Now we have the raw data and we want to encapsulate it into TCP/IP stack.
We follow the structure:
VoIP data
packets
RTP
UDP
IP
I,II layers
VoIP data
packets live in RTP (Real-Time Transport Protocol) packets which are inside
UDP-IP packets.
Firstly,
VoIP doesn't use TCP because it is too heavy for real time applications,
so instead a UDP (datagram) is used.
Secondly,
UDP has no control over the order in which packets arrive at the destination
or how long it takes them to get there (datagram concept). Both of these
are very important to overall voice quality (how well you can understand
what the other person is saying) and conversation quality (how easy it
is to carry out a conversation). RTP solves the problem enabling the receiver
to put the packets back into the correct order and not wait too long for
packets that have either lost their way or are taking too long to arrive
(we don't need every single voice packet, but we need a continuous flow
of many of them and ordered).
RSVP
There are also other protocols used in VoIP, like RSVP, that can manage
Quality of Service (QoS).
RSVP is a signaling protocol that requests a certain amount of bandwidth
and latency in every network hop that supports it.
For detailed info about RSVP see the RFC 2205
Quality
of Service (QoS)
We said many times that VoIP applications require a real-time data streaming
cause we expect an interactive data voice exchange.
Unfortunately,
TCP/IP cannot guarantee this kind of purpose, it just make a "best
effort" to do it. So we need to introduce tricks and policies that
could manage the packet flow in EVERY router we cross.
So here are:
- TOS field
in IP protocol to describe type of service: high values indicate low
urgency while more and more low values bring us more and more real-time
urgency
- Queuing
packets methods:
- FIFO
(First in First Out), the more stupid method that allows passing
packets in arrive order.
- WFQ
(Weighted Fair Queuing), consisting in a fair passing of packets
(for example, FTP cannot consume all available bandwidth), depending
on kind of data flow, typically one packet for UDP and one for TCP
in a fair fashion.
- CQ
(Custom Queuing), users can decide priority.
- PQ
(Priority Queuing), there is a number (typically 4) of queues with
a priority level each one: first, packets in the first queue are
sent, then (when first queue is empty) starts sending from the second
one and so on.
- CB-WFQ
(Class Based Weighted Fair Queuing), like WFQ but, in addition,
we have classes concept (up to 64) and the bandwidth value associated
for each one.
- Shaping
capability, that allows to limit the source to a fixed bandwidth in:
- download
- upload
- Congestion
Avoidance, like RED (Random Early Detection).
For an exhaustive
information about QoS see Differentiated Services at IETF.
All
of this content on VoIP was taken from VoIP Howto, written by Roberto
Arcomano berto@fatamorgana.com on v1.7, August 7, 2002. The most up to
date version of this information can be found at: http://www.fatamorgana.com/bertolinux
Definition
| Technical Information
| Requirements | Considerations
| News
|