.
Abstract: Session Initiation Protocol known as SIP is currently being preferred to H.323 protocol due to its simple architecture. In Traditional SIP, security mechanisms are inadequate for assuring the identity of the end user based on cryptography. This is vulnerable in the sense that a third person, an impersonator, may easily duplicate one of the users engaged in the session. This can pose serious threats as it affects the secrecy of the user. In order to reduce the threat posed by an eavesdropper, an authentication scheme is required. This study proposes a scheme for authenticating the end users identities with the outbound proxy server in that domain with the help of the registrar server. The authentication scheme is based on end users public key infrastructure (PKI) certificate and strong one way hash function. The proposed scheme is implemented using sockets and tested for its validity in a
LAN environment. The delay overhead in providing authentication in the proposed scheme is found to be within the allowable limit as per IETF standard.
Key words: Authentication, VoIP, SIP, proxy server, user agent client, user agent server
INTRODUCTION
The Session Initiation Protocol (SIP) [1,2] is an application layer signaling protocol for initiating, managing and terminating voice and video sessions across packet networks. SIP sessions involve one or more participants and can use unicast or multicast communication. Borrowing from universal Internet protocols, such as HTTP [3] and SMTP, SIP is textencoded and highly extensible. SIP is being developed by the SIP Working Group, within the Internet
Engineering Task Force (IETF). The protocol is published as IETF RFC2543 and currently considered to be standard. Different types of entities are defined in
SIP namely user agents, proxy servers, redirect servers and registrar servers. The user agent represents the terminal (i.e., an application that contains both the user agent client and user agent server). The proxy server is an intermediate entity that acts as both a server and a client for making requests on behalf of other clients.
The redirect server accepts requests and replies to the client with a response message (typically providing a contact address for the called user). The registrar is a particular server that accepts user registration requests.
A simple call setup procedure from a user to another user is depicted in Fig. 1. Here the user1 represents the User Agent Client (UAC) that acts as the source for a call. The User2 represents the User Agent
Server (UAS) that acts as the destination for a call. The proxy servers used in the call setup procedures correspond to their respective domains. Though the signaling procedure seen in Fig.1 is simple and represents the ease of the SIP for which it was developed, a mandatory service namely security and privacy was given less attention when the initial architecture for SIP was designed.
Fig. 1: A simple SIP call setup procedure
This has paved a way for many researchers to make this topic a hot one. This study provides an alternative user authentication scheme to the HTTP digest authentication scheme that exists in SIP as per
RFC2617. Issuing PKI certificate to all end users may be a difficult task at present in the proposed authentication scheme, but this is acceptable taking into account the reliability of the proposed scheme.
SIP security services: The two fundamental security services that are required by SIP are confidentiality and authentication. Confidentiality is usually provided by
Corresponding Author: Srinivasan R, Department of Electronics Engg, MIT Campus of Anna University, Chennai – 600 044. India
1
means of encryption [4,5] so that only the intended recipient can decrypt and get the meaning out of it. In
SIP based network confidentiality is provided in two ways either as on a hop-to-hop basis or end-to-end basis. User authentication is needed in SIP connections to ensure that the communication is taking place only between the legitimate parties. The security services must be properly combined in order to achieve a trusted network scenario. Data authentication is used to authenticate the sender of the message and it also makes sure that the information being passed is not altered in the transmission. This is to prevent an attacker from altering and/or replaying SIP requests and responses. At present SIP makes use of proxy to authenticate, authenticate header fields, similar to the one that exists in HTTP [3] ; for authentication of the end users it makes use of a digital signature. As an alternative, hop-by-hop authentication is being performed using transport- or network-layer authentication protocols such as TLS or IPsec.
Proposed authentication scheme: A reliable authentication scheme is proposed for authenticating the user client with the outbound proxy server in the trusted SIP domain. The consideration in this scheme is that the called party and the calling party are with in the same domain and they authenticate with the outbound proxy server of that domain. The design has incorporated the registrar server within the outbound proxy server in the domain.
In the proposed model shown in the Fig. 2 the user client of the calling party wants to directly communicate with the outbound proxy server in that domain. Hence it is the responsibility of the proxy server to authenticate the user client by the use of the registrar server that co-exists with the proxy server of that trusted SIP domain. The key assumption that is made in the proposed authentication protocol is that both the proxy server and the registrar server hold the public key certificate issued by the certification authorities.
Table 1: Notations used in the design
Notation Description of the Notation
T
C i i
[M] k
Time stamp used by entity ‘i’
Certificate of an entity “i”
Encryption of a message ‘M’ using a symmetric key ‘K’
E k
[M] Encryption of a message ‘M’ using an asymmetric key ‘K’
H[M] One-Way hash function used to hash message ‘M’ p & q large prime numbers such that q | (p-1)
In the proposed scheme the user client of the calling party first registers with the registrar server that exists in the domain of its existence. The registrar
2 server generates a large number N of size m-bits and keeps it secret. The number is chosen in such a way that it is not easy to trace it by exhaustive search mechanism. During registration process the user client
(UC) submit its own identity I uc
to the registrar server
(RS). In return the RS generates the password for the
UC as follows:
PW
UC
= H [N || I uc
] (1)
Where, || represents concatenation operation and sends it back to that UC in a secure manner. The RS also issues a tag for the UC that contains I
RS
, r and a strong one way hash function H.
SIP Trusted Domain
User
Client
A
D
Proxy
Server
B
C
Registrar
Server
Fig. 2: The authentication protocol flow in the SIP domain where I
RS
is the identity of the registrar server and the value of r is computed as follows: r = H[N || I
RS
]
H[N || I
UC
]
I
RS
I
UC
(2)
The symbol ‘ ’ represent the bit-wise XOR operation
In call establishment phase the calling party’s user client and the proxy server of the domain has to authenticate each other. This is because the proxy server initiates the call on behalf of the user client. At the same time the user client must also prove that he is a valid user in that domain.
In the proposed scheme when a user client wants to establish a connection with a user server it first enters a
password PW
UC
that was issued to it during the registration process by the registrar server. This password is used to generate a secret random number
‘R
0
’. This value of R
0 is kept secretly for further usage.
The random number generated is an element of the set
{1, 2 …… (q-1). Now the user client computes the number n = r
PW
UC
and generates a time stamp
TS
UC
. Now the user client generates a temporary key L
= H [ PW
UC
TS
UC
] used for symmetric encryption.
The secret random number R
0
is encrypted using the symmetric key L. After all these computations are done the user client sends parameters A= n, [R
0
]
L
, I
RS
, TS
UC as shown in Fig.2.
After the request has come from the user client the proxy server checks how recent the time stamp is, in the parameter ‘A’ by comparing it with its current time. If it is within some allowable limit the proxy processes the request and terminates it otherwise.
On a valid time stamp the proxy server (PS) generates a secret random
σ and computes its signature using its private key KR
PS
.
Signature of PS=
E
KR
PS
(H(σ,n,(R
0
)
L
, TS
UC
, C
PS
)) (3)
Where, C
PS
is the certificate of the proxy server.
Now the PS generates its own time stamp TS
PS
and then sends B to the registrar server. Where, B = σ,n, (R
0
)
L
,
TS
UC
, Signature of PS, TS
PS
and C
PS .
Upon receiving
‘B’ the registrar server has to validate the certificate and the time stamp that it received from PS. If the time stamp is not within some elapse limit then the request phase is terminated otherwise it generates the UC’s real identity as follows
I
UC
= I
RS
n
H [N || I
RS
] (4)
After computing the real identity of the UC it verifies whether the UC is a legal user for the domain.
If the user is not a legal user an error message is forwarded otherwise it computes the temporary key L for decryption of the message R
0.
L = H [TS
UC
H [N||I
UC
]] (5)
Once the message R
0
is obtained, the registrar server encrypts the H [ I
UC
] and R
0 with the public key of PS. Then RS computes its signature using the private key KR
RS
.
Signature of RS =
E
KR
RS
(H (σ, γ, E
KU
PS
(H [I
UC
|| R
0
) (6)
Where γ is the secret random number generated by the registrar server. Finally the registrar server generates its time stamp TS
RS
and sends C = γ,
E
KU
PS
(H[I
UC
|| R
0
), signature of RS, TS
RS and C
RS
to the proxy server.
Upon receiving the message from the RS the PS verifies whether the time stamp is within the allowable
3 limit and also checks for the validity of the certificate received from RS. If these parameters are not valid then the exchange is terminated. But if the received parameters are valid then the PS can identify that the
UC is an authorized user. Now the PS issues a temporary certificate TC
UC
to the UC that contains the lifetime of the temporary certificate and other information related to it. The PS computes the session key (SK) that is needed for all signal transfer between the UC and PS. Where SK is computed as SK = H[I
UC
]
R
0.
The PS now stores H[I
UC
] and R
0
and sends the
(TC
UC
)
SK
to UC.
D = (TC
UC
)
SK
(7)
Now PS has completed the authentication process with UC and has also established a session key. Upon receiving D the UC can compute SK and decrypt the message and obtain the temporary certificate.
Once the call is established between the calling UC and the called user server (US) the identity of the end users are shared. Therefore, the media traffic is carried out directly between the end users. During the call progress period, the end users authenticate each other on a regular time interval by means of providing TC
UC,
(R || TC ) i
, where, SK i
= H(I
UC
)
R i-1
, i =
1,2,…n. upon receipt of this message the US checks if the certificate TC
UC
is valid. If it is not valid, US of the called party terminates the connection. Otherwise, US of the called party computes SK i
and decrypts with SK i
to get i
. Now the US compares the two TC
UC
and verifies the integrity of the message. Now the US saves R i
in order to compute the next session key and provides connection for calling
UC. Because R i
can be computed only by the calling
UC who generated it. R i
plays a vital role of one time key to access the called US.
Performance analysis: In the proposed authentication scheme it is reasonable to assume that RS is trustworthy because the user must register itself with its private information to obtain the service.
In the authentication scheme, security is based on the one way hash function [6] and smart tag. The authentication between the outbound PS and the RS makes use of the public key encryption; the security of the encryption is based on the difficult problem of disperse logarithm. In the start of the protocol the UC computes temporary encryption key L using the one way hash function and smart tag. The strength of the hash and the secrecy of the big number N assure that r , n and L cannot be obtained easily. The time stamp in the temporary key L ensures it is recent.
In the next step of the protocol the PS verifies the legal status of the UC and then forwards the information of the UC and its certificate to the RS. The signature over the hashed value of message provides the implicit PS
authentication. The one way hash operation assures the integrity of the message and random number σ prevents any replay attack. Similarly the RS uses random number and time stamp to provide freshness and to prevent any replay attack. In the proposed scheme the
PS trusts UC because PS trusts RS. On the other hand,
UC trusts PS because UC trusts RS. The trust relation between UC and PS is established through RS.
Table 2: Computational load in the protocol
Operations Proxy Registrar User Client
Server Server (Calling party)
User Server
(Called Party)
No. of hash 1
Operations
During registration
Encryptions 2 and
Decryptions
3
3
1
2
1
2
Table 3: Delay budget
Delay Source On-net Budget(ms)
Device Sample Capture
Encoding Delay
Packetisation/ Depacketisation Delay
0.1
17.5
20
Uplink Transmission Delay
Network Transmission Delay + Others
10
X
¶
2 Decoder Processing Delay
Device Play out Delay 0.5
Hashing Overhead Delay in Proposed Scheme 10
§
Delay in traditional VoIP : 60.1 ms + X
Delay in Proposed Scheme : 70.1 ms + X
Overhead
¶
: 10 ms
Network Transmission Delay, Jitter varies from network to network.
§
With a P III 1.0 GHz Processor
Table 2 summarizes the computational load involved in the proposed scheme for SIP based VoIP authentication. The XOR and the concatenation operations involved in the protocol are negligible operations compared to hashing and encryption.
During media transfer the UC can prove itself by presenting TC
UC
to PS. Because R i
can be computed only by the UC who that generated it, SK i
plays a role of one time label to access the network through the PS.
The importance of the protocol is that there is only one round of information between UC and PS and one round of information exchange between PS and RS.
A sample on-net delay budget [7] for the G.729
(8 kb s‾ 1 ) codec is shown in Table 3. The delay budget is for an end to end connection between two users involved in the VoIP call. The last row in the Table 3 is the delay overhead that is introduced due to the authentication scheme. The total delay in the call due to the additional overhead is well within the acceptable limit that is recommended by ITU-T [8] .
CONCLUSION
In this study, a simple but effective authentication protocol for SIP based VoIP applications is proposed.
The security of the proposed scheme relies on the security of the one way hash scheme. The scheme is based on a fine blend of public key and symmetric key encryption. In the proposed protocol there is only one round of information exchange between the UC and PS and one round of information exchange between PS and
RS. The important feature of the scheme is the use of one time key between the UC and PS. The computational load and delay budget involved in the protocol is also shown to prove the efficiency of the proposed scheme.
REFERENCES
1. Rosenberg, J., et al., 2002. SIP: Session Initiation
Protocol. IETF RFC 3261.
2. Handley, M. et al., 1999. SIP: Session Initiation
Protocol. IETF RFC 2543.
3. Franks, J., et al.
, 1999. HTTP Authentication:
Basic and Digest Access Authentication. IETF
RFC 2617.
4. Schneier, Bruce, 1996. Applied Cryptography:
Protocols, Algorithms and Source Code in C. 2
Edn. New York, John Wiley.
5. Stallings William, 2000. Network Security
Essentials: Applications and standards. New
Jersey, Prentice Hall.
6. Rivest, R., 1992. The MD5 Message-digest
Algorithm. IETF RFC 1321.
7. Bur Goode, 2002. Voice over Internet Protocol
(VoIP). Proc. IEEE, 90: 9.
8. One-way transmission
Recommendation G.114, 1996. time, ITU-T
4