Contents 1. Introduction.....………………………………………………………………1 2. Digital Signatures…………….……………………………………….……..1 3. XML Signature Fundamentals.……………………………………….……..2 4. Types of XML Signatures.…………………………………………………..3 5. XML Signature Processing………………………………………………….4 6. XML Signature Elements…………………………………………………....6 7. XML Canonicalization……………………………………………………...10 8. Algorithms…………………………………………………………………..11 9. Conclusion…………………………………………………………………..14 10. References…………………………………………………………………...15 1. Introduction Security in web applications is a very important issue for today’s businesses. Current developers of business applications try to come out with solutions to offer secure business transactions for companies deploying applications that satisfy their needs. Likewise, companies look for applications that bring them security in their transactions not only in a conceptual business-to-costumer flow but also in a business-to-business direction. XML Signatures is a developing technology that provides more secure web applications for businesses. XML has surged as the predominant language used for web applications. One of the XML properties is its portability among different platforms and its semistructured storage model ideal for interaction with current database management systems. XML is a text based conceptual language, leaving aside the formatting. XML Encryption and XML Signature are fundamental to the next generation of emerging technologies that use these two standards as building blocks, like WS-Security, XML Key Management Specification (XKMS) or SAML. 2. Digital Signatures Digital signatures serve to identify the origin of a document. But that is not all, digital signatures protects the integrity of data and can detect any changes made to it while it is in route to its recipient. Furthermore, authenticity is obtained by the sender’s identity. A message is typically signed using the private key of the sender and verified by the sender’s public key. This prevents an attacker to pretend to be the sender forging a message. In a digital signature process, a message is usually taken through a series of phases where algorithms sign the message. First, the message is hashed using a cryptographic hash function returning a hash value of the message. Then the hash value is signed using a signing algorithm and the sender’s private key to produce a signature value. The receiver then starts the verification process in which the received message is hashed with the same hash function used when signing, then the signature value is verified by passing it, along with the public key and the computed hash, to the signing algorithm. If the computed hash and the signature hash match, then the signature is valid. The signature process is explained in detail in section five. 1 3. XML Signature Fundamentals XML Signature is a joint standard from W3C and IETF organizations for digitally signing all of an XML document, part of an XML document or even an external object. Pointing a Uniform Resource Locator, you can sign pretty much anything you can, from regular text to images and pictures. A standard XML document is formed of a set of one or more elements enclosed by a root element. A schema such as DTD or XSchema provides a determined structure of a XML document. This is a free-error approach in business-to-business communication in order to have an established well-formed document used in applications. Thus, business applications communicate through XML documents based in schemas agreed by both entities. An XML Signature is itself a piece of XML with its corresponding schema determining how this XML document will be structured. Within the XML Signature itself are references to sources that will be digitally signed. The source indication is part of the Reference element which has an attribute URI (Uniform Resource Identifier) that points to an internal or external object. A single XML document can contain multiple XML Signatures each referring to a different object. Let’s take a look at how a XML Signature looks like. <check> <PersonName>Jim Morrison</PersonName> <date>2004-11-01T00:00:00</date> <Signature xmlns=http://www.w3.org/2000/09/xmldsig#> <SignedInfo> <CanonicalizationMethod Algorithm=“http://www.w3.org/TR/2001/REC-xml-c14n-20010315”/> <SignatureMethod Algorithm=“http://www.w3.org/TR/2000/09/xmldsig#rsa-sha1”/> <Reference URI=“ ”> <Transforms> <Transform Algorithm=“http://www.w3.org/2000/09/xmldsig#enveloped-signature”/> </Transforms> <DigestMethod Algorithm=“http://www.w3.org/2000/09/xmldisg#sha1”/> <DigestValue>eUPar59M28X1c1DNORnhmW0Z2Y=</DigestValue> </Reference> </SignedInfo> <SignatureValue>epyuHLJmyscoVMg2pZZAtZJbBHsZFUCwE4Udv+u3T thj6fJGH4wpw/danhTLj7fqOghdk3jfplbxsewHSVfjpeytvnd=</SignatureValue> </Signature> </check> 2 4. Types of XML Signatures XML Signatures allows signing an internal or external resource. An internal resource might be a XML node while an external resource can be a binary or non-XML file (image or text document), another XML document or a node within another XML document. The type of an XML Signature depends on whether the resource is an internal or external resource. There are three types of XML Signatures: enveloping, enveloped and detached. Enveloping Signature An enveloping signature wraps the item that is being signed. The reference is to an XML element within the signature element itself. The following diagram depicts this signature. <Signature> <Reference> <Object> <SignedItem id… : </Object> <Signature> Enveloped Signature The Reference element of a signature points to a parent XML element. <PurchaseOrder id =“po1”> <sku>125362</sku> <quantity>13</quantity> <Signature xmlns=“… : <Reference URI= “po1”/> </Signature> <PurchaseOrder> 3 Detached Signature A detached Signature points to an XML element or binary outside the signature element’s hierarchy. The item being pointed to is neither a child nor a parent. It could point to an element within the same document or to another resource completely outside the current XML document. <TargetXMLElement> : : <Signature> <Reference> : : An XML Signature can be enveloping, enveloped and detached all at the same time. The signature element can contain more than one Reference element which can be enveloping, enveloped or detached. 5. XML Signature Processing How XML Signature Works? XML Signature technology standard is composed of two processes; the Signing process takes place in the sender end and the Verification process in the recipient end. Before we describe the processes, let’s define a message digest. A message digest is a short representation, usually 20-bytes, of the full message. This message digest is created by applying a hash function to the message. The created message digest can be used as a proxy for the original message. This hash function needs to be fast because you need to run this function on both the sending and receiving ends of communication. The process is as follows: 4 Signing Process 1. Create a message digest by hashing the entire plaintext message. 2. Encrypt the message digest using the sender’s private key. 3. Send original plaintext message and the encrypted message digest along with the sender’s public key to any recipients. Verification process 1. Recipient receives the plaintext message and the encrypted message digest from the sender. 2. Recipient receives the sender’s public key. (Public key may or may not be sent with the signature) 3. Recipient runs the original plaintext message through the same SHA1 hash algorithm originally performed by the signer. 4. Recipient uses the sender’s public key to decrypt the message digest. 5. Finally, a bit-to-bit comparison is done between the message digest computed in the receiver’s end and the one decrypted in the receiver’s end too. Digest Method Verification Encryption H’(M) Public Key D(E(H(M),Ps),P) If H’(M) == H(M) SENDER M H(M) E(H(M),Ps) RECEIVER Then M = M Signature The hash function task is to avoid for two messages to create the same message digest. If that occurs, an attacker could substitute a new message for the original and fool the recipient into thinking the new fraudulent message is the correct one. Therefore, excellent collision avoidance is the fundamental property for hash functions used to create a message digest. Some of these hash functions algorithms are MD4, MD5 and SHA1, being the first two avoided for its weakness found, while SHA1 is the current algorithm 5 used by security systems and web services security. When a message of any length < 264 bits is input, the SHA produces a 160-bit message digest output. Message digest is then input to the DSA (Digital Signature Algorithm) which computes the signature of the message. Signing a message digest instead of the entire message improves efficiency since the message digest has a shorter length. The verifier or receiver of the signature should obtain the same message digest when the received version of the message is used as input to SHA1. SHA1 hash algorithm which stands for Secure Hash Algorithm will be discussed in detail later. 6. The XML Signature Elements Like any regular XML document, a XML digital signature file is composed of tag elements. The Signature tag is the root element of the document and contains four child elements: SignedInfo, SignatureValue, KeyInfo and Object elements, being the last two optional. Each of these elements contains even more elements, forming a complex structure that depicts a XML Signature. We will describe the syntax as well as the functionality of each of these elements. Format of a signature XML digital signatures use a single namespace that must be declared in each document. xmlns:ds=”http://www.w3.org/2000/09/xmldsig#” Within a XML Signature, the URIs identify resources, algorithms and semantics. Signature Element The top-level element is the Signature element. It contains information about what is being signed, the signature, the keys used to create the signature, and a place to store arbitrary information. An XML digital signature is represented by the Signature element which posses the following structure. (Note: “?” denotes zero or one occurrence, “+” denotes one or more occurrences,“*” denotes zero or more occurrences and elements between parentheses are optional. 6 <<Signature ID?> <SignedInfo> <CanonicalizationMethod/> <SignatureMethod/> <Reference URI? > (<Transforms>)? <DigestMethod> <DigestValue> </Reference>+ </SignedInfo> <SignatureValue> (<KeyInfo>)? (<Object ID?>)* </Signature> In order to provide a generic structure of a XML digital signature, the signature document is validated against its schema. The following is the schema for the Signature element. <element name="Signature" type="ds:SignatureType"/> <complexType name="SignatureType"> <sequence> <element ref="ds:SignedInfo"/> <element ref="ds:SignatureValue"/> <element ref="ds:KeyInfo" minOccurs="0"/> <element ref="ds:Object" minOccurs="0" maxOccurs="unbounded"/> </sequence> <attribute name="Id" type="ID" use="optional"/> </complexType> The ID attribute in the Signature element allows a document to have multiple signatures and provides a way to identified particular instances. SignedInfo Element This element is the most complex element. It contains information about the SignatureValue element and information about the content application. Also it contains the information that is actually signed. 7 <element name="SignedInfo" type="ds:SignedInfoType"/> <complexType name="SignedInfoType"> <sequence> <element ref="ds:CanonicalizationMethod"/> <element ref="ds:SignatureMethod"/> <element ref="ds:Reference" maxOccurs="unbounded"/> </sequence> <attribute name="Id" type="ID" use="optional"/> </complexType> It is in this element where the canonicalization process takes place. Canonicalization, or C14N, is the process of picking one path through all the possible output options, so that sender and receiver can generate the exact same byte value, no matter what intermediate XML software might be involved. The SignatureMethod element specifies what type of signature (Kerberos or RSA) is used to create the signature. Taken together, these two elements (CanonicalizationMethod and SignatureMethod) tell us how to create the digest, and how to protect it from modification. Reference Element The Reference element is contained inside the SignedInfo element. A signature can have multiple references to objects such as all parts in a MIME message, an XML file and the XSLT script that converts it to HTML, and so on. The power and flexibility of URIs to point to just about any type of resource are critical to the power and flexibility of XML Signature. The Reference element has the following schema: <element name="Reference" type="ds:ReferenceType"/> <complexType name="ReferenceType"> <sequence> <element ref="ds:Transforms" minOccurs="0"/> <element ref="ds:DigestMethod"/> <element ref="ds:DigestValue"/> </sequence> <attribute name="Id" type="ID" use="optional"/> <attribute name="URI" type="anyURI" use="optional"/> <attribute name="Type" type="anyURI" use="optional"/> </complexType> 8 The URI attribute of the Reference element defines the location of the resource object to be signed. The child Transform element from the Reference element specifies how to process the data before hashing. It gives control over the content signed by allowing you to modify the data for a reference before the hash value for that data is generated. For example, in an enveloped signature, the transform element removes the Signature node from the XML document before signing it. SignatureValue Element It contains the actual signature encoded in Base-64 form. Base-64 encoding is used pervasively in XML-related applications. Base-64 encoding is a convenient, well-defined encoding mechanism for creating a unique, printable representation of arbitrary binary data. The following is the schema for the SignatureValue element with its corresponding example. <element name="SignatureValue" type="ds:SignatureValueType"/> <complexType name="SignatureValueType"> <simpleContent> <extension base="base64Binary"> <attribute name="Id" type="ID" use="optional"/> </extension> </simpleContent> </complexType> Example: <SignatureValue> WvZUJAJ/3QNqzQvwne2vvy7U5Pck8ZZ5UTa6pIwR7GE+PoGi6A1kyw== </SignatureValue> Object Element The Object element represents an item to be signed as part of the signature element. The following is its schema. Think of the Object element as the place to put the thing that is being signed when you have an enveloping reference. 9 <element name="Object" type="ds:ObjectType"/> <complexType name="ObjectType" mixed="true"> <sequence minOccurs="0" maxOccurs="unbounded"> <any namespace="##any" processContents="lax"/> </sequence> <attribute name="Id" type="ID" use="optional"/> <attribute name="MimeType" type="string" use="optional"/> <attribute name="Encoding" type="anyURI" use="optional"/> </complexType> KeyInfo Element The job of the optional KeyInfo element is to protect the digest from being modified. It contains specific information used to verify an XML Signature. <element name="KeyInfo" type="ds:KeyInfoType"/> <complexType name="KeyInfoType" mixed="true"> <choice maxOccurs="unbounded"> <element ref="ds:KeyName"/> <element ref="ds:KeyValue"/> <element ref="ds:RetrievalMethod"/> <element ref="ds:X509Data"/> <element ref="ds:PGPData"/> <element ref="ds:SPKIData"/> <element ref="ds:MgmtData"/> <any processContents="lax" namespace="##other"/> <!-- (1,1) elements from (0,unbounded) namespaces --> </choice> <attribute name="Id" type="ID" use="optional"/> </complexType> 7. XML Canonicalization Digital signatures only work if the verification calculations are performed on exactly the same bits as the signing calculations. If the surface representation of the signed data can change between signing and verification, then some way to standardize the changeable aspect must be used before signing and verification. For example, even for simple ASCII text there are at least three widely used line ending sequences. If it is possible for signed text to be modified from one line ending convention to another between the time of 10 signing and signature verification, then the line endings need to be canonicalized to a standard form before signing and verification or the signatures will break. XML is subject to surface representation changes and to processing which discards some surface information. For this reason, XML digital signatures have a provision for indicating canonicalization methods in the signature so that a verifier can use the same canonicalization as the signer. The kinds of changes in XML that may need to be canonicalized can be divided into four categories. There are those related to the basic XML. There are those related to DOM, SAX. Third, there is the possibility of coded character set conversion, such as between UTF-8 and UTF-16, both of which all XML compliant processors are required to support. And, fourth, there are changes that related to namespace declaration and XML namespace attribute context. 8. Algorithms Algorithms are identified by URIs that appear as an attribute to the element that identifies the algorithms' role. The four algorithms used in a XML digital signature are: 1- CanonicalizationMethod. 2- SignatureMethod 3- Transform 4- DigestMethod 1-Canonicalization Algorithm The CanonicalizationMethod identifies the algorithm that is used to canonicalize the SignedInfo element before it is digested as part of the signature operation. Canonicalization is how the process deals with different data streams that can be contained inside the same data element. For instance, there could be two different ways to represent the text. Canonicalization is the method in which raw data is interpreted to have spaces displayed as spaces and not as ASCII code. It is used to ensure that XML is handled consistently by different XML processors in light of white space and other variations. Digest algorithms requires content to be exactly the same to produce the same digest. Even a minor change that does not change the meaning such as adding an extra space will invalidate the digest. XML, on the other hand, allows some variation in the syntax of the XML text without changing the document. In other words, two XML documents may be considered the same even if they do not have the exact same text. For example, one XML document may use single quotes for an attribute and other use double quotes. These are the same to an XML parser, but very different to a digest algorithm. There is an entire list 11 of such potential issues for digests. To get around this problem, a Canonicalization transform may be used, one that converts any XML document to a form using a single set of rules, such as always using a certain type of quote for attributes. So, canonicalization is to put the signed data in a standard format that everyone generally uses. Because the signature is dependent on the content it is signing, a signature produced from a noncanonicalized document could possibly be different from that produced from a canonicalized document.An example of an XML canonicalization element is: <CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n20010315"/> 2-Signature Algorithm The SignatureMethod is the algorithm that is used to convert the canonicalized SignedInfo into the SignatureValue. It is a combination of a digest algorithm, a key dependent algorithm, and possibly other algorithms. The algorithm names are signed to resist attacks based on the substitution of a weaker algorithm. To promote application interoperability, the candidate specifies a set of signature algorithms that are required to be implemented, though their use is at the discretion of the signature creator. The SignatureMethod is protected by the signature, avoiding substitution attacks and defines how the signature is created. Signature algorithms take two implicit parameters, their keying material determined from KeyInfo and the octet stream output by CanonicalizationMethod. Signature and MAC algorithms are syntactically identical but a signature implies public key cryptography. An example of a DSA SignatureMethod element is: <SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#dsa- sha1"/> The output of the DSA algorithm consists of a pair of integers usually referred by the pair (r, s). The signature value consists of the base64 encoding of the concatenation of two octet-streams that respectively result from the octet-encoding of the values r and s in that order. For example, the SignatureValue element for a DSA signature (r, s) with values specified in hexadecimal: r = 8BAC1AB6 6410435C B7181F95 B16AB97C 92B341C0 s = 41E2345F 1F56DF24 58F426D1 55B4BA2D B6DCD8C8 12 3-Transform Algorithm Transforms is an optional ordered list of processing steps that were applied to the resource's content before it is digested. This is the trail that needs to be followed for decryption. Transforms can include operations such as canonicalization, encoding/decoding (including compression/inflation), XSLT, and XPath. XPath transforms are a little tricky because they permit the signer to derive an XML document that omits portions of the source document, limiting the XML tree, as it were. The excluded portions can therefore change without affecting the signature validity. If no Transforms element is present, the resource's content is digested directly. The Reference element identifies the resource to be signed and any algorithms used to preprocess the data. If a signature contains more than one Reference element, the presence of the URI attribute is optional for only one Reference element; all the others must have a URI attribute. Each Reference element includes transformations that produced the input to the digest operation. There may be zero or more Transforms steps. If there are multiple Transforms, each ones output provides the input for the next. The CanonicalizationMethod element contains the algorithm used to canonicalize the data, or structure the data in a common way agreed on by almost everyone. Canonicalization can be used to do such things as apply a standard end-of-line convention, removing comments, or doing any other manipulation of the signed document that your needs require. 4-Digest Algorithm The DigestMethod is the algorithm applied to the data after any defined transformations are applied to generate the value within DigestValue. DigestValue is applied to result of the canonicalization and transform process, not the original data. Consequently, if a change is made to this document that is transparent to these manipulations, the signature of the document will still verify. As a simple example, suppose we had created a canonicalization method that converts all text in a file to lowercase and used it to sign a document that originally contained mixed case. If we subsequently changed the original document by converting it to entirely uppercase, that modified document would still be validly verified by the original signature. SHA The Secure Hash Algorithm (SHA) is a cryptographic message digest algorithm similar to the MD4 family of hash functions developed by Rivest. It differs in that it adds an additional expansion operation, an extra round and the whole transformation were designed to accommodate the DSS block size for efficiency. 13 SHA was developed by National Institute of Standards (NIST), along with the NSA, for use with the Digital Signature Standard (DSS) is specified within the Secure Hash Standard (SHS). SHA-1 was a revision to SHA, released in 1995, which corrected an unpublished flaw in SHA. It is most popular algorithm used in digital signatures these days. MD5 algorithm is not used widely because of recent advances in cryptanalysis have cast doubt on its strength. SHA takes a message of less than 2**64 bits in length and produces a 20 byte message digest which is designed so that it should be computationally expensive to find a text which matches a given hash. If you have a hash for document A, H(A), it is difficult to find a document B which has the same hash, and even more difficult to arrange that document B says what you want it to say. The following are some examples of SHA-1 digests: SHA-1("The quick brown fox jumps over the lazy dog") = 2fd4e1c67a2d28fced849ee1bb76e7391b93eb12 Even a small change in the message will result in a completely different hash, e.g. changing d to c: SHA-1("The quick brown fox jumps over the lazy cog") = de9f2c7fd25e1b3afad3e85a0bd17d9b100db4b3 The hash of a zero-length string is: SHA-1("") = da39a3ee5e6b4b0d3255bfef95601890afd80709 9. Conclusion With the growing acceptance of XML technologies for documents and protocols, it is logical that security should be integrated with XML solutions. Older security technologies provide a set of core security algorithms and technologies that can be used in XML Security, but the actual formats used to implement security requirements are inappropriate for most XML Security applications. As XML becomes a vital component of the emerging electronic business infrastructure, we need trustable, secure XML messages to form the basis of business transactions. One key to enabling secure transactions is the concept of a digital signature, ensuring the integrity and authenticity of origin for business documents. XML Signature is an evolving standard for digital signatures that both addresses the special issues and requirements that XML presents for signing operations and uses XML syntax for capturing the result, simplifying its integration into XML applications. 14 10. References Books Atreya, Mohan et al. Digital Signatures. McGraw-Hill (2002). ISBN: 0-07-219482-0 Eastlake, Donald E. Secure XML: The New Syntax for Signatures and Encryption. Addison-Wesley (2002). ISBN: 0-201-75605-6. Web Sites XML-Signature Syntax and Processing. http://www.w3.org/TR/xmldsig-core/ An Introduction to XML Digital Signatures. http://www.xml.com/pub/a/2001/08/08/xmldsig.html Understanding XML Digital Signatures. http://msdn.microsoft.com/library/default.asp?url=/library/enus/dnwebsrv/html/underxmldigsig.asp 15