eXtensible Markup Language (XML)

advertisement
1
eXtensible Markup Language (XML)
Extensible Markup Language, or XML for short, was developed by the
SGML1 Editorial Board of the World Wide Web Consortium (W3C). The
initial XML draft was presented in 1996 at a conference in Boston. While
the official W3C specification (XML 1.0) was presented in 1998 by the
headquarters of the World Wide Web Consortium at the Massachusetts
Institute of Technology. XML is a new technology for web applications that
simplifies business-to-business transactions on the web and lets the users
create their own tags. [1]
XML is called extensible because it is not a fixed format Language but
actually it is a language for describing other languages, which lets the
users design their own customized mark-up languages for unlimited
different types of documents. [2]
C, C++, Pascal, Java and many more are programming languages in
which users specify calculations, actions, and decisions to be carried out
in order. Those languages are differ from XML, which says nothing about
what to do with that data and used to design ways of describing
information (text or data), storage, transmission, or processing by a
program, Moreover any programming language can be used to output
data from any source in XML format, Java Language appears to be the
most popular one at the moment. [2]
An XML document is a database only in the strictest sense of the term.
That is, XML is a collection of data, which makes it not different from any
other files. As a "database" format, XML has some advantages. For
example, XML describes the structure and type names of the data, but not
the semantic, it is portable, and it can describe the data in tree or graph
structures. On the other hand it has some disadvantages. For example,
the data access due to parsing and text conversion is slow. The XML
documents are suitable to be used as a database in environments with
small amounts of data, few users, and modest performance requirements,
in other words XML isn’t suitable in environments with many users, strict
data integrity’ requirements, and good performance requirements. [3]
1
Indexed list of topic on XML could be found at :
http://www.idealliance.org/papers/dx_xmle03/index/keyword/
2
Why XML
Using XML has many advantages; it taps the potential of the World Wide
Web and other technologies for disseminating “distribute” information
accurately, quickly, and independently of specific software applications or
hardware platforms. Other advantages of using XML can be:
 Reusing Content and "Modularity":
 Sharing information across the Enterprise.
 Reviewing and Translating Large Documents.
 Automating Tasks.
 Increasing Accuracy.
 Increasing Timeliness.
A conceptual view of XML
An XML document generally consists of two parts, header and continent. The
header, which is an XML declaration, defines and gives XML application
information such as how to handle the documents.
The content, which is the XML data itself consist of three parts

Root element:
The root element for an XML document is the highest-level element in that
document, which surrounds all the other document tags. The root element
must be the first opening tag and the last closing tag in the document.

Elements nodes:
Each element node is a labelled with a name (often called the element type),
and a set of attributes, each consisting of a name and a value. Each of these
element nodes can have child and descendant nodes

Character data:
The characteristic data are the leaf nodes that contain the actual data (text
strings). Usually, it must be non-empty and non-adjacent to other character
data nodes. The XML document can be presented as following:
<?xml version="1.0" encoding="iso-8859-1"?>
 the header
3
<Users>
<User id=”15”>
<UserName> marouf</UserName>
<PassWord> marouf</PassWord>
</User>
…
</Users>
the content
An important point that I have to mention is the XML Name Space (xmlns), which is the
cheapest way of getting a unique name. Both elements and attributes can be qualified by
name space; the Universal Resource Identifier URI will do this qualification. Then the
element is defined not only by its name but also by the URI. Xmlns are not supposed to
point at anything; the name space is usually a URL because the URLs are unique.
A concrete view of XML
An XML document is a (Unicode) text with markup tags. Moreover it is a well
formed so the start and end tags are matched (i.e. start tag must match end tag),
the element tags are properly nested, it is case sensitive, it can use a non Latin
characters, and the white space is used for indentation and contents.
4
<USer id="15"> ... ... </user>
|
|
|
|
|
|
|
a matching element end tag
|
|
the contents of the element (could be text
|
an attribute with name “id” and value “15”.
an element start tag with name client
or element)
Download