Notes 6. Reading and Writing XML in C#

advertisement
Reading and Writing XML in C#
Conceptually, an XML file is very similar to a table in a database. Just like a table, an
XML file is a series of collections of fields grouped by subject. Consider the following
Microsoft Access table NameTable:
This table has five records and three fields (Name, SSN, and Birthdate). Now look at the
following XML file called NameTable.xml:
<BAND>
<BEATLE>
<NAME>John</NAME>
<SSN>123456789</SSN>
< BIRTHDATE >9/16/45</ BIRTHDATE >
</BEATLE>
<BEATLE>
<NAME>Ringo</NAME>
<SSN>159487263</SSN>
<BIRTHDATE>11/11/72</BIRTHDATE>
</BEATLE>
<BEATLE>
<NAME>Paul</NAME>
<SSN>321654987</SSN>
<BIRTHDATE>2/20/50</BIRTHDATE>
</BEATLE>
<BEATLE>
<NAME>George</NAME>
<SSN>741258963</SSN>
<BIRTHDATE>1/2/60</BIRTHDATE>
</BEATLE>
<BEATLE>
<NAME>Pete</NAME>
<SSN>963258741</SSN>
<BIRTHDATE>12/22/44</BIRTHDATE>
</BEATLE>
</BAND>
This also has three fields (NAME, SSN, and BIRTHDATE) and five records (BEATLE).
Once you see an XML file as a collection of records, which in turn is a collection of
fields, reading and writing XML data structures are fairly easy. We’ll do a few things
with XML, including creating an XML file with the XML authoring tool in Visual Studio
2008, reading XML files two different ways using C#, and writing an XML file using C#.
Reading and Writing XML
Page 2
Creating an XML document using Visual Studio 2008
An XML document is really just a text file. Therefore, you could use Windows Notepad
to create and modify any XML file. We’re going to do it from the XML editor inside
Visual Studio. You can also write code to create XML files, but we’ll get to that later.
So first, let’s create an empty web project:
1) In Visual Studio 2008, create a new web site (File/New/Web Site…)
2) Select the “ASP.NET Web Site” template.
3) Make sure the “File System” is the value in the Location drop-down box and “Visual
C#” is the “Language.”
4) Look at the directory where the application will be stored. It probably looks
something like this:
C:\Documents and Settings\David Schuff\My Documents\Visual Studio
2008\WebSites\WebSite1
change the location to wherever you want, but the directory should end with
“XMLDemo” (for example C:\VSProjects\MovieAppWeb). Click “OK” and the site
will be created.
Now we’ll create a simple XML document:
5) From the “Website” menu, select “Add New Item…”
6) Create a new XML document called Products.xml. We’re going to put the same
data You’ll see an empty file with a single line of text:
<?xml version="1.0" encoding="utf-8" ?>
7) Now we’ll create the first record. Add the following lines to the file (make sure you
type the tags correctly):
<Products>
<Product>
<ProductID>500</ProductID>
<ProductName>Pucca</ProductName>
<ProductDescription>Little fish-shaped crackers with chocolate
filling</ProductDescription>
<Supplier>Meiji</Supplier>
<UnitPrice>1.99</UnitPrice>
<Inventory>1000</Inventory>
</Product>
</Products>
Reading and Writing XML
Page 3
8) Enter the remaining four records by repeating this structure four more times,
changing the data values each time.
Fill in the remaining four data records as follows:
ProductID
501
Product
Name
Fran
502
Pocky
503
Pocari
Sweat
BOSS
Coffee
(Super
Blend)
504
Product
Description
Chocolate-covered
cookie sticks
Chocolate-covered
cookie sticks
Sports drink
Coffee in an
aluminum can
Supplier
UnitPrice
Inventory
Meiji
2.19
1000
Glico
1.49
1000
Otsuka
Pharmaceutical
Suntory
2.19
1000
0.99
1000
The complete XML document should look like this:
<?xml version="1.0" encoding="utf-8"?>
<Products>
<Product>
<ProductID>500</ProductID>
<ProductName>Pucca</ProductName>
<ProductDescription>Little fish-shaped crackers with
chocolate filling</ProductDescription>
<Supplier>Meiji</Supplier>
<UnitPrice>1.99</UnitPrice>
<Inventory>1000</Inventory>
</Product>
<Product>
<ProductID>501</ProductID>
<ProductName>Fran</ProductName>
<ProductDescription>Chocolate-covered cookie sticks
</ProductDescription>
<Supplier>Meiji</Supplier>
<UnitPrice>2.19</UnitPrice>
<Inventory>1000</Inventory>
</Product>
<Product>
<ProductID>502</ProductID>
<ProductName>Pocky</ProductName>
<ProductDescription>Chocolate-covered cookie sticks
</ProductDescription>
<Supplier>Glico</Supplier>
<UnitPrice>1.49</UnitPrice>
<Inventory>1000</Inventory>
</Product>
<Product>
<ProductID>503</ProductID>
<ProductName>Pocari Sweat</ProductName>
<ProductDescription>Sports drink</ProductDescription>
<Supplier>Otsuka Pharmaceutical</Supplier>
<UnitPrice>2.19</UnitPrice>
Reading and Writing XML
Page 4
<Inventory>1000</Inventory>
</Product>
<Product>
<ProductID>504</ProductID>
<ProductName>BOSS Coffee (Super Blend)</ProductName>
<ProductDescription>Coffee in an aluminum can
</ProductDescription>
<Supplier>Suntory</Supplier>
<UnitPrice>0.99</UnitPrice>
<Inventory>1000</Inventory>
</Product>
</Products>
9) Make sure you save the file.
Create a Product class (Product.cs)
We will need a data structure to hold the product information as we read it in from the
XML file. To do that we’ll create a Product class. To do this:
1) Select “Add New Item…” from the Website menu.
2) Select the “Class” template and call the file Product.cs.
3) Click “Add.” If Visual Studio asks to place the file in the App_Code folder, click
“Yes.”
4) Add the code that follows to the Product.cs file. The assumption is that you
already are familiar with how to build a class file, so we won’t cover how this works.
using
using
using
using
using
using
using
using
using
System;
System.Data;
System.Configuration;
System.Web;
System.Web.Security;
System.Web.UI;
System.Web.UI.WebControls;
System.Web.UI.WebControls.WebParts;
System.Web.UI.HtmlControls;
/// <summary>
/// Summary description for Product
/// </summary>
public class Product
{
public int productID;
public string productName;
public string productDescription;
public string supplier;
public double unitPrice;
public int inventory;
public Product()
{
}
Reading and Writing XML
public int ProductID
{
get { return productID; }
set { productID = value; }
}
public string ProductName
{
get { return productName; }
set { productName = value; }
}
public string ProductDescription
{
get { return productDescription; }
set { productDescription = value; }
}
public string Supplier
{
get { return supplier; }
set { supplier = value; }
}
public double UnitPrice
{
get { return unitPrice; }
set { unitPrice = value; }
}
public int Inventory
{
get { return inventory; }
set { inventory = value; }
}
public override string ToString()
{
return "Product:\n" +
"ID: " + this.ProductID + "<BR>" +
"Name: " + this.ProductName + "<BR>" +
"Description: " + this.ProductDescription + "<BR>" +
"Supplier: " + this.Supplier + "<BR>" +
"Unit Price: " + this.UnitPrice + "<BR>" +
"Inventory: " + this.Inventory + "<BR>";
}
}
Page 5
Reading and Writing XML
Page 6
Reading an XML file, Method 1: Using only the XMLReader
Now we’re going to create a method that reads the Products.xml file, places that
information into a new instance of the Product class, adds that new object to an
ArrayList, and outputs the string representation of the instance to a Label.
First, set up the web form by switching to the design view of Default.aspx and dragging a
Label to the page. Then resize the Label to make it 400 by 175 pixels. Don’t change the
name of the Label – leave it as Label1.
Now, allow your program to see the following two class libraries by adding these two
using statements:
using System.Collections;
using System.Xml;
System.Collections allows you to use the ArrayList class, and System.Xml
allows you to use the XML-related classes.
Next, switch to the code view of that page (Default.aspx.cs) and add the following
line above (outside) the Page_Load() method:
ArrayList products = new ArrayList();
This is where we’ll store the Products we read from the XML file.
Now we’ll add the method to read the XML file:
public void readUsingJustReader()
{
string path = AppDomain.CurrentDomain.BaseDirectory +
"Products.xml";
XmlReaderSettings settings = new XmlReaderSettings();
settings.IgnoreWhitespace = true;
settings.IgnoreComments = true;
XmlReader xmlIn = XmlReader.Create(path, settings);
xmlIn.ReadToDescendant("Product");
string output = "";
string pid, prd, dsc, sup, prc, inv = "";
do
{
xmlIn.ReadStartElement("Product");
pid = xmlIn.ReadElementContentAsString();
prd = xmlIn.ReadElementContentAsString();
dsc = xmlIn.ReadElementContentAsString();
sup = xmlIn.ReadElementContentAsString();
prc = xmlIn.ReadElementContentAsString();
inv = xmlIn.ReadElementContentAsString();
Reading and Writing XML
Page 7
Product p = new Product();
p.ProductID = Convert.ToInt16(pid);
p.ProductName = prd;
p.ProductDescription = dsc;
p.Supplier = sup;
p.UnitPrice = Convert.ToDouble(prc);
p.Inventory = Convert.ToInt16(inv);
products.Add(p);
output = output + p.ToString();
}
while (xmlIn.ReadToFollowing("Product"));
Label1.Text = output;
xmlIn.Close();
}
So let’s go through the method step by step. The first major thing we’re doing is creating
an instance of the XmlReader class – that’s what will allow us to read the XML file.
The statement to do this is:
XmlReader xmlIn = XmlReader.Create(path, settings);
The Create() method of the XmlReader class takes two parameters. The first is a
string value that contains the path to the XML file. The second parameter is an
XmlReaderSettings object.
You can see the path variable defined in the first line of the method:
string path = "c:/VSProjects/XMLDemo/Products.xml";
That’s pretty self-explanatory, but the settings variable is a little more complex:
XmlReaderSettings settings = new XmlReaderSettings();
settings.IgnoreWhitespace = true;
settings.IgnoreComments = true;
This creates an XmlReaderSettings object called settings. The
XmlReaderSettings class has many properties that you can set, but in this example
we’re only concerned with two. First, IgnoreWhitespace tells the XmlReader to
ignore empty space in the document instead of treating that as meaningful content.
Second, IgnoreComments prevents the XmlReader from confusing XML comments
with meaningful data. There aren’t any in Products.xml, but if you put this in you
won’t have to worry if you work with a future version of Product.xml that has
comments.
Now that the XmlReader xmlIn has been created, we will move down through the file,
reading the tags sequentially. This next line will move the XmlReader to the first Product:
xmlIn.ReadToDescendant("Product");
Reading and Writing XML
Page 8
This reads down the XML file structure until it reaches the first <Product> tag. It’s
helpful at this point to think fo the XML document as a “tree,” where each Product
represents a branch and its associated attributes represents the leaves on that branch. So,
looking back at the Product.xml file, you’ll see the first branch of the tree is “Pucca” (the
branches of the tree, as well as its leaves, are read sequentially from top-to-bottom).
Products.xml as a Tree
Branch:
<Product>
Has
<ProductID>
<ProductName>
<ProductDescription>
<Supplier>
<UnitPrice>
<Inventory>
Branch:
<Product>
Has
<ProductID>
<ProductName>
<ProductDescription>
<Supplier>
<UnitPrice>
<Inventory>
Branch:
Branch:
<Product>
Has
<ProductID>
<ProductName>
<ProductDescription>
<Supplier>
<UnitPrice>
<Inventory>
<Product>
Has
<ProductID>
<ProductName>
<ProductDescription>
<Supplier>
<UnitPrice>
<Inventory>
Root:
<Products>
Now we’ll initialize the variables that will hold the data we retrieve from the XML
document:
string output = "";
string pid, prd, dsc, sup, prc, inv = "";
The next piece of the method is a do/while loop. The purpose of the loop is to find
every Product tag (“branch”) and then retrieve the attributes (“leaves”) belonging to that
Product.
do
{
xmlIn.ReadStartElement("Product");
pid = xmlIn.ReadElementContentAsString();
prd = xmlIn.ReadElementContentAsString();
dsc = xmlIn.ReadElementContentAsString();
sup = xmlIn.ReadElementContentAsString();
prc = xmlIn.ReadElementContentAsString();
inv = xmlIn.ReadElementContentAsString();
Reading and Writing XML
Page 9
Product p = new Product();
p.ProductID = Convert.ToInt16(pid);
p.ProductName = prd;
p.ProductDescription = dsc;
p.Supplier = sup;
p.UnitPrice = Convert.ToDouble(prc);
p.Inventory = Convert.ToInt16(inv);
products.Add(p);
output = output + p.ToString();
}
while (xmlIn.ReadToFollowing("Product"));
The tags are also called “Elements”, so
xmlIn.ReadStartElement(“Product”);
reads the “Product” tag. Then the next six tags (the attributes of the Product) are read
sequentially using the ReadElementContentAsString() method.
Now that we have the entire set of data for a Product, we create a instance of the Product
class called p and then use the data to populate the properties of that new object. For nonstring properties (such as ProductID, which is an integer), we’ll have to convert the data
type, like this:
p.ProductID = Convert.ToInt16(pid);
Once all six properties are filled, the new object p is added to the products
ArrayList:
products.Add(p);
And we concatentate the string representation of p to the output string.
output = output + p.ToString();
The while condition uses the ReadToFollowing() method. This simply tells the
XmlReader xmlIn to try to find the next “Product” tag. If it can’t, then the method
returns false and the loop ends. Otherwise, the next Product is read from the XML file,
a new Product object is created, and it is added to the Products ArrayList.
When the loop ends, we take the entire output string and display it through the Text
property of Label1 and close the XmlReader.
Label1.Text = output;
xmlIn.Close();
Reading and Writing XML
Page 10
To test this out, you’ll need to invoke this method when the page loads. Add this line to
the Page_Load() method:
readUsingJustReader();
and then build and execute the application. The output should look something like this:
Reading an XML file, Method 2: Using the XMLReader, XMLDocument, and
XMLNode classes
The previous method of reading an XML file will usually work fine, but it requires you to
read sequentially through every attribute in the file. For example if we wanted to only
look at the Inventory attribute for a product, we would still need to read the ProductID,
Name, Description, Supplier, and UnitPrice first. This is because the reader doesn’t really
know anything about the structure of the file and sort of “wanders” through the tags one
item at a time.
As an alternative, you can load the entire structure of the XML file into the computer’s
memory and then use the tags to find exactly what you’re looking for. To do this, you’ll
need the XMLDocument and XMLNode classes. Going back to our tree analogy from
before, the XMLDocument is like the entire tree, and the XMLNodes are the branches
and leaves. You can have nodes (leaves) within other nodes (branches).
Reading and Writing XML
Page 11
To see this work, add the following new method to your application:
public void readUsingNodes()
{
string path = AppDomain.CurrentDomain.BaseDirectory +
"Products.xml";
XmlReaderSettings settings = new XmlReaderSettings();
settings.IgnoreWhitespace = true;
settings.IgnoreComments = true;
XmlReader xmlIn = XmlReader.Create(path, settings);
XmlDocument xd = new XmlDocument();
xd.Load(xmlIn);
XmlNodeList xnl = xd.GetElementsByTagName("Product");
int nodeCount = xnl.Count;
string output = "";
string pid, prd, dsc, sup, prc, inv = "";
XmlNode currentNode = null;
for (int i = 0; i < nodeCount; i++)
{
currentNode = xnl.Item(i);
pid = currentNode.SelectSingleNode("ProductID").InnerText;
prd = currentNode.SelectSingleNode("ProductName").InnerText;
dsc =
currentNode.SelectSingleNode("ProductDescription").InnerText;
sup = currentNode.SelectSingleNode("Supplier").InnerText;
prc = currentNode.SelectSingleNode("UnitPrice").InnerText;
inv = currentNode.SelectSingleNode("Inventory").InnerText;
Product p = new Product();
p.ProductID = Convert.ToInt16(pid);
p.ProductName = prd;
p.ProductDescription = dsc;
p.Supplier = sup;
p.UnitPrice = Convert.ToDouble(prc);
p.Inventory = Convert.ToInt16(inv);
products.Add(p);
output = output + p.ToString();
}
xmlIn.Close();
Label1.Text = output;
}
You’ll notice that we set up the XMLReader just like before. However, this time we also
create an instance of the XMLDocument class, which will hold all the data contined
within Products.xml. The data is moved from the XML file to the XMLDocument
object xd using the Load() method:
Reading and Writing XML
Page 12
XmlDocument xd = new XmlDocument();
xd.Load(xmlIn);
The Load() method takes the file that is represented by the XMLReader (xmlIn) and
puts the data into the XMLDocument (xd) as a collection of nodes. In this case, each
“node” is a product.
XmlNodeList xnl = xd.GetElementsByTagName("Product");
We then retrieve the number of nodes (products) in the XML file. We’ll use this later in
the loop.
int nodeCount = xnl.Count;
Now we’re ready to loop through the nodes, collecting the attribute information for each
product. There are two important differences between this loop and the previous version.
First, we retrieve the nodes by index – we don’t have to retrieve them sequentially
(although that’s how we’re doing it here via the loop):
currentNode = xnl.Item(i);
An XMLNodeList basically works like an ArrayList. We’re using the Item()
method of the XMLNodeList class (xnl) to retrieve the XMLNode at position i. Then
we can retrieve specific attribute nodes from the current product node (currentNode).
Here’s an example:
pid = currentNode.SelectSingleNode("ProductID").InnerText;
We’re using the SelectSingleNode() method to directly retrieve the piece of
information with a specific attribute name. Look back at the XML document
Products.xml – you’ll see a <ProductID> tag. The InnerText property of the
XMLNode contains the data within the tag, and returns that information as a string.
Once we’ve retrieved all of the data attributes for that Product, we create a new
Product object just like we did before. And then we populate the properties of the new
object with the attribute values (converting data types where necessary). Finally, we add
the new object to the products ArrayList and add to the output string.
To make sure this works, change the single line of the Page_Load() method to call the
new method:
readUsingNodes()
Then rebuild and execute the application. The output should again look like the graphic
on page 10.
Reading and Writing XML
Page 13
Writing an XML File using the XMLWriter class
Just as you can use the classes in System.Xml to read the data from an XML file, you
can also use those classes to write an XML document. Basically, it’s just creating a plain
text file, but we’ll use classes and methods that automatically create the tags for us and
make sure that the XML document is “well-formed.”
The method we’re going to create will again use the products ArrayList, but this
time it will read each element in the list and write it to a new XML file. Of course, for
this to work there needs to already be something in products. So when we call this
method, we’ll have to do it after calling the readUsingNodes() method or the
readUsingJustReader() method.
So first, add the following method to the application:
public void writeXML()
{
string path = AppDomain.CurrentDomain.BaseDirectory +
"output.xml";
XmlWriterSettings settings = new XmlWriterSettings();
settings.Indent = true;
settings.IndentChars = "
";
XmlWriter xmlOut = XmlWriter.Create(path, settings);
xmlOut.WriteStartElement("Products");
for (int i = 0; i < products.Count; i++)
{
Product p = (Product)products[i];
xmlOut.WriteStartElement("Product");
xmlOut.WriteElementString("ProductID",
p.ProductID.ToString());
xmlOut.WriteElementString("ProductName", p.ProductName);
xmlOut.WriteElementString("ProductDescription",
p.ProductDescription);
xmlOut.WriteElementString("Supplier", p.Supplier);
xmlOut.WriteElementString("UnitPrice",
p.UnitPrice.ToString());
xmlOut.WriteElementString("Inventory",
p.Inventory.ToString());
xmlOut.WriteEndElement();
}
xmlOut.WriteEndElement();
xmlOut.Close();
}
Like the XMLReader, the XMLWriter is created using the Create() method. The
method takes two parameters – the path to the file (a string), and an
XMLWriterSettings object. The XMLWriterSettings have different properties
than XMLReaderSettings, but we’re interested in two of them – Indent, which
Reading and Writing XML
Page 14
indents lines according to their place in the XML tree, and IndentChars, which
contains the string to use to create the indentation.
XmlWriter xmlOut = XmlWriter.Create(path, settings);
Once the xmlOut object is created, we start writing the document by creating the root
element <Products>. There is no data associated directly with the <Products> tag
(just other tags), so we use the WriteStartElement() method:
xmlOut.WriteStartElement("Products");
This just writes out <Products> to the XML output file. Next, we loop through the
ArrayList products. The index of the loop is used to retrieve each element in the
ArrayList:
Product p = (Product)products[i];
For each individual product, the set of product attributes are surrounded by <Product>
tags. So we need to write the starting tag for the current product (again, the <Product>
tag doesn’t have a correponding data value, so we use the WriteStartElement()
method):
xmlOut.WriteStartElement("Product");
Which will write the <Product> tag. The XML document now looks like this:
<Products>
<Product>
Now we can write the tags associated with the attribute for the current product. To do this,
we use the WriteElementString() method. The first parameter is the tag itself (a string),
and the second parameter is the data associated with the tag (a string). For example, to
write the data tags, we do the following:
xmlOut.WriteElementString("ProductID", p.ProductID.ToString());
xmlOut.WriteElementString("ProductName", p.ProductName);
xmlOut.WriteElementString("ProductDescription",
p.ProductDescription);
xmlOut.WriteElementString("Supplier", p.Supplier);
xmlOut.WriteElementString("UnitPrice", p.UnitPrice.ToString());
xmlOut.WriteElementString("Inventory", p.Inventory.ToString());
Now the XML file looks like this:
<Products>
<Product>
<ProductID>500</ProductID>
<ProductName>Pucca</ProductName>
Reading and Writing XML
Page 15
<ProductDescription> Little fish-shaped crackers with
chocolate filling</ProductDescription>
<Supplier>Meiji</Supplier>
<UnitPrice>1.99</UnitPrice>
<Inventory>1000</Inventory>
To add the closing </Product> tag, we use the WriteEndElement() method:
xmlOut.WriteEndElement();
And when the loop is completed, we use a final WriteEndElement() method to
write the closing </Products> tag.
To test the method, make sure you’ve put the writeXML() method call as the second
line in the Page_Load() method, after either the readUsingNodes() method or
the readUsingJustReader() method.
Build and execute the application, and then once you see the browser output, return to
Visual Studio and refresh the Solution Explorer by right clicking on the root of the
project and select “Refresh Folder.” You should see the output.xml file. Double-click
on that file, and you should see the XML – and it should look just like the original
Products.xml file.
Problem 1:
Using the Visual Studio XML tools (see page 2), create an new XML document called
Customers.xml with the following structure:
<Customers>
<Customer>
<FirstName>data</FirstName>
<LastName>data</LastName>
<Street>data</Street>
<City>data</City>
<State>data</State>
<Zip>data</Zip>
</Customer>
</Customers>
The Customers.xml file should have five data records (use any data values you
want). Then modify the readUsingNodes(), the readUsingJustReader(), and
the writeXML() method so that it reads and writes this data structure. You should also
create a Customer class (Customer.cs) that will hold the data from the XML file.
Download