Reading and Writing XML in C# Conceptually, an XML file is very similar to a table in a database. Just like a table, an XML file is a series of collections of fields grouped by subject. Consider the following Microsoft Access table NameTable: This table has five records and three fields (Name, SSN, and Birthdate). Now look at the following XML file called NameTable.xml: <BAND> <BEATLE> <NAME>John</NAME> <SSN>123456789</SSN> < BIRTHDATE >9/16/45</ BIRTHDATE > </BEATLE> <BEATLE> <NAME>Ringo</NAME> <SSN>159487263</SSN> <BIRTHDATE>11/11/72</BIRTHDATE> </BEATLE> <BEATLE> <NAME>Paul</NAME> <SSN>321654987</SSN> <BIRTHDATE>2/20/50</BIRTHDATE> </BEATLE> <BEATLE> <NAME>George</NAME> <SSN>741258963</SSN> <BIRTHDATE>1/2/60</BIRTHDATE> </BEATLE> <BEATLE> <NAME>Pete</NAME> <SSN>963258741</SSN> <BIRTHDATE>12/22/44</BIRTHDATE> </BEATLE> </BAND> This also has three fields (NAME, SSN, and BIRTHDATE) and five records (BEATLE). Once you see an XML file as a collection of records, which in turn is a collection of fields, reading and writing XML data structures are fairly easy. We’ll do a few things with XML, including creating an XML file with the XML authoring tool in Visual Studio 2008, reading XML files two different ways using C#, and writing an XML file using C#. Reading and Writing XML Page 2 Creating an XML document using Visual Studio 2008 An XML document is really just a text file. Therefore, you could use Windows Notepad to create and modify any XML file. We’re going to do it from the XML editor inside Visual Studio. You can also write code to create XML files, but we’ll get to that later. So first, let’s create an empty web project: 1) In Visual Studio 2008, create a new web site (File/New/Web Site…) 2) Select the “ASP.NET Web Site” template. 3) Make sure the “File System” is the value in the Location drop-down box and “Visual C#” is the “Language.” 4) Look at the directory where the application will be stored. It probably looks something like this: C:\Documents and Settings\David Schuff\My Documents\Visual Studio 2008\WebSites\WebSite1 change the location to wherever you want, but the directory should end with “XMLDemo” (for example C:\VSProjects\MovieAppWeb). Click “OK” and the site will be created. Now we’ll create a simple XML document: 5) From the “Website” menu, select “Add New Item…” 6) Create a new XML document called Products.xml. We’re going to put the same data You’ll see an empty file with a single line of text: <?xml version="1.0" encoding="utf-8" ?> 7) Now we’ll create the first record. Add the following lines to the file (make sure you type the tags correctly): <Products> <Product> <ProductID>500</ProductID> <ProductName>Pucca</ProductName> <ProductDescription>Little fish-shaped crackers with chocolate filling</ProductDescription> <Supplier>Meiji</Supplier> <UnitPrice>1.99</UnitPrice> <Inventory>1000</Inventory> </Product> </Products> Reading and Writing XML Page 3 8) Enter the remaining four records by repeating this structure four more times, changing the data values each time. Fill in the remaining four data records as follows: ProductID 501 Product Name Fran 502 Pocky 503 Pocari Sweat BOSS Coffee (Super Blend) 504 Product Description Chocolate-covered cookie sticks Chocolate-covered cookie sticks Sports drink Coffee in an aluminum can Supplier UnitPrice Inventory Meiji 2.19 1000 Glico 1.49 1000 Otsuka Pharmaceutical Suntory 2.19 1000 0.99 1000 The complete XML document should look like this: <?xml version="1.0" encoding="utf-8"?> <Products> <Product> <ProductID>500</ProductID> <ProductName>Pucca</ProductName> <ProductDescription>Little fish-shaped crackers with chocolate filling</ProductDescription> <Supplier>Meiji</Supplier> <UnitPrice>1.99</UnitPrice> <Inventory>1000</Inventory> </Product> <Product> <ProductID>501</ProductID> <ProductName>Fran</ProductName> <ProductDescription>Chocolate-covered cookie sticks </ProductDescription> <Supplier>Meiji</Supplier> <UnitPrice>2.19</UnitPrice> <Inventory>1000</Inventory> </Product> <Product> <ProductID>502</ProductID> <ProductName>Pocky</ProductName> <ProductDescription>Chocolate-covered cookie sticks </ProductDescription> <Supplier>Glico</Supplier> <UnitPrice>1.49</UnitPrice> <Inventory>1000</Inventory> </Product> <Product> <ProductID>503</ProductID> <ProductName>Pocari Sweat</ProductName> <ProductDescription>Sports drink</ProductDescription> <Supplier>Otsuka Pharmaceutical</Supplier> <UnitPrice>2.19</UnitPrice> Reading and Writing XML Page 4 <Inventory>1000</Inventory> </Product> <Product> <ProductID>504</ProductID> <ProductName>BOSS Coffee (Super Blend)</ProductName> <ProductDescription>Coffee in an aluminum can </ProductDescription> <Supplier>Suntory</Supplier> <UnitPrice>0.99</UnitPrice> <Inventory>1000</Inventory> </Product> </Products> 9) Make sure you save the file. Create a Product class (Product.cs) We will need a data structure to hold the product information as we read it in from the XML file. To do that we’ll create a Product class. To do this: 1) Select “Add New Item…” from the Website menu. 2) Select the “Class” template and call the file Product.cs. 3) Click “Add.” If Visual Studio asks to place the file in the App_Code folder, click “Yes.” 4) Add the code that follows to the Product.cs file. The assumption is that you already are familiar with how to build a class file, so we won’t cover how this works. using using using using using using using using using System; System.Data; System.Configuration; System.Web; System.Web.Security; System.Web.UI; System.Web.UI.WebControls; System.Web.UI.WebControls.WebParts; System.Web.UI.HtmlControls; /// <summary> /// Summary description for Product /// </summary> public class Product { public int productID; public string productName; public string productDescription; public string supplier; public double unitPrice; public int inventory; public Product() { } Reading and Writing XML public int ProductID { get { return productID; } set { productID = value; } } public string ProductName { get { return productName; } set { productName = value; } } public string ProductDescription { get { return productDescription; } set { productDescription = value; } } public string Supplier { get { return supplier; } set { supplier = value; } } public double UnitPrice { get { return unitPrice; } set { unitPrice = value; } } public int Inventory { get { return inventory; } set { inventory = value; } } public override string ToString() { return "Product:\n" + "ID: " + this.ProductID + "<BR>" + "Name: " + this.ProductName + "<BR>" + "Description: " + this.ProductDescription + "<BR>" + "Supplier: " + this.Supplier + "<BR>" + "Unit Price: " + this.UnitPrice + "<BR>" + "Inventory: " + this.Inventory + "<BR>"; } } Page 5 Reading and Writing XML Page 6 Reading an XML file, Method 1: Using only the XMLReader Now we’re going to create a method that reads the Products.xml file, places that information into a new instance of the Product class, adds that new object to an ArrayList, and outputs the string representation of the instance to a Label. First, set up the web form by switching to the design view of Default.aspx and dragging a Label to the page. Then resize the Label to make it 400 by 175 pixels. Don’t change the name of the Label – leave it as Label1. Now, allow your program to see the following two class libraries by adding these two using statements: using System.Collections; using System.Xml; System.Collections allows you to use the ArrayList class, and System.Xml allows you to use the XML-related classes. Next, switch to the code view of that page (Default.aspx.cs) and add the following line above (outside) the Page_Load() method: ArrayList products = new ArrayList(); This is where we’ll store the Products we read from the XML file. Now we’ll add the method to read the XML file: public void readUsingJustReader() { string path = AppDomain.CurrentDomain.BaseDirectory + "Products.xml"; XmlReaderSettings settings = new XmlReaderSettings(); settings.IgnoreWhitespace = true; settings.IgnoreComments = true; XmlReader xmlIn = XmlReader.Create(path, settings); xmlIn.ReadToDescendant("Product"); string output = ""; string pid, prd, dsc, sup, prc, inv = ""; do { xmlIn.ReadStartElement("Product"); pid = xmlIn.ReadElementContentAsString(); prd = xmlIn.ReadElementContentAsString(); dsc = xmlIn.ReadElementContentAsString(); sup = xmlIn.ReadElementContentAsString(); prc = xmlIn.ReadElementContentAsString(); inv = xmlIn.ReadElementContentAsString(); Reading and Writing XML Page 7 Product p = new Product(); p.ProductID = Convert.ToInt16(pid); p.ProductName = prd; p.ProductDescription = dsc; p.Supplier = sup; p.UnitPrice = Convert.ToDouble(prc); p.Inventory = Convert.ToInt16(inv); products.Add(p); output = output + p.ToString(); } while (xmlIn.ReadToFollowing("Product")); Label1.Text = output; xmlIn.Close(); } So let’s go through the method step by step. The first major thing we’re doing is creating an instance of the XmlReader class – that’s what will allow us to read the XML file. The statement to do this is: XmlReader xmlIn = XmlReader.Create(path, settings); The Create() method of the XmlReader class takes two parameters. The first is a string value that contains the path to the XML file. The second parameter is an XmlReaderSettings object. You can see the path variable defined in the first line of the method: string path = "c:/VSProjects/XMLDemo/Products.xml"; That’s pretty self-explanatory, but the settings variable is a little more complex: XmlReaderSettings settings = new XmlReaderSettings(); settings.IgnoreWhitespace = true; settings.IgnoreComments = true; This creates an XmlReaderSettings object called settings. The XmlReaderSettings class has many properties that you can set, but in this example we’re only concerned with two. First, IgnoreWhitespace tells the XmlReader to ignore empty space in the document instead of treating that as meaningful content. Second, IgnoreComments prevents the XmlReader from confusing XML comments with meaningful data. There aren’t any in Products.xml, but if you put this in you won’t have to worry if you work with a future version of Product.xml that has comments. Now that the XmlReader xmlIn has been created, we will move down through the file, reading the tags sequentially. This next line will move the XmlReader to the first Product: xmlIn.ReadToDescendant("Product"); Reading and Writing XML Page 8 This reads down the XML file structure until it reaches the first <Product> tag. It’s helpful at this point to think fo the XML document as a “tree,” where each Product represents a branch and its associated attributes represents the leaves on that branch. So, looking back at the Product.xml file, you’ll see the first branch of the tree is “Pucca” (the branches of the tree, as well as its leaves, are read sequentially from top-to-bottom). Products.xml as a Tree Branch: <Product> Has <ProductID> <ProductName> <ProductDescription> <Supplier> <UnitPrice> <Inventory> Branch: <Product> Has <ProductID> <ProductName> <ProductDescription> <Supplier> <UnitPrice> <Inventory> Branch: Branch: <Product> Has <ProductID> <ProductName> <ProductDescription> <Supplier> <UnitPrice> <Inventory> <Product> Has <ProductID> <ProductName> <ProductDescription> <Supplier> <UnitPrice> <Inventory> Root: <Products> Now we’ll initialize the variables that will hold the data we retrieve from the XML document: string output = ""; string pid, prd, dsc, sup, prc, inv = ""; The next piece of the method is a do/while loop. The purpose of the loop is to find every Product tag (“branch”) and then retrieve the attributes (“leaves”) belonging to that Product. do { xmlIn.ReadStartElement("Product"); pid = xmlIn.ReadElementContentAsString(); prd = xmlIn.ReadElementContentAsString(); dsc = xmlIn.ReadElementContentAsString(); sup = xmlIn.ReadElementContentAsString(); prc = xmlIn.ReadElementContentAsString(); inv = xmlIn.ReadElementContentAsString(); Reading and Writing XML Page 9 Product p = new Product(); p.ProductID = Convert.ToInt16(pid); p.ProductName = prd; p.ProductDescription = dsc; p.Supplier = sup; p.UnitPrice = Convert.ToDouble(prc); p.Inventory = Convert.ToInt16(inv); products.Add(p); output = output + p.ToString(); } while (xmlIn.ReadToFollowing("Product")); The tags are also called “Elements”, so xmlIn.ReadStartElement(“Product”); reads the “Product” tag. Then the next six tags (the attributes of the Product) are read sequentially using the ReadElementContentAsString() method. Now that we have the entire set of data for a Product, we create a instance of the Product class called p and then use the data to populate the properties of that new object. For nonstring properties (such as ProductID, which is an integer), we’ll have to convert the data type, like this: p.ProductID = Convert.ToInt16(pid); Once all six properties are filled, the new object p is added to the products ArrayList: products.Add(p); And we concatentate the string representation of p to the output string. output = output + p.ToString(); The while condition uses the ReadToFollowing() method. This simply tells the XmlReader xmlIn to try to find the next “Product” tag. If it can’t, then the method returns false and the loop ends. Otherwise, the next Product is read from the XML file, a new Product object is created, and it is added to the Products ArrayList. When the loop ends, we take the entire output string and display it through the Text property of Label1 and close the XmlReader. Label1.Text = output; xmlIn.Close(); Reading and Writing XML Page 10 To test this out, you’ll need to invoke this method when the page loads. Add this line to the Page_Load() method: readUsingJustReader(); and then build and execute the application. The output should look something like this: Reading an XML file, Method 2: Using the XMLReader, XMLDocument, and XMLNode classes The previous method of reading an XML file will usually work fine, but it requires you to read sequentially through every attribute in the file. For example if we wanted to only look at the Inventory attribute for a product, we would still need to read the ProductID, Name, Description, Supplier, and UnitPrice first. This is because the reader doesn’t really know anything about the structure of the file and sort of “wanders” through the tags one item at a time. As an alternative, you can load the entire structure of the XML file into the computer’s memory and then use the tags to find exactly what you’re looking for. To do this, you’ll need the XMLDocument and XMLNode classes. Going back to our tree analogy from before, the XMLDocument is like the entire tree, and the XMLNodes are the branches and leaves. You can have nodes (leaves) within other nodes (branches). Reading and Writing XML Page 11 To see this work, add the following new method to your application: public void readUsingNodes() { string path = AppDomain.CurrentDomain.BaseDirectory + "Products.xml"; XmlReaderSettings settings = new XmlReaderSettings(); settings.IgnoreWhitespace = true; settings.IgnoreComments = true; XmlReader xmlIn = XmlReader.Create(path, settings); XmlDocument xd = new XmlDocument(); xd.Load(xmlIn); XmlNodeList xnl = xd.GetElementsByTagName("Product"); int nodeCount = xnl.Count; string output = ""; string pid, prd, dsc, sup, prc, inv = ""; XmlNode currentNode = null; for (int i = 0; i < nodeCount; i++) { currentNode = xnl.Item(i); pid = currentNode.SelectSingleNode("ProductID").InnerText; prd = currentNode.SelectSingleNode("ProductName").InnerText; dsc = currentNode.SelectSingleNode("ProductDescription").InnerText; sup = currentNode.SelectSingleNode("Supplier").InnerText; prc = currentNode.SelectSingleNode("UnitPrice").InnerText; inv = currentNode.SelectSingleNode("Inventory").InnerText; Product p = new Product(); p.ProductID = Convert.ToInt16(pid); p.ProductName = prd; p.ProductDescription = dsc; p.Supplier = sup; p.UnitPrice = Convert.ToDouble(prc); p.Inventory = Convert.ToInt16(inv); products.Add(p); output = output + p.ToString(); } xmlIn.Close(); Label1.Text = output; } You’ll notice that we set up the XMLReader just like before. However, this time we also create an instance of the XMLDocument class, which will hold all the data contined within Products.xml. The data is moved from the XML file to the XMLDocument object xd using the Load() method: Reading and Writing XML Page 12 XmlDocument xd = new XmlDocument(); xd.Load(xmlIn); The Load() method takes the file that is represented by the XMLReader (xmlIn) and puts the data into the XMLDocument (xd) as a collection of nodes. In this case, each “node” is a product. XmlNodeList xnl = xd.GetElementsByTagName("Product"); We then retrieve the number of nodes (products) in the XML file. We’ll use this later in the loop. int nodeCount = xnl.Count; Now we’re ready to loop through the nodes, collecting the attribute information for each product. There are two important differences between this loop and the previous version. First, we retrieve the nodes by index – we don’t have to retrieve them sequentially (although that’s how we’re doing it here via the loop): currentNode = xnl.Item(i); An XMLNodeList basically works like an ArrayList. We’re using the Item() method of the XMLNodeList class (xnl) to retrieve the XMLNode at position i. Then we can retrieve specific attribute nodes from the current product node (currentNode). Here’s an example: pid = currentNode.SelectSingleNode("ProductID").InnerText; We’re using the SelectSingleNode() method to directly retrieve the piece of information with a specific attribute name. Look back at the XML document Products.xml – you’ll see a <ProductID> tag. The InnerText property of the XMLNode contains the data within the tag, and returns that information as a string. Once we’ve retrieved all of the data attributes for that Product, we create a new Product object just like we did before. And then we populate the properties of the new object with the attribute values (converting data types where necessary). Finally, we add the new object to the products ArrayList and add to the output string. To make sure this works, change the single line of the Page_Load() method to call the new method: readUsingNodes() Then rebuild and execute the application. The output should again look like the graphic on page 10. Reading and Writing XML Page 13 Writing an XML File using the XMLWriter class Just as you can use the classes in System.Xml to read the data from an XML file, you can also use those classes to write an XML document. Basically, it’s just creating a plain text file, but we’ll use classes and methods that automatically create the tags for us and make sure that the XML document is “well-formed.” The method we’re going to create will again use the products ArrayList, but this time it will read each element in the list and write it to a new XML file. Of course, for this to work there needs to already be something in products. So when we call this method, we’ll have to do it after calling the readUsingNodes() method or the readUsingJustReader() method. So first, add the following method to the application: public void writeXML() { string path = AppDomain.CurrentDomain.BaseDirectory + "output.xml"; XmlWriterSettings settings = new XmlWriterSettings(); settings.Indent = true; settings.IndentChars = " "; XmlWriter xmlOut = XmlWriter.Create(path, settings); xmlOut.WriteStartElement("Products"); for (int i = 0; i < products.Count; i++) { Product p = (Product)products[i]; xmlOut.WriteStartElement("Product"); xmlOut.WriteElementString("ProductID", p.ProductID.ToString()); xmlOut.WriteElementString("ProductName", p.ProductName); xmlOut.WriteElementString("ProductDescription", p.ProductDescription); xmlOut.WriteElementString("Supplier", p.Supplier); xmlOut.WriteElementString("UnitPrice", p.UnitPrice.ToString()); xmlOut.WriteElementString("Inventory", p.Inventory.ToString()); xmlOut.WriteEndElement(); } xmlOut.WriteEndElement(); xmlOut.Close(); } Like the XMLReader, the XMLWriter is created using the Create() method. The method takes two parameters – the path to the file (a string), and an XMLWriterSettings object. The XMLWriterSettings have different properties than XMLReaderSettings, but we’re interested in two of them – Indent, which Reading and Writing XML Page 14 indents lines according to their place in the XML tree, and IndentChars, which contains the string to use to create the indentation. XmlWriter xmlOut = XmlWriter.Create(path, settings); Once the xmlOut object is created, we start writing the document by creating the root element <Products>. There is no data associated directly with the <Products> tag (just other tags), so we use the WriteStartElement() method: xmlOut.WriteStartElement("Products"); This just writes out <Products> to the XML output file. Next, we loop through the ArrayList products. The index of the loop is used to retrieve each element in the ArrayList: Product p = (Product)products[i]; For each individual product, the set of product attributes are surrounded by <Product> tags. So we need to write the starting tag for the current product (again, the <Product> tag doesn’t have a correponding data value, so we use the WriteStartElement() method): xmlOut.WriteStartElement("Product"); Which will write the <Product> tag. The XML document now looks like this: <Products> <Product> Now we can write the tags associated with the attribute for the current product. To do this, we use the WriteElementString() method. The first parameter is the tag itself (a string), and the second parameter is the data associated with the tag (a string). For example, to write the data tags, we do the following: xmlOut.WriteElementString("ProductID", p.ProductID.ToString()); xmlOut.WriteElementString("ProductName", p.ProductName); xmlOut.WriteElementString("ProductDescription", p.ProductDescription); xmlOut.WriteElementString("Supplier", p.Supplier); xmlOut.WriteElementString("UnitPrice", p.UnitPrice.ToString()); xmlOut.WriteElementString("Inventory", p.Inventory.ToString()); Now the XML file looks like this: <Products> <Product> <ProductID>500</ProductID> <ProductName>Pucca</ProductName> Reading and Writing XML Page 15 <ProductDescription> Little fish-shaped crackers with chocolate filling</ProductDescription> <Supplier>Meiji</Supplier> <UnitPrice>1.99</UnitPrice> <Inventory>1000</Inventory> To add the closing </Product> tag, we use the WriteEndElement() method: xmlOut.WriteEndElement(); And when the loop is completed, we use a final WriteEndElement() method to write the closing </Products> tag. To test the method, make sure you’ve put the writeXML() method call as the second line in the Page_Load() method, after either the readUsingNodes() method or the readUsingJustReader() method. Build and execute the application, and then once you see the browser output, return to Visual Studio and refresh the Solution Explorer by right clicking on the root of the project and select “Refresh Folder.” You should see the output.xml file. Double-click on that file, and you should see the XML – and it should look just like the original Products.xml file. Problem 1: Using the Visual Studio XML tools (see page 2), create an new XML document called Customers.xml with the following structure: <Customers> <Customer> <FirstName>data</FirstName> <LastName>data</LastName> <Street>data</Street> <City>data</City> <State>data</State> <Zip>data</Zip> </Customer> </Customers> The Customers.xml file should have five data records (use any data values you want). Then modify the readUsingNodes(), the readUsingJustReader(), and the writeXML() method so that it reads and writes this data structure. You should also create a Customer class (Customer.cs) that will hold the data from the XML file.