Upgrade your C# Skills part 5 – LINQ to XML
I am working on a project that requires a configuration file that can be edited or even replaced by the user. Looking at the requirements, what I really need is a series of “records” of the same layout, similar to a database table. Naturally, given the current state of technology, an XML file is the obvious choice. So I decided that now would be a good time to investigate LINQ to XML.
Understanding LINQ to XML
First of all, unlike other LINQ topics, this one is not really query oriented. LINQ to XML (also known as XLINQ) is primarily about reading and writing XML files. Once read into program variables, you can use LINQ to Objects technologies to perform query functions.
First, the XLINQ classes are not part of the standard System.Linq namespace. Instead, you will need to include System.Xml.Linq. This namespace contains the XML classes that make XLINQ worthwhile, and we’ll review a few of these as we go along.
Second, this article will be a far cry from exhaustive on the subject: there are a ton of additional features and topics that I will not cover. Chief among the reasons for this is that I am far from an XML expert. I consider myself a typical developer where XML is concerned: my programs need to find and consume information stored in an XML format. I may even have occasion to update or write XML content. Otherwise, XML holds no particular glamor for me.
Reading an XML File
Most articles I have read on this topic begin with using XLINQ to create XML structures and files. My first task was to read an existing XML file, so that is where I will begin.
Here is the XML data I will be using for this article:
<LayoutItems Class=""> <LayoutItem Name="FullName"> <Row>2</Row> <Column>5</Column> </LayoutItem> <LayoutItem Name="Address1"> <Row>3</Row> <Column>10</Column> </LayoutItem> <LayoutItem Name="Address2"> <Row>4</Row> <Column>10</Column> </LayoutItem> <LayoutItem Name="City"> <Row>5</Row> <Column>10</Column> </LayoutItem> <LayoutItem Name="State"> <Row>5</Row> <Column>40</Column> </LayoutItem> <LayoutItem Name="Zip"> <Row>5</Row> <Column>44</Column> </LayoutItem> </LayoutItems>
This is a simple configuration file for a printing product I am writing. The root element LayoutItems has a Class attibute and contains a collection of child LayoutItem objects. Each LayoutItem has a Name attribute that references a Property name in the Class listed in the LayoutItems Class attribute. Each LayoutItem element then contains Row and Column elements. I’m pretty confident you can guess what these represent. This is a simple example, but no matter how complex your XML layout is, these same techniques will work.
First, we need to get an object that we can use to read and process our XML data. There are two that we can use: XDocument and XElement. If we use XDocument, we have to pull an XElement out of it to use our data, so an easier solution is to bypass the XDocument altogether and just use the XElement approach.
There are several ways to create a usable XElement object. This example uses XElement’s static Load method. In this case I am passing it the path of the XML file:
// Load XML from file XElement xml = XElement.Load(XmlPath);
Dropping into debug after this happens shows us that the XElement object now contains all the XML from our file. All the nodes in an XElement are represented by other XElements. This nesting may look confusing at first, but it makes sense, like the Nodes of a TreeNode. Each subsequent element contains all the information for that element, including any other child elements. In our simple example there is only one sublevel of elements, but again the same techniques would work regardless of how deeply nested your data.
Now, in order to read through my XML data, I am going to loop through the Elements collection:
// Loop through Elements Collection foreach (XElement child in xml.Elements()) { // Process child XElement }
Each of these XElement objects will represent one LayoutItem section. The LayoutItem element has a Name attribute that I need to read, so I am going to use the XElement’s Attribute property:
// Read an Attribute XAttribute name = child.Attribute("Name");
There is also an Attributes property that is a collection of all the XAttribute objects. If you had multiples to process or did not know the attribute names, this would be a simple enough option to use.
I want to point out that frequently the methods ask for an XName value. You will quickly find though that XName has no constructor. According to the documentation, wherever an XName is asked for, sending a string will perform an implicit conversion and create an XName object. I’m sorry, but this is really stupid: it creates unnecessary confusion in Intellisense and is not clear or intuitive. All that being said, now that you know, whenever you see “XName”, think “string”.
Now, since I have no further nesting to deal with, I’m ready to go ahead and read my child element data. The code is very similar to the Attribute code, except now we are using the “Element” property:
// Read an Element XElement row = child.Element("Row"); XElement column = child.Element("Column");
Now that we have our Attributes and Elements, it’s time to actually use the values. To do this, we need to extract the values out of our XElement object. To accomplish this, you have two options. First, you can always use ToString() to get the string representation of the data and then manually convert to the desired type:
// using ToString() to extract data string val = row.ToString(); float realVal = Convert.ToSingle(val);
Another option that will let you bypass that step is to use one of the built in cast operators to get the appropriate type directly:
// using a built in Cast operator float realVal = (float)row;
There are built in cast operators for all the expected cast of characters like int, bool, DateTime, etc.
Creating and Writing XML
I used Visual Studio to create my XML file, but writing XML is pretty straightforward as well. Essentially, you need to create XElement objects and add other XElement objects to their Elements collection. Let’s create the above document in code.
The XElement constructor has several overloads. To just create an XElement, you can simply create a new one and pass it the name of the Element:
// Create a new XElement XElement root = new XElement("LayoutItems"); // Add an Attribute root.Add(new XAttribute("Class", ""));
Now, what would be intuitive would be able to access the Attributes and Elements collections and use their Add() methods to add content to them. Unfortunately, this is not possible because these are Read-only collections. XElement has an Add method of its own that accepts single content objects or a params[] construct. You can also use this method on the constructor, so let’s use the constructor method to create a few new XElements and add them to our root:
// Create and add more elements XElement elm1 = new XElement("LayoutItem", new XAttribute("Name", "FullName"), new XElement("Row", 2), new XElement("Column", 5)); root.Add(elm1); XElement elm2 = new XElement("LayoutItem", new XAttribute("Name", "Address1"), new XElement("Row", 3), new XElement("Column", 10)); root.Add(elm2); XElement elm3 = new XElement("LayoutItem", new XAttribute("Name", "Address2"), new XElement("Row", 4), new XElement("Column", 10)); root.Add(elm3); XElement elm4 = new XElement("LayoutItem", new XAttribute("Name", "City"), new XElement("Row", 4), new XElement("Column", 10)); root.Add(elm4); XElement elm5 = new XElement("LayoutItem", new XAttribute("Name", "State"), new XElement("Row", 5), new XElement("Column", 40)); root.Add(elm5); XElement elm6 = new XElement("LayoutItem", new XAttribute("Name", "Zip"), new XElement("Row", 5), new XElement("Column", 44)); root.Add(elm6);
Debug will now show that our root XElement object now contains all the properly formatted XML. To write it to the file system, you can use the standard StreamWriter options, or you could use the built in Save() method:
// Write file root.Save("myFileName.xml");
In this example, I am just passing the string of the path of the file I want. There are several other overloads, but I’m confident you will be able to use them with no problem.
Conclusions
Like I said at the beginning, you didn’t see any query features here, because in this case the LINQ parts really happen in the XML processing. The Attributes and Elements collections implement IEnumerable<T>, so if you need to query them you can do so using LINQ to Objects techniques.
Personally, these new classes finally make XML an attractive solution. I finally feel like reading and processing XML can be pain free and straightforward. I know that I will be using a lot more XML in the future as a result.
Thanks for the example. From this example, I didn’t really see any advantage of using LINQ over XmlDocument and standard DOM API.
Still new to LINQ so I may be missing something. I was expecting that you would point LINQ to an XmlDocument (or Xmlschema), and it would dynamically generate an application object. So I think I had a misconception about what it was about.
I guess the LINQ advantage would be that you can use LINQ query syntax (which works for database, and other objects), instead of having to learn Xpath, or DOM API?
What if you want to save the xml file within your project?
With your method the xml file will be created and everything, but what if you wanted to edit a xml-file within your solution explorer, how can that be done?
We have the need to have an XML destination in SSIS.
Source are execl spreadsheets.
I’m thinking a Script task destination using LINQ.
Have you done this? Ideas?
Great post and some very interesting comments. Keep up the great blog
linq was a very useful application code