Derik Whittaker

Syndication

News


Images in this post missing? We recently lost them in a site migration. We're working to restore these as you read this. Should you need an image in an emergency, please contact us at imagehelp@codebetter.com
Using XPath when there is Namespace information in the XML document

If you have ever worked with XML, you know that XPath is your friend.  Today I working with an XML document that was generated via MSWord and the generated document was well formed and included namespaces.  Now, I have worked with namespaces in XML in the past, but it has been a LONG time.  I thought I would put up a post on how to perform simple queries in a XML document that has namespaces because it does involve a little more work then when there is no namespaces.

Sample XML Document

 

Sample Code to query the XML Document

 

Code explanation

  • Getting the namespace information
    In order to use the namespace URI information you will need to pull this out of the xmlDocument object.  You can do this by calling the method GetNamespaceOfPrefix and provide it the name of the namespace.  In this case that is found in workSheetXPath.

  • Creating the XmlNamespaceManager
    Once you have the namespace, you need to build the namespace manager object (this is used later during the actual query).  This is pretty straight forward, but is required.

  • Building the XPath query
    If you have ever built a XPath query before, this should look familiar, but with a twist.  When there is namespace information in the document, this has to be prepended to EVERYTHING in the XPath query.  Notice I am using the string.Format to allow for cleaner code.  The final output of this query is "//ss:Worksheet[@ss:Name=’Sheet1']/ss:Table/ss:Row"

  • Calling/Using the XPath query
    Executing the XPath is not much different, but you will need to provide the namespaceManager object created above in order to get this to work.  If you do not, you will get a run time exception.

There you go, a simple how-to on querying an XML document that has namespace information in it.

Till next time,


kick it on DotNetKicks.com

Posted 08-03-2007 12:28 PM by Derik Whittaker
Filed under: , ,

[Advertisement]

Comments

pgfearo wrote re: Using XPath when there is Namespace information in the XML document
on 08-07-2007 9:03 AM

One observation:

I know this is just a quick sample but your method of using the prefix to get the URI seems sort of back to front in this case.

The URI is the qualifier for the element/attribute name, the prefix is just a shortcut and can actually be anything, in the instance xml it's set by the declaration:

xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"

A valid Word document (created/modified by another product) could use another prefix but must use the same URI.

All you need then do is hardcode the URI, and add the URI and the chosen prefix (such as 'ss') to the XmlNamespaceManager as you describe.

A final point: In my view it would actually make your code more readable to leave the literal namespace prefix in place instead of using String.format - though using this technique for variables such as "Sheet1" is a good idea.

This would allow you to test the XPath expression more quickly by copying and pasting to/from an XPath 'tester ' (in this tool you would set the namespace prefix to match that in your expression - if it were different).

As your XPath expressions become more complex this will be increasingly useful.

Phil Fearon

http://www.sketchpath.com/

Derik Whittaker wrote re: Using XPath when there is Namespace information in the XML document
on 08-07-2007 2:04 PM

@Phil,

Thanks for the post.  I tried just adding the 'ss' to the XmlNamespaceManger but that did not seem to work.  This is why i added it to the XPath query.  It is possible I was doing something wrong.  I was not able to find any real usefull examples on the net for doing this, which is why i created the post.

Yes, you are right, leaving the 'ss' as part of the XPath literal would make it easier to read.  I just i just got caught up in the moment :(

Derik Whittaker wrote re: Using XPath when there is Namespace information in the XML document
on 08-07-2007 2:05 PM

@Phil,

Also, i am going to have to try out your SkethPath tool, looks pretty cool.

pgfearo wrote re: Using XPath when there is Namespace information in the XML document
on 08-08-2007 4:19 AM

Derek,

Your published method for adding namespaces to the XmlNamespaceManager was fine. Both the prefix and URI are required to be added as a pair, as you demonstrated, this then allows the XPath evaluator internally to resolve your 'ss' prefixes to the fully qualified name (which includes the full URI), when querying an XML instance document.

The main point I was making was really a technicality, that you normally already know the URI that identifies the namespace so you don't need to look it up from the source XML.

Namespaces in XML take some time to get used to. Thanks for your comment on SketchPath. Hopefully this tool helps demonstrate how you can use any valid prefix within an XPath query, so long as it is paired with the correct URI in the namespace manager first (For your sample xml, try changing a prefix in the tool's own Namespace manager grid and then looking at the new auto-generated XPath).

Phil Fearon

errrrrrrr... wrote Think bigger.. bigger... bigger..... now your getin it
on 08-28-2007 6:46 PM

Think bigger.. bigger... bigger..... now your getin it

Derik wrote re: Using XPath when there is Namespace information in the XML document
on 03-19-2008 1:44 PM

I have been looking for hours for this code.  Now a question, how can I modify the XPath query to be able to access the individual cells?

Thanks,

Johnny

About The CodeBetter.Com Blog Network
CodeBetter.Com FAQ

Our Mission

Advertisers should contact Brendan

Subscribe
Google Reader or Homepage

del.icio.us CodeBetter.com Latest Items
Add to My Yahoo!
Subscribe with Bloglines
Subscribe in NewsGator Online
Subscribe with myFeedster
Add to My AOL
Furl CodeBetter.com Latest Items
Subscribe in Rojo

Member Projects
DimeCasts.Net - Derik Whittaker

Friends of Devlicio.us
Red-Gate Tools For SQL and .NET

NDepend

SlickEdit
 
SmartInspect .NET Logging
NGEDIT: ViEmu and Codekana
LiteAccounting.Com
DevExpress
Fixx
NHibernate Profiler
Unfuddle
Balsamiq Mockups
Scrumy
JetBrains - ReSharper
Umbraco
NServiceBus
RavenDb
Web Sequence Diagrams
Ducksboard<-- NEW Friend!

 



Site Copyright © 2007 CodeBetter.Com
Content Copyright Individual Bloggers

 

Community Server (Commercial Edition)