Thursday, March 26, 2009

How to NOT use XSD.EXE

I've never been a big fan of xsd.exe. It always felt kinda crude and clunky to me. No support for properties - simple type elements are public fields, limited control over name mapping, a bunch of other minor quirks and not so minor irritations. All of these complaints, when combined with the simple fact that it was a little documented, closed-source command-line tool added up to a less than spectacular solution. Up until now, however, it's worked (sort-of) and really been the only game in town for free xml-schema-to-.NET-classes code generator.

Recently I was working on an integration project. I'm sure you've heard this one before: REST-y web services consumes XML and does wonderful things with the XML behind the scenes. The business logic involved in handling the raw XML and doing all the right stuff was extremely complicated and everything (not the least of which my unit and system tests) we're getting lousy with XML manipulation code. It was pretty clear that I needed something strongly-typed and class-based that I could work with in my code (while still being able to effortlessly hydrate and de-hydrate) XML.

[ed. Note: as an aside, if you're a .NET developer using 3.5 and you're still doing your XML handling using XmlDom you really owe it to yourself to stop reading this blog post right now and go read the MSDN primer on LINQ to XML (http://msdn.microsoft.com/en-us/library/bb308960.aspx) I'm sure you're probably tired of all the LINQ hype by now, but frankly (and especially in this case) it is *entirely* justified. Once you've started using LINQ to XML you will strenuously avoid any other approach to reading, writing and querying XML. It's really that good.]

So here I am needing to generate .NET classes from my XSD. The funny thing is that my first instinct was to go hunt for a new tool to do this. I hadn't needed the services of xsd.exe for a long time and I was thinking to myself "Surely there *has* to be a new and better tool now." I was surprised and more than a little disappointed to find (after a *lot* of research) that aside from the usual smattering of exorbitantly priced commercial software there had been shockingly little progress with and not even much discussion about this particular subject. Everyone was still talking (and complaining) about xsd.exe.

At this point I'm simply resigned to using xsd.exe and moving on with my life. This is where things get a little weird. Apparently, there hasn't been a new version of xsd.exe since .NET 1.1. Well that's a little weird, but ok. I'll just go and find the SDK directory under… oh, wait, this is a new machine that's only had VS '08 installed. Well in that case I'll just go and grab the binary online, it shouldn't be too hard to find. [Three hours later…] Finally, I've got the xsd.exe binary. Now just run it and… "This program requires that you upgrade to the Microsoft .NET Framework Version 1.1" What?! Gahhhhh! Back to Google. (As an aside, I have no clue what's going on here and haven't had a chance to research it. I looked in \Microsoft.NET\Framework\v1.1.4322\ and all that's in there are gacutil and regsvc config files. Also, I see that in VS I have only 2.0, 3.0 and 3.5 as possible framework targets. Apparently .NET 1.1 doesn't install with Vista?)

Anyway, this is way more preamble than I intended for this post (though I'm really hoping to save others and "Future Jake" the pain I suffered around this particular issue (and the substantial lost time trying to sort it all out). So to make an already long story just a bit longer. I gave up on xsd.exe (which I really didn't want to use anyway) and re-doubled my research efforts. I have to admit at this point I was considering writing something myself, even though a recent experience with generating *documentation* from XSDs had made it really clear that this would've been a *serious* undertaking.

I really didn't expect to find anything more, but I tried. I don't know what new search term I used or link I clicked on that I hadn't the first time around but here's the breakthrough I made.

Daniel Cazzulino (an excellent and prolific blogger some may know as Kzu) posted a blog entry way back in October of 2003 that talks about the limitations of xsd.exe (specifically the public/settable nature of the members in the generated classes) and discusses the little-known and poorly documented in-built support in .NET for XSD-based code generation using the XmlSchemaImporter and XmlCodeExporter from the System.Xml.Serialization namespace. This approach requires shockingly little code and is, apparently, how xsd.exe does its code gen.

Here's that original post: http://www.clariusconsulting.net/blogs/kzu/archive/2003/10/24/96.aspx

Then, in a May 2004 post Kzu continues this conversation with an adaptation of the technique to turn his code generator into a VS.NET custom tool. http://www.clariusconsulting.net/blogs/kzu/archive/2004/05/14/XsdCodeGenTool.aspx

Note that the link to the code download on gotdotnet no longer works (since gdn is now dead).

It ends up that Kzu re-worked these blogs into an article for MSDN, complete with full source, later that month (May of 2004). http://msdn.microsoft.com/en-us/library/aa302301.aspx

The source: http://download.microsoft.com/download/5/E/9/5E923D54-242B-48F4-B3A1-DA8CDED6BE45/XsdGenerator.exe

Although it's relatively old (in Internet years) it's still an excellent article and a valid (*the* valid?) approach for XSD-based code generation in .NET. I used this instead of xsd.exe with great results (and this was a pretty complex schema set). It wasn't perfect. The output shows it's age a bit - no nullable type support, arrays instead of generic lists, and a few other minor quirks - but it worked for me "out of the box" with .NET 3.5 SP1 and the output was usable in it's raw form (ie. No post-generation tweaks required that later have to be manually diff-synced when you change the schema and re-gen, a major plus).

Anyway, just today I stumbled on a new (2009.01.27) project by Pascal Cabanel on CodePlex that references Kzu's article and cites it as the author's inspiration. Apparently, he's adapted the technique (and updated it) to produce a more rich, business-object type output. I haven't played with it yet, but it looks very promising. Updates once I've had a chance to try it out.

Pascal Cabanel's Xsd2Code: http://xsd2code.codeplex.com/

Add to del.icio.usDiggIt!RedditStumble ThisAdd to Google BookmarksAdd to Yahoo MyWebAdd to Technorati FavesSlashdot it

2 comments:

Kevin Dietz said...

I also have a project on CodePlex that is similar in purpose to Xsd2Code. It's called XmlGen#, available at http://www.codeplex.com/xmlgensharpIt also is available as a VS custom tool. I have a little bit more work to do on it, but it is coming along.

One thing different is that XmlGen# doesn't rely on XmlSerializer and generates actual code to parse the XML (basically, it generates a recursive decent parser). To me, it's more understandable than the black box XmlSerializer gives me.

Jake Foster said...

Kevin: Thanks for the heads-up. I'll definitely check it out.