Search This Blog

Monday, April 25, 2011

Optimizing XML Document Upload by ignoring DTD

Few days back, I was struggling to optimize a piece of code for loading an XML through stream. The default XmlDocument.Load() was taking much time to load a document stream which was not more than 100 KB. After investigating the XML file I found that the file was taking long to upload because of the DTD tags. Since I have to directly parse XML to retrieve content, I was looking for a fast performing solution.
I searched for the XmlDocument namespace to get some idea of how to remove this DTD from XML document and found that the XmlDocument class loads XML files via the Load or LoadXml methods, which all ultimately convert to an XmlTextReader before reading the XML. There's one exception to this rule, however, and that's the Load overload that accepts an XmlReader. More than this, it's the XmlReader, and not the XmlDocument that resolves DTD validation arguments. It does this by using the XmlResolver set in the XmlReaderSettings.XmlResolver property.
To solve this issue, I created an instance of XmlReaderSettings, and probhited DTD processing by setting ProhibitDtd = false. I also removed the ability for the XmlReader to resolve the address specified in the DOCTYPE element by setting XmlResolver = null.
After doing this, we can safely create an XmlReader, and pass the reader into the Load method of the XmlDocument, and the XmlDocument will load the specified XML file without validating the document.
//get the xml document as filestream
XmlDocument uploadedFileStream = fileUploadCtrl.PostedFile.InputStream;
if (uploadedFileStream != null)
          //code optimized; dtd ignored for faster processing
          objXmlReaderSettings = new XmlReaderSettings();
          objXmlReaderSettings.XmlResolver = null;
          objXmlReaderSettings.ProhibitDtd = false
          objXmlReader = XmlTextReader.Create(uploadedFileStream,   objXmlReaderSettings);
          uploadedFileXmlDoc = new XmlDocument();
       catch (XmlException xmlParseEx)
  //custom exception handling

No comments:

Post a Comment