Split a large XML into pieces using Java(Part-2)

In the previous post we discussed about building a parser for parsing xml. At the end of the previous post you are like "WTH is this new XmlErrorHandler() thing??".Basically what it does is that handle the errors inside the xml.(Missing tags and schema errors).So let's implement the error handler class.

public class XmlErrorHandler implements ErrorHandler {

      public void warning(SAXParseException e) throws SAXException {
           //print the exception
      }
      public void error(SAXParseException e) throws SAXException {
           //print the exception
      }
      public void fatalError(SAXParseException e) throws SAXException {
          //print the exception
      }
}

Okay now the fun part..!!We need to split the xml from <person></person>  tags.Assume someone has told "hey isla can you split this xml from the <person></persons> tags." Mmmmm...yeah I can.First we'll convert this starting and ending tag thing to an xpath.  So the xpath is //person.

XPath xPath = XPathFactory.newInstance().newXPath();
XPathExpression exp = xPath.compile("//person");

An xpath expression is set up and now we can analyse the xml for tags. We'll instantiate a NodeList element to do this.

NodeList nl = (NodeList) exp.evaluate(doc, XPathConstants.NODESET);

Now go through the xml using a loop and find the tags <person></person>.

for (int i = 0; i < nl.getLength(); i++) {

      Node node = nl.item(i);
      StringWriter buf = new StringWriter();
      Transformer transformer =                   TransformerFactory.newInstance().newTransformer();
       //------------------OUTPUT_PROPERTIES-----------------------//
       transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION,    "no");   
       //------------------OUTPUT_PROPERTIES-----------------------//
       transformer.transform(new DOMSource(node), new StreamResult(buf));
       System.out.println("Element found:\n"+buf.toString());
 }

You'll get the output as:

Element found:
<person> <id>person0</id> <name>name0</name> <age>age0</age> </person>
Element found:
<person> <id>person1</id> <name>name1</name> <age>age1</age> </person>

Note that I have used an xml with only two of desired elements.You can add elements as much as possible to the xml and check the validity of the implementation.The main strategy we used here is the Transformer.You can set output properties for the transformer as much as you want.Following you get some output properties for the transformer.

OutputKeys.ENCODING
encoding=string

OutputKeys.INDENT
indent = "yes" | "no"

OutputKeys.OMIT_XML_DECLARATION
omit-xml-declaration = "yes" | "no"

OutputKeys.STANDALONE
standalone = "yes" | "no"

Comments