Transforming XML Documents
- A SAX parser can be viewed as a mechanism for transforming an XML text document into a stream of events corresponding to the markup and character data contained in the original document.
- Similarly, a DOMparser transforms an XML document into a DOMtree.
- In fact, JAXP provides standardized APIs for transforming from any of these three representations – XML document, SAX event stream, or DOM tree – to either of the others.
- Furthermore, JAXP allows a Java program to use the Extensible Stylesheet Language (XSL) to extract data from one XML document, process that data, and produce another XML document containing the processed data.
- XSL can be used, for example, to extract information from an XML document and embed it within an XHTML document so that the information can be viewed using a web browser.
- In this section, we will learn how to perform JAXP transformations between XML representations (text, DOM, and SAX events) and will introduce the JAXP API for XSL.
- In later sections we will cover two key components of XSL itself: XPath and XSLT.
Transforming between XML Representations
- Parsing – Can convert a XML document into a DOM tree. This tree can be manipulated using Java DOM API mathods
- XML Transformations – is the “reverse” of the parsing operation – produce a textual representation of an internal DOM tree.
- Example program – Reading a text XML document into a DOM Document object and then modifying this object
- TransformerFactory – JAXP factory class – used to create an instance of Transformer.
- The Transformer instance then calls the transform() method and performs the actual conversion from the DOM Document object to a text XML document.
- The transform() method takes two arguments,
- Object of class implementing the javax.xml.transform.Source interface
- Object of class implementing javax.xml.transform.Result interface
- JAXP supplies several classes implementing the Source interface:
- xml.transform.dom.DOMSource, (DOM representation of a XML document)
- xml.transform.sax.SAXSource, (SAX representation of a XML document) and
- xml.transform.stream.StreamSource, (text representation of a XML document).
- The Result interface is similarly implemented by JAXP classes DOMResult, SAXResult, and StreamResult, each located in the same package as its Source counterpart.
// JAXP classes
import javax.xml.transform.*;
import javax.xml.transform.dom.*;
import javax.xml.transform.stream.*;
import javax.xml.parsers.*;
// DOM classes
import org.w3c.dom.*;
// JDK classes
import java.io.*;
/** Input an RSS document, remove the first “item” element, and
output the resulting RSS document to System.out */
class DOMtoText
{
public static void main(String args[])
{
try
{
// Input an RSS document into a DOM Document object
DocumentBuilderFactory docBuilderFactory = DocumentBuilderFactory.newInstance();
DocumentBuilder parser = docBuilderFactory.newDocumentBuilder();
Document document = parser.parse(new File(args[0]));
// Use the DOM API to remove the first item element
NodeList items = document.getElementsByTagName(“item”);
items.item(0).getParentNode().removeChild(items.item(0));
// Use JAXP methods to output the modified Document object
TransformerFactory tFactory = TransformerFactory.newInstance();
Transformer transformer = tFactory.newTransformer();
transformer.transform(new DOMSource(document), new StreamResult(System.out));
}
catch (Exception e)
{
e.printStackTrace();
}
return;
}
}
FIGURE 7.11 Program converting a DOM Document object to an XML text representation.