Thursday, May 19, 2011

JAXP & JAXB

JAXP = Java API for XML Processing (SAX, DOM, StaX)

JAXP is for processing XML data using applications written in Java. JAXP leverages the parser standards Simple API for XML Parsing (SAX) and Document Object Model (DOM) so that you can choose to parse your data as a stream of events or to build an object representation of it. JAXP also implements the Streaming API for XML (StAX) standard.

SAX is a standard interface for event-based XML parsing. It reports parsing events (such as the start and end of elements) directly to the application through callbacks, the application implements handlers to deal with the different events. It supports read-only access, that is, SAX is designed only for reading XML documents. Of course, we could write a new document out as a result of event, but we cannot change the document being processed.

SAX is fast and efficient, but its event model makes it most useful for such state-independent filtering. For example, a SAX parser calls one method in your application when an element tag is encountered and calls a different method when text is found. If the processing you are doing is state-independent (meaning that it does not depend on the elements that have come before), then SAX works fine.

On the other hand, for state-dependent processing, where the program needs to do one thing with the data under element A but something different with the data under element B, then a pull parser such as the Streaming API for XML (StAX) would be a better choice.

StAX provides a streaming, event-driven, pull-parsing API for reading and writing XML documents. StAX offers a simpler programming model than SAX and more efficient memory management than DOM.

Pull Parsing versus Push Parsing

Streaming pull parsing refers to a programming model in which a client application calls methods on an XML parsing library when it needs to interact with an XML infoset—that is, the client only gets (pulls) XML data when it explicitly asks for it.

Streaming push parsing refers to a programming model in which an XML parser sends (pushes) XML data to the client as the parser encounters elements in an XML infoset—that is, the parser sends the data whether or not the client is ready to use it at that time.

Pull parsers and the SAX API both act like a serial I/O stream. You see the data as it streams in, but you cannot go back to an earlier position or leap ahead to a different position. In general, such parsers work well when you simply want to read data and have the application act on it.
But when you need to modify an XML structure - especially when you need to modify it interactively - an in-memory structure makes more sense. DOM is one such model.

Generally speaking, there are two programming models for working with XML infosets: streaming and the document object model (DOM).

The DOM model involves creating in-memory objects representing an entire document tree and the complete infoset state for an XML document. Once in memory, DOM trees can be navigated freely and parsed arbitrarily, and as such provide maximum flexibility for developers. However, the cost of this flexibility is a potentially large memory footprint and significant processor requirements, because the entire representation of the document must be held in memory as objects for the duration of the document processing. This may not be an issue when working with small documents, but memory and processor requirements can escalate quickly with document size.

Streaming refers to a programming model in which XML infosets are transmitted and parsed serially at application runtime, often in real time, and often from dynamic sources whose contents are not precisely known beforehand. Moreover, stream-based parsers can start generating output immediately, and infoset elements can be discarded and garbage collected immediately after they are used. While providing a smaller memory footprint, reduced processor requirements, and higher performance in certain situations, the primary trade-off with stream processing is that you can only see the infoset state at one location at a time in the document.

Streaming models for XML processing are particularly useful when your application has strict memory limitations, as with a cellphone running the Java Platform, Micro Edition (Java ME platform), or when your application needs to process several requests simultaneously, as with an application server.

JAXB = Java API for XML Binding

JAXB provides a fast and convenient way to create a two-way mapping between XML documents and Java objects. What XML data-binding does is to create Java classes from schema definition. That is, the XML schema can be compiled to generate corresponding Java classes.And once the class is generated for the XML schema, a XML document which follows the syntax of the schema can be represented as an instance of the generated class. The process of converting XML document to a corresponding high-level Java object is called "un-marshalling" while the reverse is called "marshalling".

Here is some example I provide:

 

©2009 Stay the Same | by TNB