Attributes Entitites   «Prev  Next»

Lesson 1

XML Parsers

An XML parser reads an XML document and analyzes its structure for the purpose of reducing it to its component elements. XML parsers check the well-formedness of an XML document and report any errors. Some XML parsers can go a step further and check the validity of an XML document against an internal or external DTD reporting any inconsistencies. In this module we will discuss the operation of XML parsers.
The purpose of an XML parser is to parse an XML document and provide a convenient way for applications to access and manipulate the data contained within the document. XML is a markup language that provides a way to structure data in a hierarchical and self-describing manner. However, the raw XML document is not well suited for direct processing by applications, as it does not provide a convenient way to access the data contained within the elements.
An XML parser is responsible for reading an XML document and converting it into a more convenient data structure, such as a Document Object Model (DOM) tree or a Simple API for XML (SAX) event stream. The parser uses the rules of XML syntax to interpret the structure of the document and to identify the relationships between elements.
Once the parser has processed the XML document, applications can use the resulting data structure to access and manipulate the data contained within the document. For example, they can retrieve the values of elements and attributes, modify the data contained within the elements, and serialize the modified document back into XML format.
In this way, the XML parser serves as a bridge between the raw XML document and the applications that need to process the data contained within the document. By providing a convenient and standardized way to access and manipulate XML data, XML parsers make it easier for developers to build applications that work with XML documents.


Module learning Objectives

After completing this module, you will have the skills and knowledge necessary to:
  1. Explain how an XML parser works
  2. Differentiate between the types of parsers and how they are used
  3. Outline the steps for using an XML parser
  4. Explain the Document Object Model (DOM) for parsing XML documents
  5. Explain the Simple API for XML (SAX) model for parsing XML documents

XML Parsers

Before any work can be done with an XML document it needs to be parsedm which means broken down into its constituent parts with some sort of internal model built up. Although XML fi les are simply text, it is not usually a good idea to extract information using traditional methods of string manipulation such as Substring, Length, and various uses of regular expressions. Because XML is so rich and flexible, for all but the most trivial processing, code using basic string manipulation will be unreliable.
Instead a number of XML parsers are available that facilitate the breakdown and yield more reliable results. You will be using a variety of these parsers throughout this module. One of the reasons to justify using a handmade parser in the early days of XML was that pre-built ones were overkill for the job and had too large a footprint, both in actual size and in the amount of memory they used. Today some very efficient and lightweight parsers are available, which means developing your own is a waste of resources and not a task to be undertaken lightly. In the next lesson, you will learn how an XML parser works.

Java XML JSON