Attributes Entitites   «Prev  Next»

Lesson 2What is a parser?
Objective Explain how an XML parser works.

What is XML parser?

XML parsers operate on XML documents by creating a hierarchical representation of these documents.
Once the document hierarchy is created, any element in the XML document can be accessed, changed, deleted, or added. For example, assume that the following elements are included in an XML document:

<ITEM>
<NAME>Computer Cable</NAME>
<LIST-PRICE>$50</LIST-PRICE>
</ITEM>

Document Hierarchy

In this example, the document hierarchy that is created by the XML parser gives an application using the XML document the ability to obtain the value included in the <LIST-PRICE> element for this inventory item, namely, $50. A processing application has the ability to add another item to this document, or change the existing list price of an item. The processing application utilizes a set of application programming interfaces (APIs) defined by the parser software vendor to programmatically obtain and manipulate the elements in the hierarchical representation of the document. Two types of APIs exist for this purpose: the Document Object Model (DOM) and the Simple API for XML (SAX), both of which are discussed later in the course. All XML parsers are required to check the well-formedness of an XML document according to the XML specification and report any errors. In addition to checking the well-formedness of an XML document, some parsers also check to see if a given XML document conforms to a specified DTD. In this case, the XML parser reports any inconsistencies (errors) that may exist in the XML document.


What is the relationship between the XML Parser and Document Hierarchy

The relationship between an XML parser and the document hierarchy is that the parser is responsible for reading an XML document and converting it into a hierarchical structure, typically referred to as a Document Object Model (DOM). The DOM represents the XML document as a tree-like structure, where each element in the document is represented as a node in the tree.
Each node in the DOM tree has parent-child relationships with other nodes, reflecting the hierarchical structure of the XML document. The root node of the DOM tree represents the top-level element in the XML document, and its children represent the elements nested within it.
The parser uses the rules of XML syntax to interpret the structure of the document and to identify the relationships between elements. It then builds the DOM tree based on these relationships, which can then be accessed and manipulated by applications that use the DOM API.
In this way, the XML parser serves as an intermediary between the raw XML document and the hierarchical representation of the document provided by the DOM.

In the next lesson, you will learn how to differentiate between the types of parsers and how they are used.