In an XML document, everything is a node. According to the XML DOM:
- Document node: The entire document.
- Element node: Every XML element.
- Text nodes: The text in the XML elements.
- Attribute node: Every attribute.
- Comment nodes: Every comment.
<?xml version="1.0" encoding="UTF-8"?> <bookstore> <book category="Child"> <title lang="en">ABC</title> <author>Unknown</author> <year>2020</year> <price>100.00</price> </book> <book category="IT"> <title lang="en">XQuery Book</title> <author>Author 1</author> <author>Author 2</author> <author>Author 3</author> <author>Author 4</author> <year>2004</year> <price>350.00</price> </book> </bookstore>
In the above example, the <bookstore> is the root node in the XML. Within this node, all other nodes in the document are included. There are two <book> nodes in the root node <bookstore>. Each <book> node holds four child nodes: <title>, <author>, <year>, and <price>, while each child node contains one text node each, “ABC”, “Unknown”, “2020”, and “100.00”.
Text is Always Stored in Text Nodes:
Expecting an element node to contain text is a common error in DOM processing. Instead, a text node is used to store the text of an element node. In the above example: in the line <year>2020</year>, a text node with the value “2020” is held by the element node <year>, i.e., “2020” is not the value of the <year> element.
The XML DOM Node Tree:
The XML DOM uses a tree-structure, also known as a node-tree, to view an XML document, i.e., each node can be accessed through the tree. We can also modify, delete, or create a new element through the tree. The set of nodes and their connections is what a node tree displays. The root node is where the tree starts. It then branches out to the text nodes at the lowest level of the tree.
Example: XML Tree:
Node Parents, Children, and Siblings:
A hierarchical relationship is present in each node in a node tree. To describe the relationships between elements, the parent, child, and sibling terms are used. Parents have children and vice versa. Children on the same level are called Siblings (brothers and sisters).
- The top node in a node tree is the root node.
- Every node has exactly one parent node. The exception here is the root node.
- Any number of children can be present in a node.
- A node with no children is called a leaf.
- Nodes with the same parent are called siblings.
It is not necessary to know the exact structure of the tree and the type of data contained within, to traverse the XML data, because the XML data is structured in a tree form.
First Child – Last Child:
<bookstore> <book category="Child"> <title lang="en">ABC</title> <author>Unknown</author> <year>2020</year> <price>100.00</price> </book> </bookstore>
In the above example, the first child of the <book> element is the <title> element. The last child of the <book> element is the <price> element. The <title>, <author>, <year>, and <price> elements, have the same parent node i.e., <book>.