Python program to retrieve a node present in the XML tree

 

How to parse XML documents in python

Write a note on XML. Design python program to retrieve a node present in the XML tree. (08 Marks)

This question was asked in Python Application Programming 15CS664 Jan 2019 question paper for 8 Marks.

Solution:

Video Tutorial – How to parse XML documents in python

eXtensible Markup Language – XML

XML document looks very similar to HTML (HyperText Markup Language), but XML is more structured than HTML document. XML is used to transfer data in standard form from one machine to another machine.

Each pair of opening and closing tags represents a element of the XML document. In this case <person> and </person> one element.

Each element can have some text or attributes (e.g., hide), and can have other nested elements. The nested elements are called child elements and the closing element is called parent elements.

If an XML element has no content, then the element may be indicated with self-closing tag (e.g., <email />).

Here the person is the root tag. It is called root tag as it appears first. The name, mobile, phone, and email are called as child elements. Phone and email are empty elements hence it is depicted using the self-closing element. Email element has one attribute called hide.

Here is a sample example of XML document:

<person>
    <name> Mahesh </name>
    <phone> +91 9989898989 </phone>
    <email/>
</person>

Another Example with attributes:

 <person>
     <name>Mahesh</name>
     <phone type="mobile">
         +91 9989898989
     </phone>
     <email hide="yes"/>
 </person> 

Unlike HTML tags, Tags in XML identify the type of data and are used to store and organize the different type of data, rather than specifying how to display it, which are used to display the data.

XML document looks like a tree structure where there is a top tag person acts as the root of the tree and other tags such as phone are drawn as children of their parent nodes.

Python program to retrieve a node present in the XML tree
XML Tree Structure

Example to store multiple person information

<persons>
    <person>
        <name>Mahesh</name>
        <phone type="mobile">
            +91 9989898989
        </phone>
        <email hide="yes"/>
    </person>
    <person>
        <name>Rahul</name>
        <phone type="landline">
            +91 9989898989
        </phone>
        <email hide="yes"/>
    </person>
    <person>
        <name>Ram</name>
        <phone type="mobile">
            +91 9989898989
        </phone>
        <email hide="yes"/>
    </person>
</persons>

Parsing XML

Here is a simple example application program in python that parses some XML documents and extracts the value of data elements (tag) from the XML. This program Python program to retrieve a node present in the XML tree.

 import xml.etree.ElementTree as ET
data = '''
<person>
    <name>Mahesh</name>
    <phone type="mobile">+91 7411043272</phone>
    <email hide="yes"/>
</person>'''

tree = ET.fromstring(data)
print('Name:', tree.find('name').text)
print('Mobile No:', tree.find('phone').text)
print('Email ID:', tree.find('email').text)
print('Attr:', tree.find('phone').get('type'))
print('Attr:', tree.find('email').get('hide'))

Output:

Name: Mahesh
Mobile No: +91 7411043272
Email ID: None
Attr: mobile
Attr: yes

xml.etree.ElementTree is used to parse the XML document. It has a function called fromstring, which takes XML document as input and converts it into the string representation of the XML into a “tree” of XML nodes.

When the XML is in a tree, we have a series of methods we can call to extract portions of data from the XML. find function is used to extract the value of a tag. In the above example tree.find(‘phone’).text returns the phone number.

Looping through elements or nodes of XML document

import xml.etree.ElementTree as ET
input = '''
<persons>
    <person>
        <name>Mahesh</name>
        <phone type="mobile">+91 7411043272</phone>
        <email hide="yes"/>
    </person>
    <person>
        <name>Rahul</name>
        <phone type="mobile">+91 7411043272</phone>
        <email hide="no">xyz@abc.com</email>
    </person>
</persons>'''

persons = ET.fromstring(input)
lst = persons.findall('person')

print('User count:', len(lst))
for p in lst:
    print ('----------------')
    print('Name:', p.find('name').text)
    print('Mobile No:', p.find('phone').text)
    print('Email ID:', p.find('email').text)
    print('Attr:', p.find('phone').get('type'))
    print('Attr:', p.find('email').get('hide'))

Output:

User count: 2
----------------
Name: Mahesh
Mobile No: +91 7411043272
Email ID: None
Attr: mobile
Attr: yes
----------------
Name: Rahul
Mobile No: +91 7411043272
Email ID: xyz@abc.com
Attr: mobile
Attr: no

Clik here to read Solution to Python Application Programming Question Paper Jan 2019 15CS664

If you like the post share it with your friends. For regular updates on VTU CBCS Notes, Question Papers, interview study material, python programs, etc, do like our Facebook page.

Leave a Comment

Your email address will not be published. Required fields are marked *