Python XML file to pandas dataframe

How to convert an xml file to pandas dataframe?
Aug 1, 2019 in Python by Rishi

Here's an example code:

import pandas as pd 

import xml.etree.ElementTree as et 


xtree = et.parse("student.xml")

xroot = xtree.getroot() 

df_cols = ["name", "email", "grade", "age"]

out_df = pd.DataFrame(columns = df_cols)

for node in xroot: 

    s_name = node.attrib.get("name")

    s_mail = node.find("email").text if node is not None else None

    s_grade = node.find("grade").text if node is not None else None

    s_age = node.find("age").text if node is not None else None


    out_df = out_df.append(pd.Series([s_name, s_mail, s_grade, s_age],

                                     index = df_cols), 

                           ignore_index = True)

1.In the above code we have imported pandas and ElementTree,

 ElementTree breaks the xml document into a tree structure which is easy to work with

 2.We have parsed or extracted the xml file and stored in xtree,

 Every part of a tree (root included) has a tag that describes the element.

 3.df_clos has the coloumn names which is in xml and which we want to store in dataframe

   out_df here all the coloumns are stored in a dataframe

4. A for loop to extract all the data and we are storing the data in the variable i,e s_name,s_mail etc,

    here find() finds the first child with a particular tag

5.In Out_df we are appending the data which has been converted to dataframe

answered Aug 1, 2019 by Sharon

