How to parse the Excel format XML using Python

0 votes

I have Excel sheet saved as XML format and I would like to parse this using the Python.

Can you please let me know how can I parse XML using python.

Excel sheet were having two sheets Data_XY and Data_AB.

Data_XY sheet has two coumns X and Y and few rows.

After parsing  this sheet I want like X values and Y values in dictionaries.

Below is the XML of same sheet:

<?xml version="1.0"?>
<?mso-application progid="Excel.Sheet"?>
<Workbook xmlns="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:o="urn:schemas-microsoft-com:office:office"
 xmlns:x="urn:schemas-microsoft-com:office:excel"
 xmlns:ss="urn:schemas-microsoft-com:office:spreadsheet"
 xmlns:html="http://www.w3.org/TR/REC-html40">
 <DocumentProperties xmlns="urn:schemas-microsoft-com:office:office">
  <Author>Hikmat Bahadur Rawal</Author>
  <LastAuthor>Hikmat Bahadur Rawal</LastAuthor>
  <Created>2015-06-05T18:17:20Z</Created>
  <LastSaved>2020-09-24T07:16:51Z</LastSaved>
  <Version>16.00</Version>
 </DocumentProperties>
 <OfficeDocumentSettings xmlns="urn:schemas-microsoft-com:office:office">
  <AllowPNG/>
 </OfficeDocumentSettings>
 <ExcelWorkbook xmlns="urn:schemas-microsoft-com:office:excel">
  <WindowHeight>12650</WindowHeight>
  <WindowWidth>22260</WindowWidth>
  <WindowTopX>32767</WindowTopX>
  <WindowTopY>32767</WindowTopY>
  <RefModeR1C1/>
  <ProtectStructure>False</ProtectStructure>
  <ProtectWindows>False</ProtectWindows>
 </ExcelWorkbook>
 <Styles>
  <Style ss:ID="Default" ss:Name="Normal">
   <Alignment ss:Vertical="Bottom"/>
   <Borders/>
   <Font ss:FontName="Calibri" x:Family="Swiss" ss:Size="11" ss:Color="#000000"/>
   <Interior/>
   <NumberFormat/>
   <Protection/>
  </Style>
  <Style ss:ID="s62">
   <Borders>
    <Border ss:Position="Bottom" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Left" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Right" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Top" ss:LineStyle="Continuous" ss:Weight="1"/>
   </Borders>
  </Style>
  <Style ss:ID="s63">
   <Borders>
    <Border ss:Position="Bottom" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Left" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Right" ss:LineStyle="Continuous" ss:Weight="1"/>
    <Border ss:Position="Top" ss:LineStyle="Continuous" ss:Weight="1"/>
   </Borders>
   <Interior ss:Color="#ED7D31" ss:Pattern="Solid"/>
  </Style>
 </Styles>
 <Worksheet ss:Name="Data_XY">
  <Table ss:ExpandedColumnCount="2" ss:ExpandedRowCount="6" x:FullColumns="1"
   x:FullRows="1" ss:DefaultRowHeight="14.5">
   <Row>
    <Cell ss:StyleID="s63"><Data ss:Type="String">X</Data></Cell>
    <Cell ss:StyleID="s63"><Data ss:Type="String">Y</Data></Cell>
   </Row>
   <Row>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">1</Data></Cell>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">4</Data></Cell>
   </Row>
   <Row>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">2</Data></Cell>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">5</Data></Cell>
   </Row>
   <Row>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">3</Data></Cell>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">6</Data></Cell>
   </Row>
   <Row>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">4</Data></Cell>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">7</Data></Cell>
   </Row>
   <Row>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">5</Data></Cell>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">8</Data></Cell>
   </Row>
  </Table>
  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
   <PageSetup>
    <Header x:Margin="0.3"/>
    <Footer x:Margin="0.3"/>
    <PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75"/>
   </PageSetup>
   <Selected/>
   <Panes>
    <Pane>
     <Number>3</Number>
     <ActiveRow>17</ActiveRow>
     <ActiveCol>5</ActiveCol>
    </Pane>
   </Panes>
   <ProtectObjects>False</ProtectObjects>
   <ProtectScenarios>False</ProtectScenarios>
  </WorksheetOptions>
 </Worksheet>
 <Worksheet ss:Name="Data_AB">
  <Table ss:ExpandedColumnCount="2" ss:ExpandedRowCount="6" x:FullColumns="1"
   x:FullRows="1" ss:DefaultRowHeight="14.5">
   <Row>
    <Cell ss:StyleID="s63"><Data ss:Type="String">A</Data></Cell>
    <Cell ss:StyleID="s63"><Data ss:Type="String">B</Data></Cell>
   </Row>
   <Row>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">1</Data></Cell>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">7</Data></Cell>
   </Row>
   <Row>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">2</Data></Cell>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">8</Data></Cell>
   </Row>
   <Row>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">3</Data></Cell>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">9</Data></Cell>
   </Row>
   <Row>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">4</Data></Cell>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">10</Data></Cell>
   </Row>
   <Row>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">5</Data></Cell>
    <Cell ss:StyleID="s62"><Data ss:Type="Number">11</Data></Cell>
   </Row>
  </Table>
  <WorksheetOptions xmlns="urn:schemas-microsoft-com:office:excel">
   <PageSetup>
    <Header x:Margin="0.3"/>
    <Footer x:Margin="0.3"/>
    <PageMargins x:Bottom="0.75" x:Left="0.7" x:Right="0.7" x:Top="0.75"/>
   </PageSetup>
   <Panes>
    <Pane>
     <Number>3</Number>
     <ActiveRow>8</ActiveRow>
     <ActiveCol>6</ActiveCol>
    </Pane>
   </Panes>
   <ProtectObjects>False</ProtectObjects>
   <ProtectScenarios>False</ProtectScenarios>
  </WorksheetOptions>
 </Worksheet>
</Workbook>

Sep 24, 2020 in Python by Hikmat
• 120 points
6,390 views

1 answer to this question.

0 votes

Hi, @Hikmat,

Regarding your query, you have to follow multiple steps, I would suggest you visit here: https://towardsdatascience.com/from-xml-to-excel-for-data-analysis-ac0c0c765b7d

answered Sep 24, 2020 by Roshni
• 10,520 points

Related Questions In Python

0 votes
2 answers

how to print the current time using python?

print(datetime.datetime.today()) READ MORE

answered Feb 14, 2019 in Python by Shashank
• 1,370 points
1,079 views
0 votes
1 answer

How to use read a WSDL file from the file system using Python suds?

Hi, good question. It is a very simple ...READ MORE

answered Jan 21, 2019 in Python by Nymeria
• 3,560 points
8,258 views
0 votes
2 answers

How to add a certain time delay to the code using Python?

You can use time.sleep(duration in second) READ MORE

answered Feb 14, 2019 in Python by Shashank
• 1,370 points
1,189 views
0 votes
1 answer

How to output the rows which are affected using SQLAlchemy in Python?

Hi, good question. This is actually not ...READ MORE

answered Feb 15, 2019 in Python by Nymeria
• 3,560 points
3,345 views
0 votes
0 answers

How to install python modules using wheeling format?

Can you show the installation of pyaudio ...READ MORE

Jun 21, 2019 in Python by Waseem
• 4,540 points
570 views
0 votes
0 answers

How to save the import csv file to mongodb using pyspark (or python)?

I have this code, and I want ...READ MORE

Oct 9, 2019 in Python by Ahmed
• 310 points
2,379 views
0 votes
3 answers

How to get the return value from a thread using python?

FWIW, the multiprocessing module has a nice interface for ...READ MORE

answered Dec 15, 2020 in Python by Roshni
• 10,520 points
107,545 views
0 votes
1 answer

how to send post request to a webpage using cookies and headers in python to scrape the webpage ?

Hey @Avinash, Regarding your query, I would suggest ...READ MORE

answered Sep 29, 2020 in Python by Roshni
• 10,520 points
2,703 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP