How to open a huge excel file efficiently

0 votes

I have a 150MB one-sheet excel file that takes about 7 minutes to open on a very powerful machine using the following:

# using python
import xlrd
wb = xlrd.open_workbook(file)
sh = wb.sheet_by_index(0)

Can the excel file be opened more quickly? I'm willing to consider even crazy suggestions (such as Hadoop, spark, c, java, etc.). If it's possible, I'd like to be able to open the file in less than 30 seconds. Additionally, although the aforementioned example uses Python, another language may be used.

Oct 27, 2022 in Others by Kithuzzz
• 38,010 points
366 views

1 answer to this question.

0 votes

A good example is using PIA's/Interop or Open XML SDK. Most programming languages that integrate with Office products include some intermediary layer, and this is typically where the bottleneck occurs.

Using a Driver is one method of getting the data at a lower level (bypassing the intermediary layer).

150MB one-sheet excel file that takes about 7 minutes.

The best I could do is a 130MB file in 135 seconds, roughly 3 times faster:

Stopwatch sw = new Stopwatch();
sw.Start();

DataSet excelDataSet = new DataSet();

string filePath = @"c:\temp\BigBook.xlsx";

// For .XLSXs we use =Microsoft.ACE.OLEDB.12.0;, for .XLS we'd use Microsoft.Jet.OLEDB.4.0; with  "';Extended Properties=\"Excel 8.0;HDR=YES;\"";
string connectionString = "Provider=Microsoft.ACE.OLEDB.12.0;Data Source='" + filePath + "';Extended Properties=\"Excel 12.0;HDR=YES;\"";

using (OleDbConnection conn = new OleDbConnection(connectionString))
{
    conn.Open();
    OleDbDataAdapter objDA = new System.Data.OleDb.OleDbDataAdapter
    ("select * from [Sheet1$]", conn);
    objDA.Fill(excelDataSet);
    //dataGridView1.DataSource = excelDataSet.Tables[0];
}
sw.Stop();
Debug.Print("Load XLSX tool: " + sw.ElapsedMilliseconds + " millisecs. Records = "  + excelDataSet.Tables[0].Rows.Count);
answered Oct 27, 2022 by narikkadan
• 63,420 points

Related Questions In Others

0 votes
0 answers

How to reduce a huge excel file

I have a short, straightforward *.XLS file ...READ MORE

Nov 27, 2022 in Others by Kithuzzz
• 38,010 points
330 views
0 votes
1 answer

How to fix a circular reference error by if condition in excel file?

Circular reference in this context refers to ...READ MORE

answered Nov 21, 2022 in Others by narikkadan
• 63,420 points
442 views
0 votes
1 answer

How to save a new sheet in an existing excel file, using Pandas?

import pandas as pd import numpy as np path ...READ MORE

answered Dec 10, 2022 in Others by narikkadan
• 63,420 points
6,350 views
0 votes
1 answer

How do I merge multiple excel files to a single excel file

You copy a worksheet from before each ...READ MORE

answered Dec 24, 2022 in Others by narikkadan
• 63,420 points
575 views
0 votes
2 answers
+1 vote
2 answers

how can i count the items in a list?

Syntax :            list. count(value) Code: colors = ['red', 'green', ...READ MORE

answered Jul 7, 2019 in Python by Neha
• 330 points

edited Jul 8, 2019 by Kalgi 4,023 views
0 votes
1 answer
0 votes
1 answer

How to open an Excel Online document as a PDF

In light of your comment above, I ...READ MORE

answered Oct 18, 2022 in Others by narikkadan
• 63,420 points
327 views
0 votes
1 answer

How to open an Excel file in C#?

You must have Microsoft Visual Studio Tools ...READ MORE

answered Nov 13, 2022 in Others by narikkadan
• 63,420 points
996 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP