Is there a way to store a huge dataset as a dataframe using Pandas

0 votes
I want to import a fairly large dataset as a dataframe every time I run the script, such that the dataframe is constantly available in between the runs. How can I do that?
Jun 22, 2019 in Python by Shabnam
• 930 points
314 views

1 answer to this question.

0 votes

This can be easliy done by using to_pickle:

df.to_pickle(file_name)  # where to save it, usually as a .pkl

And to load it back:

df = pd.read_pickle(file_name)

Another way to do this is by using HDF5 which offers fairly fast access to huge datasets:

store = HDFStore('store.h5')

store['df'] = df  # save it
store['df']  # load it
answered Jun 22, 2019 by Taj
• 1,080 points

Related Questions In Python

0 votes
1 answer

Is there a way to store this text in a list using selenium (python)

Try using this code snippet to resolve ...READ MORE

answered Aug 24, 2020 in Python by Carlos
409 views
0 votes
0 answers

is there a way to run android using python?

can you give a few sample projects ...READ MORE

Apr 22, 2019 in Python by Waseem
• 4,540 points
140 views
0 votes
1 answer

Is there anyway to obtain the full abstract from a 'PUBmed' article using bioPython

Hey Charlie, it's certainly possible to pull ...READ MORE

answered Aug 24, 2018 in Python by aryya
• 7,440 points
1,776 views
0 votes
2 answers
+1 vote
2 answers

how can i count the items in a list?

Syntax :            list. count(value) Code: colors = ['red', 'green', ...READ MORE

answered Jul 7, 2019 in Python by Neha
• 330 points

edited Jul 8, 2019 by Kalgi 2,414 views
0 votes
0 answers
+5 votes
6 answers

Lowercase in Python

You can simply the built-in function in ...READ MORE

answered Apr 11, 2018 in Python by hemant
• 5,810 points
1,814 views
0 votes
1 answer

How to create a train and test sample from one dataframe using pandas?

Hi, The below written code can help you ...READ MORE

answered Jul 4, 2019 in Python by Taj
• 1,080 points
4,823 views
0 votes
1 answer

How to iterate over row in a Dataframe in Pandas?

Hi, You can use df.iterrows(), it yields both ...READ MORE

answered Jul 19, 2019 in Python by Taj
• 1,080 points
2,077 views