Is there a way to store a huge dataset as a dataframe using Pandas?

0 votes
I want to import a fairly large dataset as a dataframe every time I run the script, such that the dataframe is constantly available in between the runs. How can I do that?
Jun 21, 2019 in Python by Shabnam
• 920 points

1 answer to this question.

0 votes

This can be easliy done by using to_pickle:

df.to_pickle(file_name)  # where to save it, usually as a .pkl

And to load it back:

df = pd.read_pickle(file_name)

Another way to do this is by using HDF5 which offers fairly fast access to huge datasets:

store = HDFStore('store.h5')

store['df'] = df  # save it
store['df']  # load it
answered Jun 21, 2019 by Taj
• 1,060 points

Related Questions In Python

0 votes
0 answers

is there a way to run android using python?

can you give a few sample projects ...READ MORE

Apr 22, 2019 in Python by Waseem
• 4,530 points
0 votes
1 answer

Is there anyway to obtain the full abstract from a 'PUBmed' article using bioPython

Hey Charlie, it's certainly possible to pull ...READ MORE

answered Aug 23, 2018 in Python by aryya
• 7,370 points
0 votes
1 answer

Is there a foreach function in python and is there a way to implement it if there isnt any

Every occurence of "foreach" I've seen (PHP, ...READ MORE

answered Aug 31, 2018 in Python by charlie_brown
• 7,770 points
0 votes
1 answer

Is there a way to run Python on Android?

YES! An example via Matt Cutts via SL4A -- "here’s ...READ MORE

answered Sep 19, 2018 in Python by Priyaj
• 57,530 points
+1 vote
2 answers

how can i count the items in a list?

Syntax :            list. count(value) Code: colors = ['red', 'green', ...READ MORE

answered Jul 6, 2019 in Python by Neha
• 330 points

edited Jul 8, 2019 by Kalgi 1,049 views
0 votes
0 answers
+4 votes
6 answers
0 votes
1 answer

How to create a train and test sample from one dataframe using pandas?

Hi, The below written code can help you ...READ MORE

answered Jul 3, 2019 in Python by Taj
• 1,060 points
0 votes
1 answer

How to iterate over row in a Dataframe in Pandas?

Hi, You can use df.iterrows(), it yields both ...READ MORE

answered Jul 18, 2019 in Python by Taj
• 1,060 points