Python read file as stream from HDFS

Question

I have a python read file in HDFS which I need to run, but due to some reasons or issues in my HDFS, I am unable to run it. I am getting a warning which says not enough to fit all in memory.

I wish to clear all cache and process this read file. I am searching for a method to do it without using any of the additional libraries. I was thinking to do this using the standard "Hadoop" command line tools using the Python subprocess module, but I can't seem to be able to do what I need since there are no command line tools that would do my processing and I would like to execute a Python function for every line in a streaming fashion.

Is there a way to apply Python functions as right operands of the pipes using the subprocess module? Or even better, open it like a file as a generator so I could process each line easily?

ravikiran · Answer 1 · May 30, 2019

I could redirect to a Python library which should help you fix the issue If it does not work, then, I can suggest you get the stdout pipe from your Popen object.

#cat = subprocess.Popen(["hadoop", "fs", "-cat", "/path/to/myfile"], stdout=subprocess.PIPE)
#for line in cat.stdout:
#print line

answered May 30, 2019 by ravikiran
• 4,620 points

Python read file as stream from HDFS

Your comment on this question:

1 answer to this question.

Your answer

Your comment on this answer:

Related Questions In Big Data Hadoop

Not Able to read the file from hdfs location

Is there a way to copy data from one one Hadoop distributed file system(HDFS) to another HDFS?

Copy file from HDFS to the local file system

Error while copying the file from local to HDFS

Hadoop Mapreduce word count Program

hadoop.mapred vs hadoop.mapreduce?

hadoop fs -put command?

Hadoop dfs -ls command?

Issue with Python Read file as stream from HDFS.

Explain to me the correct way to get a Hadoop FileSystem object so that I can use it for reading from HDFS as well as writing to HDFS.

Subscribe to our Newsletter, and get personalized recommendations.

TRENDING CERTIFICATION COURSES

TRENDING MASTERS COURSES

COMPANY

WORK WITH US

DOWNLOAD APP

CATEGORIES

CATEGORIES

TRENDING BLOG ARTICLES

TRENDING BLOG ARTICLES