Read file content from S3 bucket with boto3

+1 vote

I read the filenames in my S3 bucket by doing

objs = boto3.client.list_objects(Bucket='my_bucket') while 'Contents' in objs.keys(): objs_contents = objs['Contents'] for i in range(len(objs_contents)): filename = objs_contents[i]['Key']

Now, I need to get the actual content of the file, similarly to a open(filename).readlines(). What is the best way?

Oct 23, 2018 in AWS by datageek
• 2,530 points

2 answers to this question.

+1 vote

boto3 offers a resource model that makes tasks like iterating through objects easier. Unfortunately, StreamingBody doesn't provide readline or readlines.

s3 = boto3.resource('s3')
bucket = s3.Bucket('test-bucket')
# Iterates through all the objects, doing the pagination for you. Each obj
# is an ObjectSummary, so it doesn't contain the body. You'll need to call
# get to get the whole body.
for obj in bucket.objects.all():
    key = obj.key
    body = obj.get()['Body'].read()

For a detailed explanation on S3, check this out!

Hope this helps!

answered Oct 23, 2018 by anonymous
where we have to pass the access key and endpoint URL

You could declare it when you create the client

client = boto3.client( 's3', aws_access_key_id="***", aws_secret_access_key="****" )
+1 vote
bucket = s3_client.Bucket('test')
for obj in bucket.objects.all():
for line in contents.splitlines():
answered Jul 4, 2019 by reddy

edited Jul 4, 2019 by Kalgi
I tried iterating through a bucket using above code,but I am getting below error:

AttributeError: 'str' object has no attribute 'objects'

Kindly check and advise.

Hi Vishal,

Check if you are passing your bucket in the following line. 

bucket = s3_client.Bucket('test')

You might be getting an error for this line as far I know. s3_client.Bucket should have a bucket argument passed.

Related Questions In AWS

0 votes
0 answers

I want to get file name from key in S3 bucket wanted to read single file from list of file present in bucket

1 <class 'boto.s3.key.Key'> <Key: numbers-email, staging/Procured_Numbers_Status/procured_numbers_status_2019-05-15:06:09:04.csv> I ...READ MORE

May 15, 2019 in AWS by anonymous
0 votes
0 answers

How to directly read excel file from s3 with pandas in airflow dag?

Python is not working when I try to read an excel file from S3 inside of an AI flow dag.  It is quite strange because it works when I read it using excel(s3 excel path) from outside airflow.   How I acted:   Create an AWS account in Airflow (this works well as I can list my s3 bucket) In my Docker environment, where I run Airflow, instal pandas and s3fs. excel(s3 excel path) should be used to attempt to read the file. I've tried it outside of Airflow, and it functions as stated.  Furthermore, even after waiting 20 minutes, nothing happens; the dag just keeps running indefinitely (at the stage where it is meant to read the file). (I am attempting to read the file ...READ MORE

Jan 3, 2023 in AWS by Tejashwini
• 3,820 points
+1 vote
2 answers

Want my AWS s3 Bucket to read Name from CloudWatch Event

CloudTrail events for S3 bucket level operations ...READ MORE

answered May 28, 2018 in AWS by Cloud gunner
• 4,670 points
0 votes
1 answer
+5 votes
14 answers

Python AWS Boto3: How do i read files from S3 Bucket?

You can use the following code, import boto3 s3 ...READ MORE

answered Dec 7, 2018 in AWS by Nitesh
0 votes
1 answer

How to copy .csv file from Amazon S3 bucket?

Boto3 is the library to use for ...READ MORE

answered Jul 6, 2018 in AWS by Priyaj
• 58,090 points
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP