How can I access S3/S3n from a local Hadoop 2.6 installation?

0 votes

I am trying to reproduce an Amazon EMR cluster on my local machine. For that purpose, I have installed the latest stable version of Hadoop as of now - 2.6.0. Now I would like to access an S3 bucket, as I do inside the EMR cluster.

I have added the aws credentials in core-site.xml:

<property>
  <name>fs.s3.awsAccessKeyId</name>
  <value>some id</value>
</property>

<property>
  <name>fs.s3n.awsAccessKeyId</name>
  <value>some id</value>
</property>

<property>
  <name>fs.s3.awsSecretAccessKey</name>
  <value>some key</value>
</property>

<property>
  <name>fs.s3n.awsSecretAccessKey</name>
  <value>some key</value>
</property>

Note: Since there are some slashes on the key, I have escaped them with %2F

If I try to list the contents of the bucket:

hadoop fs -ls s3://some-url/bucket/

I get this error:

ls: No FileSystem for scheme: s3

I edited core-site.xml again, and added information related to the fs:

<property>
  <name>fs.s3.impl</name>
  <value>org.apache.hadoop.fs.s3.S3FileSystem</value>
</property>

<property>
  <name>fs.s3n.impl</name>
  <value>org.apache.hadoop.fs.s3native.NativeS3FileSystem</value>
</property>

This time I get a different error:

-ls: Fatal internal error
java.lang.RuntimeException: java.lang.ClassNotFoundException: Class org.apache.hadoop.fs.s3.S3FileSystem not found
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:2074)
        at org.apache.hadoop.fs.FileSystem.getFileSystemClass(FileSystem.java:2578)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:2591)

Somehow I suspect the Yarn distribution does not have the necessary jars to be able to read S3, but I have no idea where to get those.

Can someone help me out with this?

Oct 3, 2018 in Big Data Hadoop by slayer
• 29,170 points
1,100 views

1 answer to this question.

0 votes

For some reason, the jar hadoop-aws-[version].jar which contains the implementation to NativeS3FileSystem is not present in the classpath of hadoop by default in the version 2.6 & 2.7. So, try and add it to the classpath by adding the following line in hadoop-env.sh which is located in $HADOOP_HOME/etc/hadoop/hadoop-env.sh:

export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$HADOOP_HOME/share/hadoop/tools/lib/*

Assuming you are using Apache Hadoop 2.6 or 2.7

By the way, you could check the classpath of Hadoop using:

bin/hadoop classpath
answered Oct 3, 2018 by digger
• 26,550 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How can I download hadoop documentation for a specific version?

You can go through this SVN link:- ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by Shubham
• 13,310 points
95 views
0 votes
1 answer
0 votes
1 answer

How to checkout Hadoop 2.6.0 from git

Clone the following Git repository: git clone git ...READ MORE

answered Apr 23, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
93 views
0 votes
1 answer

How to upgrade Apache Hadoop from 2.4.1 to 2.6.0?

If the downtime is not an issue, ...READ MORE

answered Sep 7, 2018 in Big Data Hadoop by Frankie
• 9,810 points
33 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,730 points
3,372 views
0 votes
10 answers

hadoop fs -put command?

put syntax: put <localSrc> <dest> copy syntax: copyFr ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Aditya
16,791 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,260 points
1,226 views
0 votes
1 answer
0 votes
1 answer

How to execute python script in hadoop file system (hdfs)?

If you are simply looking to distribute ...READ MORE

answered Sep 19, 2018 in Big Data Hadoop by digger
• 26,550 points
2,440 views
0 votes
1 answer

How to create smaller table from big table in HIVE?

You could probably best use Hive's built-in sampling ...READ MORE

answered Sep 24, 2018 in Big Data Hadoop by digger
• 26,550 points
218 views