How to programmatically access hadoop cluster where kerberos is enable

0 votes

Here is the code using which I am fetching a file from a Hadoop filesystem. Now, in my local single node setup I am able to use this code for fetching the file.

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;

import org.apache.hadoop.fs.FileSystem;

import org.apache.hadoop.fs.Path;

public class HDFSdownloader{

public static void main(String[] args) throws Exception {

    System.getProperty("java.classpath");
    if (args.length != 3) {

        System.out.println("use: HDFSdownloader hdfs src dst");

        System.exit(1);

    }

    System.out.println(HDFSdownloader.class.getName());
    HDFSdownloader dw = new HDFSdownloader();

    dw.copy2local(args[0], args[1], args[2]);

}

private void copy2local(String hdfs, String src, String dst) throws IOException {

    System.out.println("!! Entering function !!");
    Configuration conf = new Configuration();
    conf.set("fs.hdfs.impl", org.apache.hadoop.hdfs.DistributedFileSystem.class.getName());
    conf.set("fs.file.impl", org.apache.hadoop.fs.LocalFileSystem.class.getName());

    conf.set("fs.default.name", hdfs);

    FileSystem.get(conf).copyToLocalFile(new Path(src), new Path(dst));

    System.out.println("!! copytoLocalFile Reached!!");

Now I took the same code, bundled it in a jar and tried to run it on another node(say B). This time the code had to fetch a file from a proper distributed Hadoop cluster. That cluster has Kerberos enabled in it.

Here I got following error:

Exception in thread "main" org.apache.hadoop.security.AccessControlException: SIMPLE authentication is not enabled.  Available:[TOKEN, KERBEROS]
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at org.apache.hadoop.ipc.RemoteException.instantiateException(RemoteException.java:106)
at org.apache.hadoop.ipc.RemoteException.unwrapRemoteException(RemoteException.java:73)
at org.apache.hadoop.hdfs.DFSClient.getFileInfo(DFSClient.java:2115)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1305)
at org.apache.hadoop.hdfs.DistributedFileSystem$22.doCall(DistributedFileSystem.java:1301)
at org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
at org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1317)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:337)
at org.apache.hadoop.fs.FileUtil.copy(FileUtil.java:289)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:2030)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1999)
at org.apache.hadoop.fs.FileSystem.copyToLocalFile(FileSystem.java:1975)
at com.hdfs.test.hdfs_util.HDFSdownloader.copy2local(HDFSdownloader.java:49)
at com.hdfs.test.hdfs_util.HDFSdownloader.main(HDFSdownloader.java:35)

Can you please help me out here?

Mar 27, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
7,432 views

1 answer to this question.

0 votes

Okay,here's the code snippet to work in the scenario you have described above.  let me point out few Important points before you have a look at the code:

  • Provide keytab file location in UserGroupInformation
  • Provide kerberos realm details in JVM arguments - krb5.conf file
  • Define hadoop security authentication mode as kerberos

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.fs.FileStatus;
import org.apache.hadoop.fs.FileSystem;
import org.apache.hadoop.fs.Path;
import org.apache.hadoop.security.UserGroupInformation;

public class KerberosHDFSIO {

public static void main(String[] args) throws IOException {

    Configuration conf = new Configuration();
    //The following property is enough for a non-kerberized setup
    //      conf.set("fs.defaultFS", "localhost:9000");

    //need following set of properties to access a kerberized cluster
    conf.set("fs.defaultFS", "hdfs://devha:8020");
    conf.set("hadoop.security.authentication", "kerberos");

    //The location of krb5.conf file needs to be provided in the VM arguments for the JVM
    //-Djava.security.krb5.conf=/Users/user/Desktop/utils/cluster/dev/krb5.conf

    UserGroupInformation.setConfiguration(conf);
    UserGroupInformation.loginUserFromKeytab("user@HADOOP_DEV.ABC.COM",
            "/Users/user/Desktop/utils/cluster/dev/.user.keytab");

    try (FileSystem fs = FileSystem.get(conf);) {
        FileStatus[] fileStatuses = fs.listStatus(new Path("/user/username/dropoff"));
        for (FileStatus fileStatus : fileStatuses) {
            System.out.println(fileStatus.getPath().getName());
        }
    } catch (IOException e) {
        // TODO Auto-generated catch block
        e.printStackTrace();
    }
answered Mar 27, 2018 by coldcode
• 2,090 points
Thanks for this code . After a lot of searching in internet, i found your code which actually works .
Hey @Hari, please do upvote the answer if it has helped you. This will help me get more points and better reputation.
Works like a charm !

Related Questions In Big Data Hadoop

0 votes
1 answer

How to delete a directory from Hadoop cluster which is having comma(,) in its name?

Just try the following command: hadoop fs -rm ...READ MORE

answered May 7, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
3,085 views
0 votes
1 answer

How to access different directories in a Hadoop cluster?

You need to configure the client to ...READ MORE

answered Sep 18, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
798 views
0 votes
1 answer

How to install and configure a multi-node Hadoop cluster?

I would recommend you to install Cent ...READ MORE

answered Mar 22, 2018 in Big Data Hadoop by Shubham
• 13,490 points
2,455 views
0 votes
1 answer

How hadoop mapreduce job is submitted to worker nodes?

Alright, I think you are basically looking ...READ MORE

answered Mar 30, 2018 in Big Data Hadoop by Ashish
• 2,650 points
5,659 views
0 votes
11 answers
0 votes
1 answer

Is there any way to access Hadoop web UI in linux?

In this case what you can do ...READ MORE

answered May 9, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
4,190 views
0 votes
1 answer

How to get started with Hadoop?

Well, hadoop is actually a framework that ...READ MORE

answered Mar 21, 2018 in Big Data Hadoop by coldcode
• 2,090 points
1,207 views
0 votes
2 answers

How can I list NameNode & DataNodes from any machine in the Hadoop cluster?

You can browse hadoop page from any ...READ MORE

answered Jan 23, 2020 in Big Data Hadoop by MD
• 95,460 points
11,674 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP