Copy file from local to hdfs from the spark job in yarn mode

0 votes

How can I copy the file from local to hdfs from the spark job in yarn mode? Means, hdfs dfs -put command equivalent for spark. Because I have a file in local I need to preprocess it the need to put the file in hdfs and then apply the transformation logic. 

Jul 16, 2019 in Big Data Hadoop by Kriti
6,351 views

1 answer to this question.

0 votes

Please refer to the below code:

import org.apache.hadoop.conf.Configuration

import org.apache.hadoop.fs.FileSystem

import org.apache.hadoop.fs.Path

val hadoopConf = new Configuration()

val hdfs = FileSystem.get(hadoopConf)


val srcPath = new Path("/home/edureka/Documents/data")

val destPath = new Path("hdfs:///tranferrred_data")


hdfs.copyFromLocalFile(srcPath, destPath)

Any Spark Job that you are executing, you might want to include the above code snippet according to your requirement use spark-submit to deploy your code in the cluster. Every time you deploy your spark application, the data in your local gets transferred to the hdfs and then you can perform your transformations accordingly.

You might use the below dependencies:

libraryDependencies += "org.apache.hadoop" % "hadoop-common" % "2.6.0"

libraryDependencies += "org.apache.commons" % "commons-io" % "1.3.2"

libraryDependencies += "org.apache.hadoop" % "hadoop-hdfs" % "2.6.0"

answered Jul 16, 2019 by Raj

Related Questions In Big Data Hadoop

0 votes
1 answer

Copy file from HDFS to the local file system

There are two possible ways to copy ...READ MORE

answered Mar 27, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
13,670 views
0 votes
1 answer

Error while copying the file from local to HDFS

Well, the reason you are getting such ...READ MORE

answered May 2, 2018 in Big Data Hadoop by Ashish
• 2,650 points
2,230 views
+1 vote
1 answer

How to copy file from Local file system to HDFS?

Hi@akhtar, You can copy files from your local ...READ MORE

answered Oct 19, 2020 in Big Data Hadoop by MD
• 95,240 points
347 views
0 votes
1 answer

Copy a directory from one node in the cluster to another in HDFS.

Hi@akhtar, You can copy a directory from one ...READ MORE

answered Oct 19, 2020 in Big Data Hadoop by MD
• 95,240 points
571 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
7,556 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
1,251 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
57,476 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
2,849 views
0 votes
2 answers

hadoop copy a local file system folder to HDFS

There's a typo in your command: "hadopp". ...READ MORE

answered Feb 4, 2019 in Big Data Hadoop by Lohith
19,860 views
0 votes
1 answer

How to check the size of a file in Hadoop HDFS?

You can use the  hadoop fs -ls command to ...READ MORE

answered Nov 21, 2018 in Big Data Hadoop by Omkar
• 69,150 points
5,105 views