Copy all files from local Windows to HDFS with Scala code

Question

I have requirement copying the files from local machine to Hadoop environment with Scala programming.

1. We have to copy all the files from the share point folder to the local machine. (This one I am able to copy from share folder to location machine)

2. Once files copied to a local machine then I have to move all files from local machine (windows) to HDFS location with scala code.

Can you please help me with code how to copy/move all the files from the local system to HDFS path (2nd point) with Scala programming.

score 0 · Answer 1 · May 22, 2019

Please try the following Scala code:

import org.apache.hadoop.conf.Configuration

import org.apache.hadoop.fs.FileSystem

import org.apache.hadoop.fs.Path


val hadoopConf = new Configuration()

val hdfs = FileSystem.get(hadoopConf)


val srcPath = new Path(srcFilePath)

val destPath = new Path(destFilePath)


hdfs.copyFromLocalFile(srcPath, destPath)

You should also check if Spark has the HADOOP_CONF_DIR variable set in the conf/spark-env.sh file. This will make sure that Spark is going to find the Hadoop configuration settings.

The dependencies for the build.sbt file:

libraryDependencies += "org.apache.hadoop" % "hadoop-common" % "2.6.0"

libraryDependencies += "org.apache.commons" % "commons-io" % "1.3.2"

libraryDependencies += "org.apache.hadoop" % "hadoop-hdfs" % "2.6.0"

Hope it helps!

If you want to know more about Apache Spark Scala, It's highly recommended to go for Apache Spark certification course today.

Thanks!!