Query regarding a spark split logic

0 votes
I have a csv file containing 10 lines. I need to have a spark code to split even and odd numbers of lines into 2 text files results. Could you help me out?
Feb 9, 2019 in Apache Spark by Joshi
82 views

1 answer to this question.

0 votes

First, import the data in Spark and add IDs to it. Run these commands in scala console:

val df = spark.read.csv("file.csv")
val df1 = df.withColumn("id",monotonicallyIncreasingId)

Then use this logic to split:

val df2 = df1.filter($"id"%2!==0)
val df2 = df1.filter($"id"%2!==0)
answered Feb 9, 2019 by Omkar
• 69,090 points

Related Questions In Apache Spark

0 votes
2 answers

Error : split value is not a member of org.apache.spark.sql.Row

var d=rdd2col.rdd.map(x=>x.split(",")) or val names=rd ...READ MORE

answered Aug 5, 2020 in Apache Spark by Ramkumar Ramasamy.
4,427 views
0 votes
1 answer

Query regarding Appending " to a string in Scala

You can perform this task in two ...READ MORE

answered Jul 10, 2019 in Apache Spark by Esha
1,117 views
0 votes
1 answer

Error : split value is not a member of org.apache.spark.sql.Row

spark.read.csv is used when loading into a ...READ MORE

answered Jul 22, 2019 in Apache Spark by Firoz
1,451 views
0 votes
1 answer

Unable to run select query with selected columns on a temp view registered in spark application

from pyspark.sql.types import FloatType fname = [1.0,2.4,3.6,4.2,45.4] df=spark.createDataFrame(fname, ...READ MORE

answered Mar 28, 2020 in Apache Spark by GAURAV
• 140 points
1,021 views
+1 vote
2 answers
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
6,862 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
1,099 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
48,497 views
–1 vote
1 answer

Not able to use sc in spark shell

Seems like master and worker are not ...READ MORE

answered Jan 3, 2019 in Apache Spark by Omkar
• 69,090 points
571 views
0 votes
1 answer

Spark and Scale Auxiliary constructor doubt

println("Slayer") is an anonymous block and gets ...READ MORE

answered Jan 8, 2019 in Apache Spark by Omkar
• 69,090 points
109 views