Query regarding a spark split logic

0 votes
I have a csv file containing 10 lines. I need to have a spark code to split even and odd numbers of lines into 2 text files results. Could you help me out?
Feb 9 in Apache Spark by Joshi
17 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes

First, import the data in Spark and add IDs to it. Run these commands in scala console:

val df = spark.read.csv("file.csv")
val df1 = df.withColumn("id",monotonicallyIncreasingId)

Then use this logic to split:

val df2 = df1.filter($"id"%2!==0)
val df2 = df1.filter($"id"%2!==0)
answered Feb 9 by Omkar
• 66,050 points

Related Questions In Apache Spark

0 votes
1 answer

Can I read a CSV represented as a string into Apache Spark?

You can use the following command. This ...READ MORE

answered May 3, 2018 in Apache Spark by kurt_cobain
• 9,260 points
37 views
0 votes
1 answer

Why does sortBy transformation trigger a Spark job?

Actually, sortBy/sortByKey depends on RangePartitioner (JVM). So ...READ MORE

answered May 8, 2018 in Apache Spark by kurt_cobain
• 9,260 points
71 views
0 votes
1 answer

In a Spark DataFrame how can I flatten the struct?

You can go ahead and use the ...READ MORE

answered May 24, 2018 in Apache Spark by Shubham
• 12,270 points
346 views
0 votes
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 12,270 points
651 views
0 votes
0 answers
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 9,070 points
1,684 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 9,070 points
132 views
0 votes
10 answers

hadoop fs -put command?

copy command can be used to copy files ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Sujay
8,189 views
0 votes
1 answer

Not able to use sc in spark shell

Seems like master and worker are not ...READ MORE

answered Jan 3 in Apache Spark by Omkar
• 66,050 points
72 views
0 votes
1 answer

Spark and Scale Auxiliary constructor doubt

println("Slayer") is an anonymous block and gets ...READ MORE

answered Jan 8 in Apache Spark by Omkar
• 66,050 points
24 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.