val a= spark.sparkContext.parallelize(Array(("a",1),("a",2),("b",2)))
val b =a.foldByKey(1)(_+_)
res2: Array[(String, Int)] = Array((b,3), (a,5))
Can someone tell me why a value is 5 not 4?
Please have a look below for your reference.
(a,1) (a,2) => foldByKey(1)(_+_) => (a,1+1)+(a,2+1) => 2+3 = 5
(b,2) => foldByKey(1)(_+_) => (b,2+1) = 3
According to that logic, the value is 5.
println("Slayer") is an anonymous block and gets ...READ MORE
Yes, you can reorder the dataframe elements.
You need ...READ MORE
There are 2 ways to check the ...READ MORE
Hadoop 3 is not widely used in ...READ MORE
When I execute the below in the ...READ MORE
Firstly you need to understand the concept ...READ MORE
org.apache.hadoop.mapred is the Old API
org.apache.hadoop.mapreduce is the ...READ MORE
copy command can be used to copy files ...READ MORE
its late but this how you can ...READ MORE
You can use the function expr
val data ...READ MORE