A Dataframe can be created from an existing RDD You would create the Dataframe from the existing RDD by inferring schema using case classes in which one of the given classes

Question

A Dataframe can be created from an existing RDD. You would create the Dataframe from the existing RDD by inferring schema using case classes in which one of the given classes?

a) if your dataset has more than 22 fields
b) if all your users are going to need dataset parsed in same way
c) if you have two sets of users who will need the text dataset parsed differently
d) we cannot create a data frame in RDD

akhtar · Answer 1 · Nov 25, 2020

Hi@ritu,

You can create a data frame from an existing RDD. You can see the below example.

SparkSession.createDataFrame(RDD obj).
val dfWithoutSchema = spark.createDataFrame(rdd)
dfWithoutSchema.show()
+------+--------------------+
|    _1|                  _2|
+------+--------------------+
| first|[2.0, 1.0, 2.1, 5.4]|
|  test|[1.5, 0.5, 0.9, 3.7]|
|choose|[8.0, 2.9, 9.1, 2.5]|
+------+--------------------+

So I think you can go with option B.