How to index one csv file with no header , after converting the csv to a dataframe, i need to name the columns in order to normalize in minmaxScaler.

0 votes
Sep 9 in Apache Spark by Manas
• 120 points
109 views

1 answer to this question.

0 votes

Hi@Manas,

You can read your dataset from CSV file to Dataframe and set header value to false. So it will create a data frame with the index value.

df = spark.read.format("csv").option("header", "false").load("csvfile.csv")

After that, you can replace the index value with column name.

val df2 = df.withColumnRenamed(0,"DateOfBirth")
           .withColumnRenamed(1,"salary")
df2.printSchema()
answered Sep 10 by MD
• 65,200 points

Related Questions In Apache Spark

0 votes
2 answers

In a Spark DataFrame how can I flatten the struct?

// Collect data from input avro file ...READ MORE

answered Jul 4, 2019 in Apache Spark by Dhara dhruve
3,060 views
+1 vote
2 answers
0 votes
1 answer

How to remove the elements with a key present in any other RDD?

Hey, You can use the subtractByKey () function to ...READ MORE

answered Jul 22, 2019 in Apache Spark by Gitika
• 41,360 points
981 views
+1 vote
1 answer
0 votes
1 answer
+1 vote
1 answer

How can I write a text file in HDFS not from an RDD, in Spark program?

Yes, you can go ahead and write ...READ MORE

answered May 29, 2018 in Apache Spark by Shubham
• 13,450 points
4,090 views
0 votes
1 answer
0 votes
12 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 58,625 views
0 votes
1 answer

How to parse a textFile to csv in pyspark?

Hi, Use this below given code, it will ...READ MORE

answered Apr 13 in Apache Spark by MD
• 65,200 points
464 views
0 votes
1 answer

How to create a not null column in case class in spark

Hi@Deepak, In your test class you passed empid ...READ MORE

answered May 14 in Apache Spark by MD
• 65,200 points
447 views