How to assign a column in Spark Dataframe PySpark as a Primary Key

+1 vote
I've just converted a glue dynamic frame into spark dataframe using the .todf() method. I now need to assign a column as the Primary Key. How do I do that? Please help!
Jan 8, 2020 in Apache Spark by anonymous
• 150 points
3,035 views
What you could do is, create a dataframe on your PySpark, set the column as Primary key and then insert the values in the PySpark dataframe.
Hi Kalgi! I do not see a way to set a column as Primary Key in PySpark. Can you please share the details (code) about how that is done? Thanks!

1 answer to this question.

+1 vote
spark do not have any concept of primary key. As spark is computation engine not database.
answered Jan 12, 2020 by Sirish
• 160 points
Yes I just read a few articles and came to the conclusion that you cannot set primary key in apache spark.

Related Questions In Apache Spark

0 votes
1 answer
+2 votes
14 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 67,804 views
0 votes
1 answer

How to create a not null column in case class in spark

Hi@Deepak, In your test class you passed empid ...READ MORE

answered May 14, 2020 in Apache Spark by MD
• 95,140 points
1,204 views
0 votes
2 answers

In a Spark DataFrame how can I flatten the struct?

// Collect data from input avro file ...READ MORE

answered Jul 4, 2019 in Apache Spark by Dhara dhruve
3,573 views
0 votes
5 answers

How to change the spark Session configuration in Pyspark?

You aren't actually overwriting anything with this ...READ MORE

answered Dec 13, 2020 in Apache Spark by Gitika
• 65,870 points
53,913 views
0 votes
1 answer

How to convert rdd object to dataframe in spark

SqlContext has a number of createDataFrame methods ...READ MORE

answered May 30, 2018 in Apache Spark by nitinrawat895
• 11,380 points
2,867 views
+1 vote
2 answers
0 votes
1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

answered May 3, 2019 in Apache Spark by Omkar
• 69,110 points
1,710 views
0 votes
1 answer

How to read a dataframe based on an avro schema?

Hi, I am able to understand your requirement. ...READ MORE

answered Oct 30, 2020 in Apache Spark by MD
• 95,140 points
586 views