How to assign a column in Spark Dataframe (PySpark) as a Primary Key?

+1 vote
I've just converted a glue dynamic frame into spark dataframe using the .todf() method. I now need to assign a column as the Primary Key. How do I do that? Please help!
Jan 8 in Apache Spark by anonymous
• 130 points
116 views
What you could do is, create a dataframe on your PySpark, set the column as Primary key and then insert the values in the PySpark dataframe.
Hi Kalgi! I do not see a way to set a column as Primary Key in PySpark. Can you please share the details (code) about how that is done? Thanks!

1 answer to this question.

+1 vote
spark do not have any concept of primary key. As spark is computation engine not database.
answered Jan 12 by Sirish
• 160 points
Yes I just read a few articles and came to the conclusion that you cannot set primary key in apache spark.

Related Questions In Apache Spark

0 votes
1 answer
0 votes
11 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 34,129 views
0 votes
2 answers

In a Spark DataFrame how can I flatten the struct?

// Collect data from input avro file ...READ MORE

answered Jul 4, 2019 in Apache Spark by Dhara dhruve
1,571 views
0 votes
4 answers

How to change the spark Session configuration in Pyspark?

You can dynamically load properties. First create ...READ MORE

answered Dec 10, 2018 in Apache Spark by Vini
20,129 views
0 votes
1 answer

How to convert rdd object to dataframe in spark

SqlContext has a number of createDataFrame methods ...READ MORE

answered May 30, 2018 in Apache Spark by nitinrawat895
• 10,840 points
1,840 views
0 votes
6 answers
+1 vote
1 answer
0 votes
1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

answered May 3, 2019 in Apache Spark by Omkar
• 68,880 points
365 views
+2 votes
4 answers

use length function in substring in spark

You can use the function expr val data ...READ MORE

answered May 3, 2018 in Apache Spark by kurt_cobain
• 9,290 points
18,903 views