Primary keys in apache Spark

0 votes
I have successfully established a JDBC connection with my spark and PostgreSQL. Am trying to insert some data into my database and I am using append mode but here I need to specify an id for each DataFrame.Row. Is there any other way to do it?
Jul 11, 2019 in Big Data Hadoop by nitinrawat895
• 10,870 points
240 views

1 answer to this question.

0 votes
from pyspark.sql.functions import monotonically_increasing_id
df.withColumn("id", monotonically_increasing_id()).show()

Verify the second argument of 

df.withColumn is monotonically_increasing_id() not monotonically_increasing_id.
answered Jul 11, 2019 by ravikiran
• 4,600 points

Related Questions In Big Data Hadoop

0 votes
1 answer

How to read more than one files in Apache Spark?

Try this: val text = sc.wholeTextFiles("student/*") text.collect() READ MORE

answered Dec 11, 2018 in Big Data Hadoop by Omkar
• 69,000 points
679 views
0 votes
1 answer

What is the command to check the number of cores in Spark?

Go to your Spark Web UI & ...READ MORE

answered May 16, 2018 in Big Data Hadoop by Shubham
• 13,380 points
1,046 views
0 votes
1 answer

What is the Data format and database choices in Hadoop and Spark?

Use Parquet. I'm not sure about CSV ...READ MORE

answered Sep 4, 2018 in Big Data Hadoop by Frankie
• 9,810 points
115 views
0 votes
1 answer

How can I calculate exact median with Apache Spark?

You need to sort RDD and take ...READ MORE

answered Oct 8, 2018 in Big Data Hadoop by Omkar
• 69,000 points
1,170 views
0 votes
1 answer

Primary keys in Apache Spark

I found the following solution to be ...READ MORE

answered Sep 11, 2019 in Apache Spark by ravikiran
• 4,600 points
135 views
0 votes
1 answer

What do we exactly mean by “Hadoop” – the definition of Hadoop?

The official definition of Apache Hadoop given ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by Shubham
414 views
+1 vote
1 answer
0 votes
1 answer

Is it possible to run Apache Spark without Hadoop?

Though Spark and Hadoop were the frameworks designed ...READ MORE

answered May 2, 2019 in Big Data Hadoop by ravikiran
• 4,600 points
166 views
0 votes
1 answer

Is there a possibility to run Apache Spark without Hadoop?

Spark and Hadoop both are the open-source ...READ MORE

answered Jun 6, 2019 in Big Data Hadoop by ravikiran
• 4,600 points
84 views