Pyspark dataframe with random values

0 votes
How to create a column in pyspark dataframe with random values within a range?
Aug 1, 2019 in Apache Spark by Esha
1,161 views

1 answer to this question.

0 votes

Hey @Esha, you can use this code. Let me know if it doesn't work:

from pyspark.sql.functions import rand,when df1 = df.withColumn('isVal', when(rand() > 0.5, 1).otherwise(0))
answered Aug 1, 2019 by Zed

Related Questions In Apache Spark

0 votes
1 answer

Filtering a row in Spark DataFrame based on matching values from a list

Use the function as following: var notFollowingList=List(9.8,7,6,3, ...READ MORE

answered Jun 5, 2018 in Apache Spark by Shubham
• 13,370 points
39,352 views
+1 vote
1 answer

getting null values in spark dataframe while reading data from hbase

Can you share the screenshots for the ...READ MORE

answered Jul 31, 2018 in Apache Spark by kurt_cobain
• 9,290 points
551 views
+1 vote
1 answer
–1 vote
1 answer

Pyspark rdd How to get partition number in output ?

The glom function is what you are looking for: glom(self): ...READ MORE

answered Jan 8, 2019 in Python by Omkar
• 68,880 points
392 views
0 votes
0 answers

try except is not working while using hdfs command

Hi,  I am trying to run following things ...READ MORE

Mar 6, 2019 in Python by anonymous
70 views
+1 vote
1 answer
0 votes
1 answer

How to get SQL configuration in Spark using Python?

You can get the configuration details through ...READ MORE

answered Mar 18, 2019 in Apache Spark by John
116 views
0 votes
6 answers
0 votes
11 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 34,004 views