Pyspark dataframe with random values

0 votes
How to create a column in pyspark dataframe with random values within a range?
Aug 1 in Apache Spark by Esha

1 answer to this question.

0 votes

Hey @Esha, you can use this code. Let me know if it doesn't work:

from pyspark.sql.functions import rand,when df1 = df.withColumn('isVal', when(rand() > 0.5, 1).otherwise(0))
answered Aug 1 by Zed

Related Questions In Apache Spark

0 votes
1 answer

Filtering a row in Spark DataFrame based on matching values from a list

Use the function as following: var notFollowingList=List(9.8,7,6,3, ...READ MORE

answered Jun 5, 2018 in Apache Spark by Shubham
• 13,350 points
+1 vote
1 answer

getting null values in spark dataframe while reading data from hbase

Can you share the screenshots for the ...READ MORE

answered Jul 31, 2018 in Apache Spark by kurt_cobain
• 9,280 points
–1 vote
1 answer

Pyspark rdd How to get partition number in output ?

The glom function is what you are looking for: glom(self): ...READ MORE

answered Jan 8 in Python by Omkar
• 68,180 points
0 votes
0 answers

try except is not working while using hdfs command

Hi,  I am trying to run following things ...READ MORE

Mar 6 in Python by anonymous
+1 vote
1 answer
0 votes
1 answer

How to get SQL configuration in Spark using Python?

You can get the configuration details through ...READ MORE

answered Mar 18 in Apache Spark by John
0 votes
6 answers
0 votes
11 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4 in Apache Spark by anonymous

edited Apr 5 by Omkar 29,445 views