Pyspark dataframe with random values

0 votes
How to create a column in pyspark dataframe with random values within a range?
Aug 1, 2019 in Apache Spark by Esha

1 answer to this question.

0 votes

Hey @Esha, you can use this code. Let me know if it doesn't work:

from pyspark.sql.functions import rand,when df1 = df.withColumn('isVal', when(rand() > 0.5, 1).otherwise(0))
answered Aug 1, 2019 by Zed

Related Questions In Apache Spark

0 votes
2 answers

Filtering a row in Spark DataFrame based on matching values from a list

Use the function as following: var notFollowingList=List(9.8,7,6,3 ...READ MORE

answered Jun 5, 2018 in Apache Spark by Shubham
• 13,450 points
+1 vote
1 answer

getting null values in spark dataframe while reading data from hbase

Can you share the screenshots for the ...READ MORE

answered Jul 31, 2018 in Apache Spark by kurt_cobain
• 9,320 points
+1 vote
1 answer
–1 vote
1 answer

Pyspark rdd How to get partition number in output ?

The glom function is what you are looking for: glom(self): ...READ MORE

answered Jan 8, 2019 in Python by Omkar
• 69,030 points
0 votes
0 answers

try except is not working while using hdfs command

Hi,  I am trying to run following things ...READ MORE

Mar 6, 2019 in Python by anonymous
+1 vote
2 answers
0 votes
1 answer

How to get SQL configuration in Spark using Python?

You can get the configuration details through ...READ MORE

answered Mar 18, 2019 in Apache Spark by John
0 votes
7 answers
0 votes
12 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 4, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 56,988 views