Pyspark dataframe with random values

0 votes
How to create a column in pyspark dataframe with random values within a range?
Aug 1, 2019 in Apache Spark by Esha

1 answer to this question.

0 votes

Hey @Esha, you can use this code. Let me know if it doesn't work:

from pyspark.sql.functions import rand,when df1 = df.withColumn('isVal', when(rand() > 0.5, 1).otherwise(0))
answered Aug 1, 2019 by Zed

Related Questions In Apache Spark

0 votes
3 answers

Filtering a row in Spark DataFrame based on matching values from a list

Use the function as following: var notFollowingList=List(9.8,7,6,3 ...READ MORE

answered Jun 6, 2018 in Apache Spark by Shubham
• 13,480 points
+1 vote
1 answer

getting null values in spark dataframe while reading data from hbase

Can you share the screenshots for the ...READ MORE

answered Jul 31, 2018 in Apache Spark by kurt_cobain
• 9,390 points
+1 vote
1 answer
–1 vote
1 answer

Pyspark rdd How to get partition number in output ?

The glom function is what you are looking for: glom(self): ...READ MORE

answered Jan 8, 2019 in Python by Omkar
• 69,170 points
0 votes
0 answers

try except is not working while using hdfs command

Hi,  I am trying to run following things ...READ MORE

Mar 6, 2019 in Python by anonymous
+1 vote
2 answers
0 votes
1 answer

How to get SQL configuration in Spark using Python?

You can get the configuration details through ...READ MORE

answered Mar 18, 2019 in Apache Spark by John
+1 vote
8 answers

How to replace null values in Spark DataFrame?

Hi, In Spark, fill() function of DataFrameNaFunctions class is used to replace ...READ MORE

answered Dec 15, 2020 in Apache Spark by MD
• 95,300 points
+2 votes
14 answers

How to create new column with function in Spark Dataframe?

val coder: (Int => String) = v ...READ MORE

answered Apr 5, 2019 in Apache Spark by anonymous

edited Apr 5, 2019 by Omkar 74,129 views