questions/apache-spark
Hey @Esha, you can use this code. ...READ MORE
By default, the timeout is set to ...READ MORE
You can try this: d.filter(col("value").isin(desiredThings: _*)) and if you ...READ MORE
You can add external jars as arguments ...READ MORE
Hey, In Apache Spark, the data storage model is ...READ MORE
There are 2 ways to check the ...READ MORE
Hi@akhtar, To convert pyspark dataframe into pandas dataframe, ...READ MORE
Yes, you can go ahead and write ...READ MORE
Suppose you have two dataset results( id, ...READ MORE
df.registerTempTable(“airports”) This command is used to register ...READ MORE
It avoids a full shuffle. If it's ...READ MORE
Seems like you have not started the ...READ MORE
I am running an application on Spark ...READ MORE
Use Array.maxBy method: val a = Array(("a",1), ("b",2), ...READ MORE
As parquet is a column based storage ...READ MORE
Hey, You can follow this below solution for ...READ MORE
You can use this: lines = sc.textFile(“hdfs://path/to/file/filename.txt”); def isFound(line): if ...READ MORE
Give read-write permissions to C:\tmp\hive folder Cd to winutils bin folder ...READ MORE
When using the Java substring() method, a ...READ MORE
Hi@akhtar, I think your HDFS cluster is not ...READ MORE
With mapPartion() or foreachPartition(), you can only ...READ MORE
Try this and see if this does ...READ MORE
The default capacity of listener bus is ...READ MORE
Apache Spark supports the following four languages: Scala, ...READ MORE
Every spark application has same fixed heap ...READ MORE
1) First we loaded the data to ...READ MORE
This type of error tends to occur ...READ MORE
Hi@ akhtar, Both map() and mapPartitions() are the ...READ MORE
Hi, foreach() operation is an action. It does not ...READ MORE
Hi@akhtar, This error occurs because your python version ...READ MORE
1. We will check whether master and ...READ MORE
Hey, sortByKey() is a transformation. It returns an RDD sorted ...READ MORE
Hi, Apache Spark is an advanced data processing ...READ MORE
// Collect data from input avro file ...READ MORE
Try this, it should work: > from pyspark.sql.functions ...READ MORE
I used Spark 1.5.2 with Hadoop 2.6 ...READ MORE
No, it is not necessary to install ...READ MORE
import sqlContext.implicits._ import org.apache.spark.sql.Row import org.apache.spark.sql.types.{StructType, StructField, LongType} val df ...READ MORE
Hi, You need to edit one property in ...READ MORE
Hi, @Ritu, When creating a pair RDD from ...READ MORE
Start spark shell using below line of ...READ MORE
As far as I understand your intentions ...READ MORE
Hi, You can resolve this error with a ...READ MORE
Spark has various persistence levels to store ...READ MORE
Hey, You need to follow some steps to complete ...READ MORE
Hadoop 3 is not widely used in ...READ MORE
Hey, For this purpose, we use the single ...READ MORE
You can do this using globbing. See ...READ MORE
Hey, A sparse vector is used for storing ...READ MORE
Hi All I am running Scala program on ...READ MORE
OR
At least 1 upper-case and 1 lower-case letter
Minimum 8 characters and Maximum 50 characters
Already have an account? Sign in.