questions/apache-spark
Every spark application has same fixed heap ...READ MORE
Check if you are able to access ...READ MORE
you can access task information using TaskContext: import org.apache.spark.TaskContext sc.parallelize(Seq[Int](), ...READ MORE
You can try and check this below ...READ MORE
I suggest you to check 2 things That jquery.sparkline.js is actually ...READ MORE
If, for option 2, you mean have ...READ MORE
Spark revolves around the concept of a ...READ MORE
It avoids a full shuffle. If it's ...READ MORE
Minimizing data transfers and avoiding shuffling helps ...READ MORE
I can list some but there can ...READ MORE
Whenever a node goes down, Spark knows ...READ MORE
No, it doesn’t provide storage layer but ...READ MORE
Spark SQL is capable of: Loading data from ...READ MORE
Apache Spark supports the following four languages: Scala, ...READ MORE
Spark is agnostic to the underlying cluster ...READ MORE
Just do the following: Edit your conf/log4j.properties file ...READ MORE
Let's first look at mapper side differences Map ...READ MORE
> In order to reduce the processing ...READ MORE
Fold in spark Fold is a very powerful ...READ MORE
Some of the key differences between an RDD and ...READ MORE
There are few reasons for keeping RDD ...READ MORE
Mainly, we use SparkConf because we need ...READ MORE
val x = sc.parallelize(1 to 10, 2) // ...READ MORE
You can add external jars as arguments ...READ MORE
Yes, there is a difference between the ...READ MORE
org.apache.spark.mllib is the old Spark API while ...READ MORE
df.orderBy($"col".desc) - this works as well READ MORE
Your error is with the version of ...READ MORE
Spark is a framework for distributed data ...READ MORE
A Spark driver (aka an application’s driver ...READ MORE
Parquet is a columnar format supported by ...READ MORE
map(): Return a new distributed dataset formed by ...READ MORE
spark-submit \ class org.apache.spark.examples.SparkPi \ deploy-mode client \ master spark//$SPARK_MASTER_IP:$SPARK_MASTER_PORT ...READ MORE
There are two popular ways using which ...READ MORE
Whenever a series of transformations are performed ...READ MORE
There are two methods to persist the ...READ MORE
The full form of RDD is a ...READ MORE
No, it is not necessary to install ...READ MORE
No, it is not mandatory, but there ...READ MORE
Spark has various persistence levels to store ...READ MORE
Shark is a tool, developed for people ...READ MORE
SQL Interpreter & Optimizer handles the functional ...READ MORE
You can select the column and apply ...READ MORE
Use the function as following: var notFollowingList=List(9.8,7,6,3,1) df.filter(col("uid").isin(notFollowingList:_*)) You can ...READ MORE
You can create a DataFrame from the ...READ MORE
Spark has various components: Spark SQL (Shark)- for ...READ MORE
Parquet is a columnar format file supported ...READ MORE
No not mandatory, but there is no ...READ MORE
By default a partition is created for ...READ MORE
OR
At least 1 upper-case and 1 lower-case letter
Minimum 8 characters and Maximum 50 characters
Already have an account? Sign in.