questions/apache-spark
Just do the following: Edit your conf/log4j.properties file ...READ MORE
Let's first look at mapper side differences Map ...READ MORE
> In order to reduce the processing ...READ MORE
Some of the key differences between an RDD and ...READ MORE
Mainly, we use SparkConf because we need ...READ MORE
Can you share the screenshots for the ...READ MORE
You can add external jars as arguments ...READ MORE
Yes, there is a difference between the ...READ MORE
org.apache.spark.mllib is the old Spark API while ...READ MORE
Your error is with the version of ...READ MORE
There is a difference between the two: mapValues ...READ MORE
Spark is a framework for distributed data ...READ MORE
A Spark driver (aka an application’s driver ...READ MORE
spark-submit \ class org.apache.spark.examples.SparkPi \ deploy-mode client \ master spark//$SPARK_MASTER_IP:$SPARK_MASTER_PORT ...READ MORE
There are two popular ways using which ...READ MORE
Minimizing data transfers and avoiding shuffling helps ...READ MORE
There are two methods to persist the ...READ MORE
The full form of RDD is a ...READ MORE
No, it is not necessary to install ...READ MORE
No, it is not mandatory, but there ...READ MORE
Spark has various persistence levels to store ...READ MORE
Shark is a tool, developed for people ...READ MORE
Here are some of the important features of ...READ MORE
SQL Interpreter & Optimizer handles the functional ...READ MORE
You can select the column and apply ...READ MORE
You can create a DataFrame from the ...READ MORE
Spark has various components: Spark SQL (Shark)- for ...READ MORE
Parquet is a columnar format file supported ...READ MORE
No not mandatory, but there is no ...READ MORE
By default a partition is created for ...READ MORE
I would recommend you create & build ...READ MORE
You need to change the following: val pipeline ...READ MORE
Spark provides a pipe() method on RDDs. ...READ MORE
Spark uses Akka basically for scheduling. All ...READ MORE
SqlContext has a number of createDataFrame methods ...READ MORE
RDD can be uncached using unpersist() So. use ...READ MORE
Yes, you can go ahead and write ...READ MORE
SparkContext.createTaskScheduler property parses the master parameter Local: 1 ...READ MORE
You can save the RDD using saveAsObjectFile and saveAsTextFile method. ...READ MORE
Yes, it is possible to run Spark ...READ MORE
Sliding Window controls transmission of data packets ...READ MORE
Here are the changes in new version ...READ MORE
Spark 2.0+ Spark 2.0 provides native window functions ...READ MORE
Use Array.maxBy method: val a = Array(("a",1), ("b",2), ...READ MORE
You have to use the comparison operator ...READ MORE
I guess you need provide this kafka.bootstrap.servers ...READ MORE
RDD is a fundamental data structure of ...READ MORE
Either you have to create a Twitter4j.properties ...READ MORE
Ideally, you would use snappy compression (default) ...READ MORE
Both 'filter' and 'where' in Spark SQL ...READ MORE
OR
At least 1 upper-case and 1 lower-case letter
Minimum 8 characters and Maximum 50 characters
Already have an account? Sign in.