Spark code takes too much time to run on cluster

I have written a Spark application. My code works fine for smaller size population (dataset) but it takes too much time for larger population (dataset).

Jan 4, 2020 in Apache Spark by asif
• 140 points • 1,608 views

Hi @asif,

Can you please share your spark application code and the approach. Also please mention the size of the dataset when the application starts getting slower.

commented Jan 6, 2020 by Kalgi
• 52,340 points

1 answer to this question.

Hi @asif,

Share with us please the application code and some data sample if possible.

answered Jan 22, 2020 by Alexandru
• 510 points

Related Questions In Apache Spark

0 votes

1 answer

When running Spark on Yarn, do I need to install Spark on all nodes of Yarn Cluster?

No, it is not necessary to install ...READ MORE

answered Jun 14, 2018 in Apache Spark by nitinrawat895
• 11,380 points • 6,946 views

0 votes

1 answer

Unable to run select query with selected columns on a temp view registered in spark application

from pyspark.sql.types import FloatType fname = [1.0,2.4,3.6,4.2,45.4] df=spark.createDataFrame(fname, ...READ MORE

answered Mar 29, 2020 in Apache Spark by GAURAV
• 140 points • 4,393 views

0 votes

1 answer

In AWS, if user wants to run spark, then on top of which one of the following can the user do it?

Hi@ritu, AWS has lots of services. For spark ...READ MORE

answered Nov 26, 2020 in Apache Spark by MD
• 95,460 points • 1,820 views

0 votes

1 answer

How to stop messages from being displayed on spark console?

In your log4j.properties file you need to ...READ MORE

answered Apr 24, 2018 in Apache Spark by kurt_cobain
• 9,350 points • 6,083 views

0 votes

1 answer

Setting textinputformat.record.delimiter in spark

I got this working with plain uncompressed ...READ MORE

answered Oct 10, 2018 in Big Data Hadoop by Omkar
• 69,180 points • 2,890 views

+1 vote

2 answers

How do I get number of columns in each line from a delimited file??

Instead of spliting on '\n'. You should ...READ MORE

answered Aug 7, 2019 in Apache Spark by ashish
• 6,238 views

+1 vote

1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points • 11,675 views

+2 votes

11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points • 112,992 views

+1 vote

1 answer

Spark: java.io.FileNotFoundException

Hello, From the error I get that the ...READ MORE

answered Dec 13, 2019 in Apache Spark by Alexandru
• 510 points • 4,640 views

+1 vote

1 answer

Cannot resolve Error In Spark when filter records with two where condition

Try df.where($"cola".isNotNull && $"cola" =!= "" && !$"colb".isin(2,3)) your ...READ MORE

answered Dec 13, 2019 in Apache Spark by Alexandru
• 510 points
edited Dec 13, 2019 by Alexandru • 3,141 views

Subscribe to our Newsletter, and get personalized recommendations.

REGISTER FOR FREE WEBINAR

Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP