How to read a dataframe based on an avro schema

0 votes

I am unable to import from_avro in Pyspark. Trying to run a spark-submit job by invoking the external package for Avro. 

Eg: spark-submit --packages org.apache.spark:spark-avro_2.12:3.0.1 test1.py

My test1.py file contains the import statement

from pyspark.sql.avro.functions import from_avro, to_avro
Getting ImportError: NO module names avro.function

Please help!!! Need to import from_avro using python code

Oct 30, 2020 in Apache Spark by anonymous
• 120 points

edited Oct 30, 2020 by MD 2,779 views

1 answer to this question.

0 votes

Hi,

I am able to understand your requirement. But your error says you don't have the avro.function package in your system. First, check all the available packages in your Spark.

answered Oct 30, 2020 by MD
• 95,440 points
How can I read a dataframe using avro schema?

Hi@ana,

While working with spark-shell, you can also use --packages to add spark-avro_2.12 and its dependencies directly.

$ ./bin/spark-shell --packages org.apache.spark:spark-avro_2.12:2.4.4

we should use DataSource format as “avro” or “org.apache.spark.sql.avro” and load() is used to read the Avro file.

$ val personDF= spark.read.format("avro").load("person.avro")
Thank you @ MD for providing the package details.

Related Questions In Apache Spark

0 votes
3 answers

Filtering a row in Spark DataFrame based on matching values from a list

Use the function as following: var notFollowingList=List(9.8,7,6,3,1) df.filter(col("uid").isin(notFollowingList:_*)) You can ...READ MORE

answered Jun 6, 2018 in Apache Spark by Shubham
• 13,490 points
91,888 views
+1 vote
1 answer

How to read a data from text file in Spark?

Hey, You can try this: from pyspark import SparkContext SparkContext.stop(sc) sc ...READ MORE

answered Aug 6, 2019 in Apache Spark by Gitika
• 65,910 points
4,682 views
0 votes
1 answer

How to read Avro Partition Data?

Hi@akhtar, When we try to retrieve the data ...READ MORE

answered Nov 4, 2020 in Apache Spark by MD
• 95,440 points
1,534 views
+1 vote
2 answers
0 votes
1 answer

How to find the number of null contain in dataframe?

Hey there! You can use the select method of the ...READ MORE

answered May 3, 2019 in Apache Spark by Omkar
• 69,210 points
4,622 views
+1 vote
1 answer
0 votes
1 answer
+1 vote
1 answer

How to write Spark DataFrame to Avro Data File?

Hi@akhtar, Since Avro library is external to Spark, ...READ MORE

answered Nov 4, 2020 in Apache Spark by MD
• 95,440 points
2,758 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP