91885/how-to-read-avro-partition-data
Hi Guys,
I have Avro partition data. I want to read the files. So that I can perform my operations. How can I do that?
Hi@akhtar,
When we try to retrieve the data from the partition, It just reads the data from the partition folder without scanning entire Avro files.
spark.read .format("avro") .load("person_partition.avro") .where(col("dob_year") === 2010) .show()
Hey, You can try this: from pyspark import SparkContext SparkContext.stop(sc) sc ...READ MORE
rdd.mapPartitions(iter => Array(iter.size).iterator, true) This command will ...READ MORE
The amount of data to be transferred ...READ MORE
Yes, you can do this by enabling ...READ MORE
Instead of spliting on '\n'. You should ...READ MORE
Firstly you need to understand the concept ...READ MORE
org.apache.hadoop.mapred is the Old API org.apache.hadoop.mapreduce is the ...READ MORE
Hi, You can create one directory in HDFS ...READ MORE
Hi, I am able to understand your requirement. ...READ MORE
Hi@akhtar, Since Avro library is external to Spark, ...READ MORE
OR
At least 1 upper-case and 1 lower-case letter
Minimum 8 characters and Maximum 50 characters
Already have an account? Sign in.