How to get Spark dataset metadata?

0 votes
I want to convert a dataset into some other object. I need the metadata of the dataset before converting it. So is there any function in spark that can help?

Thanks in advance.
Apr 26, 2018 in Apache Spark by Ashish
• 2,630 points
511 views

1 answer to this question.

0 votes
There are a bunch of functions that can help you here.\

For the schema of dataset ds, you can use ds.schema

you have ds.schema.size, ds.schema.fields, ds.schema.fieldNames

Also, you can see the details with ds.printSchema()

Hope this helps
answered Apr 26, 2018 by kurt_cobain
• 9,280 points

Related Questions In Apache Spark

0 votes
1 answer

How to get ID of a map task in Spark?

you can access task information using TaskContext: import org.apache.spark.TaskContext sc.parallelize(Seq[Int](), ...READ MORE

answered Nov 20, 2018 in Apache Spark by Frankie
• 9,810 points
545 views
0 votes
1 answer

How to get Spark SQL configuration?

First create a Spark session like this: val ...READ MORE

answered Mar 18 in Apache Spark by John
159 views
0 votes
1 answer

How to get SQL configuration in Spark using Python?

You can get the configuration details through ...READ MORE

answered Mar 18 in Apache Spark by John
92 views
0 votes
3 answers

How to transpose Spark DataFrame?

Please check the below mentioned links for ...READ MORE

answered Dec 31, 2018 in Apache Spark by anonymous
7,216 views
0 votes
1 answer
+1 vote
2 answers

Execute Pig Script from Grunt Shell

From your current directory run  pig -x local Then ...READ MORE

answered Oct 25, 2018 in Big Data Hadoop by Kunal
1,034 views
0 votes
1 answer
0 votes
2 answers

ansible-command not found

Use some other variable instead of PATH. READ MORE

answered Apr 23 in Ansible by Vismaya
4,358 views
0 votes
1 answer

How to stop messages from being displayed on spark console?

In your log4j.properties file you need to ...READ MORE

answered Apr 24, 2018 in Apache Spark by kurt_cobain
• 9,280 points
1,446 views
0 votes
1 answer

How to get the number of elements in partition?

rdd.mapPartitions(iter => Array(iter.size).iterator, true) This command will ...READ MORE

answered May 8, 2018 in Apache Spark by kurt_cobain
• 9,280 points
271 views