How to get Spark dataset metadata?

0 votes
I want to convert a dataset into some other object. I need the metadata of the dataset before converting it. So is there any function in spark that can help?

Thanks in advance.
Apr 26, 2018 in Apache Spark by Ashish
• 2,630 points
221 views

1 answer to this question.

Your answer

Your name to display (optional):
Privacy: Your email address will only be used for sending these notifications.
0 votes
There are a bunch of functions that can help you here.\

For the schema of dataset ds, you can use ds.schema

you have ds.schema.size, ds.schema.fields, ds.schema.fieldNames

Also, you can see the details with ds.printSchema()

Hope this helps
answered Apr 26, 2018 by kurt_cobain
• 9,260 points

Related Questions In Apache Spark

0 votes
1 answer

How to get ID of a map task in Spark?

you can access task information using TaskContext: import org.apache.spark.TaskContext sc.parallelize(Seq[Int](), ...READ MORE

answered Nov 20, 2018 in Apache Spark by Frankie
• 9,710 points
136 views
0 votes
1 answer

How to get Spark SQL configuration?

First create a Spark session like this: val ...READ MORE

answered Mar 18 in Apache Spark by John
27 views
0 votes
1 answer

How to get SQL configuration in Spark using Python?

You can get the configuration details through ...READ MORE

answered Mar 18 in Apache Spark by John
25 views
0 votes
3 answers

How to transpose Spark DataFrame?

Please check the below mentioned links for ...READ MORE

answered Dec 31, 2018 in Apache Spark by anonymous
3,566 views
0 votes
1 answer
+1 vote
2 answers

Execute Pig Script from Grunt Shell

From your current directory run  pig -x local Then ...READ MORE

answered Oct 25, 2018 in Big Data Hadoop by Kunal
536 views
0 votes
1 answer
0 votes
2 answers

ansible-command not found

Use some other variable instead of PATH. READ MORE

answered Apr 23 in Ansible by Vismaya
2,025 views
0 votes
1 answer

How to stop messages from being displayed on spark console?

In your log4j.properties file you need to ...READ MORE

answered Apr 24, 2018 in Apache Spark by kurt_cobain
• 9,260 points
713 views
0 votes
1 answer

How to get the number of elements in partition?

rdd.mapPartitions(iter => Array(iter.size).iterator, true) This command will ...READ MORE

answered May 8, 2018 in Apache Spark by kurt_cobain
• 9,260 points
84 views

© 2018 Brain4ce Education Solutions Pvt. Ltd. All rights Reserved.
"PMP®","PMI®", "PMI-ACP®" and "PMBOK®" are registered marks of the Project Management Institute, Inc. MongoDB®, Mongo and the leaf logo are the registered trademarks of MongoDB, Inc.