I am new to Apache Pig and started working with the fundamentals. There I came through a keyword "Distinct" but did not understand why to use it. Can anyone tell the use of this keyword?
The "distinct" statement is very simple. It removes duplicate records. It works only on entire records, not on individual fields.
The Mapper class belongs to package org.apache.hadoop.mapreduce ...READ MORE
In simple explanation,
When specify SPLIT_BY only ...READ MORE
The command sqoop help lists the tools ...READ MORE
The operation FOREACH in Apache Pig is ...READ MORE
The official definition of Apache Hadoop given ...READ MORE
Firstly you need to understand the concept ...READ MORE
org.apache.hadoop.mapred is the Old API
org.apache.hadoop.mapreduce is the ...READ MORE
put <localSrc> <dest>
copyF ...READ MORE
In pig, Relation represents a complete database. ...READ MORE
We use store command to store the ...READ MORE
Already have an account? Sign in.