Sqoop incremental append by date

0 votes

Suppose I want to do incremental append on the basis of date, how do you mention date for incremental last-modified import?

Feb 19 in Big Data Hadoop by Kisha
162 views

1 answer to this question.

0 votes

Consider a table with 3 records which you already imported to hdfs using sqoop

+------+------------+----------+------+------------+

| sid | city | state | rank | rDate |

+------+------------+----------+------+------------+

| 101 | Chicago | Illinois | 1 | 2014-01-25 |

| 101 | Schaumburg | Illinois | 3 | 2014-01-25 |

| 101 | Columbus | Ohio | 7 | 2014-01-25 |

+------+------------+----------+------+------------+​

sqoop import --connect jdbc:mysql://localhost:3306/ydb --table yloc --username root -P

Now you have additional records in the table but no updates on existing records

+------+------------+----------+------+------------+

| sid | city | state | rank | rDate |

+------+------------+----------+------+------------+

| 101 | Chicago | Illinois | 1 | 2014-01-25 |

| 101 | Schaumburg | Illinois | 3 | 2014-01-25 |

| 101 | Columbus | Ohio | 7 | 2014-01-25 |

| 103 | Charlotte | NC | 9 | 2013-04-22 |

| 103 | Greenville | SC | 9 | 2013-05-12 |

| 103 | Atlanta | GA | 11 | 2013-08-21 |

+------+------------+----------+------+------------+

Here you should use an bwith --check-column which specifies the column to be examined when determining which rows to import.

sqoop import --connect jdbc:mysql://localhost:3306/ydb --table yloc --username root -P --check-column rank --incremental append --last-value 7

The above code will insert all the new rows based on the last value.

Now we can think of second case where there are updates in rows

+------+------------+----------+------+------------+

| sid | city | state | rank | rDate |

+------+------------+----------+------+------------+

| 101 | Chicago | Illinois | 1 | 2015-01-01 |

| 101 | Schaumburg | Illinois | 3 | 2014-01-25 |

| 101 | Columbus | Ohio | 7 | 2014-01-25 |

| 103 | Charlotte | NC | 9 | 2013-04-22 |

| 103 | Greenville | SC | 9 | 2013-05-12 |

| 103 | Atlanta | GA | 11 | 2013-08-21 |

| 104 | Dallas | Texas | 4 | 2015-02-02 |

| 105 | Phoenix | Arzona | 17 | 2015-02-24 |

+------+------------+----------+------+------------+

Here we use incremental last modified where we will fetch all the updated rows based on date.

sqoop import --connect jdbc:mysql://localhost:3306/ydb --table yloc --username root -P --check-column rDate --incremental lastmodified --last-value 2014-01-25 --target-dir yloc/loc
answered Feb 19 by Omkar
• 67,140 points

Related Questions In Big Data Hadoop

+1 vote
1 answer

What is the process to perform an incremental data load in Sqoop?

The process to perform incremental data load ...READ MORE

answered Dec 17, 2018 in Big Data Hadoop by Frankie
• 9,810 points
248 views
0 votes
1 answer

Why we use --split by command in Sqoop?

The command --split-by is used to specify the ...READ MORE

answered Apr 11 in Big Data Hadoop by Gitika
• 20,200 points
152 views
0 votes
1 answer

Creating testjob in sqoop for incremental load

Yes, it is possible to do so. ...READ MORE

answered Jul 5 in Big Data Hadoop by Umar
21 views
0 votes
1 answer

What do we exactly mean by “Hadoop” – the definition of Hadoop?

The official definition of Apache Hadoop given ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by Shubham
153 views
0 votes
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,150 points
2,062 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 10,150 points
199 views
0 votes
10 answers

hadoop fs -put command?

copy command can be used to copy files ...READ MORE

answered Dec 7, 2018 in Big Data Hadoop by Sujay
10,560 views
0 votes
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,240 points
771 views
0 votes
1 answer

Incremental append in Sqoop

You are right. As Hadoop follows WORM ...READ MORE

answered Dec 31, 2018 in Big Data Hadoop by Omkar
• 67,140 points
213 views
0 votes
1 answer

Sqoop split-by problem

Hello. The -m or --num-mappers is just a ...READ MORE

answered Dec 19, 2018 in Big Data Hadoop by Omkar
• 67,140 points
121 views