Sqoop incremental append by date

0 votes

Suppose I want to do incremental append on the basis of date, how do you mention date for incremental last-modified import?

Feb 19, 2019 in Big Data Hadoop by Kisha
4,304 views

1 answer to this question.

0 votes

Consider a table with 3 records which you already imported to hdfs using sqoop

+------+------------+----------+------+------------+

| sid | city | state | rank | rDate |

+------+------------+----------+------+------------+

| 101 | Chicago | Illinois | 1 | 2014-01-25 |

| 101 | Schaumburg | Illinois | 3 | 2014-01-25 |

| 101 | Columbus | Ohio | 7 | 2014-01-25 |

+------+------------+----------+------+------------+​

sqoop import --connect jdbc:mysql://localhost:3306/ydb --table yloc --username root -P

Now you have additional records in the table but no updates on existing records

+------+------------+----------+------+------------+

| sid | city | state | rank | rDate |

+------+------------+----------+------+------------+

| 101 | Chicago | Illinois | 1 | 2014-01-25 |

| 101 | Schaumburg | Illinois | 3 | 2014-01-25 |

| 101 | Columbus | Ohio | 7 | 2014-01-25 |

| 103 | Charlotte | NC | 9 | 2013-04-22 |

| 103 | Greenville | SC | 9 | 2013-05-12 |

| 103 | Atlanta | GA | 11 | 2013-08-21 |

+------+------------+----------+------+------------+

Here you should use an bwith --check-column which specifies the column to be examined when determining which rows to import.

sqoop import --connect jdbc:mysql://localhost:3306/ydb --table yloc --username root -P --check-column rank --incremental append --last-value 7

The above code will insert all the new rows based on the last value.

Now we can think of second case where there are updates in rows

+------+------------+----------+------+------------+

| sid | city | state | rank | rDate |

+------+------------+----------+------+------------+

| 101 | Chicago | Illinois | 1 | 2015-01-01 |

| 101 | Schaumburg | Illinois | 3 | 2014-01-25 |

| 101 | Columbus | Ohio | 7 | 2014-01-25 |

| 103 | Charlotte | NC | 9 | 2013-04-22 |

| 103 | Greenville | SC | 9 | 2013-05-12 |

| 103 | Atlanta | GA | 11 | 2013-08-21 |

| 104 | Dallas | Texas | 4 | 2015-02-02 |

| 105 | Phoenix | Arzona | 17 | 2015-02-24 |

+------+------------+----------+------+------------+

Here we use incremental last modified where we will fetch all the updated rows based on date.

sqoop import --connect jdbc:mysql://localhost:3306/ydb --table yloc --username root -P --check-column rDate --incremental lastmodified --last-value 2014-01-25 --target-dir yloc/loc
answered Feb 20, 2019 by Omkar
• 69,210 points

Related Questions In Big Data Hadoop

+1 vote
1 answer

What is the process to perform an incremental data load in Sqoop?

The process to perform incremental data load ...READ MORE

answered Dec 17, 2018 in Big Data Hadoop by Frankie
• 9,830 points
5,056 views
0 votes
2 answers

Why we use --split by command in Sqoop?

In simple explanation, When specify SPLIT_BY only ...READ MORE

answered Feb 6, 2020 in Big Data Hadoop by Ramji Sridaran
14,089 views
0 votes
1 answer

Creating testjob in sqoop for incremental load

Yes, it is possible to do so. ...READ MORE

answered Jul 5, 2019 in Big Data Hadoop by Umar
722 views
0 votes
1 answer

Error: Hive show tables does not display table "sqooptest" , which was imported by SQOOP

Hi, You can follow the below-given solution. Just enter ...READ MORE

answered Aug 7, 2019 in Big Data Hadoop by Gitika
• 65,910 points

edited Aug 7, 2019 by Gitika 1,208 views
+1 vote
1 answer

Hadoop Mapreduce word count Program

Firstly you need to understand the concept ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
10,611 views
0 votes
1 answer

hadoop.mapred vs hadoop.mapreduce?

org.apache.hadoop.mapred is the Old API  org.apache.hadoop.mapreduce is the ...READ MORE

answered Mar 16, 2018 in Data Analytics by nitinrawat895
• 11,380 points
2,212 views
+2 votes
11 answers

hadoop fs -put command?

Hi, You can create one directory in HDFS ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by nitinrawat895
• 11,380 points
104,859 views
–1 vote
1 answer

Hadoop dfs -ls command?

In your case there is no difference ...READ MORE

answered Mar 16, 2018 in Big Data Hadoop by kurt_cobain
• 9,390 points
4,290 views
–1 vote
1 answer

Incremental append in Sqoop

You are right. As Hadoop follows WORM ...READ MORE

answered Dec 31, 2018 in Big Data Hadoop by Omkar
• 69,210 points
4,124 views
–1 vote
1 answer

Sqoop split-by problem

Hello. The -m or --num-mappers is just a ...READ MORE

answered Dec 19, 2018 in Big Data Hadoop by Omkar
• 69,210 points
2,173 views
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP