Avoid duplication while streaming data into a BigQuery table

0 votes
When trying to stream data into BigQuery table, I encountered a network error and I am not sure if my data is inserted already, if I try to enter again it will be a duplicate. How do I avoid duplication into the table?
Nov 20, 2019 in GCP by anonymous
• 6,260 points
3,060 views

1 answer to this question.

0 votes

If you receive a failure HTTP response code such as a network error, there's no way to tell if the streaming insert succeeded. 

If you try to simply re-send the request, you might end up with duplicated rows in your table. 

To help protect your table against duplication, set the insertId property when sending your request. 

BigQuery uses the insertId property for de-duplication.

Hope this helps!

answered Nov 20, 2019 by Sirajul
• 59,230 points

Related Questions In GCP

0 votes
1 answer

GCP Error : 409 "duplicate" while trying to create a table.

This error returns when trying to create ...READ MORE

answered Nov 20, 2019 in GCP by Sirajul
• 59,230 points
2,669 views
0 votes
1 answer
0 votes
2 answers
0 votes
1 answer

Assign access control to a BigQuery Dataset.

You can apply access controls during dataset creation by ...READ MORE

answered Nov 15, 2019 in GCP by Sirajul
• 59,230 points
1,631 views
0 votes
1 answer

Permissions required to update a dataset.

At a minimum, to update dataset properties, ...READ MORE

answered Nov 15, 2019 in GCP by Sirajul
• 59,230 points
1,571 views
0 votes
1 answer

Change default expiration time for a dataset.

You can update a dataset's default table ...READ MORE

answered Nov 15, 2019 in GCP by Sirajul
• 59,230 points
1,530 views
0 votes
1 answer
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP