Avoid duplication while streaming data into a BigQuery table

0 votes
When trying to stream data into BigQuery table, I encountered a network error and I am not sure if my data is inserted already, if I try to enter again it will be a duplicate. How do I avoid duplication into the table?
Nov 20, 2019 in GCP by anonymous
• 6,260 points
3,401 views

1 answer to this question.

0 votes

If you receive a failure HTTP response code such as a network error, there's no way to tell if the streaming insert succeeded. 

If you try to simply re-send the request, you might end up with duplicated rows in your table. 

To help protect your table against duplication, set the insertId property when sending your request. 

BigQuery uses the insertId property for de-duplication.

Hope this helps!

answered Nov 20, 2019 by Sirajul
• 59,230 points

Related Questions In GCP

0 votes
1 answer

GCP Error : 409 "duplicate" while trying to create a table.

This error returns when trying to create ...READ MORE

answered Nov 20, 2019 in GCP by Sirajul
• 59,230 points
2,903 views
0 votes
1 answer
0 votes
2 answers
0 votes
1 answer

Assign access control to a BigQuery Dataset.

You can apply access controls during dataset creation by ...READ MORE

answered Nov 15, 2019 in GCP by Sirajul
• 59,230 points
1,889 views
0 votes
1 answer

Permissions required to update a dataset.

At a minimum, to update dataset properties, ...READ MORE

answered Nov 15, 2019 in GCP by Sirajul
• 59,230 points
1,773 views
0 votes
1 answer

Change default expiration time for a dataset.

You can update a dataset's default table ...READ MORE

answered Nov 15, 2019 in GCP by Sirajul
• 59,230 points
1,760 views
0 votes
1 answer
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP