Avoid duplication while streaming data into a BigQuery table

0 votes
When trying to stream data into BigQuery table, I encountered a network error and I am not sure if my data is inserted already, if I try to enter again it will be a duplicate. How do I avoid duplication into the table?
Nov 20, 2019 in GCP by anonymous
• 5,840 points
299 views

1 answer to this question.

0 votes

If you receive a failure HTTP response code such as a network error, there's no way to tell if the streaming insert succeeded. 

If you try to simply re-send the request, you might end up with duplicated rows in your table. 

To help protect your table against duplication, set the insertId property when sending your request. 

BigQuery uses the insertId property for de-duplication.

Hope this helps!

answered Nov 20, 2019 by Sirajul
• 54,190 points

Related Questions In GCP

0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

How do I create a BigQuery Dataset using the console?

To create a dataset: Open the BigQuery web ...READ MORE

answered Nov 15, 2019 in GCP by anonymous
• 54,190 points
66 views
0 votes
1 answer

Assign access control to a BigQuery Dataset.

You can apply access controls during dataset creation by ...READ MORE

answered Nov 15, 2019 in GCP by Sirajul
• 54,190 points
280 views
0 votes
1 answer

Permissions required to update a dataset.

At a minimum, to update dataset properties, ...READ MORE

answered Nov 15, 2019 in GCP by Sirajul
• 54,190 points
147 views
0 votes
1 answer

Change default expiration time for a dataset.

You can update a dataset's default table ...READ MORE

answered Nov 15, 2019 in GCP by Sirajul
• 54,190 points
107 views
0 votes
1 answer