Avoid duplication while streaming data into a BigQuery table

0 votes
When trying to stream data into BigQuery table, I encountered a network error and I am not sure if my data is inserted already, if I try to enter again it will be a duplicate. How do I avoid duplication into the table?
Nov 20, 2019 in GCP by anonymous
• 5,050 points
31 views

1 answer to this question.

0 votes

If you receive a failure HTTP response code such as a network error, there's no way to tell if the streaming insert succeeded. 

If you try to simply re-send the request, you might end up with duplicated rows in your table. 

To help protect your table against duplication, set the insertId property when sending your request. 

BigQuery uses the insertId property for de-duplication.

Hope this helps!

answered Nov 20, 2019 by Sirajul
• 46,080 points

Related Questions In GCP

0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

How do I create a BigQuery Dataset using the console?

To create a dataset: Open the BigQuery web ...READ MORE

answered Nov 15, 2019 in GCP by anonymous
• 46,080 points
38 views
0 votes
1 answer

Assign access control to a BigQuery Dataset.

You can apply access controls during dataset creation by ...READ MORE

answered Nov 15, 2019 in GCP by Sirajul
• 46,080 points
48 views
0 votes
1 answer

Permissions required to update a dataset.

At a minimum, to update dataset properties, ...READ MORE

answered Nov 15, 2019 in GCP by Sirajul
• 46,080 points
53 views
0 votes
1 answer

Change default expiration time for a dataset.

You can update a dataset's default table ...READ MORE

answered Nov 15, 2019 in GCP by Sirajul
• 46,080 points
46 views
0 votes
1 answer