Avoid duplication while streaming data into a BigQuery table

0 votes
When trying to stream data into BigQuery table, I encountered a network error and I am not sure if my data is inserted already, if I try to enter again it will be a duplicate. How do I avoid duplication into the table?
Nov 20 in GCP by anonymous
• 4,330 points
11 views

1 answer to this question.

0 votes

If you receive a failure HTTP response code such as a network error, there's no way to tell if the streaming insert succeeded. 

If you try to simply re-send the request, you might end up with duplicated rows in your table. 

To help protect your table against duplication, set the insertId property when sending your request. 

BigQuery uses the insertId property for de-duplication.

Hope this helps!

answered Nov 20 by Sirajul
• 40,980 points

Related Questions In GCP

0 votes
1 answer
0 votes
1 answer

How do I create a BigQuery Dataset using the console?

To create a dataset: Open the BigQuery web ...READ MORE

answered Nov 15 in GCP by anonymous
• 40,980 points
23 views
0 votes
1 answer

Assign access control to a BigQuery Dataset.

You can apply access controls during dataset creation by ...READ MORE

answered Nov 15 in GCP by Sirajul
• 40,980 points
29 views
0 votes
1 answer

Permissions required to update a dataset.

At a minimum, to update dataset properties, ...READ MORE

answered Nov 15 in GCP by Sirajul
• 40,980 points
31 views
0 votes
1 answer

Change default expiration time for a dataset.

You can update a dataset's default table ...READ MORE

answered Nov 15 in GCP by Sirajul
• 40,980 points
28 views