Google PubSub Subscription cannot recover from StatusCode.UNAVAILABLE [code=8a75] error

0 votes

Exceptions:

Exception in thread Thread-ConsumeBidirectionalStream: grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.UNAVAILABLE, The service was unable to fulfill your request. Please try again. [code=8a75])>

The above exception was the direct cause of the following exception:

google.api_core.exceptions.ServiceUnavailable: 503 The service was unable to fulfill your request. Please try again. [code=8a75]

I'm trying to build an IoT prototype that roughly follows Google's end-to-end sample (docs | code) and I am encountering an error in the subscriber when there are no messages in the queue. This can happen both when the subscriber first starts against an empty queue after about a minute and also after processing any number of messages and a minute or so after the queue is emptied.

I have found a workaround here on StackOverflow but can't get it working. So my question is how to get this workaround policy working since all it seems to do is hide the error - my subscriber still hangs and doesn't process any further messages.

The relevant bits of code look like this:

from google.cloud import pubsub
import google.cloud.pubsub_v1.subscriber.message as Message

from google.cloud.pubsub_v1.subscriber.policy import thread
import grpc

class WorkaroundPolicy(thread.Policy):
    def on_exception(self, exception):
        # If we are seeing UNAVALABLE then we need to retry (so return None)
        unavailable = grpc.StatusCode.UNAVAILABLE

        if isinstance(exception, ServiceUnavailable):
            logger.warning('Ignoring google.api_core.exceptions.ServiceUnavailable exception: {}'.format(exception))
            return
        elif getattr(exception, 'code', lambda: None)() in [unavailable]:
            logger.warning('Ignoring grpc.StatusCode.UNAVAILABLE (Orginal exception: {})'.format(exception))
            return

        # For anything else fall back on the parent class behaviour
        super(WorkaroundPolicy, self).on_exception(exception)


# Other imports and code ommitted for brevity

def callback(msg: Message):
    try:
        data = json.loads(msg.data)
    except ValueError as e:
        logger.error('Loading Payload ({}) threw an Exception: {}.'.format(msg.data, e))
        # For the prototype, if we can't read it, then discard it
        msg.ack()
        return

    device_project_id = msg.attributes['projectId']
    device_registry_id = msg.attributes['deviceRegistryId']
    device_id = msg.attributes['deviceId']
    device_region = msg.attributes['deviceRegistryLocation']

    self._update_device_config(
      device_project_id,
      device_region,
      device_registry_id,
      device_id,
      data)

    msg.ack()

def run(self, project_name, subscription_name):   
    # Specify a workaround policy to handle StatusCode.UNAVAILABLE [code=8a75] error (but may get CPU issues)
    #subscriber = pubsub.SubscriberClient(policy_class = WorkaroundPolicy)

    # Alternatively, instantiate subscriber without the workaround to see full exception stack
    subscriber = pubsub.SubscriberClient()

    subscription = subscriber.subscribe(subscription_path, callback)

    subscription.future.result()

    while True:
        time.sleep(60)

If it helps, the full version of this can be found in GitHub.

Stack trace/command line output (without policy workaround)

(venv) Freds-MBP:iot-subscriber-issue Fred$ python Controller.py \
     --project_id=xyz-tests \
     --pubsub_subscription=simple-mqtt-controller \
     --service_account_json=/Users/Fred/_dev/gcp-credentials/simple-mqtt-controller-service-account.json

2018-03-21 09:36:20,975 INFO Controller Creating credentials from JSON Key File: "/Users/Fred/_dev/gcp-credentials/simple-mqtt-controller-service-account.json"...
2018-03-21 09:36:20,991 INFO Controller Creating service from discovery URL: "https://cloudiot.googleapis.com/$discovery/rest?version=v1"...
2018-03-21 09:36:20,992 INFO googleapiclient.discovery URL being requested: GET https://cloudiot.googleapis.com/$discovery/rest?version=v1
2018-03-21 09:36:21,508 INFO Controller Creating subscriber for project: "xyz-tests" and subscription: "simple-mqtt-controller"...
2018-03-21 09:36:23,200 INFO Controller Listening for messages on projects/xyz-tests/subscriptions/simple-mqtt-controller...

# This then occurs typically 60 seconds or so (sometimes up to 2 mins) later:

Exception in thread Thread-ConsumeBidirectionalStream:
Traceback (most recent call last):
  File "/Users/Fred/_dev/datacentricity-public-samples/iot-subscriber-issue/venv/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 76, in next
    return six.next(self._wrapped)
  File "/Users/Fred/_dev/datacentricity-public-samples/iot-subscriber-issue/venv/lib/python3.6/site-packages/grpc/_channel.py", line 347, in __next__
    return self._next()
  File "/Users/Fred/_dev/datacentricity-public-samples/iot-subscriber-issue/venv/lib/python3.6/site-packages/grpc/_channel.py", line 341, in _next
    raise self
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.UNAVAILABLE, The service was unable to fulfill your request. Please try again. [code=8a75])>

Oct 5, 2018 in Python by eatcodesleeprepeat
• 4,670 points
124 views
The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/Users/Fred/_dev/datacentricity-public-samples/iot-subscriber-issue/venv/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_consumer.py", line 349, in _blocking_consume
    for response in responses:
  File "/Users/Fred/_dev/datacentricity-public-samples/iot-subscriber-issue/venv/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_consumer.py", line 476, in _pausable_iterator
    yield next(iterator)
  File "/Users/Fred/_dev/datacentricity-public-samples/iot-subscriber-issue/venv/lib/python3.6/site-packages/google/api_core/grpc_helpers.py", line 78, in next
    six.raise_from(exceptions.from_grpc_error(exc), exc)
  File "<string>", line 3, in raise_from
google.api_core.exceptions.ServiceUnavailable: 503 The service was unable to fulfill your request. Please try again. [code=8a75]

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 916, in _bootstrap_inner
    self.run()
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 864, in run
    self._target(*self._args, **self._kwargs)
  File "/Users/Fred/_dev/datacentricity-public-samples/iot-subscriber-issue/venv/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_consumer.py", line 363, in _blocking_consume
    request_generator, response_generator)
  File "/Users/Fred/_dev/datacentricity-public-samples/iot-subscriber-issue/venv/lib/python3.6/site-packages/google/cloud/pubsub_v1/subscriber/_consumer.py", line 275, in _stop_request_generator
    if not response_generator.done():
AttributeError: '_StreamingResponseIterator' object has no attribute 'done'

^C

Traceback (most recent call last):
  File "Controller.py", line 279, in <module>
    if __name__ == '__main__':
  File "Controller.py", line 274, in main
    try:
  File "Controller.py", line 196, in run

  File "/Users/Fred/_dev/datacentricity-public-samples/iot-subscriber-issue/venv/lib/python3.6/site-packages/google/cloud/pubsub_v1/futures.py", line 111, in result
    err = self.exception(timeout=timeout)
  File "/Users/Fred/_dev/datacentricity-public-samples/iot-subscriber-issue/venv/lib/python3.6/site-packages/google/cloud/pubsub_v1/futures.py", line 133, in exception
    if not self._completed.wait(timeout=timeout):
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 551, in wait
    signaled = self._cond.wait(timeout)
  File "/Library/Frameworks/Python.framework/Versions/3.6/lib/python3.6/threading.py", line 295, in wait
    waiter.acquire()
KeyboardInterrupt
(venv) Freds-MBP:iot-subscriber-issue Fred$

1 answer to this question.

0 votes
I'm seeing similar behavior as you. Except that regardless of the workaround policy, the process always hangs without any exception thrown.

Even stranger is that it's working on a colleagues machine and not mine. I have tried on two other machines and docker container and both hang.

Next, I'm going try strip everything down to a minimal containerized example I can take to Google and hopefully get some feedback from them.

Let me know if you learn anything in the interim.
answered Oct 5, 2018 by Priyaj
• 56,900 points

Related Questions In Python

0 votes
11 answers
0 votes
2 answers

Error: Speech to Text Codec cannot decode the bytes in position

import speech_recognition as sr r = sr.Recognizer() audio ='C\Users\Desktop\audiofile1.wav' with ...READ MORE

answered Nov 27, 2018 in Python by Nabarupa Das
171 views
0 votes
1 answer

Error Code installing Scrapy

You can install it via pip wheel. Download ...READ MORE

answered Jan 17 in Python by SDeb
• 13,180 points
168 views
+1 vote
2 answers

how can i count the items in a list?

Syntax :            list. count(value) Code: colors = ['red', 'green', ...READ MORE

answered Jul 6 in Python by Neha
• 330 points

edited Jul 8 by Kalgi 328 views
0 votes
1 answer

setuptools: build shared libary from C++ code, then build Cython wrapper linked to shared libary

There is a seemingly undocumented feature of setup that ...READ MORE

answered Sep 11, 2018 in Python by Priyaj
• 56,900 points
31 views
0 votes
1 answer

setuptools: build shared libary from C++ code, then build Cython wrapper linked to shared libary

There is a seemingly undocumented feature of setup that ...READ MORE

answered Sep 21, 2018 in Python by Priyaj
• 56,900 points
173 views