I have been facing a very problematic bug for several hours. We have been around 10 trying to find how to fix it, but none of us could.

When importing tensorflow, I had to use

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

to maintain the communication with halite. However, when a specific tf error occurs, it is not possible to catch it properly.

We tried the
log = logging.getLogger('tensorflow')
stderr = sys.stdout sys.stdout = open("my_error.log", 'w') ##tf inference code here sys.stdout = stderr

or even with stderr, but it is always the same problem. Could you propose something to have a proper debuggining environment ?

I am still having problems using tensorflow as well, and implemented the min log level solution you listed. I also added…

tf_logger = logging.Logger(‘tensorflow’)
tf_logger.propagate = False

…after the TF import but as I said I am still getting the occasional ‘Failed Communication’ error for no discernible reason.