Reports

The known issue is described here:

https://learn.microsoft.com/en-us/azure/hdinsight/hdinsight-known-issues

Spark History Service => Decommissioned node logs cannot be accessed directly from Spark / YARN UI (Expected behavior)

This issue can be very bothersome. A Spark job that has executed recently (in the past hour) may stop presenting its logs in the Spark U/I. I think the bug needs to be fixed on priority.

In the meantime here are a few alternative approaches that the PG suggested for customers to use:

Alternative #1: Manually construct the URL to the Job History to access the decommissioned aggregated logs.

Example:

https://<CLUSTERDNSNAME>.azurehdinsight.net/yarnui/jobhistory/logs/<Decommissioned worker node FQDN>/port/30050/<CONTAINER-ID>/<CONTAINER-ID>/root/stderr?start=-4096

Alternative #2: Use the schedule-based autoscaling workflow. This allows developers time to debug job failures before the cluster scales down.

Alternative #3: Use the yarn logs command via the Azure CLI.

Alternative #4: Use an open-source converter to translate TFile-formatted logs in the Azure Storage account to plain text

79631307