HDFS_DELEGATION_TOKEN can't be found in cache
The problem can be appears in Hadoop’s NodeManager logs.
Usually it means that NodeManager is trying to use an expired / not renewed HDFS delegation token.
For example, you can face this error while app log aggregation process. The timeline is:
- Your application pass HDFS delegation token to the NodeManager through the
ContainerLaunchContextclass, because NodeManager needs to localize container resources.
- NodeManager uses the same HDFS delegation token to aggregate the logs. It transfer all app log files from nodes to HDFS.
- You didn’t start containers on some nodes for a long time, so token on corresponding NodeManagers are going to expire.
- When you kill your application, the log aggregation process was triggered, but some NodeManagers are going to give you HDFS_DELEGATION_TOKEN error, because they don’t have valid token.
To solve this error, your application must renew that token itself, which it previously pass to NodeManager through the
Important notice! The token is going to expire and can’t be renewed after about 7 days. You don’t need to do something anymore, because NodeManager is going to make it’s own token.
By the way you need to ensure, that you are using Hadoop version >= 2.6.0. In older versions there is a bug, because of which NodeManager is aren’t going to make it’s own token: https://issues.apache.org/jira/browse/YARN-2704.Telegram channel
If you still have any questions, feel free to ask me in the comments under this article or write me at firstname.lastname@example.org.
If I saved your day, you can support me 🤝