Spark Java access remote HDFS

Suppose we need to work with different HDFS (clusterB, for instance) from our Spark Java application, running on clusterA. Firstly, you need to add –conf key to your run command. Depends on Spark version: Secondly, when you creating Spark’s Java context, add that: You need to go to clusterB and gather core-site.xml and hdfs-site.xml from there (default location for Cloudera is /etc/hadoop/conf) […]

READ MORE

Java access remote HDFS from current Hadoop cluster

Suppose we have our Java app running on Hadoop clusterA, and we want to access remote HDFS based on Hadoop clusterB. Let’s see how we can do it: You need to go to clusterB and gather core-site.xml and hdfs-site.xml from there (default location for Cloudera is /etc/hadoop/conf) and put near your app running in clusterA. […]

READ MORE

HDFS DELEGATION TOKEN can’t be found in cache

The problem can be appears in Hadoop’s NodeManager logs. Usually it means that NodeManager is trying to use an expired / not renewed HDFS delegation token. For example, you can face this error while app log aggregation process. The timeline is: Your application pass HDFS delegation token to the NodeManager through the ContainerLaunchContext class, because […]

READ MORE

ZooKeeper recursive watcher

If you need to set up a recursive watchers (watch on all nodes), the standard ZooKeeper’s Watcher class will not help much – it is installed on only 1 node (or one-level-forward when you calling getChildren()), and is also a one-time event. This means that after each watch trigger, you need to install a new […]

READ MORE

How to immediately terminate the Spring Boot Yarn container with an error

Imagine an error or exception occurs while running the Spring Boot Yarn container, and we need to kill container from itself and return an error code. You can use @OnContainerStart annotation as mentioned in this article: https://mchesnavsky.tech/how-to-set-up-exit-code-on-spring-boot-yarn-container. But if we need to stop the container immediately, we just need to call: – where parameter is […]

READ MORE

KeeperErrorCode = ConnectionLoss for /hbase/hbaseid

IMPORTANT! If you trying to install Apache Atlas and receiving this error, there is a separate article: https://mchesnavsky.tech/apache-atlas-building-installing/ Suppose that we are faced with these exceptions. The first: The second: The third: The hbase-client cannot connect to the Zookeeper. You need to pay attention to the address: If there is a real Zookeeper instance at […]

READ MORE

IgniteException: Work directory does not exist

Suppose that when starting the application, you encounter the following exception: It is assumed, that the /tmp/ignite/work directory is located on the local file system. If the application is running on a cluster, then Ignite will try to create such a folder on each node of the cluster. Exception reasons: The account under which the […]

READ MORE