How the Java logging should works

In this article you will find the answers for these questions: Let’s start To begin with, there is two main concepts in Java logging: The problem Suppose the you’re Java library developer (let’s say it’s called our_library), not standard Java application developer. You need to have logging in your library. Suppose you have chosen log4j. […]

READ MORE

Spark Java access remote HDFS

Suppose we need to work with different HDFS (clusterB, for instance) from our Spark Java application, running on clusterA. Firstly, you need to add –conf key to your run command. Depends on Spark version: Secondly, when you creating Spark’s Java context, add that: You need to go to clusterB and gather core-site.xml and hdfs-site.xml from there (default location for Cloudera is /etc/hadoop/conf) […]

READ MORE

Java access remote HDFS from current Hadoop cluster

Suppose we have our Java app running on Hadoop clusterA, and we want to access remote HDFS based on Hadoop clusterB. Let’s see how we can do it: You need to go to clusterB and gather core-site.xml and hdfs-site.xml from there (default location for Cloudera is /etc/hadoop/conf) and put near your app running in clusterA. […]

READ MORE

Java SSL certificate revocation check

There is two common way to check TLS certificate revocation status: Certificate Revocation List (CRL) Online Certificate Status Protocol (OCSP) The second option is more faster and modern way to do that. The OCSP link must be presented some way to do that. There are at least two options: Your Certificate Authority (CA) automatically puts […]

READ MORE

How to create Kafka Kerberos Java consumer

Suppose that you need to create Kafka Java consumer with Kerberos. The code will be: You don’t need to specify java.security.auth.login.config Java property, because we set SaslConfigs.SASL_JAAS_CONFIG property directly to the consumer. You just need to made changes in kafkaJaasConfiguration() method that necessary for your Kerberos configuration.

READ MORE

HDFS DELEGATION TOKEN can’t be found in cache

The problem can be appears in Hadoop’s NodeManager logs. Usually it means that NodeManager is trying to use an expired / not renewed HDFS delegation token. For example, you can face this error while app log aggregation process. The timeline is: Your application pass HDFS delegation token to the NodeManager through the ContainerLaunchContext class, because […]

READ MORE