Java access remote HDFS from current Hadoop cluster

Suppose we have our Java app running on Hadoop clusterA, and we want to access remote HDFS based on Hadoop clusterB. Let’s see how we can do it: You need to go to clusterB and gather core-site.xml and hdfs-site.xml from there (default location for Cloudera is /etc/hadoop/conf) and put near your app running in clusterA. […]

READ MORE

Java SSL certificate revocation check

There is two common way to check TLS certificate revocation status: Certificate Revocation List (CRL) Online Certificate Status Protocol (OCSP) The second option is more faster and modern way to do that. The OCSP link must be presented some way to do that. There are at least two options: Your Certificate Authority (CA) automatically puts […]

READ MORE

Kafka client gets stuck in CHECKING API VERSIONS state

If you face the problems, when: Kafka Consumer is not involving in rebalance process Kafka Consumer is not reading messages Partitions are not allocated to the Kafka Consumer, but the Consumer is subscribed to the topic Kafka Consumer connection gets stuck in CHECKING_API_VERSIONS state Kafka Producer connection gets stuck in CHECKING_API_VERSIONS state Then this is […]

READ MORE

Kafka broker Kerberos

Let’s see how we can configure Kerberos between Kafka broker and Kafka client on server side. The client side is presented here: https://mchesnavsky.tech/how-to-create-kafka-kerberos-java-consumer. <kafka_home>/conf/server.properties <kafka_home>/bin/kafka-run-class.sh Insert this: To KAFKA_OPTS: Result: /your/path/to/kafka_server_jaas.conf Kerberos between Kafka brokers is configuring with separate conf keys (which we not mentioned in this article). Above configuration is for broker-client interaction.

READ MORE

Yarn is not aggregating application logs

First of all, you need to check NodeManager logs. There may be at least two problems: Log aggregation is not initialized problem: https://mchesnavsky.tech/log-aggregation-is-not-initialized HDFS DELEGATION TOKEN can’t be found in cache problem: https://mchesnavsky.tech/hdfs-delegation-token-cant-be-found-in-cache Please, refer to corresponding article, or made a note in comments below, if you have any other problem.

READ MORE

Log aggregation is not initialized

You may encounter with Hadoop Yarn exception in NodeManager logs that states: It may happen because NM reboot. The newly launched NM inherited the running application, and it does not know how to collect logs from it.According to the Hadoop Yarn NodeManager source code, instances of log collector classes for each running application are stored […]

READ MORE

How to create Kafka Kerberos Java consumer

Suppose that you need to create Kafka Java consumer with Kerberos. The code will be: You don’t need to specify java.security.auth.login.config Java property, because we set SaslConfigs.SASL_JAAS_CONFIG property directly to the consumer. You just need to made changes in kafkaJaasConfiguration() method that necessary for your Kerberos configuration.

READ MORE