Enterprise Java

Storm event processor – GC log file per worker

In the last three months, I am working with a new team building a product for Big Data analytics on Telecom domain.

Storm event processor is one of the main frameworks we use and it is really great. You can read more details on its official documentation (which has been improved).

Storm uses Workers to do your job, where each of them is a single JVM and is administrated internally by Storm (start, restart if no responsive, move Worker to another node of cluster, etc.). For a single job you can run many Workers on your cluster (Storm decides how to distribute your Workers in cluster nodes). As “node” I mean a running OS, either running on VM or on a physical machine.

The tricky point here is that all Workers in a node read the same configuration file (STORM_HOME/conf/storm.yaml) even they are running/processing a different kind of job. Additionally, there is a single parameter (worker.childopts) in this file, which is used for all Workers (of the same node) to initialize theirs JVMs (how to set JVM Options).

As we want to know how GC performs in each worker we need to monitor GC log of each Worker/JVM.

As I said, the problem is that as all Workers, in a node, read the same parameter from the same configuration file in order to initialize theirs JVMs, so it is not trivial to use a different GC logging file for each Worker/JVM.

Fortunately, Storm developers have expose a “variable” that solves this problem. This variable is named “ID” and it is unique for each Worker on each node (same Worker ID could exist in different nodes).

For Workers JVM Options, we use this entry in our “storm.yaml” file:

worker.childopts: "-Xmx1024m -XX:MaxPermSize=256m -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc -Xloggc:/opt/storm/logs/gc-storm-worker-%ID%.log"

Be aware, that you have to add “%” before and after “ID” string (in order to be identified as an internal Storm variable).

Additionally, for Supervisor JVM Options (one process on each node), we use this entry in our “storm.yaml” file:

supervisor.childopts: "-Xmx512m -XX:MaxPermSize=256m -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -verbose:gc -Xloggc:/opt/storm/logs/gc-storm-supervisor.log"

I have also included a kind of memory settings (“-Xmx” and “-XX:MaxPermSize”) too, but it is just an example.

Please keep in mind that Storm requires Oracle Hotspot JDK 6 (JDK 7/8 is not yet supported). This is a strong drawback, but we hope it will be fixed soon.

Hope it helps!

Democracy Requires Free Software

Adrianos Dadis

Adrianos is working as senior software engineer in telcos business domain. Particularly interested in enterprise integration, multi-tier architecture and middleware services. He mainly works with Weblogic, JBoss, Java EE, Spring, Drools, Oracle SOA Suite and various ESBs.
Subscribe
Notify of
guest

This site uses Akismet to reduce spam. Learn how your comment data is processed.

0 Comments
Oldest
Newest Most Voted
Inline Feedbacks
View all comments
Back to top button