Sunday, December 11, 2016

Livy Job Server Implementation

Livy Job Server Installation on Horton Works Data Platform (HDP 2.4):
Download the compressed file from
http://archive.cloudera.com/beta/livy/livy-server-0.2.0.zip

Unzip the file and work on configuration changes.

Configurations:

1
Add 'spark.master=yarn-cluster' in config file '/usr/hdp/current/spark-client/conf/spark-defaults.conf'



livy-env.sh
Add these entries

1
2
3
4
5
6
7
8
export SPARK_HOME=/usr/hdp/current/spark-client
export HADOOP_HOME=/usr/hdp/current/hadoop-client/bin/
export HADOOP_CONF_DIR=/etc/hadoop/conf
export SPARK_CONF_DIR=$SPARK_HOME/conf
export LIVY_LOG_DIR=/jobserver-livy/logs
export LIVY_PID_DIR=/jobserver-livy
export LIVY_MAX_LOG_FILES=10
export HBASE_HOME=/usr/hdp/current/hbase-client/bin


log4j.properties
By default logs roll on console in INFO mode. It can be changed to file rolling with debug enabled with the below configuration:

1
2
3
4
5
6
log4j.rootCategory=DEBUG, NotConsole
log4j.appender.NotConsole=org.apache.log4j.RollingFileAppender
log4j.appender.NotConsole.File=/Analytics/livy-server/logs/livy.log
log4j.appender.NotConsole.maxFileSize=20MB
log4j.appender.NotConsole.layout=org.apache.log4j.PatternLayout
log4j.appender.NotConsole.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{1}: %m%n

Copy the following libraries to the path '<LIVY SEVER INSTALL PATH>/rsc-jars'

1
2
3
4
5
6
7
8
hbase-common.jar
hbase-server.jar
hbase-client.jar
hbase-rest.jar
guava-11.0.2.jar
protobuf-java.jar
hbase-protocol.jar
spark-assembly.jar


Integrating web Application with Livy job server:
Copy the following libraries to tomcat classpath (lib folder)

1
2
livy-api-0.2.0.jar
livy-client-http-0.2.0.jar


Servlet invokes the LivyClientUtil

LivyClientUtil:
1
2
3
4
5
String livyUrl = "http://127.0.0.1:8998";
LivyClient client = new HttpClientFactory().createClient(new URL(livyUrl).toURI(), null);
client.uploadJar(new File(jarPath)).get();
TimeUnit time = TimeUnit.SECONDS;
String jsonString = client.submit(new SparkReaderJob("08/19/2010")).get(40,time);

SparkReaderJob:

1
2
3
4
5
public class SparkReaderJob implements Job<String> {
@Override
public String call(JobContext jc) throws Exception {
           JavaSparkContext jsc = jc.sc();
} }


References:
http://livy.io/
https://github.com/cloudera/livy