Sunday, July 26, 2020

Install Postgres DB in docker

docker search postgresql

docker run --name postgresqldb -e POSTGRES_USER=myusername -e POSTGRES_PASSWORD=mypassword -p 5432:5432 -v /data:/var/lib/postgresql/data -d postgres


docker ps


docker start postgresqldb


Install PG Admin




docker stop postgresqldb


Listing the docker images history

docker image ls

Tuesday, May 22, 2018

Working with Cloudera


HDFS Files Path:

http://<hostname>:50070/explorer.html#/

Useful Unix Commands

Extracting 1 - 10 lines from an input.txt file and write to output.txt
sed -n -e '1,10p' input.txt > output.txt

Wednesday, October 25, 2017

Run shell script using Oozie job

In this I will show how to automate shell scripts using Oozie framework.

First create job.properties file.

1
2
3
4
5
6
nameNode=hdfs://localhost:8020
jobTracker=localhost:8050
queueName=default
oozie.wf.application.path=/<Hdfs Path>/sampleworkflow.xml
oozie.use.system.libpath=true
oozie.libpath=<shared lib path on Hdfs>

Create sampleworkflow.xml and place it in Hdfs.
1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
<workflow-app name="shell-wf" xmlns="uri:oozie:workflow:0.4">
 <start to="shell-node"/>
 <action name="shell-node">
 <shell xmlns="uri:oozie:shell-action:0.2">
 <job-tracker>${jobTracker}</job-tracker>
 <name-node>${nameNode}</name-node>
 <configuration>
    <property>
    <name>mapred.job.queue.name</name>
    <value>${queueName}</value>
    </property>
  </configuration>
 <exec>sample.sh</exec>
 <file>/<Hdfs Path of the shell script>/sample.sh</file>
 </shell>
 <ok to="end"/>
 <error to="fail"/>
 </action>
 <kill name="fail">
 <message>Shell action failed, error message[${wf:errorMessage(wf:lastErrorNode())}]</message>
 </kill>
 <end name="end"/>
</workflow-app>


Running the job:
1
oozie job -oozie http://localhost:11000/oozie -config job.properties -run


Java Client for Secure Web Socket

This post is how to connect a websocket which is secured by Basic Authentication.

I will be using javax.websocket-api jar file.

Create WebContainer object

WebSocketContainer container = ContainerProvider.getWebSocketContainer();

using the container connect to the service

1
container.connectToServer(endpoint, clientConfig, new URI("wss://hostname:port/demo"));


credentials are passed to clientConfig object.

1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
ClientEndpointConfig.Configurator configurator = new ClientEndpointConfig.Configurator() {
             public void beforeRequest(Map<String, List<String>> headers) {
          String credentials = "username:password";
          headers.put("Authorization", Arrays.asList("Basic " + new BASE64Encoder().encode(credentials.getBytes())));
                 System.out.println("Header set successfully");
             }
         };

         ClientEndpointConfig clientConfig = ClientEndpointConfig.Builder.create()
                 .configurator(configurator)
                 .build();


endpoint is the callback handler. Once the session is established with the service then we pass a message using the session object. onMessage() method should be overridden to receive the message.

1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
Endpoint endpoint = new Endpoint() {
  @Override
                public void onOpen(Session session, EndpointConfig config) {
                    session.addMessageHandler(new MessageHandler.Whole<String>() {
                        @Override
                        public void onMessage(String content) {
                            System.out.println("Received message: "+content);
                        }
                    });
                    try {
                 System.out.println("Sending message to endpoint: " + msg);
                        System.out.println("Session Id:: "+session.getId());
   session.getBasicRemote().sendText(msg);
      } catch (Exception e) {
   e.printStackTrace();
      }
                }
            };

Thursday, October 19, 2017

Connect to Cassandra from RStudio

In order to connect to cassandra from rstudio RJDBC library would help.

Prerequisites:
Install RJDBC library
1
install.packages("RJDBC")


Place cassandra-jdbc.jar library in the cassandra libraries folder. In my environment it's placed under the path '/usr/share/dse/cassandra/lib/'

Make sure thrift protocol is enabled on the cassandra cluster
1
nodetool statusthrift


If thrift is disabled it can be enabled with following command

1
nodetool enablethrift


R Script:

1
2
3
4
library(RJDBC)
cassdrv <- JDBC("org.apache.cassandra.cql.jdbc.CassandraDriver",list.files("/usr/share/dse/cassandra/lib/",pattern="jar$",full.names=T))
casscon <- dbConnect(cassdrv, "jdbc:cassandra://localhost:9160/test")
res <- dbGetQuery(casscon, "select * from emp")