Apun's Weblog: May 2012

Thursday, May 31, 2012

Quick Hadoop 2-node Cluster Setup for Oracle DBAs

OS Version: Red Hat Enterprise Linux Server release 5.6
Hadoop Version: 1.0.1
Hosts: host1, host2

Step 1: Install Java on the Hosts

yum install java-1.6.0-openjdk-devel.x86_64

Check for Java version
java -version

Step 2: Create OS Group and User on hosts

Add Group:
/usr/sbin/groupadd hdfs

Add User:
/usr/sbin/useradd -G hdfs -d /home/hadoop hadoop

Change user password:
passwd hadoop

Step 3: Setup password less ssh between the hadoop users on both the hosts

login to hadoop user on both the hosts:
hadoop@host1:
ssh-keygen -t rsa
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
Copy file $HOME/.ssh/authorized_keys to host2 /tmp/
hadoop@host2:
ssh-keygen -t rsa
cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys
cat /tmp/authorized_keys >> $HOME/.ssh/authorized_keys
Now copy the $HOME/.ssh/authorized_keys to host1 /tmp
chmod 700 $HOME/.ssh
cd $HOME/.ssh
chmod 600 *
hadoop@host1:
cp /tmp/authorized_keys $HOME/.ssh/authorized_keys
chmod 700 $HOME/.ssh
chmod 700 $HOME/.ssh
cd $HOME/.ssh
chmod 600 *
ssh host1
ssh host2
hadoop@host2:
ssh host1
ssh host2

This is setup password ssh between hadoop user on both the hosts

Step 4: Download and install hadoop

Download location: http://www.apache.org/dyn/closer.cgi/hadoop/common/
Download the release of your choice.
Download hadoop-1.0.1.tar.gz, ungzip and untar

mv hadoop-1.0.1 hadoop

Hadoop location: /home/hadoop/hadoop

Step 5: Configure Hadoop

5.a: Update JAVA_HOME in /home/hadoop/conf/hadoop-env.sh

export JAVA_HOME=/usr/lib/jvm/jre-1.6.0-openjdk.x86_64

5.b: Add following entry to /home/hadoop/conf/core-site.xml

<property>
<name>hadoop.tmp.dir</name>
<value>/home/hadoop/tmp</value>
<description>A base for other temporary directories.</description>
</property>

<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation. The
uri's scheme determines the config property (fs.SCHEME.impl) naming
the FileSystem implementation class. The uri's authority is used to
determine the host, port, etc. for a filesystem.</description>
</property>

5.c: Add following to /home/hadoop/conf/mapred-site.xml

<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
<description>The host and port that the MapReduce job tracker runs
at. If "local", then jobs are run in-process as a single map
and reduce task.
</description>
</property>

5.d: Add following to /home/hadoop/conf/hdfs-site.xml

<property>
<name>dfs.replication</name>
<value>1</value>
<description>Default block replication.
The actual number of replications can be specified when the file is created.
The default is used if replication is not specified in create time.
</description>
</property>

5.e: Update /home/hadoop/conf/masters with the name of the master node

[hadoop@host1 conf]$ more masters
host1

5.f: Update /home/hadoop/conf/slaves with the name of the slave nodes

[hadoop@host1 conf]$ more slaves
host1
host2

5.g: Format name node

[hadoop@host1 ~]$ hadoop namenode -format
12/05/09 09:42:03 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG: host = host1/66.228.160.91
STARTUP_MSG: args = [-format]
STARTUP_MSG: version = 1.0.1

STARTUP_MSG:   build = https://svn.apache.org/repos/asf/hadoop/common/branches/branch-1.0 -r 1243785; compiled by 'hortonfo' on Tue Feb 14 08:15:38 UTC 2012

************************************************************/
12/05/09 09:42:03 INFO util.GSet: VM type = 64-bit
12/05/09 09:42:03 INFO util.GSet: 2% max memory = 17.77875 MB
12/05/09 09:42:03 INFO util.GSet: capacity = 2^21 = 2097152 entries
12/05/09 09:42:03 INFO util.GSet: recommended=2097152, actual=2097152
12/05/09 09:42:03 INFO namenode.FSNamesystem: fsOwner=hadoop
12/05/09 09:42:03 INFO namenode.FSNamesystem: supergroup=supergroup
12/05/09 09:42:03 INFO namenode.FSNamesystem: isPermissionEnabled=true
12/05/09 09:42:03 INFO namenode.FSNamesystem: dfs.block.invalidate.limit=100

12/05/09 09:42:03 INFO namenode.FSNamesystem: isAccessTokenEnabled=false accessKeyUpdateInterval=0 min(s), accessTokenLifetime=0 min(s)

12/05/09 09:42:03 INFO namenode.NameNode: Caching file names occuring more than 10 times
12/05/09 09:42:03 INFO common.Storage: Image file of size 112 saved in 0 seconds.
12/05/09 09:42:03 INFO common.Storage: Storage directory /home/hadoop/tmp/dfs/name has been successfully formatted.
12/05/09 09:42:03 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at host1/66.228.160.91
************************************************************/

Step 6: Startup hadoop

[hadoop@host1 bin]$ ./start-all.sh
starting namenode, logging to /home/hadoop/hadoop/libexec/../logs/hadoop-hadoop-namenode-host1.out
host2: starting datanode, logging to /home/hadoop/hadoop/libexec/../logs/hadoop-hadoop-datanode-host2.out
host1: starting datanode, logging to /home/hadoop/hadoop/libexec/../logs/hadoop-hadoop-datanode-host1.out

host1: starting secondarynamenode, logging to /home/hadoop/hadoop/libexec/../logs/hadoop-hadoop-secondarynamenode-host1.out

host1: Exception in thread "main" java.io.IOException: Cannot lock storage /home/hadoop/temp/dfs/namesecondary. The directory is already locked.

host1: at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.lock(Storage.java:602)
host1: at org.apache.hadoop.hdfs.server.common.Storage$StorageDirectory.analyzeStorage(Storage.java:455)

host1:  at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$CheckpointStorage.recoverCreate(SecondaryNameNode.java:615)

host1: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:175)
host1: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.(SecondaryNameNode.java:129)
host1: at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:567)
starting jobtracker, logging to /home/hadoop/hadoop/libexec/../logs/hadoop-hadoop-jobtracker-host1.out
host2: starting tasktracker, logging to /home/hadoop/hadoop/libexec/../logs/hadoop-hadoop-tasktracker-host2.out
host1: starting tasktracker, logging to /home/hadoop/hadoop/libexec/../logs/hadoop-hadoop-tasktracker-host1.out

Checks:
[hadoop@host1 bin]$ jps
20358 JobTracker
20041 DataNode
20528 TaskTracker
19869 NameNode
27905 Jps
26450 oc4j.jar
20229 SecondaryNameNode

[hadoop@host2 ~]$ jps
12522 Jps
1737 TaskTracker
1572 DataNode

jps - Java Virtual Machine Process Status Tool, show the java VMs running.

Step 7: Running a basic Map Reduce Test

I am using the sample test from: http://wiki.apache.org/hadoop/WordCount

7.a: Create directories in Hadoop:

[hadoop@host1 ~]$ hadoop dfs -mkdir input
[hadoop@host1 ~]$ hadoop dfs -mkdir output

Hadoop shell reference: http://hadoop.apache.org/common/docs/r0.17.1/hdfs_shell.html

7.b: Copy the script from the web to /home/hadoop/WordCount.java and compile

create directory wordcount
mkdir -p /home/hadoop/wordcount

[hadoop@host1 ~]$ javac -classpath /usr/share/hadoop/hadoop-core-1.0.1.jar -d wordcount WordCount.java
[hadoop@host1 ~]$ jar -cvf /home/hadoop/wordcount.jar -C wordcount/ .
[hadoop@host1 ~]$ jar -cvf /home/hadoop/wordcount.jar -C wordcount/ .
added manifest
adding: org/(in = 0) (out= 0)(stored 0%)
adding: org/myorg/(in = 0) (out= 0)(stored 0%)
adding: org/myorg/WordCount$Map.class(in = 1938) (out= 798)(deflated 58%)
adding: org/myorg/WordCount.class(in = 1546) (out= 749)(deflated 51%)
adding: org/myorg/WordCount$Reduce.class(in = 1611) (out= 649)(deflated 59%)

This will create the jar file to be used in the test.

7.c: Create input files:

file1:
Hello World Bye World
file2:
Hello Hadoop Goodbye Hadoop

7.d: Copy the files over to Hadoop:

[hadoop@host1 ~]$ hadoop dfs -copyFromLocal file1 input
[hadoop@host1 ~]$ hadoop dfs -copyFromLocal file2 input
[hadoop@host1 ~]$ hadoop dfs -ls input
Found 2 items
-rw-r--r-- 1 hadoop supergroup 23 2012-05-09 09:44 /user/hadoop/input/file1
-rw-r--r-- 1 hadoop supergroup 28 2012-05-09 09:44 /user/hadoop/input/file2

7.e: Run the Map Reduce Java program:

[hadoop@host1 ~]$ hadoop jar /home/hadoop/wordcount.jar org.myorg.WordCount /user/hadoop/input /user/hadoop/output

12/05/30 16:03:44 WARN mapred.JobClient: Use GenericOptionsParser for parsing the arguments. Applications should implement Tool for the same.

12/05/30 16:03:44 INFO mapred.FileInputFormat: Total input paths to process : 2
12/05/30 16:03:44 INFO mapred.JobClient: Running job: job_201205091119_0004
12/05/30 16:03:45 INFO mapred.JobClient: map 0% reduce 0%
12/05/30 16:03:58 INFO mapred.JobClient: map 100% reduce 0%
12/05/30 16:04:10 INFO mapred.JobClient: map 100% reduce 100%
12/05/30 16:04:15 INFO mapred.JobClient: Job complete: job_201205091119_0004
12/05/30 16:04:15 INFO mapred.JobClient: Counters: 30
12/05/30 16:04:15 INFO mapred.JobClient: Job Counters
12/05/30 16:04:16 INFO mapred.JobClient: Launched reduce tasks=1
12/05/30 16:04:16 INFO mapred.JobClient: SLOTS_MILLIS_MAPS=19996
12/05/30 16:04:16 INFO mapred.JobClient: Total time spent by all reduces waiting after reserving slots (ms)=0
12/05/30 16:04:16 INFO mapred.JobClient: Total time spent by all maps waiting after reserving slots (ms)=0
12/05/30 16:04:16 INFO mapred.JobClient: Launched map tasks=3
12/05/30 16:04:16 INFO mapred.JobClient: Data-local map tasks=3
12/05/30 16:04:16 INFO mapred.JobClient: SLOTS_MILLIS_REDUCES=10031
12/05/30 16:04:16 INFO mapred.JobClient: File Input Format Counters
12/05/30 16:04:16 INFO mapred.JobClient: Bytes Read=55
12/05/30 16:04:16 INFO mapred.JobClient: File Output Format Counters
12/05/30 16:04:16 INFO mapred.JobClient: Bytes Written=41
12/05/30 16:04:16 INFO mapred.JobClient: FileSystemCounters
12/05/30 16:04:16 INFO mapred.JobClient: FILE_BYTES_READ=79
12/05/30 16:04:16 INFO mapred.JobClient: HDFS_BYTES_READ=412
12/05/30 16:04:16 INFO mapred.JobClient: FILE_BYTES_WRITTEN=86679
12/05/30 16:04:16 INFO mapred.JobClient: HDFS_BYTES_WRITTEN=41
12/05/30 16:04:16 INFO mapred.JobClient: Map-Reduce Framework
12/05/30 16:04:16 INFO mapred.JobClient: Map output materialized bytes=91
12/05/30 16:04:16 INFO mapred.JobClient: Map input records=2
12/05/30 16:04:16 INFO mapred.JobClient: Reduce shuffle bytes=51
12/05/30 16:04:16 INFO mapred.JobClient: Spilled Records=12
12/05/30 16:04:16 INFO mapred.JobClient: Map output bytes=82
12/05/30 16:04:16 INFO mapred.JobClient: Total committed heap usage (bytes)=443023360
12/05/30 16:04:16 INFO mapred.JobClient: CPU time spent (ms)=1440
12/05/30 16:04:16 INFO mapred.JobClient: Map input bytes=51
12/05/30 16:04:16 INFO mapred.JobClient: SPLIT_RAW_BYTES=357
12/05/30 16:04:16 INFO mapred.JobClient: Combine input records=8
12/05/30 16:04:16 INFO mapred.JobClient: Reduce input records=6
12/05/30 16:04:16 INFO mapred.JobClient: Reduce input groups=5
12/05/30 16:04:16 INFO mapred.JobClient: Combine output records=6
12/05/30 16:04:16 INFO mapred.JobClient: Physical memory (bytes) snapshot=576110592
12/05/30 16:04:16 INFO mapred.JobClient: Reduce output records=5
12/05/30 16:04:16 INFO mapred.JobClient: Virtual memory (bytes) snapshot=2767060992
12/05/30 16:04:16 INFO mapred.JobClient: Map output records=8

7.f: Check for the output in hadoop:

[hadoop@host1 ~]$ hadoop dfs -ls output
Found 3 items
-rw-r--r-- 2 hadoop supergroup 0 2012-05-09 11:22 /user/hadoop/output/_SUCCESS
drwxr-xr-x - hadoop supergroup 0 2012-05-09 11:22 /user/hadoop/output/_logs
-rw-r--r-- 2 hadoop supergroup 41 2012-05-09 11:22 /user/hadoop/output/part-00000

7.g: Check the output:

[hadoop@host1 ~]$ hadoop dfs -cat /user/hadoop/output/part-00000
Bye 1
Goodbye 1
Hadoop 2
Hello 2
World 2

This completes the quick setup of a 2 node hadoop cluster.

While trying to setup and learn I referred to the following links. My thanks and regards for all the help:

For single node setup: http://hadoop.apache.org/common/docs/current/single_node_setup.html#Local
For cluster setup: http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/
http://hortonworks.com/blog/set-up-apache-hadoop-in-minutes-with-rpms/
http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

Saturday, May 26, 2012

Setup Data replication from Oracle to MongoDB

Credits: http://tebros.com/2011/09/keep-mongodb-and-oracle-in-sync-using-streams-advanced-queuing/

Components used:
MongoDB: MongoDB shell version: 2.0.5
Oracle: 11.2.0.2
Oracle Advanced Queuing
Perl scripting
DBD:Oracle: perl DBD for Oracle
JSON perl module.

The model for this setup is:
Replicate HR.EMPLOYEES table from Oracle to MongoDB.
Use Oracle AQ to queue insert, update & delete and then use perl script to dequeue them in MongoDB.

Step 1:
Use SCOTT.EMP table available in the SCOTT schema in oracle.
The structure is:

SQL> desc emp

Name Data Type
-------------------------
EMPNO NUMBER(4)
ENAME VARCHAR2(10)
JOB VARCHAR2(9)
MGR NUMBER(4)
HIREDATE DATE
SAL NUMBER(7,2)
COMM NUMBER(7,2)
DEPTNO NUMBER(2)

Step 2: Grant execute on DBMS_AQ, DBMS_AQADM to SCOTT

SQL> grant execute on DBMS_AQ to SCOTT;

Grant succeeded.

SQL> grant execute on DBMS_AQADM to SCOTT;

Grant succeeded.

Step 3: Setup Queue Table in Oracle

CREATE TYPE EMP_T AS OBJECT (MSG VARCHAR2(4000));

BEGIN
-- THIS PROCEDURE CREATES A QUEUE TABLE FOR MESSAGES OF A PREDEFINED TYPE.
DBMS_AQADM.CREATE_QUEUE_TABLE(
QUEUE_TABLE => 'EMPQUEUE_QT',
QUEUE_PAYLOAD_TYPE => 'EMP_T'
);

-- This procedure creates a queue in the specified queue table.
DBMS_AQADM.CREATE_QUEUE(
QUEUE_NAME => 'EMPQUEUE_Q',
QUEUE_TABLE => 'EMPQUEUE_QT'
);

-- This procedure enables the specified queue for enqueuing or dequeuing.
DBMS_AQADM.START_QUEUE(
QUEUE_NAME => 'EMPQUEUE_Q'
);
END;
/

QUEUE TABLE Structure:
SQL> desc AQ$EMPQUEUE_QT
Name Null? Type
----------------------------------------- -------- ----------------------------
QUEUE VARCHAR2(30)
MSG_ID NOT NULL RAW(16)
CORR_ID VARCHAR2(128)
MSG_PRIORITY NUMBER
MSG_STATE VARCHAR2(16)
DELAY DATE
DELAY_TIMESTAMP TIMESTAMP(6)
EXPIRATION NUMBER
ENQ_TIME DATE
ENQ_TIMESTAMP TIMESTAMP(6)
ENQ_USER_ID VARCHAR2(30)
ENQ_TXN_ID VARCHAR2(30)
DEQ_TIME DATE
DEQ_TIMESTAMP TIMESTAMP(6)
DEQ_USER_ID VARCHAR2(30)
DEQ_TXN_ID VARCHAR2(30)
RETRY_COUNT NUMBER
EXCEPTION_QUEUE_OWNER VARCHAR2(30)
EXCEPTION_QUEUE VARCHAR2(30)
USER_DATA EMP_T
ORIGINAL_QUEUE_NAME VARCHAR2(30)
ORIGINAL_QUEUE_OWNER VARCHAR2(30)
EXPIRATION_REASON VARCHAR2(18)
SENDER_NAME VARCHAR2(30)
SENDER_ADDRESS VARCHAR2(1024)
SENDER_PROTOCOL NUMBER
ORIGINAL_MSGID RAW(16)

SQL> SELECT name, queue_table,enqueue_enabled,dequeue_enabled,user_comment FROM user_queues;

NAME QUEUE_TABLE ENQUEUE DEQUEUE USER_COMMENT
------------------------------ ------------------------------ -----
EMPQUEUE_Q EMPQUEUE_QT YES YES
AQ$_EMPQUEUE_QT_E EMPQUEUE_QT NO NO exception queue

Step 4: Create procedure to enqueue messages/DMLs

CREATE OR REPLACE PROCEDURE enqueue_emp(payload VARCHAR2) AS
msg1 emp_t := emp_t(NULL);
msg_id RAW(16);
priority NUMBER;
enqueue_options DBMS_AQ.ENQUEUE_OPTIONS_T;
message_properties DBMS_AQ.MESSAGE_PROPERTIES_T;
BEGIN
msg1.msg := payload;
message_properties.priority := 1;
DBMS_AQ.ENQUEUE(
queue_name => 'EMPQUEUE_Q',
enqueue_options => enqueue_options,
message_properties => message_properties,
payload => msg1,
msgid => msg_id);
END;
/

Step 5: Create procedure to dequeue messages/DMPs to be used to load data in mongoDB.

CREATE PROCEDURE dequeue_emp(payload OUT VARCHAR2) AS
msg1 emp_t := emp_t(NULL);
msg_id RAW(16);
dequeue_options DBMS_AQ.DEQUEUE_OPTIONS_T;
message_properties DBMS_AQ.MESSAGE_PROPERTIES_T;
BEGIN
DBMS_AQ.DEQUEUE(
queue_name => 'EMPQUEUE_Q',
dequeue_options => dequeue_options,
message_properties => message_properties,
payload => msg1,
msgid => msg_id
);
payload := msg1.msg;
END;
/

Step 6: Create Trigger to queue DMLs on table SCOTT.EMP

CREATE OR REPLACE TRIGGER emp_q_trig
AFTER INSERT OR UPDATE OR DELETE ON emp
FOR EACH ROW
DECLARE
msg VARCHAR2(4000);
BEGIN
IF INSERTING OR UPDATING THEN
msg := '{"empno":' || :new.empno || ',"ename":"' || :new.ename || '"';
msg := msg||',"job":"' || :new.job || '","mgr":"' || :new.mgr || '"';
msg := msg||',"hiredate":"' || :new.HIREDATE || '","sal":"' || :new.sal || '"';
msg := msg||',"comm":"' || :new.COMM || '","deptno":"' || :new.DEPTNO || '"';
msg := msg||',"dml_type":"';
msg := msg|| CASE WHEN INSERTING THEN 'I' ELSE 'U' END || '"}';
ELSE
msg := '{"empno":' || :old.empno || ',"dml_type":"D"}';
END IF;
enqueue_emp(msg);
END;
/

Step 7: Lets test the AQ Setup

SQL> select * from emp;

EMPNO ENAME JOB MGR HIREDATE SAL COMM DEPTNO
---------- ---------- --------- ---------- --------- ---------- ---------- ----------
7369 SMITH CLERK 7902 17-DEC-80 800 20
7499 ALLEN SALESMAN 7698 20-FEB-81 1600 300 30
7521 WARD SALESMAN 7698 22-FEB-81 1250 500 30
7566 JONES MANAGER 7839 02-APR-81 2975 20
7654 MARTIN SALESMAN 7698 28-SEP-81 1250 1400 30
7698 BLAKE MANAGER 7839 01-MAY-81 2850 30
7782 CLARK MANAGER 7839 09-JUN-81 2450 10
7788 SCOTT ANALYST 7566 19-APR-87 3000 20
7839 KING PRESIDENT 17-NOV-81 5000 10
7844 TURNER SALESMAN 7698 08-SEP-81 1500 0 30
7876 ADAMS CLERK 7788 23-MAY-87 1100 20
7900 JAMES CLERK 7698 03-DEC-81 950 30
7902 FORD ANALYST 7566 03-DEC-81 3000 20
7934 MILLER CLERK 7782 23-JAN-82 1300 10

14 rows selected.

SQL> UPDATE EMP SET COMM=200 WHERE EMPNO=7369;

1 ROW UPDATED.

SQL> COMMIT;

COMMIT COMPLETE.

SQL> SELECT COUNT(*) FROM AQ$EMPQUEUE_QT;

COUNT(*)
----------
1

SQL> DECLARE
2 VAR1 VARCHAR2(4000);
3 BEGIN
4 DEQUEUE_EMP(VAR1);
5 DBMS_OUTPUT.PUT_LINE('QUEUE OUTPUT:'|| VAR1);
6 END;
7 /

QUEUE OUTPUT:{"EMPNO":7369,"ENAME":"SMITH","JOB":"CLERK","MGR":"7902","HIREDATE":"17-DEC-80","SAL":"800","COMM":"200","DEPTNO":"20"","DML_TYPE":"U"}

PL/SQL PROCEDURE SUCCESSFULLY COMPLETED.

SQL> SELECT COUNT(*) FROM AQ$EMPQUEUE_QT;

COUNT(*)
----------
0

We can see the QUEUE Output as well.

Step 8: Create a perl script to dequeue the message to mongoDB collection called EMP.

--------------Script Start------------------
#!/usr/bin/perl

use strict;
use DBI;
use DBD::Oracle;
use MongoDB;
use MongoDB::OID;
use Time::Local;
use JSON;

#Connect to the Oracle DB
my $dbh = DBI->connect("dbi:Oracle:host=hostname1;port=1521;sid=test","scott","tiger")
or die "Error: " . $DBI::errstr;

#Connect to Mongo DB
my $conn=MongoDB::Connection->new(host => 'hostname1:27020');
print "conn $conn\n";

#Connect to required Mongo DB
my $db=$conn->test;
print "db $db\n";

#Connect to the required Mongo Collection
my $coll=$db->emp;
print "Collection: $coll\n";

# Count the messages in Oracle Queue
my $sql1 = "select count(*) from AQ\$EMPQUEUE_QT";
my $stm1 = $dbh->prepare($sql1);
$stm1->execute();
my $ccc = $stm1->fetchrow_array();
print "Message Count: $ccc\n";

#Fetch the Oracle AQ Queue Data
my $aqdata;
my $sql2 = "BEGIN dequeue_emp(:aqdata); END;";
my $stm2 = $dbh->prepare($sql2);
$stm2->bind_param_inout(":aqdata", \$aqdata,4000);

#Start of the while loop
my $counter="1";

while ($counter <= $ccc) {
$stm2->execute();
print "Counter: $counter\n";

##JSON output {"empno":7369,"ename":"SMITH","job":"CLERK","mgr":"7902","hiredate":"17-DEC-80","sal":"800","comm":"200","deptno":"20","dml_type":"U"}

my $json = $aqdata;
my $decoded_json = decode_json($json);
print $decoded_json->{"dml_type"};
print "\n";
print $decoded_json->{"ename"};
print "\n";

if ($decoded_json->{"dml_type"} eq "I"){

$coll->insert({empno => $decoded_json->{"empno"},ename => $decoded_json->{"ename"}, job => $decoded_json->{"job"}, mgr => $decoded_json->{"mgr"}, hiredate => $decoded_json->{"hiredate"}, sal => $decoded_json->{"sal"}, comm => $decoded_json->{"comm"}, deptno => $decoded_json->{"deptno"}});

print "Insert if \n";
}
if ($decoded_json->{"dml_type"} eq "U"){
$coll->remove({"empno" => $decoded_json->{"empno"}});

$coll->insert({empno => $decoded_json->{"empno"},ename => $decoded_json->{"ename"}, job => $decoded_json->{"job"}, mgr => $decoded_json->{"mgr"}, hiredate => $decoded_json->{"hiredate"}, sal => $decoded_json->{"sal"}, comm => $decoded_json->{"comm"}, deptno => $decoded_json->{"deptno"}});

print "Update if \n";
}
if ($decoded_json->{"dml_type"} eq "D"){
$coll->remove({"empno" => $decoded_json->{"empno"}});
print "Delete if \n";
}
$counter ++;
}

--------------Script END------------------

Step 9: Executing and testing the data replication

SQL> update emp set comm=200 where empno=7369;

1 row updated.

SQL> commit;

Commit complete.

SQL> select count(*) from aq$EMPQUEUE_QT;

COUNT(*)
----------
1

###Executing the perl script

[oracle@hostname1 perlstuff]$ perl o2m_emp.pl
conn MongoDB::Connection=HASH(0x1b45c630)
db MongoDB::Database=HASH(0x1b930580)
Collection: MongoDB::Collection=HASH(0x1c1c5210)
Message Count: 1
Counter: 1
U
SMITH
Update if

####Checking the data in mongoDB
[oracle@hostname1 ~]$ mongo --port 27020
MongoDB shell version: 2.0.5
connecting to: 127.0.0.1:27020/test
> use test
switched to db test
> show collections
emp
system.indexes
> db.emp.find()

{ "_id" : ObjectId("4fc016298b93a68269000000"), "comm" : "200", "sal" : "800", "job" : "CLERK", "mgr" : "7902", "hiredate" : "17-DEC-80", "deptno" : "20", "ename" : "SMITH", "empno" : NumberLong(7369) }

>

Step 10: Seed existing data to MongoDB from Oracle

We had 14 rows existing in SCOTT.EMP table. We might want to load those rows in MongoDB and then start the replication.
Spool the output of the below SQL

sqlplus scott/tiger

select '$collection->insert({"EMPNO"=>'||empno||',"ENAME"=>"'||ename||'","JOB"=>"'||job||'","MGR"=>"'||mgr||'","HIREDATE"=>"'||hiredate||'","SAL"=>"'||sal||'","COMM"=>"'||comm||'","DEPTNO"=>"'||deptno||'"});' from emp;

### Create a perl script to load these into MongoDB:
--------------Script Start------------------
#!/usr/bin/perl

use strict;
use MongoDB;
use MongoDB::OID;
use Time::Local;

my $conn=MongoDB::Connection->new(host => 'hostname1:27020');

print "conn $conn\n";

my $db=$conn->test;

print "db $db\n";

my $collection=$db->emp;

print "collection $collection\n";

my $cursor=$collection->find;

print "cursor $cursor\n\n";

$collection->insert({"EMPNO"=>7369,"ENAME"=>"SMITH","JOB"=>"CLERK","MGR"=>"7902","HIREDATE"=>"17-DEC-80","SAL"=>"800","COMM"=>"200","DEPTNO"=>"20"});

$collection->insert({"EMPNO"=>7499,"ENAME"=>"ALLEN","JOB"=>"SALESMAN","MGR"=>"7698","HIREDATE"=>"20-FEB-81","SAL"=>"1600","COMM"=>"300","DEPTNO"=>"30"});

$collection->insert({"EMPNO"=>7521,"ENAME"=>"WARD","JOB"=>"SALESMAN","MGR"=>"7698","HIREDATE"=>"22-FEB-81","SAL"=>"1250","COMM"=>"500","DEPTNO"=>"30"});

$collection->insert({"EMPNO"=>7566,"ENAME"=>"JONES","JOB"=>"MANAGER","MGR"=>"7839","HIREDATE"=>"02-APR-81","SAL"=>"2975","COMM"=>"","DEPTNO"=>"20"});

$collection->insert({"EMPNO"=>7654,"ENAME"=>"MARTIN","JOB"=>"SALESMAN","MGR"=>"7698","HIREDATE"=>"28-SEP-81","SAL"=>"1250","COMM"=>"1400","DEPTNO"=>"30"});

$collection->insert({"EMPNO"=>7698,"ENAME"=>"BLAKE","JOB"=>"MANAGER","MGR"=>"7839","HIREDATE"=>"01-MAY-81","SAL"=>"2850","COMM"=>"","DEPTNO"=>"30"});

$collection->insert({"EMPNO"=>7782,"ENAME"=>"CLARK","JOB"=>"MANAGER","MGR"=>"7839","HIREDATE"=>"09-JUN-81","SAL"=>"2450","COMM"=>"","DEPTNO"=>"10"});

$collection->insert({"EMPNO"=>7788,"ENAME"=>"SCOTT","JOB"=>"ANALYST","MGR"=>"7566","HIREDATE"=>"19-APR-87","SAL"=>"3000","COMM"=>"","DEPTNO"=>"20"});

$collection->insert({"EMPNO"=>7839,"ENAME"=>"KING","JOB"=>"PRESIDENT","MGR"=>"","HIREDATE"=>"17-NOV-81","SAL"=>"5000","COMM"=>"","DEPTNO"=>"10"});

$collection->insert({"EMPNO"=>7844,"ENAME"=>"TURNER","JOB"=>"SALESMAN","MGR"=>"7698","HIREDATE"=>"08-SEP-81","SAL"=>"1500","COMM"=>"0","DEPTNO"=>"30"});

$collection->insert({"EMPNO"=>7876,"ENAME"=>"ADAMS","JOB"=>"CLERK","MGR"=>"7788","HIREDATE"=>"23-MAY-87","SAL"=>"1100","COMM"=>"","DEPTNO"=>"20"});

$collection->insert({"EMPNO"=>7900,"ENAME"=>"JAMES","JOB"=>"CLERK","MGR"=>"7698","HIREDATE"=>"03-DEC-81","SAL"=>"950","COMM"=>"","DEPTNO"=>"30"});

$collection->insert({"EMPNO"=>7902,"ENAME"=>"FORD","JOB"=>"ANALYST","MGR"=>"7566","HIREDATE"=>"03-DEC-81","SAL"=>"3000","COMM"=>"","DEPTNO"=>"20"});

$collection->insert({"EMPNO"=>7934,"ENAME"=>"MILLER","JOB"=>"CLERK","MGR"=>"7782","HIREDATE"=>"23-JAN-82","SAL"=>"1300","COMM"=>"","DEPTNO"=>"10"});

--------------Script END------------------

DEMO for AQ: http://psoug.org/reference/aq_demo1.html

MongoDB Install and Quick Setup

1.) Download MongoDB from - http://www.mongodb.org/downloads
Copy is over to your host, unzip it.
Thats all to install Mongo DB.

2.) Create directory for storing your data

mkdir /home/oracle/data
You can create this directory anywhere on the host/filer/shared storage.

3.) Start mongoDB (http://www.mongodb.org/display/DOCS/Starting+and+Stopping+Mongo#StartingandStoppingMongo-RunningasaDaemon)

cd to bin directory inside the downloaded/uncompress mongoDB software.
cd /home/oracle/mongodb/bin

run ./mongod

By default it will start the mongoDB on port 27017.

Startup options:

If you want to change the port start it as:
./mongod --port 12345

To specify the data location:

./mongod --dbpath /home/oracle/data --port 12345

To specify the log directory path (store DB log like the alert log in oracle):

./mongod --dbpath /home/oracle/data --port 12345 -logpath /home/oracle/logs/mongo.log

To run it as a daemon

./mongod --dbpath /home/oracle/data --port 12345 --fork --logpath /home/oracle/logs/mongo.log

To get all the command like options (http://www.mongodb.org/display/DOCS/Command+Line+Parameters)

./mongod -h

4.) Login/Connect to the mongoDB database instance

./mongo --port

./mongo --port 12345

Note: Just by starting multiple mongoDB instance on multiple port creates multiple mongoDB instances.

You can now run mongoDB commands

./mongo --port 12345
PRIMARY> use admin
switched to db admin
PRIMARY> show dbs --> Shows the databases.
local 10.1982421875GB
test 0.203125GB
PRIMARY> db.help() --> Give list of Admin commands
DB methods:
db.addUser(username, password[, readOnly=false])
db.auth(username, password)
db.cloneDatabase(fromhost)
db.commandHelp(name) returns the help for the command
db.copyDatabase(fromdb, todb, fromhost)
db.createCollection(name, { size : ..., capped : ..., max : ... } )
db.currentOp() displays the current operation in the db
db.dropDatabase()
db.eval(func, args) run code server-side
db.getCollection(cname) same as db['cname'] or db.cname
db.getCollectionNames()
db.getLastError() - just returns the err msg string
db.getLastErrorObj() - return full status object
db.getMongo() get the server connection object
db.getMongo().setSlaveOk() allow this connection to read from the nonmaster member of a replica pair
db.getName()
db.getPrevError()
db.getProfilingLevel() - deprecated
db.getProfilingStatus() - returns if profiling is on and slow threshold
db.getReplicationInfo()
db.getSiblingDB(name) get the db at the same server as this one
db.isMaster() check replica primary status
db.killOp(opid) kills the current operation in the db
db.listCommands() lists all the db commands
db.logout()
db.printCollectionStats()
db.printReplicationInfo()
db.printSlaveReplicationInfo()
db.printShardingStatus()
db.removeUser(username)
db.repairDatabase()
db.resetError()
db.runCommand(cmdObj) run a database command. if cmdObj is a string, turns it into { cmdObj : 1 }
db.serverStatus()
db.setProfilingLevel(level,) 0=off 1=slow 2=all
db.shutdownServer()
db.stats()
db.version() current version of the server
db.getMongo().setSlaveOk() allow queries on a replication slave server
db.fsyncLock() flush data to disk and lock server for backups
db.fsyncUnock() unlocks server following a db.fsyncLock()

PRIMARY> use test --> Change database
switched to db test
PRIMARY> show collections ---> Query the tables/collections in the database
customer
person
system.indexes
PRIMARY> db.customer.count() ---> select the row count (in oracle - select count(*) from customer;)
49996

5.) Shutdown MongoDB

[oracle@host1 ~]$ ps -ef|grep mongo

oracle   11899     1  0 12:25 ?        00:00:00 mongod --port 12345 --dbpath /home/oracle/test --rest --fork --logpath /home/oracle/test/mongo.log

oracle 11912 11755 0 12:25 pts/0 00:00:00 grep mongo
[oracle@host1 ~]$ mongo --port 12345
MongoDB shell version: 2.0.5
connecting to: 127.0.0.1:12345/test
> use admin
switched to db admin
> db.shutdownServer()
Fri May 25 12:26:04 DBClientCursor::init call() failed
Fri May 25 12:26:04 query failed : admin.$cmd { shutdown: 1.0 } to: 127.0.0.1:12345
server should be down...
Fri May 25 12:26:04 trying reconnect to 127.0.0.1:12345
Fri May 25 12:26:04 reconnect 127.0.0.1:12345 ok
Fri May 25 12:26:04 Socket say send() errno:104 Connection reset by peer 127.0.0.1:12345
Fri May 25 12:26:04 Error: error doing query: unknown shell/collection.js:151
> exit
bye
[oracle@host1 ~]$ ps -ef|grep mongo
oracle 11919 11755 0 12:26 pts/0 00:00:00 grep mongo

Thursday, May 24, 2012

How to display Oracle PL/SQL "OUT" variable in perl

Step 1: Create a procedure in oracle with OUT variable

create procedure testing (timendate out varchar2)
as
begin
select systimestamp into timendate from dual;
end;
/

Note: Above proc outputs systimestamp.

Step 2: Test the proc output in Oracle

set serveroutput on
declare
time varchar2(200);
begin
testing(time);
dbms_output.put_line('Returned: ' ||time);
end;
/

The output should be like:
Returned: 23-MAY-12 03.20.00.502124 PM -07:00

PL/SQL procedure successfully completed.

Step 3: Get this output from a perl script

#!/usr/bin/perl

#Adding the Oracle DBD Perl modules
use DBI;
use DBD::Oracle;

#Defining the connection to the Oracle DB
my $dbh = DBI->connect("dbi:Oracle:host=hostname;port=1521;sid=test","system","oracle")
or die "Error: " . $DBI->errstr;

#Perl variable for storing the PL SQL Proc output.
my $proc_out;

#Preparing the Oracle Statement to be execute. ":proc_out" is the oracle variable.
my $sql = "BEGIN testing(:proc_out); END;";
my $stm = $dbh->prepare($sql);

#bind_param_inout takes a REFERENCE to program variable as an argument. It passes the pointer to variable to Oracle, not the variable itself

$stm->bind_param_inout(":proc_out", \$proc_out,400);

#execute the SQL Statement
$stm->execute()
or die "Database Error: " . $sth->errstr;

#Print the output
print "Current System Time Is: $proc_out\n";

Output should be like:

Current System Time Is: 23-MAY-12 03.25.55.364674 PM -07:00

Detailed reference from http://www.dba-oracle.com/t_dbi_perl_bind_calls.htm

Saturday, May 19, 2012

Install & Configure Perl DBD for Oracle 11.2 on Redhat Linux

Steps 1: Install Oracle 11.2 server/client on the host

You can download oracle from download.oracle.com

Steps 2: Install Perl on the host.

Check if perl is already installed on the host:
# perl -v
Most hosts have perl already installed.
You can download and install perl from http://www.perl.org/get.html.

Step 3: Download PERL DBD-Oracle

Download link: http://search.cpan.org/~pythian/DBD-Oracle-1.44/

Step 4: unzip and untar the download DBD-Oracle

[oracle@host1 tmp]$ gunzip DBD-Oracle-1.44.tar.gz
[oracle@host1 tmp]$ tar -xvf DBD-Oracle-1.44.tar

Step 5: Create file oci.conf

Create file oci.conf at /etc/ld.so.conf.d/ as root with the location of Oracle LD_LIBRARY_PATH

For this example:
ORACLE_HOME=/u01/product/11.2
PATH=$PATH:$ORACLE_HOME/bin
LD_LIBRARY_PATH=/u01/product/11.2/lib

[root@host1 ~]$ more /etc/ld.so.conf.d/oci.conf
/u01/product/11.2/lib

Run ldconfig to update ld.so.conf

[root@host1 ld.so.conf.d]# ldconfig -v

What is ldconfig (from the man pages)
"DESCRIPTION: ldconfig creates the necessary links and cache to the most recent shared libraries found in the directories specified on the command line, in the file /etc/ld.so.conf, and in the trusted directories (/lib and /usr/lib). The cache is used by the run-time linker, ld.so or ld-linux.so. ldconfig checks the header and filenames of the libraries it encounters when determining which versions should have their links updated."
For more details: http://linux.die.net/man/8/ldconfig

Step 6: Install DBD-Oracle

Go to the directory where u untared the downloaded DBD-Oracle
Note: Make sure u have completed Step 5

[root@host1 DBD-Oracle-1.44]# perl Makefile.PL -V 11.2.0
[root@host1 DBD-Oracle-1.44]# make install

This should complete the DBD-Oracle installation!!!!!

Step 7: Test the install

$host="host1";
$ora_listener="LISTENER";
$oracle_sid="test";
$listener_port="1521";
$ora_user="system";
$ora_password="oracle";
$db_table="dba_users";

use DBI;
use DBD::Oracle;

my $dbh = DBI->connect("dbi:Oracle:host=$host;port=$listener_port;sid=$oracle_sid",$ora_user, $ora_password)
or die "Error Connecting to Oracle : " . DBI->errstr;

my $stm = $dbh->prepare("SELECT username,created FROM $db_table")
or die "Database Error: " . $dbh->errstr;

$stm->execute()
or die "Database Error: " . $sth->errstr;

while (( $username,$created ) = $stm->fetchrow_array() )
{
print "\n";
print " Username: $username\n";
print " Created Date: $created\n";
print "\n";
}

$stm->finish;

$dbh->disconnect;

Apun's Weblog

AWS

Non Oracle

RAC & DG Related

DBA Related

Popular Posts

Blog Archive