Apache OOZIE installation step-by-step on Ubuntu

1) Download "oozie-4.1.0.tar.gz"

2) Gunzip and Untar @ /opt/ds/app/oozie

3) Change directory to /opt/ds/app/oozie/oozie-4.1.0

4) Execute

bin/mkdistro.sh -DskipTests -Dhadoopversion=2.2.0

5) Change directory to /opt/ds/app/oozie/oozie-4.1.0/distro/target/oozie-4.1.0-distro/oozie-4.1.0

6) Edit '.bashrc' and add

export OOZIE_VERSION=4.1.0

export OOZIE_HOME=/opt/ds/app/oozie/oozie-4.1.0/distro/target/oozie-4.1.0-distro/oozie-4.1.0

export PATH=$PATH:$OOZIE_HOME/bin

7) Change directory to /opt/ds/app/oozie/oozie-4.1.0/distro/target/oozie-4.1.0-distro/oozie-4.1.0

8) Make directory 'libext'

9) Execute:
>cp /opt/ds/app/oozie/oozie-4.1.0/hcataloglibs/target/oozie-4.1.0-hcataloglibs.tar.gz .
>tar xzvf oozie-4.1.0-hcataloglibs.tar.gz
>cp oozie-4.1.0/hadooplibs/hadooplib-2.3.0.oozie-4.1.0/* libext/
>cd libext/

10) Download 'ext-2.2.zip'and place it in 'libext/' directory

11) Add below properties for your user in "core-site.xml".

<property>
<name>hadoop.proxyuser.USERNAME.hosts</name>
<value>*</value>
</property>

<property>
<name>hadoop.proxyuser.USERNAME.groups</name>
<value>*</value>
</property>

Note:- Replace USERNAME with your actual user. In my case name is "dsuser".

12) Now execute below command from shell:

oozie-setup.sh prepare-war
setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"

INFO: Adding extension: /usr/lib/oozie/oozie-bin/libext/activation-1.1.jar
.....................
..............................
New Oozie WAR file with added 'ExtJS library, JARs' at /opt/ds/app/oozie/oozie-4.1.0/distro/target/oozie-4.1.0-distro/oozie-4.1.0

INFO: Oozie is ready to be started.

13) Please note that in above step if "ExtJS library" is not added to war then web console will not get opened.

14) Next step is to prepare share lib

oozie-setup.sh sharelib create -fs hdfs://abcdHost:54310
setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"
.....
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
the destination path for sharelib is: /user/dsuser/share/lib/lib_20150216191242

15) Next step is to update "ozzie-site.xml"

<name>oozie.service.HadoopAccessorService.hadoop.configurations</name>

<value>*=/opt/ds/app/hadoop-2.2.0/etc/hadoop</value>

Comma separated AUTHORITY=HADOOP_CONF_DIR, where AUTHORITY is the HOST:PORT of

the Hadoop service (JobTracker, HDFS). The wildcard '*' configuration is

used when there is no exact match for an authority. The HADOOP_CONF_DIR contains

the relevant Hadoop *-site.xml files. If the path is relative is looked within

the Oozie configuration directory; though the path can be absolute (i.e. to point

to Hadoop client conf/ directories in the local filesystem.

</description>

</property>

<name>oozie.service.WorkflowAppService.system.libpath</name>

<value>/user/${user.name}/share/lib</value>

System library path to use for workflow applications.

This path is added to workflow application if their job properties sets

the property 'oozie.use.system.libpath' to true.

</description>

</property>

16) Create oozie DB

oozie-setup.sh db create -run

setting CATALINA_OPTS="$CATALINA_OPTS -Xmx1024m"

Validate DB Connection

DONE

Check DB schema does not exist

DONE

Check OOZIE_SYS table does not exist

DONE

Create SQL schema

DONE

Create OOZIE_SYS table

DONE

Oozie DB has been created for Oozie version '4.1.0'

The SQL commands have been written to: /tmp/ooziedb-8336919621541544603.sql

17) Start OOZIE

oozied.sh start

18) Verify oozie web console

oozie admin -oozie http://localhost:11000/oozie -status

Comments

UnknownNovember 2, 2015 at 2:53 AM
great!!! This blog was really helpful :)
ReplyDelete
Replies
DineshNovember 3, 2015 at 4:49 AM
Thank you @Prince
ReplyDelete
Replies

Add comment

QueryDB

Search This Blog

Apache OOZIE installation step-by-step on Ubuntu

Comments

Post a Comment

Popular posts

Read from a hive table and write back to it using spark sql

Hive Parse JSON with Array Columns and Explode it in to Multiple rows.

Caused by: java.lang.UnsupportedOperationException: org.apache.parquet.column.values.dictionary.PlainValuesDictionary$PlainIntegerDictionary

org.apache.spark.sql.AnalysisException: Cannot overwrite a path that is also being read from.;

Hadoop Distcp Error Duplicate files in input path