Monday, April 9, 2012

Hive ODBC Driver on Ubuntu

The Apache Hive is used for managing large datasets residing in distributed storage. It provides the SQL like query language (HiveQL) to interact with datasets. It also provides shell utility which can be used to run Hive queries. But if you need to use the HiveQL from other application/software, some short of driver is required.

The Hive ODBC Driver is a software library that implements the Open Database Connectivity (ODBC) Ahttp://www.blogger.com/img/blank.gifPI standard for the Hive database management systemhttp://www.blogger.com/img/blank.gif, http://www.blogger.com/img/blank.gifenabling ODBC compliant applications to interact seamlessly (ideally) with Hive through a standard interface.

To know how to install and use the driver, I have gone through https://cwiki.apache.org/Hive/hiveodbc.html. I followed the given steps using latest version of Hadoop, Hive and Thrift. But I could not get success to compile the code. After that I have gone through lot many internet pages and finally compiled the code.
http://www.blogger.com/img/blank.gif
The article describes the steps/method which I have used to compile the code.

I have experimented following steps on Ubuntu-10.10.

Install Apache Hadoop

Please refer http://hadoop.apache.org/common/docs/r0.20.2/quickstart.html to download and install the Apache Hadoop on Ubuntu.

Install Apache Hive

Please follow the instruction given on https://cwiki.apache.org/Hive/adminmanual-installation.html to download and install Apache Hive.

Install Thrift-0.5.0

1. Install required tools
$ sudo apt-get install libboost-dev libevent-dev python-dev automake pkg-config libtool flex bison
$ sudo apt-get install php5-dev
$ sudo apt-get install ant
$ sudo apt-get install openjdk-6-jdk
$ sudo apt-get install bjam
$ sudo apt-get install libboost-all-dev
2. Download Thrift-0.5.0
Get the source code for Thrift-0.5.0 from http://archive.apache.org/dist/incubator/thrift/0.5.0-incubating/thrift-0.5.0.tar.gz
3. Install Thrift
Make and install thrift after tar extraction with the following commands:
#Configure and build thrift compiler and libraries
$ cd thrift-0.5.0
$ ./configure –without-csharp –without-ruby –prefix=<thrift_install_path>
$ make
# Install thrift
$ make install
# Configure, build, and install fb303
$ cd contrib/fb303
$ ./bootstrap.sh
$ ./configure –with-thriftpath=<thrift_install_path> –prefix <thrift_install_path>
$ make && make install

Note: Solve compilation errors by including missing include files

Build and test Hive Client

Build the Hive client by running the following command from HIVE_HOME
$ ant compile-cpp -Dthrift.home=<THRIFT_HOME>

Execute the Hive client tests by running the following command from HIVE_HOME/odbc/
$ ant test -Dthrift.home=<THRIFT_HOME>

To install the Hive client libraries onto your machine, run the following command from HIVE_HOME/odbc/
$ sudo ant install -Dthrift.home=<THRIFT_HOME>

If you encounter any issue please refer “Hive Client Build/Setup” section on https://cwiki.apache.org/Hive/hiveodbc.html

Build Unix ODBC Wrapper

In the unixODBC root directory, run the following command:
$ ./configure –enable-gui=no –prefix=<unixODBC_INSTALL_DIR>

Compile the unixODBC API wrapper with the following:
$ make

Run the following from the unixODBC root directory:
$ sudo make install

If you encounter any issue please refer “unixODBC API Wrapper Build/Setup” section on https://cwiki.apache.org/Hive/hiveodbc.html

After compilation, the driver will be located at <UnixODBC_BUILD_DIR>/Drivers/hive/.libs/libodbchive.so.1.0.0.

You can manually install the unixODBC API wrapper by doing the following:
$ cp <UnixODBC_BUILD_DIR>/Drivers/hive/.libs/libodbchive.so.1.0.0 <SYSTEM_INSTALL_DIR>
$ cd <SYSTEM_INSTALL_DIR>
$ ln -s libodbchive.so.1.0.0 libodbchive.so
$ ldconfig

Connecting Driver to the Driver Manager

1. Export LD_LIBRARY_PATH and LD_PRELOAD with proper libraries
LD_LIBRARY_PATH should contain a list of directories having library file libodbchive.so, libhiveclient.so and libthrift.so. Generally all of these files are there in /usr/local/lib/
Use this command to export LD_LIBRARY_PATH:
$ export LD_LIBRARY_PATH=/usr/local/lib:$LD_LIBRARY_PATH

After that run this command:
$ export LD_PRELOAD=/usr/local/lib/libodbchive.so
2. Connecting Driver to the Driver Manager

Connect the Driver to a Driver Manager as describe in “Connecting the Driver to a Driver Manager” section of https://cwiki.apache.org/Hive/hiveodbc.html.
3. Testing with ISQL
You will be able to interactively test the driver with isql by using following command:
$ isql -v Hive