Hive WebHCat安装(1.0.0)-测试未成功 2015-07-25 20:00

说明

WebHCat已经集成在Hive软件包中,无须下载。配置完成后,启动即可。

设置环境变量

在/etc/bashrc中增加如下环境变量:

1
2
3
4
5
export HADOOP_PREFIX=$HADOOP_HOME
export HIVE_HOME=/opt/hive
export TEMPLETON_HOME=/opt/hive/hcatalog
export PYTHON_CMD=/usr/bin/python
export PATH=$PATH:$TEMPLETON_HOME

使环境变量立即生效:

source /etc/bashrc

修改配置文件

说明

WebHCat配置文件路径: /opt/hive/hcatalog/etc/webhcat/webhcat-default.xml

配置hive-webhcat-1.0.0.jar文件路径

将此文件拷贝到配置文件中默认指定的路径:

cd /opt/hive/hcatalog/share/webhcat/svr/lib
cp hive-webhcat-1.0.0.jar ..

配置zookeeper jar文件路径:

cp /opt/hive/lib/zookeeper-3.4.6.jar .
1
2
3
4
5
<property>
    <name>templeton.libjars</name>
    <value>${env.TEMPLETON_HOME}/share/webhcat/svr/lib/zookeeper-3.4.6.jar</value>
    <description>Jars to add to the classpath.</description>
</property>

配置hadoop-streaming包

hdfs dfs -mkdir /user/templeton
hdfs dfs -put /opt/hadoop/share/hadoop/tools/lib/hadoop-streaming-2.4.1.jar /user/templeton
1
2
3
4
5
<property>
    <name>templeton.streaming.jar</name>
    <value>hdfs://ctrl:9000/user/templeton/hadoop-streaming-2.4.1.jar</value>
    <description>The hdfs path to the Hadoop streaming jar file.</description>
</property>

Hive相关配置

这里使用的Hive tar.gz包不是直接下载下来的(因为找不到原来版本的下载地址),而是将经运行的hive目录打包。

tar -zcvf apache-hive-1.0.0-bin.tar.gz hive
hdfs dfs -put ./apache-hive-1.0.0-bin.tar.gz /user/templeton
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
<property>
    <name>templeton.hive.archive</name>
    <value>hdfs://ctrl:9000/user/templeton/apache-hive-1.0.0-bin.tar.gz</value>
    <description>The path to the Hive archive.</description>
  </property>

  <property>
    <name>templeton.hive.path</name>
    <value>apache-hive-1.0.0-bin.tar.gz/hive/bin/hive</value>
    <description>The path to the Hive executable.</description>
  </property>

  <property>
    <name>templeton.hive.home</name>
    <value>apache-hive-1.0.0-bin.tar.gz/hive</value>
    <description>
      The path to the Hive home within the tar.  This is needed if Hive is not nstalled on all
      nodes in the cluster and needs to be shipped to the target node in the clster to execute Pig
      job which uses HCat, Hive query, etc.  Has no effect if templeton.hive.arhive is not set.
    </description>
  </property>

  <property>
    <name>templeton.hcat.home</name>
    <value>apache-hive-1.0.0-bin.tar.gz/hive/hcatalog</value>
    <description>
      The path to the HCat home within the tar.  This is needed if Hive is not nstalled on all
      nodes in the cluster and needs to be shipped to the target node in the clster to execute Pig
      job which uses HCat, Hive query, etc.  Has no effect if templeton.hive.arhive is not set.
    </description>
  </property>

  <property>
    <name>templeton.hive.properties</name>
    <value>hive.metastore.uris=thrift://data02:9083,thrift://data03:9083,hive.mtastore.sasl.enabled=false</value>
    <description>Properties to set when running hive (during job sumission).  Tis is expected to
        be a comma-separated prop=value list.  If some value is itself a comma-eparated list the
        escape character is '\'</description>
  </property>

其他配置

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
<property>
    <name>templeton.zookeeper.hosts</name>
    <value>data01:2181,data02:2181,data03:2181</value>
    <description>ZooKeeper servers, as comma separated host:port pairs</descripion>
  </property>

  <property>
    <name>templeton.hcat</name>
    <value>${env.TEMPLETON_HOME}/bin/hcat.py</value>
    <description>The path to the hcatalog executable.</description>
  </property>

使用

启动

sbin/webhcat_server.sh start

维护信息

监听端口:50111
jps显示名称:RunJar

测试安装结果

curl http://localhost:50111/templeton/v1/status | python -mjson.tool

参考文档

  1. WebHCat Rest API List
Tags: #Hive #WebHCat    Post on Hadoop