Hue安装指南 2015-05-13 19:00

准备

预安装好如下组件:

  • JDK
  • maven
  • Git

同时安装如软件:

yum install -y gcc libxml2-devel libxslt-devel cyrus-sasl-devel mysql-devel python-devel python-setuptools python-simplejson sqlite-devel ant gmp-devel
yum install -y cyrus-sasl-plain cyrus-sasl-devel cyrus-sasl-gssapi

安装过程

下载源代码

1
2
3
ssh ctrl
cd /opt/
git clone https://github.com/cloudera/hue.git

安装指定的分支版本:

git clone https://github.com/cloudera/hue.git branch-3.7.1

修改配置文件

  • 修改maven/pom.xml文件

将Hadoop和Spark修改为相应的版本:

1
2
<hadoop-mr1.version>2.4.1</hadoop-mr1.version>
<hadoop.version>2.4.1</hadoop.version>

将hadoop-core修改为hadoop-common

1
<artifactId>hadoop-core</artifactId>

改为:

1
<artifactId>hadoop-common</artifactId>

将hadoop-test的版本改为1.2.1(原因未知):

1
2
<artifactId>hadoop-test</artifactId>
<version>1.2.1</version>
  • 删除Hadoop v1的无关文件
1
rm desktop/libs/hadoop/java/src/main/java/org/apache/hadoop/mapred/ThriftJobTrackerPlugin.java

编译

这个过程时间较久,请耐心等待。

1
make apps

修改安装后的配置信息

1
vi desktop/conf/pseudo-distributed.ini

启动

1
build/env/bin/hue runserver 0.0.0.0:8000

对接配置

配置文件路径

/opt/hue/desktop/conf/pseudo-distributed.ini

HDFS

在NameNode的hdfs-site.xml增加如下配置:

1
2
3
4
<property>
  <name>dfs.webhdfs.enabled</name>
  <value>true</value>
</property>

在NameNode的core-site.xml增加如下配置:

1
2
3
4
5
6
7
8
<property>
  <name>hadoop.proxyuser.hue.hosts</name>
  <value>*</value>
</property>
<property>
  <name>hadoop.proxyuser.hue.groups</name>
  <value>*</value>
</property>

Hue中的配置:

如果Hue与NameNode在同一个节点上,保持默认配置即可。

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
[hadoop]

  [[hdfs_clusters]]

    [[[default]]]

      # Enter the filesystem uri
      fs_defaultfs=hdfs://localhost:8020

      # Use WebHdfs/HttpFs as the communication mechanism.
      # Domain should be the NameNode or HttpFs host.
      webhdfs_url=http://localhost:50070/webhdfs/v1

YARN

  • 启动JobHistoryServer:

mapred-site.xml:

1
2
3
4
<property>
    <name>mapreduce.jobhistory.webapp.address</name>
    <value>0.0.0.0:19888</value>
</property>
/opt/hadoop/sbin/mr-jobhistory-daemon.sh start historyserver
  • 启动ProxyServer

yarn-site.xml:

1
2
3
4
5
6
7
8
<property>
    <name>yarn.resourcemanager.webapp.address</name>
    <value>${yarn.resourcemanager.hostname}:8088</value>
</property>
<property>
    <name>yarn.web-proxy.address</name>
    <value>${yarn.resourcemanager.hostname}:8888</value>
</property>

YARN启动WebAppProxyServer:

yarn-daemon.sh start proxyserver

启动完毕,通过jps命令可以看到WebAppProxyServer进程。

Hue中增加如下配置:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
[hadoop]

  [[yarn_clusters]]

    [[[default]]]

      # Enter the host on which you are running the ResourceManager
      resourcemanager_host=ctrl

      # Whether to submit jobs to this cluster
      submit_to=True

      # URL of the ResourceManager API
      resourcemanager_api_url=http://ctrl:8088

      # URL of the ProxyServer API
      proxy_api_url=http://ctrl:8888

      # URL of the HistoryServer API
      history_server_api_url=http://ctrl:19888

注意:如果Hue连接不上Yarn,检查一下Yarn监听的IP是0.0.0.0还是192.168.1.0。

Hive

将HiveServer2节点hive-site.xml中的hive.server2.authentication配置项修改为NOSASL :

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
<property>
    <name>hive.server2.authentication</name>
    <value>NOSASL</value>
    <description>
      Expects one of [nosasl, none, ldap, kerberos, pam, custom].
      Client authentication types.
        NONE: no authentication check
        LDAP: LDAP/AD based authentication
        KERBEROS: Kerberos/GSSAPI authentication
        CUSTOM: Custom authentication provider
                (Use with property hive.server2.custom.authentication.class)
        PAM: Pluggable authentication module
        NOSASL:  Raw transport
    </description>
  </property>

Hue中配置:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
[beeswax]

  # Host where HiveServer2 is running.
  # If Kerberos security is enabled, use fully-qualified domain name (FQDN).
  hive_server_host=localhost

  # Port where HiveServer2 Thrift server runs on.
  hive_server_port=10000

  # Hive configuration directory, where hive-site.xml is located
  hive_conf_dir=/opt/hive/conf

HBase

HBase启动Thrift Server:

hbase thrift start &

Hue中配置:

1
2
3
4
5
6
7
8
9
[hbase]
  # Comma-separated list of HBase Thrift servers for clusters in the format of
  # '(name|host:port)'.
  # Use full hostname with security.
  # If using Kerberos we assume GSSAPI SASL, not PLAIN.
  hbase_clusters=(QingCloudHBase|localhost:9090)

  # HBase configuration directory, where hbase-site.xml is located.
  hbase_conf_dir=/opt/hbase/conf

Zookeeper

Zookeeper需要启动REST服务:

1
2
3
4
5
ssh data01
#编译zookeeper
cd /opt/zookeeper
cd /opt/zookeeper/src/contrib/rest
ant run

默认Rest服务监听9888端口。 测试:访问http://data01:9998/znodes/v1/ 。

备注:如果是在其他目录中,则执行如下命令启动REST服务。

/usr/bin/ant -f /opt/zookeeper/src/contrib/rest/build.xml run

参考:Zookeeper开启Rest服务(3.4.6)

Hue中配置:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
[zookeeper]

  [[clusters]]

    [[[default]]]
      # Zookeeper ensemble. Comma separated list of Host/Port.
      # e.g. localhost:2181,localhost:2182,localhost:2183
      host_ports=data01:2181,data02:2181,data03:2181

      # The URL of the REST contrib service (required for znode browsing).
      rest_url=http://data01:9998,http://data02:9998,http://data03:9998

      # Name of Kerberos principal when using security.
      ## principal_name=zookeeper

参考文档

  1. How to configure Hue for your Hadoop cluster
  2. Hue安装配置实践
Tags: #Hue    Post on Hadoop