2017年12月

本文采用知识共享 署名-相同方式共享 4.0 国际 许可协议进行许可。
访问 https://creativecommons.org/licenses/by-sa/4.0/ 查看该许可协议。

PS:WEB-UI里可能会有报错,但不影响普通使用。只是在接入HUE时,会有报错,网上无资料,最近忙碌,暂时未解决。
PS:There may be an error in WEB-UI, but it does not affect the normal use. Only when accessing HUE, there will be an error, there is no information on the Internet, and it has been busy recently and has not been resolved yet.

Profile

#HBase
HBASE_HOME=/usr/java/hbase/hbase2.0
export PATH=$HBASE_HOME/bin:$PATH

$HBASE_HOME/conf/hbase-env.sh

28 export JAVA_HOME=/usr/java/jdk/jdk8
125 export HBASE_MANAGES_ZK=true

$HBASE_HOME/conf/hbase-site.xml

<property>
        <name>hbase.rootdir</name>
        <value>hdfs://cat4:9000/hbase</value>
</property>
<property>
        <name>hbase.cluster.distributed</name>
        <value>true</value>
</property>
<property>
        <name>hbase.zookeeper.quorum</name>
        //使用主机名测试失败,换用ip
        <value>192.0.96.14</value>
</property>
<property>
        <name>dfs.replication</name>
        <value>1</value>
</property>

接着出现了Hmaster启动一段时间就挂,log里是这样:

2018-06-10 07:00:34,604 ERROR [master/Cat4:16000] master.HMaster: ***** ABORTING master cat4,16000,1528628425902: Unhandled exception. Starting shutdown. *****
java.lang.IllegalStateException: The procedure WAL relies on the ability to hsync for proper operation during component failures, but the underlying filesystem does not support doing so. Please check the config value of 'hbase.procedure.store.wal.use.hsync' to set the desired level of robustness and ensure the config value of 'hbase.wal.dir' points to a FileSystem mount that can provide it.
        at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.rollWriter(WALProcedureStore.java:1043)
        at org.apache.hadoop.hbase.procedure2.store.wal.WALProcedureStore.recoverLease(WALProcedureStore.java:382)
        at org.apache.hadoop.hbase.procedure2.ProcedureExecutor.start(ProcedureExecutor.java:530)
        at org.apache.hadoop.hbase.master.HMaster.startProcedureExecutor(HMaster.java:1222)
        at org.apache.hadoop.hbase.master.HMaster.startServiceThreads(HMaster.java:1141)
        at org.apache.hadoop.hbase.master.HMaster.finishActiveMasterInitialization(HMaster.java:849)
        at org.apache.hadoop.hbase.master.HMaster.startActiveMasterManager(HMaster.java:2019)
        at org.apache.hadoop.hbase.master.HMaster.run(HMaster.java:553)
        at java.lang.Thread.run(Thread.java:748)

hsync问题,还有未列出来的log4j冲突问题

然后才知道官网的Hbase2.0-bin包默认是基于Hadoop2构建的,HBase lib文件夹包含与Hadoop-2.7版本的Hadoop依赖关系,该版本不支持hsync功能,解决方案呢查到好像有两种,

第一种使用maven的这个参数-Dhadoop.profile=3.0来编译从源码编译一个基于Hadoop3的版本,编译输出:hbase/hbase-assembly/target/hbase-3.0.0-SNAPSHOT-bin.tar.gz

git clone https://github.com/apache/hbase.git
cd hbase
mvn clean package -DskipTests assembly:single -Dhadoop.profile=3.0

第二种呢,关闭hsync检查,hbase-site.xml代码如下,测试好像有各种问题:

<property>
  <name>hbase.unsafe.stream.capability.enforce</name>
  <value>false</value>
</property>

完成,start-hbase.sh
默认WebConsole port:16010

本文采用知识共享 署名-相同方式共享 4.0 国际 许可协议进行许可。
访问 https://creativecommons.org/licenses/by-sa/4.0/ 查看该许可协议。

这个异常大概是这样

    Error: java.lang.RuntimeException: java.io.EOFException
            at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:165)
            at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.compare(MapTask.java:1283)
            at org.apache.hadoop.util.QuickSort.fix(QuickSort.java:35)
            at org.apache.hadoop.util.QuickSort.sortInternal(QuickSort.java:87)
            at org.apache.hadoop.util.QuickSort.sort(QuickSort.java:63)
            at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.sortAndSpill(MapTask.java:1625)
            at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1505)
            at org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:735)
            at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:805)
            at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
            at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:174)
            at java.security.AccessController.doPrivileged(Native Method)
            at javax.security.auth.Subject.doAs(Subject.java:422)
            at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1682)
            at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:168)
    Caused by: java.io.EOFException
            at java.io.DataInputStream.readFully(DataInputStream.java:197)
            at java.io.DataInputStream.readUTF(DataInputStream.java:609)
            at java.io.DataInputStream.readUTF(DataInputStream.java:564)
            at compare.Salary.readFields(Salary.java:73)
            at org.apache.hadoop.io.WritableComparator.compare(WritableComparator.java:158)
            ... 14 more

原因:

@Override
public void write(DataOutput dataOutput) throws IOException {
    dataOutput.write(this.empno);
    dataOutput.writeUTF(this.ename);
    dataOutput.writeUTF(this.job);
    dataOutput.write(this.mgr);
    dataOutput.writeUTF(this.hiredate);
    dataOutput.write(this.sal);
    dataOutput.write(this.comm);
    dataOutput.write(this.deptno);
}
@Override
public void readFields(DataInput dataInput) throws IOException {
    this.empno = dataInput.readInt();
    this.ename = dataInput.readUTF();
    this.job = dataInput.readUTF();
    this.mgr = dataInput.readInt();
    this.hiredate = dataInput.readUTF();
    this.sal = dataInput.readInt();
    this.comm = dataInput.readInt();
    this.deptno = dataInput.readInt();
}

read的时候用的是readInt

Reads four input bytes and returns an int value. Let a-d be the first through fourth bytes read. The value returned is:

,读四个字节,write的时候呢用的write-。-
Writes to the output stream the eight low-order bits of the argument b. The 24 high-order bits of b are ignored.
写一个字节,这个导致的异常,解决办法呢把write换成writeInt,就起了怪了doc说忽略24个高位你给我抛个异常emmmm

本文采用知识共享 署名-相同方式共享 4.0 国际 许可协议进行许可。
访问 https://creativecommons.org/licenses/by-sa/4.0/ 查看该许可协议。

伪分布模式:
hosts

127.0.0.1    Cat11

env:

# Jdk
JAVA_HOME=/usr/java/jdk/jdk8
export JAVA_HOME
PATH=$JAVA_HOME/bin:$PATH
export PATH
# Hadoop
HADOOP_HOME=/usr/java/hadoop/hadoop3
export HADOOP_HOME
PATH=$HADOOP_HOME/bin:$HADOOP/sbin:$PATH
export PATH
export HDFS_NAMENODE_USER="root"
export HDFS_DATANODE_USER="root"
export HDFS_SECONDARYNAMENODE_USER="root"
export YARN_RESOURCEMANAGER_USER="root"
export YARN_NODEMANAGER_USER="root"

hadoop-env.sh

export JAVA_HOME=/usr/java/jdk/jdk8

hdfs-site.xml

<!--数据块冗余度,默认3-->
<property>
    <name>dfs.replication</name>
    <value>1</value>
</property>
<!--是否开启HDFS的权限检查,默认:true-->
<!--
 <property>
   <name>dfs.permissions</name>
   <value>false</value>
 </property>
 -->

core-site.xml

<!--NameNode的地址-->
<property>
    <name>fs.defaultFS</name>
    <value>hdfs://Cat11:9000</value>
</property> 
<!--HDFS数据保存的目录,默认是Linux的tmp目录-->
<property>
    <name>hadoop.tmp.dir</name>
    <value>/usr/java/hadoop/hadoop3/tmp</value>
</property>

mapred-site.xml

<!--MR程序运行的容器是Yarn-->
<property>
    <name>mapreduce.framework.name</name>
    <value>yarn</value>
</property>

yarn-site.xml

<!--ResourceManager的地址-->
<property>
    <name>yarn.resourcemanager.hostname</name>
    <value>Cat11</value>
</property>     
<!--NodeManager运行MR任务的方式-->
<property>
    <name>yarn.nodemanager.aux-services</name>
    <value>mapreduce_shuffle</value>
</property>

格式化NameNode

hdfs namenode -format

name has been successfully formatted.(成功字段)
然后就能启动了!

start-all.sh

最后测试一下wordcount的时候报错了

hdfs dfs -put data /
hadoop jar $HADOOP_HOME/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.*.jar wordcount /data /output/test

报错信息大概是这样

    Please check whether your etc/hadoop/mapred-site.xml contains the below configuration:
    <property>
      <name>yarn.app.mapreduce.am.env</name>
      <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
    </property>
    <property>
      <name>mapreduce.map.env</name>
      <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
    </property>
    <property>
      <name>mapreduce.reduce.env</name>
      <value>HADOOP_MAPRED_HOME=${full path of your hadoop distribution directory}</value>
    </property>

再把Mapreduce的环境变量配一下
mapred-site.xml

<property>
    <name>yarn.app.mapreduce.am.env</name>
    <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
    <name>mapreduce.map.env</name>
    <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
    <name>mapreduce.reduce.env</name>
    <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>

完成!

本文采用知识共享 署名-相同方式共享 4.0 国际 许可协议进行许可。
访问 https://creativecommons.org/licenses/by-sa/4.0/ 查看该许可协议。

全分布模式(一主二从,注释为Hadoop3.1的都是相比2多出来的必要配置):
hosts

192.0.96.11    Cat1
192.0.96.12    Cat2
192.0.96.13    Cat3

ssh(三台全配,先ssh-keygen -t rsa)

ssh-copy-id -i .ssh/id_rsa.pub root@cat1
ssh-copy-id -i .ssh/id_rsa.pub root@cat2
ssh-copy-id -i .ssh/id_rsa.pub root@cat3

env:

# Jdk
JAVA_HOME=/usr/java/jdk/jdk8
export JAVA_HOME
PATH=$JAVA_HOME/bin:$PATH
export PATH

# Hadoop
HADOOP_HOME=/usr/java/hadoop/hadoop3
export HADOOP_HOME
PATH=$HADOOP_HOME/bin:$HADOOP/sbin:$PATH
export PATH
#Hadoop3.1
export HDFS_NAMENODE_USER="root"
export HDFS_DATANODE_USER="root"
export HDFS_SECONDARYNAMENODE_USER="root"
export YARN_RESOURCEMANAGER_USER="root"
export YARN_NODEMANAGER_USER="root"

hadoop-env.sh

export JAVA_HOME=/usr/java/jdk/jdk8

hdfs-site.xml

<!--数据块冗余度,默认3-->
<property>
  <name>dfs.replication</name>
  <value>2</value>
</property>
<!--是否开启HDFS的权限检查,默认:true-->
<property>
  <name>dfs.permissions</name>
  <value>false</value>
</property>

core-site.xml

<!--NameNode的地址-->
<property>
  <name>fs.defaultFS</name>
  <value>hdfs://Cat1:9000</value>
</property>    
<!--HDFS数据保存的目录,默认是Linux的tmp目录-->
<property>
  <name>hadoop.tmp.dir</name>
  <value>/usr/java/hadoop/hadoop3/tmp</value>
</property>

mapred-site.xml

<!--MR程序运行的容器是Yarn-->
<property>
  <name>mapreduce.framework.name</name>
  <value>yarn</value>
</property>
<!--Hadoop3.1-->
<property>
  <name>yarn.app.mapreduce.am.env</name>
  <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
  <name>mapreduce.map.env</name>
  <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<property>
  <name>mapreduce.reduce.env</name>
  <value>HADOOP_MAPRED_HOME=$HADOOP_HOME</value>
</property>
<!--Hadoop3.1-->

yarn-site.xml

<!--ResourceManager的地址-->
<property>
  <name>yarn.resourcemanager.hostname</name>
  <value>Cat1</value>
</property>        
<!--NodeManager运行MR任务的方式-->
<property>
  <name>yarn.nodemanager.aux-services</name>
  <value>mapreduce_shuffle</value>
</property>

workers(Hadoop2为slaves)
Format NameNode(这里容易出的问题是format时tmp或name,data目录不能存在)

    hdfs namenode -format

成功字段为:name has been successfully formatted.

Copy(保证clusterID一致)

scp -r hadoop3/ root@cat2:/usr/java/hadoop
scp -r hadoop3/ root@cat3:/usr/java/hadoop

Running

start-all.sh

查看运行状态

    # hdfs dfsadmin -report
    Configured Capacity: 13283360768 (12.37 GB)
    Present Capacity: 7522836480 (7.01 GB)
    DFS Remaining: 7522820096 (7.01 GB)
    DFS Used: 16384 (16 KB)
    DFS Used%: 0.00%
    Replicated Blocks:
            Under replicated blocks: 0
            Blocks with corrupt replicas: 0
            Missing blocks: 0
            Missing blocks (with replication factor 1): 0
            Pending deletion blocks: 0
    Erasure Coded Block Groups: 
            Low redundancy block groups: 0
            Block groups with corrupt internal blocks: 0
            Missing block groups: 0
            Pending deletion blocks: 0


-------------------------------------------------
Live datanodes (2):

Name: 192.0.96.12:9866 (Cat2)
Hostname: Cat2
Decommission Status : Normal
Configured Capacity: 6641680384 (6.19 GB)
DFS Used: 8192 (8 KB)
Non DFS Used: 3138756608 (2.92 GB)
DFS Remaining: 3502915584 (3.26 GB)
DFS Used%: 0.00%
DFS Remaining%: 52.74%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sun May 27 04:29:04 EDT 2018
Last Block Report: Sun May 27 04:07:13 EDT 2018
Num of Blocks: 0



Name: 192.0.96.13:9866 (Cat3)
Hostname: Cat3
Decommission Status : Normal
Configured Capacity: 6641680384 (6.19 GB)
DFS Used: 8192 (8 KB)
Non DFS Used: 2621767680 (2.44 GB)
DFS Remaining: 4019904512 (3.74 GB)
DFS Used%: 0.00%
DFS Remaining%: 60.53%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sun May 27 04:29:04 EDT 2018
Last Block Report: Sun May 27 04:07:13 EDT 2018
Num of Blocks: 0

Web Interfaces(3.1)

NameNode--------------------------------http://nn_host:port/----Default HTTP port is 9870.
ResourceManager----------------------http://rm_host:port/----Default HTTP port is 8088.
MapReduce JobHistory Server-----http://jhs_host:port/---Default HTTP port is 19888.

Web Interfaces(2.9.1)

NameNode--------------------------------http://nn_host:port/----Default HTTP port is 50070.
ResourceManager----------------------http://rm_host:port/----Default HTTP port is 8088.
MapReduce JobHistory Server-----http://jhs_host:port/---Default HTTP port is 19888.

Title - Artist
0:00