资阳视频网站建设,wordpress 又拍云 js,专业见长,建盏大师排名表及落款在之前的文章中#xff0c;我教会大家如何一步一步搭建一个Hadoop集群#xff0c;但是只提供了代码#xff0c;怕有些朋友会在一些地方产生疑惑#xff0c;今天我来以图文混排的方式#xff0c;一站式交给大家如何搭建一个Hadoop高可用集群包括#xff08;HadoopHA#…在之前的文章中我教会大家如何一步一步搭建一个Hadoop集群但是只提供了代码怕有些朋友会在一些地方产生疑惑今天我来以图文混排的方式一站式交给大家如何搭建一个Hadoop高可用集群包括HadoopHAZookeeper、MySQL、Hbase、Hive、Sqoop、Scala、Spark。如果对之前文章感兴趣的朋友可以观看这刊专栏
大数据技术之Hadoop全生态组件学习与搭建http://t.csdnimg.cn/LMyEn文章较长附目录此次安装是在VM虚拟环境下进行。同时希望我的文章能帮助到你如果觉得我的文章写的不错请留下你宝贵的点赞谢谢。
目录
一创建集群
一、创建主机
二、解压安装包配置环境变量
一、解压安装包
二、配置环境变量
三、创建从机
二、配置安装应用
1、HadoopHA 及 zookeeper
2、HBase
3、Hive及MySQL
4、sqoop
5、scala 及spark 一创建集群
一、创建主机 首先我们需要在vm里安装新建一台名为BigData01的虚拟机作为我们的主机。 这个内存要注意如果只是学习搭建内存不用给很大如果你的集群搭建是为了工作或别的高需求目的能给多大就多大。 这样我们就可以开启虚拟机了第一次需要初始化。 这是添加新用户name是用户名password是密码。 这边我们选择用root(管理员)用户来登录密码就是刚才创建虚拟机时所设置的密码。 将我们所需要的安装包jdk、Hadoop、zookeeper、hbase、MySQL、MySQL.java、hive、sqoop、Scala、spark上传到Linux的Downloads中。
二、解压安装包配置环境变量
一、解压安装包
打开终端解压到opt下
tar zxvf /root/Downloads/jdk-8u171-linux-x64.tar.gz -C/opt/
tar zxvf /root/Downloads/zookeeper-3.4.5.tar.gz -C/opt/
tar zxvf /root/Downloads/hadoop-2.7.5.tar.gz -C/opt/
mv /opt/zookeeper-3.4.5/ /opt/zookeeper
mv /opt/hadoop-2.7.5/ /opt/hadoopHA
tar zxvf /root/Downloads/hbase-1.2.6-bin.tar.gz -C/opt/ tar zxvf /root/Downloads/apache-hive-2.1.1-bin.tar.gz -C/opt/
mv /opt/apache-hive-2.1.1-bin/ /opt/hive卸载原有数据库
rpm -qa | grep mariadb(出来的是哪个版本号下面就哪个)
rpm -e --nodeps mariadb-libs-5.5.65-1.el7.x86_64
rpm -e --nodeps mariadb-5.5.68-1.el7.x86_64
rpm -e --nodeps mariadb-libs-5.5.68-1.el7.x86_64安装mysql
cd /opt/
mkdir mysql
cd
tar xvf /root/Downloads/mysql-5.7.26-1.el7.x86_64.rpm-bundle.tar -C/opt/mysql
cd /opt/mysql/
rpm -ivh mysql-community-common-5.7.26-1.el7.x86_64.rpm
rpm -ivh mysql-community-libs-5.7.26-1.el7.x86_64.rpm
rpm -ivh mysql-community-libs-compat-5.7.26-1.el7.x86_64.rpm
rpm -ivh mysql-community-client-5.7.26-1.el7.x86_64.rpm
rpm -ivh mysql-community-server-5.7.26-1.el7.x86_64.rpm检查安装情况
rpm -qa | grep mysql mv /root/Downloads/mysql-connector-java-5.1.46-bin.jar /opt/hive/lib/ tar -zxvf /root/Downloads/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz -C/opt/
mv /opt/sqoop-1.4.7.bin__hadoop-2.6.0/ /opt/sqoop二、配置环境变量
创建所需文件配置环境变量
cd /opt/zookeeper
mkdir data mkdir logs
cdvim /etc/profileexport JAVA_HOME/opt/jdk1.8.0_171
export HADOOP_HOME/opt/hadoopHA
export PATH$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbinexport ZOOKEEPER_HOME/opt/zookeeper
export PATH$PATH:$ZOOKEEPER_HOME/binexport HBASE_HOME/opt/hbase-1.2.6
export PATH$PATH:$HBASE_HOME/binexport HIVE_HOME/opt/hive
export HIVE_CONF_DIR$HIVE_HOME/conf
export HCAT_HOME$HIVE_HOME/hcatalog
export PATH$PATH:$HIVE_HOME/binexport SQOOP_HOME/opt/sqoop
export PATH$PATH:$SQOOP_HOME/binexport SCALA_HOME/usr/local/soft/scala-2.12.12
export PATH$PATH:${SCALA_HOME}/binexport SPARK_HOME/opt/spark-3.2.1
export PATH$PATH:${SPARK_HOME}/bin
export PATH$PATH:${SPARK_HOME}/sbinsource /etc/profile三、创建从机
主机关机从目前的主机状态克隆出两个从机当集群中的从节点名称分别为BigData01BigData02。 BigData03创建方式同上
二、配置安装应用
1、HadoopHA 及 zookeeper
vim /etc/hosts
192.168.67.128 BigData01
192.168.67.129 BigData02
192.168.67.130 BigData03
(根据实际ip改变)
scp -r /etc/hosts BigData02:/etc/
scp -r /etc/hosts BigData03:/etc/
(接yes和密码) ssh-keygen -t rsa
cd ~/.ssh/
cat ./id_rsa.pub ./authorized_keys产生的授权后的钥匙要发送给s1和s2节点
#scp 发送命令
scp ./authorized_keys rootBigData02:/.ssh
scp ./authorized_keys rootBigData03:/.sshssh-copy-id BigData02
ssh-copy-id BigData03ssh-add 启动ssh的服务echo 1 /opt/zookeeper/data/myid
cp /opt/zookeeper/conf/zoo_sample.cfg /opt/zookeeper/conf/zoo.cfgvim /opt/zookeeper/conf/zoo.cfg
修改dataDir/opt/zookeeper/data
末尾添加
server.1BigData01:2888:3888
server.2BigData02:2888:3888
server.3BigData03:2888:3888scp -r /opt/zookeeper rootBigData02:/opt/
scp -r /opt/zookeeper rootBigData03:/opt/02虚拟机下echo 2 /opt/zookeeper/data/myid
03虚拟机下echo 3 /opt/zookeeper/data/myid 三个节点
systemctl stop firewalld.service
zkServer.sh start
zkServer.sh status cd /opt/hadoopHA/
mkdir tmp
scp -r /opt/hadoopHA/tmp BigData02:/opt/hadoopHA/
scp -r /opt/hadoopHA/tmp BigData03:/opt/hadoopHA/vim /opt/hadoopHA/etc/hadoop/hadoop-env.sh
export JAVA_HOME/opt/jdk1.8.0_171vim /opt/hadoopHA/etc/hadoop/core-site.xmlproperty!--指定HDFS的通信地址--namefs.defaultFS/namevaluehdfs://ns1/value/propertyproperty!--指定hadoop运行时产生文件的存储路径(即临时目录)--namehadoop.tmp.dir/namevalue/opt/hadoopHA/tmp/value/propertyproperty!--指定ZooKeeper地址2181端口参考zoo.cfg配置文件 --nameha.zookeeper.quorum/namevalueBigData01:2181,BigData02:2181,BigData03:2181/value/propertyvim /opt/hadoopHA/etc/hadoop/hdfs-site.xmlproperty!--指定HDFS的nameservices为ns1需要与core-site.xml保持一致--namedfs.nameservices/namevaluens1/value/propertyproperty!--ns1下面设置2个NameNode分别是nn1,nn2--namedfs.ha.namenodes.ns1/namevaluenn1,nn2/value/propertyproperty!--设置nn1的RPC通信地址--namedfs.namenode.rpc-address.ns1.nn1/namevalueBigData01:9000/value/propertyproperty!--设置nn1的http通信地址--namedfs.namenode.http-address.ns1.nn1/namevalueBigData01:50070/value/propertyproperty!--设置nn2的RPC通信地址--namedfs.namenode.rpc-address.ns1.nn2/namevalueBigData02:9000/value/propertyproperty!--设置nn2的http通信地址--namedfs.namenode.http-address.ns1.nn2/namevalueBigData02:50070/value/propertyproperty!--设置NameNode的元数据在JournalNode上的存放位置--namedfs.namenode.shared.edits.dir/namevalueqjournal://BigData01:8485;BigData02:8485;BigData03:8485/ns1/value/propertyproperty!--指定JournalNode存放edits日志的目录位置--namedfs.journalnode.edits.dir/namevalue/opt/hadoopHA/tmp/dfs/journal/value/propertyproperty!--开启NameNode失败自动切换--namedfs.ha.automatic-failover.enabled/namevaluetrue/value/propertyproperty!--配置失败自动切换实现方式--namedfs.client.failover.proxy.provider.ns1/namevalueorg.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider/value/propertyproperty!--配置隔离机制--namedfs.ha.fencing.methods/namevaluesshfence/value/property!--设置使用隔离机制时需要的SSH免登录--propertynamedfs.ha.fencing.ssh.private-key-files/namevalue/home/hadoop/.ssh/id_rsa/value/propertyvim /opt/hadoopHA/etc/hadoop/yarn-site.xmlproperty!--设置resourcemanager在哪个节点上--nameyarn.resourcemanager.hostname/namevalueBigData01/value/propertyproperty!--Reducer取数据的方法是mapreduce_shuffle--!--指定nodemanager启动时加载server的方式--nameyarn.nodemanager.aux-services/namevaluemapreduce_shuffle/value/propertypropertynameyarn.nodemanager.aux-services.mapreduce.shuffle.class/namevalueorg.apache.hadoop.mapred.ShuffleHandler/value/propertycd /opt/hadoopHA/etc/hadoop
cp mapred-site.xml.template mapred-site.xml
cd
vim /opt/hadoopHA/etc/hadoop/mapred-site.xmlproperty!--指定MR(mapreduce)框架使用YARN方式--namemapreduce.framework.name/namevalueyarn/value/propertyvim /opt/hadoopHA/etc/hadoop/slaves
BigData01
BigData02
BigData03scp -r /opt/hadoopHA rootBigData02:/opt/
scp -r /opt/hadoopHA rootBigData03:/opt/三个节点启动zookeeper查看状态
zkServer.sh start
zkServer.sh status
jps查看进程
hadoop-daemon.sh start journalnode主节点下
hdfs namenode -format
scp -r /opt/hadoopHA/tmp/dfs BigData02:/opt/hadoopHA/tmp/
hadoop-daemon.sh start namenode
另外一个namenode节点下:
hdfs namenode -bootstrapStandby
hadoop-daemon.sh start namenode主节点下
hdfs zkfc -formatZK 格式化
启动服务
start-dfs.sh
start-yarn.sh
查看进程
jps关机前关闭集群服务
stop-yarn.sh
stop-dfs.sh 后面的配置方法同上就不附图了
2、HBase 在BigData01下
tar zxvf /root/Downloads/hbase-1.2.6-bin.tar.gz -C/opt/vim /etc/profile
export HBASE_HOME/opt/hbase-1.2.6
export PATH$PATH:$HBASE_HOME/binsource /etc/profilevim /opt/hbase-1.2.6/conf/hbase-env.sh
export JAVA_HOME/opt/jdk1.8.0_17127行export HBASE_MANAGES_ZKfalse(128)此行下面添加
# Configure PermSize. Only needed in JDK7. You can safely remove it for JDK8
export HBASE_MASTER_OPTS$HBASE_MASTER_OPTS -XX:PermSize128m -XX:MaxPermSize128m -XX:ReservedCodeCacheSize256m
export HBASE_REGIONSERVER_OPTS$HBASE_REGIONSERVER_OPTS -XX:PermSize128m -XX:MaxPermSize128m -XX:ReservedCodeCacheSize256mvim /opt/hbase-1.2.6/conf/hbase-site.xml propertynamehbase.rootdir/name !-- hbase存放数据目录 默认值${hbase.tmp.dir}/hbase--!-- 端口要和Hadoop的fs.defaultFS端口一致--!--ns1为hdfs-site.xml中dfs.nameservices的值。或与Hadoop的fs.defaultFS一致--valuehdfs://ns1/data/hbase_db/value/propertypropertynamehbase.cluster.distributed/name !-- 是否分布式部署 --valuetrue/value/propertypropertynamedfs.support.append/namevaluetrue/value/propertypropertynamehbase.zookeeper.quorum/name !-- list of zookooper --valueBigData01,BigData02,BigData03/value/propertypropertynamehbase.zookeeper.property.datadir/name !--zookooper配置、日志等的存储位置 --value/opt/zookeeper-3.4.12/value/propertypropertynamehbase.zookeeper.property.clientPort/namevalue2181/value/propertyvim /opt/hbase-1.2.6/conf/regionservers
BigData01
BigData02
BigData03vim /opt/hbase-1.2.6/conf/backup-masters
BigData02scp /opt/hadoopHA/etc/hadoop/hdfs-site.xml /opt/hbase-1.2.6/conf/scp -r /etc/profile BigData02:/etc/
scp -r /etc/profile BigData03:/etc/
scp -r /opt/hbase-1.2.6 rootBigData02:/opt/
scp -r /opt/hbase-1.2.6 rootBigData03:/opt/
两节点下
source /etc/profile依次启动zkServer.sh start检查防火墙
systemctl stop firewalld01下start-dfs.sh
start-yarn.sh
start-hbase.sh
jps
03下:
mr-jobhistory-daemon.sh start historyserver
jps浏览器打开查看
http://192.168.67.128:16010
http://192.168.67.128:16030
根据实际ip地址查看HBase的Shell命令
1基本Shell命令
1、启动Shell进入HBase命令行环境
$ hbase shell
[hadoopBigData01 ~]$ hbase shell
hbase(main):001:0 2、查看HBase运行状态
hbase(main):002:0 status
1 active master, 1 backup masters, 3 servers, 0 dead, 0.6667 average load3、查看版本
hbase(main):003:0 version4、获得帮助
hbase(main):004:0 help5、退出Shell
hbase(main):005:0 exit2DDL操作命令
1、创建表
create 表名student列族名address, 列族名info
hbase(main):001:0 create student, address, info
0 row(s) in 2.9230 secondsHbase::Table - student2、列表的形式显示所有数据表
hbase(main):002:0 list
TABLE
student
1 row(s) in 0.0920 seconds [student]3、查看表的结构
hbase(main):003:0 describe student
Table student is ENABLED
student
COLUMN FAMILIES DESCRIPTION
{NAME address, BLOOMFILTER ROW, VERSIONS 1, IN_MEMORY false, KEEP_DELETED_CELLS FALSE,
DATA_BLOCK_ENCODING NONE, TTL FOREVER, COMPRESSION NONE, MIN_VERSIONS 0, BLOCKCACHE true,
BLOCKSIZE 65536, REPLICATION_SCOPE 0}
{NAME info, BLOOMFILTER ROW, VERSIONS 1, IN_MEMORY false, KEEP_DELETED_CELLS FALSE,
DATA_BLOCK_ENCODING NONE, TTL FOREVER, COMPRESSION NONE, MIN_VERSIONS 0, BLOCKCACHE true,
BLOCKSIZE 65536, REPLICATION_SCOPE 0}
2 row(s) in 0.1420 seconds4、修改表结构需要先将表设为不可用
hbase(main):004:0 disable student
0 row(s) in 2.4820 seconds4.1)增加列族
hbase(main):005:0 alter student, NAMEcf3, VERSIONS5
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 2.3320 seconds4.2)删除列族
hbase(main):007:0 alter student, NAMEcf3, METHODdelete
Updating all regions with the new schema...
1/1 regions updated.
Done.
0 row(s) in 2.2480 seconds 4.3)设表为启用状态
enable student 5、查询表是否存在
exists student 6、查询表是否可用
is_enabled student 7、判断表是否不可用
is_disabled student 8、删除表
先disable表再drop表
disable test
drop test 2DML操作命令
假设student表的列族address{province,city,university}info{height,weight,birthday,telephone,qq}create student, address, info Row Key: 姓名也可根据需要设学号为row key
1、插入记录数据put 表名, row key, 列族:列, 列的值
put student,zhangsan,info:height,180
put student,zhangsan,info:birthday,1990-01-20
put student,zhangsan,info:weight,70
put student,zhangsan,address:province,Hubei
put student,zhangsan,address:city,Wuhan
put student,zhangsan,address:university,Wenhua College2、获取一条数据get 表名, row key
hbase(main):011:0 get student, zhangsan
COLUMN CELL address:city timestamp1521772686458, valueWuhan address:province timestamp1521772681481, valueHubei address:university timestamp1521772690856, valueWenhua College info:birthday timestamp1521772670610, value1990-01-20 info:height timestamp1521772660840, value180 info:weight timestamp1521772675096, value70
6 row(s) in 0.1980 seconds3、获取一个IDrow key一个列族的所有数据
get student, zhangsan, info4、获取一个IDrow key一个列族中某列的所有数据
get student, zhangsan, info:birthday5、更新一条记录
put student, zhangsan, info:weight, 756、读出数据全表扫描
scan student7、查询表有多少行row key的数量
count student 8、将整表清空
truncate student 9、删除某IDrow key的某列的值
delete student, zhangsan, info:weight 3运行HBase Shell脚本
可以把操作命令写入到文件中如testHbaseData.sh再在Linux shell命令下执行
$ hbase shell testHbaseData.sh 如testHbaseData.sh文件中写入如下内容
put student,lisi,info:height,170
put student,lisi,info:birthday,1991-06-20
put student,lisi,info:weight,65
put student,lisi,address:province,Hubei
put student,lisi,address:city,Wuhan
put student,lisi,address:university,Wuhan University3、Hive及MySQL tar zxvf /root/Downloads/apache-hive-2.1.1-bin.tar.gz -C/opt/
mv /opt/apache-hive-2.1.1-bin/ /opt/hive关闭防火墙及自启
systemctl stop firewalld
systemctl disable firewalld卸载原有数据库
rpm -qa | grep mariadb(出来的是哪个版本号下面就哪个)
rpm -e --nodeps mariadb-libs-5.5.65-1.el7.x86_64
rpm -e --nodeps mariadb-5.5.68-1.el7.x86_64
rpm -e --nodeps mariadb-libs-5.5.68-1.el7.x86_64安装mysql
cd /opt/
mkdir mysql
cd
tar xvf /root/Downloads/mysql-5.7.26-1.el7.x86_64.rpm-bundle.tar -C/opt/mysql
cd /opt/mysql/
rpm -ivh mysql-community-common-5.7.26-1.el7.x86_64.rpm
rpm -ivh mysql-community-libs-5.7.26-1.el7.x86_64.rpm
rpm -ivh mysql-community-libs-compat-5.7.26-1.el7.x86_64.rpm
rpm -ivh mysql-community-client-5.7.26-1.el7.x86_64.rpm
rpm -ivh mysql-community-server-5.7.26-1.el7.x86_64.rpm检查安装情况
rpm -qa | grep mysql 修改配置文件vim /etc/my.cnf添加到symbolic-links0 配置信息的下方
default-storage-engineinnodb
innodb_file_per_table
collation-serverutf8_general_ci
init-connectSET NAMES utf8
character-set-serverutf8启动mysql服务
mysqld --initialize --usermysql
systemctl start mysqld
systemctl status mysqld出现绿色active (running)cat /var/log/mysqld.log | grep password查看默认密码复制:s/!:!8:kNrf
mysql -uroot -p(输入拷贝的密码)
set passwordpassword(123456);(修改密码)
update mysql.user set host% where userroot;实现任意主机root用户的远程登录
flush privileges;刷新权限表
quit;(退出重新登录)
mysql -uroot -p(修改后的密码登录成功即可退出mysql配置成功)hive配置mv /root/Downloads/mysql-connector-java-5.1.46-bin.jar /opt/hive/lib/修改hive环境变量
vim /etc/profile
export HIVE_HOME/opt/hive
export HIVE_CONF_DIR$HIVE_HOME/conf
export HCAT_HOME$HIVE_HOME/hcatalog
export PATH$PATH:$HIVE_HOME/binsource /etc/profile配置hive
cd /opt/hive/conf/
cp hive-default.xml.template hive-site.xmlvim /opt/hive/conf/hive-site.xmlnamejavax.jdo.option.ConnectionPassword/namevalue123456/valuedescriptionpassword to use against metastore database/descriptionpropertynamejavax.jdo.option.ConnectionURL/namevaluejdbc:mysql://BigData01:3306/hive?createDatabaseIfNotExisttrueamp;useSSLfalse/valuedescriptionJDBC connect string for a JDBC metastore/description/property
484-488684为truepropertynamejavax.jdo.option.ConnectionDriverName/namevaluecom.mysql.jdbc.Driver/valuedescriptionDriver class name for a JDBC metastore/description
/property928-932propertynamejavax.jdo.option.ConnectionUserName/namevalueroot/valuedescriptionUsername to use against metastore database/description/property
953-957namehive.exec.scratchdir/name
value/opt/hive/tmp/value
descriptionLocation of Hive run time structured log file/description(1513-1514)namehive.exec.local.scratchdir/namevalue/opt/hive/tmp/valuenamehive.downloaded.resources.dir/name
value/opt/hive/tmp/resources/valuenamehive.server2.logging.operation.log.location/name
value/opt/hive/tmp/operation_logs/value创建hive缓存路径
mkdir /opt/hive/tmp添加Hadoop远程登录配置文件
vim /opt/hadoopHA/etc/hadoop/core-site.xmlpropertynamehadoop.proxyuser.hadoop.hosts/namevalue*/value/propertypropertynamehadoop.proxyuser.hadoop.groups/namevalue*/value/propertyscp -r /opt/hadoopHA/etc/hadoop/core-site.xml rootBigData02:/opt/hadoopHA/etc/hadoop/
scp -r /opt/hadoopHA/etc/hadoop/core-site.xml rootBigData03:/opt/hadoopHA/etc/hadoop/初始化 hive 元数据
cd /opt/hive/lib/
ll
mv log4j-slf4j-impl-2.4.1.jar log4j-slf4j-impl-2.4.1.jar.bak
schematool -initSchema -dbType mysql(报错纠错查看报错原极大可能是配置文件出错)
mysql -uroot -p
show databases;(出现hive表即配置成功)启动hive启动各服务关闭防火墙
systemctl stop firewalld
stop-all.sh
start-all.sh
依次启动zkServer.sh start检查防火墙
01下
start-dfs.sh
start-yarn.sh
start-hbase.sh
jps
03下:
mr-jobhistory-daemon.sh start historyserver确保各前置服务启动成功后首次启动hive需初始化
hive --service metastore
出现WARNING即成功hive 4、sqoop tar -zxvf /root/Downloads/sqoop-1.4.7.bin__hadoop-2.6.0.tar.gz -C/opt/
mv /opt/sqoop-1.4.7.bin__hadoop-2.6.0/ /opt/sqoopvim /etc/profile
export SQOOP_HOME/opt/sqoop
export PATH$PATH:$SQOOP_HOME/bin
source /etc/profilescp /opt/hive/lib/mysql-connector-java-5.1.46-bin.jar /opt/sqoop/lib/
cp sqoop-env-template.sh sqoop-env.shvim sqoop-env.sh
#Set path to where bin/hadoop is available
export HADOOP_COMMON_HOME/opt/hadoopHA
#Set path to where hadoop-*-core.jar is available
export HADOOP_MAPRED_HOME/opt/hadoopHA
#set the path to where bin/hbase is available
export HBASE_HOME/opt/hbase-1.2.6
#Set the path to where bin/hive is available
export HIVE_HOME/opt/hive
#Set the path for where zookeper config dir is
export ZOOCFGDIR/opt/zookeeper 5、scala 及spark
cp /opt/spark-3.2.1/conf/spark-env.sh.template /opt/spark-3.2.1/conf/spark-env.sh
cp /opt/spark-3.2.1/conf/workers.template /opt/spark-3.2.1/conf/workersvim /opt/spark-3.2.1/conf/spark-env.shexport SCALA_HOME/opt/scala-2.12.15
export JAVA_HOME/opt/jdk1.8.0_171
export SPARK_MASTER_IPBigData01
export SPARK_WOKER_CORES2
export SPARK_WOKER_MEMORY2g
export HADOOP_CONF_DIR/opt/HadoopHA
#export SPARK_MASTER_WEBUI_PORT8080
#export SPARK_MASTER_PORT7070vim /opt/spark-3.2.1/conf/workersBigData02
BigData03scp -r /opt/spark-3.2.1/ BigData02:/opt/
scp -r /opt/spark-3.2.1/ BigData03:/opt/vim /etc/profileexport SPARK_HOME/opt/spark-3.2.1
export PATH$PATH:${SPARK_HOME}/bin
export PATH$PATH:${SPARK_HOME}/sbinsource /etc/profile主节点下
cd /opt/spark-3.2.1/sbin/./start-all.sh
三个节点分别
jps三、效果 拓展-Hadoop生态系统组件
组件简略作用HDFS (Hadoop Distributed File System)用于存储和管理大规模数据集提供高可靠性、可扩展性和高吞吐量的数据存储。MapReduce分布式计算框架用于并行处理大规模数据集实现大数据量的计算和分析。YARN (Yet Another Resource Negotiator)集群资源管理器负责管理和调度集群中的计算资源允许多租户并行运行不同的作业。Hive基于Hadoop的数据仓库基础设施提供类似SQL的查询语言(HiveQL)用于处理和分析结构化数据。Pig数据流编程语言和执行环境用于在Hadoop上进行数据转换和分析简化大数据处理过程。HBase分布式列存数据库用于存储非结构化和半结构化数据提供高可靠性、高性能的数据存储和访问能力。ZooKeeper分布式协调服务用于维护服务器状态信息、存储配置信息、实现命名服务和集群管理。Spark快速、通用的大数据处理引擎可以在内存中进行数据处理提供高效的数据分析和计算能力。 组件简略作用Sqoop用于在Hadoop和传统数据库之间进行高效的数据传输可以实现数据的导入和导出操作。Oozie工作流调度系统用于定义和管理Hadoop作业的工作流实现作业的自动化执行和调度。Flume分布式、可靠和高可用的服务用于有效地收集、聚合和移动大量日志数据。Ambari管理工具用于安装、配置、监控和管理Hadoop集群提供直观的用户界面和强大的管理功能。Tez基于Hadoop YARN的框架用于优化执行速度使得Hive、Pig等处理引擎能够更快地处理数据。Flink一个流处理和批处理的开源平台可以在Hadoop上运行提供高效的数据处理和分析能力。