网站添加视频代码,移动端优化,百度营销,wordpress导航列表目录
Hlog WALs和oldWALs
整体流程
HMaster 初始化
定时执行
LogCleaner 日志清理类
ReplicationLogCleaner 日志清理类
总结
Hlog WALs和oldWALs
这里先介绍一下Hlog失效和Hlog删除的规则
HLog失效#xff1a;写入数据一旦从MemStore中刷新到磁盘#xff0c;…目录
Hlog WALs和oldWALs
整体流程
HMaster 初始化
定时执行
LogCleaner 日志清理类
ReplicationLogCleaner 日志清理类
总结
Hlog WALs和oldWALs
这里先介绍一下Hlog失效和Hlog删除的规则
HLog失效写入数据一旦从MemStore中刷新到磁盘HLog默认存储目录在/hbase/WALs下就会自动把数据移动到 /hbase/oldWALs 目录下此时并不会删除
Hlog删除Master启动时会启动一个线程定期去检查oldWALs目录下的可删除文件进行删除定期检查时间为 hbase.master.cleaner.interval 默认是1分钟 删除条件有两个 1.Hlog文件在参与主从复制否的话删除是的话不删除 2.Hlog文件是否在目录中存在 hbase.master.logcleaner.ttl 时间如果是则删除
整体流程 pos 格式流程图下载地址 链接https://pan.baidu.com/s/1szhpVn7RyegE0yqQedACIA 提取码ig9x 这里只介绍与wal相关的流程一下介绍的代码都在上图中标记类名方法名以及说明可以直接从源码中查看
HMaster 初始化
HMaster启动初始化 HMaster构造方法调用 startActiveMasterManager 方法
startActiveMasterManager 方法 调用 finishActiveMasterInitialization(status); 方法
在 finishActiveMasterInitialization 方法中会启动所有服务线程代码段如下
// start up all service threads.
status.setStatus(Initializing master service threads);
startServiceThreads(); startServiceThreads 方法代码如下 /** Start up all services. If any of these threads gets an unhandled exception* then they just die with a logged message. This should be fine because* in general, we do not expect the master to get such unhandled exceptions* as OOMEs; it should be lightly loaded. See what HRegionServer does if* need to install an unexpected exception handler.*/private void startServiceThreads() throws IOException{// Start the executor service poolsthis.service.startExecutorService(ExecutorType.MASTER_OPEN_REGION,conf.getInt(hbase.master.executor.openregion.threads, 5));this.service.startExecutorService(ExecutorType.MASTER_CLOSE_REGION,conf.getInt(hbase.master.executor.closeregion.threads, 5));this.service.startExecutorService(ExecutorType.MASTER_SERVER_OPERATIONS,conf.getInt(hbase.master.executor.serverops.threads, 5));this.service.startExecutorService(ExecutorType.MASTER_META_SERVER_OPERATIONS,conf.getInt(hbase.master.executor.serverops.threads, 5));this.service.startExecutorService(ExecutorType.M_LOG_REPLAY_OPS,conf.getInt(hbase.master.executor.logreplayops.threads, 10));// We depend on there being only one instance of this executor running// at a time. To do concurrency, would need fencing of enable/disable of// tables.// Any time changing this maxThreads to 1, pls see the comment at// AccessController#postCreateTableHandlerthis.service.startExecutorService(ExecutorType.MASTER_TABLE_OPERATIONS, 1);startProcedureExecutor();// Initial cleaner choreCleanerChore.initChorePool(conf);// Start log cleaner thread//获取定时日志清理时间从系统配置获取默认为10分钟int cleanerInterval conf.getInt(hbase.master.cleaner.interval, 60 * 1000);this.logCleaner new LogCleaner(cleanerInterval,this, conf, getMasterFileSystem().getOldLogDir().getFileSystem(conf),getMasterFileSystem().getOldLogDir());//将任务加入定时执行时间间隔为 cleanerInterval 该值在LogCleaner中已经设置为定时执行间隔getChoreService().scheduleChore(logCleaner);//start the hfile archive cleaner threadPath archiveDir HFileArchiveUtil.getArchivePath(conf);MapString, Object params new HashMapString, Object();params.put(MASTER, this);this.hfileCleaner new HFileCleaner(cleanerInterval, this, conf, getMasterFileSystem().getFileSystem(), archiveDir, params);getChoreService().scheduleChore(hfileCleaner);serviceStarted true;if (LOG.isTraceEnabled()) {LOG.trace(Started service threads);}if (!conf.getBoolean(HConstants.ZOOKEEPER_USEMULTI, true)) {try {replicationZKLockCleanerChore new ReplicationZKLockCleanerChore(this, this,cleanerInterval, this.getZooKeeper(), this.conf);getChoreService().scheduleChore(replicationZKLockCleanerChore);} catch (Exception e) {LOG.error(start replicationZKLockCleanerChore failed, e);}}try {replicationZKNodeCleanerChore new ReplicationZKNodeCleanerChore(this, cleanerInterval,new ReplicationZKNodeCleaner(this.conf, this.getZooKeeper(), this));getChoreService().scheduleChore(replicationZKNodeCleanerChore);} catch (Exception e) {LOG.error(start replicationZKNodeCleanerChore failed, e);}}
定时执行
其中这段代码是对我们HLog进行处理并加入调度定时执行 // Initial cleaner choreCleanerChore.initChorePool(conf);// Start log cleaner thread//获取定时日志清理时间从系统配置获取默认为10分钟int cleanerInterval conf.getInt(hbase.master.cleaner.interval, 60 * 1000);this.logCleaner new LogCleaner(cleanerInterval,this, conf, getMasterFileSystem().getOldLogDir().getFileSystem(conf),getMasterFileSystem().getOldLogDir());//将任务加入定时执行时间间隔为 cleanerInterval 该值在LogCleaner中已经设置为定时执行间隔getChoreService().scheduleChore(logCleaner); 加入调度后会周期性执行 LogCleaner.chore() 方法(在父类CleanerChore中) Overrideprotected void chore() {if (getEnabled()) {try {POOL.latchCountUp();if (runCleaner()) {if (LOG.isTraceEnabled()) {LOG.trace(Cleaned all WALs under oldFileDir);}} else {if (LOG.isTraceEnabled()) {LOG.trace(WALs outstanding under oldFileDir);}}} finally {POOL.latchCountDown();}// After each cleaner chore, checks if received reconfigure notification while cleaning.// First in cleaner turns off notification, to avoid another cleaner updating pool again.if (POOL.reconfigNotification.compareAndSet(true, false)) {// This cleaner is waiting for other cleaners finishing their jobs.// To avoid missing next chore, only wait 0.8 * period, then shutdown.POOL.updatePool((long) (0.8 * getTimeUnit().toMillis(getPeriod())));}} else {LOG.trace(Cleaner chore disabled! Not cleaning.);}}
上面代码中的runCleaner方法就是将我们CleanerTask加入任务队列中 public Boolean runCleaner() {CleanerTask task new CleanerTask(this.oldFileDir, true);POOL.submit(task);return task.join();}
LogCleaner 日志清理类 LogCleaner类是清理日志数据LogCleaner 父类 CleanerChore 类中的 私有类CleanerTask该类继承RecursiveTask类不做过多介绍想了解的可以百度 ForkJoinTask , 的 compute()方法是定时清理的关键这里获取了所有oldWALs目录下的文件并进行选择性删除
Overrideprotected Boolean compute() {LOG.trace(Cleaning under dir);ListFileStatus subDirs;ListFileStatus tmpFiles;final ListFileStatus files;try {// if dir doesnt exist, well get null back for both of these// which will fall through to succeeding.subDirs FSUtils.listStatusWithStatusFilter(fs, dir, new FileStatusFilter() {Overridepublic boolean accept(FileStatus f) {return f.isDirectory();}});if (subDirs null) {subDirs Collections.emptyList();}//获取oldWALs目录下文件tmpFiles FSUtils.listStatusWithStatusFilter(fs, dir, new FileStatusFilter() {Overridepublic boolean accept(FileStatus f) {return f.isFile();}});files tmpFiles null ? Collections.FileStatusemptyList() : tmpFiles;} catch (IOException ioe) {LOG.warn(failed to get FileStatus for contents of dir , ioe);return false;}boolean allFilesDeleted true;if (!files.isEmpty()) {allFilesDeleted deleteAction(new ActionBoolean() {Overridepublic Boolean act() throws IOException {//files 是oldWALs目录下所有文件return checkAndDeleteFiles(files);}}, files);}boolean allSubdirsDeleted true;if (!subDirs.isEmpty()) {final ListCleanerTask tasks Lists.newArrayListWithCapacity(subDirs.size());for (FileStatus subdir : subDirs) {CleanerTask task new CleanerTask(subdir, false);tasks.add(task);//任务task.fork();}allSubdirsDeleted deleteAction(new ActionBoolean() {Overridepublic Boolean act() throws IOException {return getCleanResult(tasks);}}, subdirs);}boolean result allFilesDeleted allSubdirsDeleted;// if and only if files and subdirs under current dir are deleted successfully, and// it is not the root dir, then task will try to delete it.if (result !root) {result deleteAction(new ActionBoolean() {Overridepublic Boolean act() throws IOException {return fs.delete(dir, false);}}, dir);}return result;} 上一步中调用了 checkAndDeleteFiles(files) 方法该方法的作用是通过每个清理程序运行给定的文件以查看是否应删除该文件并在必要时将其删除。输入参数是所有oldWALs目录下的文件 /*** Run the given files through each of the cleaners to see if it should be deleted, deleting it if* necessary.* 通过每个清理程序运行给定的文件以查看是否应删除该文件并在必要时将其删除。* param files List of FileStatus for the files to check (and possibly delete)* return true iff successfully deleted all files*/private boolean checkAndDeleteFiles(ListFileStatus files) {if (files null) {return true;}// first check to see if the path is validListFileStatus validFiles Lists.newArrayListWithCapacity(files.size());ListFileStatus invalidFiles Lists.newArrayList();for (FileStatus file : files) {if (validate(file.getPath())) {validFiles.add(file);} else {LOG.warn(Found a wrongly formatted file: file.getPath() - will delete it.);invalidFiles.add(file);}}IterableFileStatus deletableValidFiles validFiles;// check each of the cleaners for the valid filesfor (T cleaner : cleanersChain) {if (cleaner.isStopped() || getStopper().isStopped()) {LOG.warn(A file cleaner this.getName() is stopped, wont delete any more files in: this.oldFileDir);return false;}IterableFileStatus filteredFiles cleaner.getDeletableFiles(deletableValidFiles);// trace which cleaner is holding on to each fileif (LOG.isTraceEnabled()) {ImmutableSetFileStatus filteredFileSet ImmutableSet.copyOf(filteredFiles);for (FileStatus file : deletableValidFiles) {if (!filteredFileSet.contains(file)) {LOG.trace(file.getPath() is not deletable according to: cleaner);}}}deletableValidFiles filteredFiles;}IterableFileStatus filesToDelete Iterables.concat(invalidFiles, deletableValidFiles);return deleteFiles(filesToDelete) files.size();}
ReplicationLogCleaner 日志清理类
checkAndDeleteFiles方法中 又调用了 cleaner.getDeletableFiles(deletableValidFiles) getDeletableFiles方法在ReplicationLogCleaner类下是判断哪些文件该删除哪些不该删除删除条件就是文章开头提出的是否在参与复制中如果在参与则不删除不在则删除。 注所有在参与peer的数据都在 zookeeper 中 /hbase/replication/rs 目录下存储 比如在zookeeper目录下有这么个节点 /hbase/replication/rs/jast.zh,16020,1576397142865/Indexer_account_indexer_prd/jast.zh%2C16020%2C1576397142865.jast.zh%2C16020%2C1576397142865.regiongroup-0.1579283025645 那么我们再oldWALs目录下是不会删除掉这个数据的 [jastjast002 ~]$ hdfs dfs -du -h /hbase/oldWALs/jast015.zh%2C16020%2C1576397142865.jast015.zh%2C16020%2C1576397142865.regiongroup-0.1579283025645
256.0 M 512.0 M /hbase/oldWALs/jast015.zh%2C16020%2C1576397142865.jast015.zh%2C16020%2C1576397142865.regiongroup-0.1579283025645 Overridepublic IterableFileStatus getDeletableFiles(IterableFileStatus files) {// all members of this class are null if replication is disabled,// so we cannot filter the filesif (this.getConf() null) {return files;}final SetString wals;try {// The concurrently created new WALs may not be included in the return list,// but they wont be deleted because theyre not in the checking set.wals loadWALsFromQueues();} catch (KeeperException e) {LOG.warn(Failed to read zookeeper, skipping checking deletable files);return Collections.emptyList();}return Iterables.filter(files, new PredicateFileStatus() {Overridepublic boolean apply(FileStatus file) {String wal file.getPath().getName();//包含文件则保留不包含则删除boolean logInReplicationQueue wals.contains(wal);if (LOG.isDebugEnabled()) {if (logInReplicationQueue) {//包含文件保留LOG.debug(Found log in ZK, keeping: wal);} else {//不包含删除LOG.debug(Didnt find this log in ZK, deleting: wal);}}return !logInReplicationQueue;}});}
上一步调用了 loadWALsFromQueues 方法该方法作用是获取所有在复制队列中的wals文件并返回
/*** Load all wals in all replication queues from ZK. This method guarantees to return a* snapshot which contains all WALs in the zookeeper at the start of this call even there* is concurrent queue failover. However, some newly created WALs during the call may* not be included.** 从ZK加载所有复制队列中的所有wals。 即使存在并发队列故障转移* 此方法也保证在此调用开始时返回包含zookeeper中所有WAL的快照。* 但是可能不会包括通话过程中一些新创建的WAL。*/private SetString loadWALsFromQueues() throws KeeperException {for (int retry 0; ; retry) {int v0 replicationQueues.getQueuesZNodeCversion();ListString rss replicationQueues.getListOfReplicators();if (rss null || rss.isEmpty()) {LOG.debug(Didnt find any region server that replicates, wont prevent any deletions.);return ImmutableSet.of();}SetString wals Sets.newHashSet();for (String rs : rss) {//加载zookeeper下/hbase/replication/rs 目录下所有数据ListString listOfPeers replicationQueues.getAllQueues(rs);// if rs just died, this will be nullif (listOfPeers null) {continue;}//加载所有目录for (String id : listOfPeers) {ListString peersWals replicationQueues.getLogsInQueue(rs, id);if (peersWals ! null) {wals.addAll(peersWals);}}}int v1 replicationQueues.getQueuesZNodeCversion();if (v0 v1) {return wals;}LOG.info(String.format(Replication queue node cversion changed from %d to %d, retry %d,v0, v1, retry));}}
总结
至此我们可以发现删除的过程就是定期执行删除文件线程从oldWALs获取所有文件如果在peer复制队列中则不进行副本删除否则则删除