Ten, HDFS NameNode work mechanism

[TOC]

One, fsimage and edits files

1, basic concepts

txid:< br>namenode gives a unique id for each operation event (addition, deletion, modification operation), called txid, which is generally incremented from 0. For each additional operation, txid will increment by 1.

fsimage:
is a mirror file of the namenode’s metadata in the memory on the local disk, but usually fsimage does not contain renewal operation events, so the essence There is still a gap between upper and in-memory metadata. What is recorded here is not an operation log, which contains serialization information of all directories and file idnodes of the HDFS file system. The general naming method is in the form of fsimage_txid, and the following txid represents the txid of the latest operation event recorded by the fsimage.

edits:
It is an operation log file that records the addition, deletion, and modification of namenode. If you know mysql, this is a bit similar to mysql’s binary log.

2, namenode directory structure

[[email protected] tmp]# tree dfs/name
dfs/name
├── current
│?? ├── edits_0000000000000000001-0000000000000000002
│?? ├── edits_0000000000000000003-0000000000000000004
│?? ├── edits_0000000000000000005-0000000000000000006
│?? ├── edits_0000000000000000007-0000000000000000008< br />│?? ├── edits_0000000000000000009-0000000000000000009
│?? ├── edits_0000000000000000010-0000000000000000011
│?? ├── edits_0000000000000000012-0000000000000000013
│?? ├── edits_0000000000000000014- 0000000000000000015
│?? ├── edits_0000000000000000016-0000000000000000017
│?? ├── edits_0000000000000000018-0000000000000000019
│?? ├── edits_0000000000000000020-0000000000000000021
│?? ├── edits_0000000000000000022-0000000000000000024
│?? ├── edits_0000000000000000025-0000000000000000026
│?? ├── edits_inprogress_0000000000000000027
│?? ├── fsimage_0000000000000000024
│?? ├── fsimage_0000000000000000 024.md5
│?? ├── fsimage_0000000000000000026
│?? ├── fsimage_0000000000000000026.md5
│?? ├── seen_txid
│?? └── VERSION< br />└── in_use.lock

In summary, it is actually simplified to the following structure:

dfs/name
├── current
│? ? ├── edits_txid1-txid2 may be multiple, which are old edits files that have been generated by scrolling
│?? ├── edits_inprogress_txid3 is the edits currently in use
│?? ├── fsimage_0000000000000000024 fsimage File
│?? ├── fsimage_0000000000000000024.md5 md5 checksum of fsimage file
│?? ├── seen_txid record the latest txid
│?? └── VERSION record hdfs cluster Some simple information of
└── in_use.lock lock file to avoid this directory being used to start multiple namenodes

(1) The content of the VERSION file

# In hdfs There will be multiple namenodes, and different namenodes have different namenodeIDs. They manage a group of blockpoolID
namespaceID=983105879
# cluster ID, globally unique
clusterID=CID-c12b7022-0c51- 49c5-942f-edc889d37fee
# Mark the time when the storage directory of the namenode was created. For the newly created storage system, this attribute is 0. But after the file system is upgraded, the value will be updated to the new timestamp
cTime=1558262787574
# Mark whether the storage directory is a namenode or a datanode
storageType=NAME_NODE
# A block pool id identifies a block pool and is globally unique across clusters. When a new Namespace is created (part of the format process), a unique ID is created and persisted. Building a globally unique BlockPoolID during the creation process is more reliable than manual configuration. NN persists the BlockPoolID to the disk, and will load and use it again in the subsequent startup process.
blockpoolID=BP-473222668-192.168.50.121-1558262787574
# This is useless
layoutVersion=-63

(2) seen_txid
This file is recorded The latest txid

(3) The directory structure of SNN is the same as that of namenode, except that some of the latest edits files are missing.

3. The relationship between the naming methods of fsimage and edit files

We can see that the file names of the fsimage and edits files above are followed by a long string of numbers. What is that? In fact It is txid. From the two naming methods, we can see some rules.

edits file:
We can see that the edits files are named edits_00000xxx-000000xxx, which actually means that the range of txid operation events is recorded in the edits file. And edit_inprogess_00000xxx means the latest txid event currently recorded, and the file is the edits file currently in use.

fsimage file:
Named as fsimage_000000xxx, it means the latest txid event recorded by the fsimage file. Please note that the edits file will only be merged after the fsimage is triggered conditionally fsimage, otherwise it will not be merged. So under normal circumstances, the txid behind the edits file will be larger than fsimage.

4. View the contents of the fsimage file

//Format: hdfs oiv -p output format-i input file-o output file
[[emailprotected] current] # hdfs oiv -p XML -i fsimage_0000000000000000037 -o /tmp/fsimage37.xml

As mentioned earlier, fsimage records mainly metadata information, which describes the directory structure and directory storage in hdfs File, and the meta information of the corresponding directory and file. Let’s take a look at some information:



-63
1
17e75c2a11685af3e043aa5e604dc831e5b14674


983105879
1000
1014
0
1073741837
334< br />


16407
16

Here is the focus , The record is the directory structure and meta-information

16386
DIRECTORY This is the directory, the name is test
test
1558263065070 Modification time
root:supergroup:0755 Permissions
-1
-1


16387 this It is a file named edit_new.xml
FILE
edit_new.xml
2
1558263065045
1558269494520
134217728
root:supergroup:0644 Permissions< br /> Here is the block information, which blocks are included

1073741825
1001
580


0


From the above fsimage information, it can be known that it records the directory structure of the current file system and the corresponding meta-information. The difference with edits is that edits records operations on the file system.

5. View the content of the edits file

//Format: hdfs oev -p output format (default XML) -i input file -o output file
[[emailprotected ] current]# hdfs oev -i edits_inprogress_0000000000000000038 -o /tmp/edits_inprogess.xml

Also intercept part of the information to view:



-63

OP_START_LOG_SEGMENT represents the type of operation, here is Indicates that the log starts to record

38 is similar to the ID of the operation, it is the only one




Each RECORD records an operation

OP_ADD_BLOCK //Like this, it means uploading files Operation

34
/jdk-8u144-linux-x64.tar.gz._COPYING_
< BLOCK>
1073741825
134217728
1001


1073741826
0
1002


-2

Each RECORD records an operation. For example, the
OP_ADD in the figure represents the operation of adding a file. Generally speaking, it also records
file path (PATH)
modification time (MTIME)
add time (ATIME)
client name (CLIENT_NAME)
client address (CLIENT_MACHINE)< br>Permission (PERMISSION_STATUS) and other very useful information

6. Manually scroll the edits log

Format: hdfs dfsadmin -rollEdits

7, NN,SNN, DN data directory configuration

(1) hadoop.tmp.dir is configured

If hadoop.tmp.dir is configured in core-site.xml, the respective data directories are as follows :

NN:{hadoop.tmp.dir}/dfs/name fsimage and edits files will be stored in this directory
SNN:${hadoop.tmp.dir}/dfs/namesecondary SNN data directory
DN:${hadoop.tmp.dir}/dfs/data datanode data directory

(2) Set the directory separately

If you don’t set hadoop The value of .tmp.dir, then NN, SNN, DN need to manually set their respective data directories, otherwise the data files will be generated under /tmp/hadoop-root/dfs/, and the respective setting parameters are as follows:

< pre>/* are set in hdfs-site.xml*/
//If only one of these two is set, then the fsimage and edits files will be stored in a specified directory
NN: dfs.namenode.name.dir sets the storage path of fsimage
dfs.namenode.edits.dir sets the storage path of edits

DN: dfs.datanode.data.dir This is the datanode storage directory
br />SNN: dfs.namenode.checkpoint.dir This is the SNN storage directory

(3) Namenode multi-directory setting

When the namenode working directory is set separately, we can Set multiple values ​​to dfs.namenode.name.dir, separated by commas, then hdfs namen When ode -format is formatted, two namenode directories will also be formatted, and the contents of the two directories are also consistent during operation. This method can be used as a supplement to namenode backup data. Such as:


dfs.namenode.name.dir
file:///${hadoop.tmp.dir} /dfs/name1,file:///${hadoop.tmp.dir}/dfs/name2

Second, namenode and SNN workflow h1>

10. Namenode working mechanism of HDFS

Leave a Comment

Your email address will not be published.