There are three swordsmen of text processing under Linux – grep sed awk
grep: text line filter tool
sed: text line editor (stream editor)< /p>
awk: report generator (formatting text output)
1. Regular expressions
1. Basic regular expressions
* Match the previous character 0 times or any number of times (a* means match the previous character 0 times or any number of times, so writing has no meaning and will match all. aa* means match lines that contain at least one a)
Match any character except line breaks
^ $ Line beginning and end
Any one of the characters in the brackets
{n\} means that the preceding character appears exactly n times \is an escape character
\{n,\} means that the preceding character appears no less than n times
\{n ,m\} Indicates that the previous character appears at least n times, and at most m times
2. Extended regularity
No need to add escape characters
+ Previous One character matches once or any number of times
The previous character matches 0 times or once
Match two or more branch choices
( ) Matching a whole
Second, character interception and replacement commands
1. cut
The default separator for the cut command is the system Table symbol, which is the “tab” key
-f Column number
-d Separator
-c Character range
,Awk
Common parameters
Print output
Specify the delimiter
Specify the variable parameters manually
Print out the formatting 1> p
Type output format’ Output content
Output type: %ns: Output string. n is a number that refers to how many characters are output.
%ni: Output an integer. n refers to the number of digits to be output.
%m.nf: Output floating-point number. m and n are numbers, referring to the number of integers and decimals in the output. For example, %8.2f means a total of 8 digits are output,
Among them, 2 digits are decimals and 6 digits are integers.
View user id greater than 500 + right aligned
Left aligned
Medium and equal to 1 example:
cat /etc/passwd |awk’BEGIN{FS=”:”} $3>=1&&$3<=500{printf "%-10s %-10d\n",$1,$3}'< /p>
2) awk basic use
awk’condition 1 {action 1} condition 2 {name action 2}’
For example: df -h|grep /dev/sda3|awk'{print $5}’ Extract the root partition occupancy rate
3) Awk conditions
BEGIN At the beginning of the awk program, it has not been executed before reading any data. The action after BEGIN is executed only once at the beginning of the program
END is executed when the awk program has processed all data and is about to end. The action after END is executed only once at the end of the program
Judgment
A string A <= A string Whether it contains a substring that can match the B expression
A!~B Determine whether the string A does not contain a substring that can match the B expression
/Regular/< /p>
4) awk’s built-in variables
$0 Represents the current awk read data. We know that awk reads the data line by line, and $0 represents the entire line of data currently read in the line
$n represents the nth field of the current read line.
The total number of fields (columns) owned by the current row.
NR The row currently processed by awk is the row of the total data.
FS User-defined separator. The default separator of awk is any space. If you want to use other separators (such as “:”), you need the FS variable definition
The separator of OFS output fields (the default is a space).
History |awk -F'[ ]+”{print $3}’|sort|uniq -c|sort -nr|head
3. sed
sed is mainly used to select, replace, delete, and add data.
sed [Options]'[Action]’ File name
Options:
Generally, the sed command will output all data to the screen. If you add this option, only The lines processed by the sed command are output to the screen.
Action:
Add one or more lines after the current line. When adding multiple rows, except for the last row, you need to use “\” at the end of each row to indicate that the data is incomplete.
.
i \: Insert, insert one or more rows before the current row. When inserting multiple rows, except for the last row, the end of each row needs to use “\” to indicate that the data is incomplete.
d: Delete, delete the specified row.
P: Print, output the specified line
s: Replace a string with another string. The format is “line range s/ old string/new string/g” (similar to the replacement format in vim)
3. Character processing commands
< h2> 1. sort
sort [options] file name
-f: ignore capitalization
blank part of each line in front of: -b p>
-n: Sort by numeric value, and string sort by default
-r: Reverse sort
-u: Delete duplicate rows. It is the uniq command
-t: Specify the separator, the default is the separator is a tab character
-k n[,m]: Sort according to the specified field range. Starting from the nth field, the m field ends (default to the end of the line)
2. uniq
uniq [options] file name
-i: ignore size Write
3.wc
-l: Only count the number of rows
-w: Only count the number of words
-m: Only Counting the number of characters
-L: Counting the number of word characters
Four. Conditional judgment
1. Conditional judgment
Test test< /p>
Judge whether the file exists or not
Determine whether the file exists
Determine whether the file exists
Determine whether the file exists Is it empty
2. Judging according to file authority
, Judgment between two files
File 1 -nt File 2 Determine whether the modification time of file 1 is newer than that of file 2 (if new is true)
File 1-ot Document 2 Determine whether the modification time of document 1 is older than that of document 2 (if it is old, it is true)
Document 1 -ef Document 2 Judge Document 1 Whether it is consistent with the Inode number of file 2 can be understood as whether the two files are the same file. This judgment is a good way to judge hard links
4. Integer judgment
Equal to
Greater than
The string is less than or equal to
Judge
Judge whether the string is empty (return true if it is empty)
Judge whether the string is non-empty (return true if it is not empty)
Judge whether string 1 is equal to string 2 (equal returns true)
! = Determine whether string 1 and string 2 are not equal (unequal returns true)
6. Multiple condition judgments
Judgment 1 -a Judgment 2 Logical AND, Judgment 1 and judgment 2 are all established, the final result is true
Judgment 1 -o Judgment 2 Logical or, if one of judgment 1 and judgment 2 is established, the final result is true
! Judgment logical negation, invert the original judgment formula
Five. Process control
1. Single branch if
If [ ];then
Execution body
Fi
Two branches if
If then
If the conditions are established and executed
Else
If the conditions are not established
The conditions are not established
The conditions are not established
2 /h2>
If []; then
Conditions are established and executed
elif [p]; then
else
All conditions are not established and executed
fi
Multi-branch case condition statement
Only Can judge one kind of conditional relationship, and if can judge many kinds
Case variable in
Value 1)
Execution body
; ;
Value 2)
Execution body
Not all of the above;
Value, then execute this
;
esac
5. for loop
for i in conditions; do
/p>
For((initial value; loop control conditions; variable changes))
do
loop
loop > Done
6. while loop
while [condition judgment]; do