Regular expression - expressions, First, regular, solution

Regular Expression (Regular Expression) is a very powerful tool for string retrieval, filtering, and replacement operations. It is often used in daily work. There is a need to find a string that meets some complex rules. Regular expressions are the tools used to describe these rules. Regular expressions are divided into basic regular expressions and extended regular expressions. Here is a brief introduction

Basic regular expressions

　　Basic regular expressions can be classified by metacharacters. As for what is a metacharacter, don’t worry about it.

metacharacters can be divided into four categories, character matching, frequency matching, position matching, grouping

　 1 Character matching: match certain irregular characters with certain symbols

　　. 　　　　 It’s not possible to match any single character, number, letter, newline

　　[] 　　　　　 matches any single character in square brackets such as: [abcd] matches any of the four characters a, b, c, d [0-9] matches any number once [az] Match any letter once

　　[^]　　　　　 This is the exact opposite of the above, matches the characters in non-square brackets any time, such as [ ^abcd] Match any character other than a, b, c, d, which means inverted

　　[:alnum:] Match Letters and numbers [A-Za-z0-9]

　　[:alpha:] 　　　　 upper and lower case characters, that is, AZ, az

　　[:lower:] 　　　　 lowercase letters

[:upper:] uppercase letters

　　[:blank:] 　　　　 blank characters (spaces and tabs)

span>

　　[:space:] 　　　　 Horizontal and vertical blank characters. This match contains [:blank:]

[:cntrl:] 　　　　 Non-printable control characters (backspace)

　　[:digit:] 　　　　 Decimal digits [0-9]

　　[:xdigit:]　　　　hexadecimal digits /span>

　　[:graph:] 　　　　 printable non-blank characters

　　[:print:] Printable characters

　　[:punct:] Symbols

　　2. Number of matches, The above describes the matching of characters with a specific symbol or mark, but when the characters or texts that need to be processed are often large, so the number of matches is very important.

　　　　　 Have you guessed it? That’s right, the same number of matches is to use certain symbols to represent the number of times a character appears. As follows

　　* 　　　　　 matches the preceding character any number of times, including 0 times, but it will match as many times as possible, which is called greedy Mode, so you need to pay attention. For example. * can match blank lines, and can also match any content

　　\? 　　　　 Match the character before it 0 or 1 time

　　\? span>

　　\+ 　　　 Match the preceding character at least once

　　\{n\} 　　　　 Match the preceding character n times

　　\{m,n\} 　　 Match the previous character The character of at least m times, and at most n times 　　　

　　\{,n\}　　　　 Match the preceding character up to n times

　　\{n,\} Match the preceding character at least n times

　　2. Position matching, what is position matching? In the process of processing files, we are often unable to determine how many characters it is, what is in front and what is behind , We only need to care about the beginning and the end

$ Match the end of the line

　　 The above two can be combined to match Blank lines such as: ^$

　　 Then what does ^[[:space:]]*$ mean?

location matching and /< and /> What does it mean? For example, / matching words ending with a This is word position matching

It should be noted that the underline of numbers is not a word/< and /> /> /> /> /> Can be replaced with /b and /ba and a/b

　　\ is to match the entire word

　　1.3.4 Grouping, speaking one or more characters together As a whole, it is convenient to reference it

　　,,, such as $hello linox$ This will put two characters String into a group, you can use \1 to quote it, if there is a second group, use \2 to quote, and so on,

　　　　　　　　 When the group contains groups, the outermost group is the first group. For example, $group1\(group2$\) The outermost group is \1 group but its content is group1group2 p>

　　\| or symbol such as a\|b matches a or b

extended regular expression

　　 extended regular expressions are all basic regular expressions The content of the extension is not much, but it is very important. It can be said to be a simplification of the basic regular expression, because many specific matches of the extended regular expression do not require the \ symbol

　　The character matching of the extended regular expression is the same as the basic regular expression.

　　Number of matches

　　　　　　　 matches the preceding character any number of times, including 0 times, but it will match as many times as possible, which is called greedy mode, so you need to pay attention. For example. * can match blank lines, and can also match any content

　　? 　　　　 Match the preceding character 0 or 1 time

　　+ 　　　　 Match the preceding character at least once

　　{n} Match the preceding character n times

　　{m,n} 　　 Match the preceding character at least m times, up to n times 　　　

At most n times

　　{n,} Match the preceding character at least n times

　　 I don’t see it, the extended regular is actually the same as the basic regular, but the extended regular doesn’t need to be escaped.

　　 summary: basic regular and extended regular, but in Linux system some commands can only use the basic, and the command uses the extended, or both. Use the same way

Leave a Comment Cancel reply