awk
Basic syntax
Awk –Fs'/pattern/ {action}' input-file
(or)
Awk –Fs'{ action}' input-file
-F is the field delimiter. If not specified, spaces will be used as delimiters by default.
/pattern/ and {action}9-AWk need to be enclosed in single quotes.
/pattern/ is optional. If not specified, awk will process all records in the input file. If you specify a pattern, awk will only process records that match the specified pattern.
Awk program structure (BEGIN, body, END) area
BEGIN area
Begin area syntax:
BEGIN {awk-commands }
BEGIN area The command is only executed once at the beginning, before awk executes the body area command.
The BEGIN area is very suitable for printing message header information and for initializing variables.
BEGIN area can have one or more awk commands
The keyword BEGIN must be capitalized
BEGIN area is optional
BODY area
/pattern/ {action}
Every Read one line at a time, execute one line
END area
END {awk-commands} execute only once
awk -F ":"'/^root/{print }' passwd
Built-in variables
awk’BEGIN {FS=”,”} {print $2,$3}’ employee.txt
awk’BEGIN {print “test1″,”test2”}’
If you don’t use a comma, awk will not use OFS, and there is no space between its output variables.
$ gawk’BEGIN {print “Hello World!”} {print $0} END {print ” byebye”}’ data1
Built-in variables
$0 the whole record
$1 the first data field in the record
$2 the second data field in the record
$n in the record The nth data field
FIELDWIDTHS is a column of numbers separated by spaces, which defines the specific width of each field
FS input field separator
RS input record separator
OFS output field separator
ORS output field separator
ARGC current command line parameter number
ARGIND current file index in ARGV
ARGV array containing command line parameters
CONVFMT number conversion Format (see printf statement), the default value is %.6g
ENVIRON is an associative array composed of the current shell environment variables and their values
ERRNO The system error number when an error occurs when reading or closing the input file
br /> FILENAME is used as the file name of the data file input by gawk
FNR The number of records in the current data file
When IGNORECASE is set to non-zero, the character case of the string appearing in the gawk command is ignored.< br /> The total number of fields in the NF data file
NR The number of input records processed
The number of FNR file records
The output format of OFMT numbers, the default value is %.6g
RLENGTH The length of the substring matched by the match function
RSTART The starting position of the substring matched by the match function
Example:
Number of command line parameters
awk'{print ARGC}' /etc/fstab /etc/inittab
Command line parameters
awk'BEGIN {print ARGV [0]}' /etc/fstab /etc/inittab
awk'{print FILENAME, “record number is”,NR,”FNR IS” ,FNR }’awk passwd
Variables
Awk variables start with a letter, and the subsequent characters can be numbers, letters, or underscores. Keywords cannot be used as awk variables
Awk variables can be used directly without prior declaration. If you want to initialize a variable, it is best to do it in the BEGIN area, it will only be executed once.
Custom variables
-v or directly define
printf formatted output
formatted output: printf "FORMAT", item1, item2, .
(1) FORMAT must be specified
(2) No automatic line breaks, line break control characters need to be explicitly given, \n
(3) FORMAT needs to specify the format characters for each subsequent item separately< br />
Unary operator
Operator description
+ Take positive, the number itself returns
- Negate
++
-
Arithmetic operators
Operator description
+
-
*
/
%
awk'NR%2 == 0 {print NR,$0}' passwd
String operators
Assignment operator
Operator description
=
+=
-=
*=
/=
%=
< br /> Comparison operators
>
>=
<
<=
==
!=
&& and
|| or
Regular expression
Operator description
~ Match
! ~ No
awk -F:'$1~"ro"' passwd The first field contains ro
$ awk'BEGIN {FS=":";print "begin test"} {print $1} END {print "itis end "} 'passwd
Match operator
$1 ~ /^data/
gawk -F:’$4 = = 0{print $1}’ /etc/passwd
Line range
awk -F:'/^root\>/,/^nobody\>/ {print $1}' /etc/ passwd
awk -F:'(NR>=10&<=20){print NR,$1}' /etc/passwd (with or without parentheses will do)
awk structured commands
if
single statement
if(conditional-expression) {statements ;.......}
more Article
if (conditional-expression)
{
action1; #Execute in turn
action2;
}
if else
if (conditional-expression)
action1
else
action2
if(condition) {statements;…} else {statements;…}
Ternary operator
codintional-expression? action1: action2 ;
while
while (codition)
{
Actions
}
while(conditon ) {statments;…}
do-while
do
{
action
}
while(condition)< br />
for
for(initialization;condition;increment/decrement)
for(expr1;expr2;expr3) {statements;…}
if-then-else statement:
if (condition) statement1; else statement2
while statement:
while (condition)
{
statements
}
do-while statement:
do {
statements
} while (condition)
for statement:
for(variable assignment; condition; iteration process)
Example
seq 10 | awk'i=0{print $0}' i=0 do not print
s eq 10 | awk'i=1{print $0}' =1 Printing has nothing to do with braces
seq 10 | awk'i=!i{print i, $0}' At the beginning i is not assigned, !i is true ( Namely 1), print, then false (0), do not print, only print odd lines
seq 10 | awk'!(i=!i){print i, $0}' Same as above, print even lines
Take the disk utilization and display it
df -h | awk -F "[[:space:]]+|%"'/^\/dev\/sd/{ if ($5> 10) print $1, $5}'
awk'/^[[:space:]]*linux16/ {i=1;while (i<= NF) {print $i,length($ i);i++} }'/boot/grub2/grub.cfg
for
for(variable assignment;condition;iteration process)
{for-body }
awk'BEGIN{wkd["mo"]="monday";wkd["fr"]="friday";wkd["sat"]="satday"; for( i in wkd ){ print i,wkd[i]}}'
awk'BEGIN{sum=0; for (i=1;i<=100;i++){ sum+=i } print sum }'
next:
End the processing of this line ahead of time and proceed directly to the next line of processing (awk's own loop)
Array
array[index-expression]
index-expression:
(1) Any string can be used; string should be enclosed in double quotes
(2) If an array element does not exist in advance, awk will automatically create this element when referencing it. And initialize its value to "empty string"
(3) To determine whether there is an element in the array, use the "index in array" format to traverse
View the number of states
netstat -tan | awk’/^tcp/ {state[$NF]++} END{for(i in state) {print i,state[i] }}’
access.log Take the front Ten ip and join the firewall
awk'{ip[$1]++} END{for (i in ip ){print i, "Number of connections" ip[i]}} 'access_log | sort -nr -k 3 | head
Join iptables firewall
iptables -A INPUT -s IP -j REJECT
The ip of the local connection is the top ten
awk'{split($5,ip,":");count[ip[1]]++;print ip[1],"link数", count[ip[1]]}' ss.log | sort -nr -k 3 | head
awk -F "[[:space:]]+|:"'{ ip [$6]++}END{for(i in ip) {print "summery", i,"links ", ip[i]}} 'ss.log | sort -nr -k4
take the ip in the log, starting with a number,
awk'/^[0-9]/ {ip[$1]++} END{for (i in ip) print i,ip[i ]} 'aess_log
Add the number of connections greater than 100 to the firewall
while true; do
awk’/^[0-9]/ {ip[$1]+ +} END{for (i in ip) {if (ip[i]>100) print i}} ‘access_log | while read line ;do echo “$line”; done
sleep 10
done
do iptables -A INPUT -s $line -j REJECT
Get random number
awk’BEGIN {srand(); for(i=1;i<=10;i++){print rand()} }'
String operation
? length([s]): return to specified The length of the string
? sub(r,s,[t]): Search the t string for the content that matches the pattern with r, and replace the first match with s
echo "2008: 08:08 08:08:08" | awk'gsub(/:/,"-",$0)'
Homework:
1 blog.magedu.com< br /> 2 www.magedu.com
3 hhhh.magedu.com
4 dddd.magedu.com
5 b333.magedu.com
6 bkkk.magedu.com< br /> 7 ssss.magedu.com
8 wog.magedu.com
9 ulog.magedu.com
Take the host name
awk -F "[ .]"'{ print $2}' soho.txt: After confirming the delimiter, take the domain
Take the number of occurrences of the fstab file system type
awk'/^UUID/{fs[$3]++ } END{for (i in fs) {print i,fs[i]}} 'fstab
fstab word occurrences
grep -wEo "[[:alpha:]]+" fstab | awk'{word[$1]++} END{for (i in word) {print i,word[i]}}'
Extract numbers
echo "[email protected]%9&Bdh7dq+YVixp3vpw" | awk'gsub(/[^[:digit:]]/, "",$0)'
Generate random numbers
awk'BEGIN{srand(); for (i=1;i<=200;i++) {if (i==200) {printf "%d", int(rand()*100) ;}else {printf "%d,", int(rand()*100) }} }'
Take the above random number Max Min
awk -F "," '{MAX=$1;MIN=$1; for (i=1;i<=NF;i++) {if ($i>= MAX) {MAX=$i} ; if ($i <= MIN) {MIN=$i}}} END{ print "MAX=",MAX, "MIN=" ,MIN} 'soho.txt
http://mail.magedu.com/index.html
http://www.magedu.com/test.html
http://study.magedu.com/index. html
http://blog.magedu.com/index.html
http://www.magedu.com/images/logo.jpg
Take the fully qualified domain name< br /> Determine the separator, select the field, count and print
awk -F"/"'{FQ[$3]++} END{ for(i in FQ) print i,FQ[i] }'soho. txt | sort -rn -k 2< br />
Example:
inode|beginnumber|endnumber|counts|
106|3363120000|3363129999|10000|
106|3368560000|3368579999 |20000|
310|3337000000|3337000100|101|
310|3342950000|3342959999|10000|
310|3362120960|3362120961|2|
311|3313460102|3313469999|9898 |
311|3313470000|3313499999|30000|
311|3362120962|3362120963|2|
Output format
310|3337000000|3362120961|10103|
311|3313460102|3362120963|39900|
106|3363120000|3368579999|30000|
awk -F'|' -v OFS='|''/^[ 0-9]/{inode[$1]++; if(!bn[$1]){bn[$1]=$2} else if(bn[$1]>$2){bn[$1]=$2}; if( en[$1]<$3)en[$1]=$3;cnt[$1]+=$(NF-1)} E{for(i in inode)print i,bn[i],en[i],cnt[ i]}' soho.txt
Use the awk command to calculate the total size of files in a directory
find. -maxdepth 1 -type f -ls | awk'{sum+ =$7} END {print sum}'
Statistics link to the 10 most local IPs
netstat -an | head | awk -F "[[:space:]]+|:" 'NR> 2 {print $6}'
netstat -an | head | awk -F "[[:space:]]+ |:" 'NR> 2 {ip[$6]++} END{for (i in ip) print i,ip[i] }'| sort -nr -k 2|head
< p>