linux-awk-3 - AWK, linux

awk

Basic syntax
Awk –Fs'/pattern/ {action}' input-file
(or)
Awk –Fs'{ action}' input-file

-F is the field delimiter. If not specified, spaces will be used as delimiters by default.
/pattern/ and {action}9-AWk need to be enclosed in single quotes.
/pattern/ is optional. If not specified, awk will process all records in the input file. If you specify a pattern, awk will only process records that match the specified pattern.

Awk program structure (BEGIN, body, END) area

BEGIN area
Begin area syntax:
BEGIN {awk-commands }
BEGIN area The command is only executed once at the beginning, before awk executes the body area command.
The BEGIN area is very suitable for printing message header information and for initializing variables.
BEGIN area can have one or more awk commands
The keyword BEGIN must be capitalized
BEGIN area is optional
BODY area
/pattern/ {action}
Every Read one line at a time, execute one line

END area
END {awk-commands} execute only once

awk -F ":"'/^root/{print }' passwd

Built-in variables
awk’BEGIN {FS=”,”} {print $2,$3}’ employee.txt
awk’BEGIN {print “test1″,”test2”}’
If you don’t use a comma, awk will not use OFS, and there is no space between its output variables.

$ gawk’BEGIN {print “Hello World!”} {print $0} END {print ” byebye”}’ data1
Built-in variables
$0 the whole record
$1 the first data field in the record
$2 the second data field in the record
$n in the record The nth data field
FIELDWIDTHS is a column of numbers separated by spaces, which defines the specific width of each field
FS input field separator
RS input record separator
OFS output field separator
ORS output field separator

ARGC current command line parameter number
 ARGIND current file index in ARGV
 ARGV array containing command line parameters
 CONVFMT number conversion Format (see printf statement), the default value is %.6g
 ENVIRON is an associative array composed of the current shell environment variables and their values
 ERRNO The system error number when an error occurs when reading or closing the input file
 br /> FILENAME is used as the file name of the data file input by gawk
 FNR The number of records in the current data file
 When IGNORECASE is set to non-zero, the character case of the string appearing in the gawk command is ignored.< br /> The total number of fields in the NF data file
 NR The number of input records processed
 The number of FNR file records
 The output format of OFMT numbers, the default value is %.6g
 RLENGTH The length of the substring matched by the match function
 RSTART The starting position of the substring matched by the match function
 
 Example: 
 Number of command line parameters
 awk'{print ARGC}' /etc/fstab /etc/inittab
 Command line parameters
 awk'BEGIN {print ARGV [0]}' /etc/fstab /etc/inittab

awk'{print FILENAME, “record number is”,NR,”FNR IS” ,FNR }’awk passwd

Variables

Awk variables start with a letter, and the subsequent characters can be numbers, letters, or underscores. Keywords cannot be used as awk variables
Awk variables can be used directly without prior declaration. If you want to initialize a variable, it is best to do it in the BEGIN area, it will only be executed once.

Custom variables
-v or directly define

printf formatted output
 formatted output: printf "FORMAT", item1, item2, .
 (1) FORMAT must be specified
 (2) No automatic line breaks, line break control characters need to be explicitly given, \n
 (3) FORMAT needs to specify the format characters for each subsequent item separately< br /> 
 
 

Unary operator
Operator description
+ Take positive, the number itself returns
- Negate
++
 - 
 
 Arithmetic operators
 
 Operator description
 +
 -
 *
 /
 %
 
 awk'NR%2 == 0 {print NR,$0}' passwd
 
 String operators
 
 Assignment operator
 Operator description
 =
 +=
 -=
 *=
 /=
 %=
 < br /> Comparison operators
 
 >
 >=
 <
 <=
 ==
 !=
 && and 
 || or
 
 
 Regular expression
 
 Operator description
 ~ Match
! ~ No
 
 awk -F:'$1~"ro"' passwd The first field contains ro

 $ awk'BEGIN {FS=":";print "begin test"} {print $1} END {print "itis end "} 'passwd

Match operator
$1 ~ /^data/

gawk -F:’$4 = = 0{print $1}’ /etc/passwd

Line range
awk -F:'/^root\>/,/^nobody\>/ {print $1}' /etc/ passwd
awk -F:'(NR>=10&<=20){print NR,$1}' /etc/passwd (with or without parentheses will do) 
 


awk structured commands
 if 
 single statement
 if(conditional-expression) {statements ;.......}
 
 more Article
 if (conditional-expression)
 {
 action1; #Execute in turn
 action2;
 }
 
 if else 
 if (conditional-expression)
 action1
 else
 action2
 
 if(condition) {statements;…} else {statements;…}
 
 
 Ternary operator
 codintional-expression? action1: action2 ;
  while 
 while (codition)
 {
 
 Actions
 
 }


 while(conditon ) {statments;…} 
 
 do-while 

 do
 {
 action
} 
 while(condition)< br />
 for 
 
 for(initialization;condition;increment/decrement)
 for(expr1;expr2;expr3) {statements;…}

 if-then-else statement:
 if (condition) statement1; else statement2
 while statement:
 while (condition)
 {
 statements
} 
 do-while statement:
 do {
 statements
} while (condition)
 for statement:
 for(variable assignment; condition; iteration process) 
 Example
 seq 10 | awk'i=0{print $0}' i=0 do not print
 s eq 10 | awk'i=1{print $0}' =1 Printing has nothing to do with braces
 seq 10 | awk'i=!i{print i, $0}' At the beginning i is not assigned, !i is true ( Namely 1), print, then false (0), do not print, only print odd lines
 seq 10 | awk'!(i=!i){print i, $0}' Same as above, print even lines
 
 Take the disk utilization and display it
 df -h | awk -F "[[:space:]]+|%"'/^\/dev\/sd/{ if ($5> 10) print $1, $5}'
 
 awk'/^[[:space:]]*linux16/ {i=1;while (i<= NF) {print $i,length($ i);i++} }'/boot/grub2/grub.cfg
 
 for
 
 for(variable assignment;condition;iteration process)
 {for-body }
 
 awk'BEGIN{wkd["mo"]="monday";wkd["fr"]="friday";wkd["sat"]="satday"; for( i in wkd ){ print i,wkd[i]}}' 
 
 
 awk'BEGIN{sum=0; for (i=1;i<=100;i++){ sum+=i } print sum }'
 
 
 next:
 End the processing of this line ahead of time and proceed directly to the next line of processing (awk's own loop)
 
 
 Array
 
 array[index-expression]
 index-expression:
(1) Any string can be used; string should be enclosed in double quotes
 (2) If an array element does not exist in advance, awk will automatically create this element when referencing it. And initialize its value to "empty string"
(3) To determine whether there is an element in the array, use the "index in array" format to traverse

View the number of states
netstat -tan | awk’/^tcp/ {state[$NF]++} END{for(i in state) {print i,state[i] }}’

access.log Take the front Ten ip and join the firewall

awk'{ip[$1]++} END{for (i in ip ){print i, "Number of connections" ip[i]}} 'access_log | sort -nr -k 3 | head

Join iptables firewall

iptables -A INPUT -s IP -j REJECT

The ip of the local connection is the top ten

awk'{split($5,ip,":");count[ip[1]]++;print ip[1],"link数", count[ip[1]]}' ss.log | sort -nr -k 3 | head

awk -F "[[:space:]]+|:"'{ ip [$6]++}END{for(i in ip) {print "summery", i,"links ", ip[i]}} 'ss.log | sort -nr -k4

take the ip in the log, starting with a number,
awk'/^[0-9]/ {ip[$1]++} END{for (i in ip) print i,ip[i ]} 'aess_log

Add the number of connections greater than 100 to the firewall
while true; do

awk’/^[0-9]/ {ip[$1]+ +} END{for (i in ip) {if (ip[i]>100) print i}} ‘access_log | while read line ;do echo “$line”; done
sleep 10
done

do iptables -A INPUT -s $line -j REJECT

Get random number
awk’BEGIN {srand(); for(i=1;i<=10;i++){print rand()} }'

String operation
? length([s]): return to specified The length of the string
? sub(r,s,[t]): Search the t string for the content that matches the pattern with r, and replace the first match with s
echo "2008: 08:08 08:08:08" | awk'gsub(/:/,"-",$0)'

Homework:

1 blog.magedu.com 2 www.magedu.com
3 hhhh.magedu.com
4 dddd.magedu.com
5 b333.magedu.com
6 bkkk.magedu.com 7 ssss.magedu.com
8 wog.magedu.com
9 ulog.magedu.com
Take the host name
awk -F "[ .]"'{ print $2}' soho.txt: After confirming the delimiter, take the domain

Take the number of occurrences of the fstab file system type
awk'/^UUID/{fs[$3]++ } END{for (i in fs) {print i,fs[i]}} 'fstab

fstab word occurrences
grep -wEo "[[:alpha:]]+" fstab | awk'{word[$1]++} END{for (i in word) {print i,word[i]}}'

Extract numbers
echo "[email protected]%9&Bdh7dq+YVixp3vpw" | awk'gsub(/[^[:digit:]]/, "",$0)'

Generate random numbers
awk'BEGIN{srand(); for (i=1;i<=200;i++) {if (i==200) {printf "%d", int(rand()*100) ;}else {printf "%d,", int(rand()*100) }} }'

Take the above random number Max Min
awk -F "," '{MAX=$1;MIN=$1; for (i=1;i<=NF;i++) {if ($i>= MAX) {MAX=$i} ; if ($i <= MIN) {MIN=$i}}} END{ print "MAX=",MAX, "MIN=" ,MIN} 'soho.txt

http://mail.magedu.com/index.html
http://www.magedu.com/test.html
http://study.magedu.com/index. html
http://blog.magedu.com/index.html
http://www.magedu.com/images/logo.jpg

Take the fully qualified domain name Determine the separator, select the field, count and print
awk -F"/"'{FQ[$3]++} END{ for(i in FQ) print i,FQ[i] }'soho. txt | sort -rn -k 2 

Example:

inode|beginnumber|endnumber|counts|
106|3363120000|3363129999|10000|
106|3368560000|3368579999 |20000|
310|3337000000|3337000100|101|
310|3342950000|3342959999|10000|
310|3362120960|3362120961|2|
311|3313460102|3313469999|9898 |
311|3313470000|3313499999|30000|
311|3362120962|3362120963|2|

Output format
310|3337000000|3362120961|10103|
311|3313460102|3362120963|39900|
106|3363120000|3368579999|30000|

awk -F'|' -v OFS='|''/^[ 0-9]/{inode[$1]++; if(!bn[$1]){bn[$1]=$2} else if(bn[$1]>$2){bn[$1]=$2}; if( en[$1]<$3)en[$1]=$3;cnt[$1]+=$(NF-1)} E{for(i in inode)print i,bn[i],en[i],cnt[ i]}' soho.txt

Use the awk command to calculate the total size of files in a directory
find. -maxdepth 1 -type f -ls | awk'{sum+ =$7} END {print sum}'

Statistics link to the 10 most local IPs

netstat -an | head | awk -F "[[:space:]]+|:" 'NR> 2 {print $6}'

netstat -an | head | awk -F "[[:space:]]+ |:" 'NR> 2 {ip[$6]++} END{for (i in ip) print i,ip[i] }'| sort -nr -k 2|head

Leave a Comment Cancel reply