linux-awk-3

awk

Basic syntax
Awk –Fs'/pattern/ {action}' input-file
(or)
Awk –Fs'{ action}' input-file

-F is the field delimiter. If not specified, spaces will be used as delimiters by default.
/pattern/ and {action}9-AWk need to be enclosed in single quotes.
/pattern/ is optional. If not specified, awk will process all records in the input file. If you specify a pattern, awk will only process records that match the specified pattern.

Awk program structure (BEGIN, body, END) area

BEGIN area
Begin area syntax:
BEGIN {awk-commands }
BEGIN area The command is only executed once at the beginning, before awk executes the body area command.
The BEGIN area is very suitable for printing message header information and for initializing variables.
BEGIN area can have one or more awk commands
The keyword BEGIN must be capitalized
BEGIN area is optional
BODY area
/pattern/ {action}
Every Read one line at a time, execute one line

END area
END {awk-commands} execute only once

awk -F ":"'/^root/{print }' passwd

Built-in variables
awk’BEGIN {FS=”,”} {print $2,$3}’ employee.txt
awk’BEGIN {print “test1″,”test2”}’
If you don’t use a comma, awk will not use OFS, and there is no space between its output variables.

$ gawk’BEGIN {print “Hello World!”} {print $0} END {print ” byebye”}’ data1
Built-in variables
$0 the whole record
$1 the first data field in the record
$2 the second data field in the record
$n in the record The nth data field
FIELDWIDTHS is a column of numbers separated by spaces, which defines the specific width of each field
FS input field separator
RS input record separator
OFS output field separator
ORS output field separator

ARGC current command line parameter number
ARGIND current file index in ARGV
ARGV array containing command line parameters
CONVFMT number conversion Format (see printf statement), the default value is %.6g
ENVIRON is an associative array composed of the current shell environment variables and their values
ERRNO The system error number when an error occurs when reading or closing the input file
br /> FILENAME is used as the file name of the data file input by gawk
FNR The number of records in the current data file
When IGNORECASE is set to non-zero, the character case of the string appearing in the gawk command is ignored.< br /> The total number of fields in the NF data file
NR The number of input records processed
The number of FNR file records
The output format of OFMT numbers, the default value is %.6g
RLENGTH The length of the substring matched by the match function
RSTART The starting position of the substring matched by the match function

Example:
Number of command line parameters
awk'{print ARGC}' /etc/fstab /etc/inittab
Command line parameters
awk'BEGIN {print ARGV [0]}' /etc/fstab /etc/inittab

awk'{print FILENAME, “record number is”,NR,”FNR IS” ,FNR }’awk passwd

Variables

Awk variables start with a letter, and the subsequent characters can be numbers, letters, or underscores. Keywords cannot be used as awk variables
Awk variables can be used directly without prior declaration. If you want to initialize a variable, it is best to do it in the BEGIN area, it will only be executed once.

Custom variables
-v or directly define

printf formatted output
formatted output: printf "FORMAT", item1, item2, .
(1) FORMAT must be specified
(2) No automatic line breaks, line break control characters need to be explicitly given, \n
(3) FORMAT needs to specify the format characters for each subsequent item separately< br />



Unary operator
Operator description
+ Take positive, the number itself returns
- Negate
++
-

Arithmetic operators

Operator description
+
-
*
/
%

awk'NR%2 == 0 {print NR,$0}' passwd

String operators

Assignment operator
Operator description
=
+=
-=
*=
/=
%=
< br /> Comparison operators

>
>=
<
<=
==
!=
&& and
|| or


Regular expression

Operator description
~ Match
! ~ No

awk -F:'$1~"ro"' passwd The first field contains ro

$ awk'BEGIN {FS=":";print "begin test"} {print $1} END {print "itis end "} 'passwd

Match operator
$1 ~ /^data/

gawk -F:’$4 = = 0{print $1}’ /etc/passwd

Line range
awk -F:'/^root\>/,/^nobody\>/ {print $1}' /etc/ passwd
awk -F:'(NR>=10&<=20){print NR,$1}' /etc/passwd (with or without parentheses will do)



awk structured commands
if
single statement
if(conditional-expression) {statements ;.......}

more Article
if (conditional-expression)
{
action1; #Execute in turn
action2;
}

if else
if (conditional-expression)
action1
else
action2

if(condition) {statements;…} else {statements;…}


Ternary operator
codintional-expression? action1: action2 ;
while
while (codition)
{

Actions

}


while(conditon ) {statments;…}

do-while

do
{
action
}
while(condition)< br />
for

for(initialization;condition;increment/decrement)
for(expr1;expr2;expr3) {statements;…}

if-then-else statement:
if (condition) statement1; else statement2
while statement:
while (condition)
{
statements
}
do-while statement:
do {
statements
} while (condition)
for statement:
for(variable assignment; condition; iteration process)
Example
seq 10 | awk'i=0{print $0}' i=0 do not print
s eq 10 | awk'i=1{print $0}' =1 Printing has nothing to do with braces
seq 10 | awk'i=!i{print i, $0}' At the beginning i is not assigned, !i is true ( Namely 1), print, then false (0), do not print, only print odd lines
seq 10 | awk'!(i=!i){print i, $0}' Same as above, print even lines

Take the disk utilization and display it
df -h | awk -F "[[:space:]]+|%"'/^\/dev\/sd/{ if ($5> 10) print $1, $5}'

awk'/^[[:space:]]*linux16/ {i=1;while (i<= NF) {print $i,length($ i);i++} }'/boot/grub2/grub.cfg

for

for(variable assignment;condition;iteration process)
{for-body }

awk'BEGIN{wkd["mo"]="monday";wkd["fr"]="friday";wkd["sat"]="satday"; for( i in wkd ){ print i,wkd[i]}}'


awk'BEGIN{sum=0; for (i=1;i<=100;i++){ sum+=i } print sum }'


next:
End the processing of this line ahead of time and proceed directly to the next line of processing (awk's own loop)


Array

array[index-expression]
index-expression:
(1) Any string can be used; string should be enclosed in double quotes
(2) If an array element does not exist in advance, awk will automatically create this element when referencing it. And initialize its value to "empty string"
(3) To determine whether there is an element in the array, use the "index in array" format to traverse

View the number of states
netstat -tan | awk’/^tcp/ {state[$NF]++} END{for(i in state) {print i,state[i] }}’

access.log Take the front Ten ip and join the firewall

awk'{ip[$1]++} END{for (i in ip ){print i, "Number of connections" ip[i]}} 'access_log | sort -nr -k 3 | head

Join iptables firewall

iptables -A INPUT -s IP -j REJECT



The ip of the local connection is the top ten

awk'{split($5,ip,":");count[ip[1]]++;print ip[1],"link数", count[ip[1]]}' ss.log | sort -nr -k 3 | head

awk -F "[[:space:]]+|:"'{ ip [$6]++}END{for(i in ip) {print "summery", i,"links ", ip[i]}} 'ss.log | sort -nr -k4


take the ip in the log, starting with a number,
awk'/^[0-9]/ {ip[$1]++} END{for (i in ip) print i,ip[i ]} 'aess_log

Add the number of connections greater than 100 to the firewall
while true; do

awk’/^[0-9]/ {ip[$1]+ +} END{for (i in ip) {if (ip[i]>100) print i}} ‘access_log | while read line ;do echo “$line”; done
sleep 10
done

do iptables -A INPUT -s $line -j REJECT

Get random number
awk’BEGIN {srand(); for(i=1;i<=10;i++){print rand()} }'

String operation
? length([s]): return to specified The length of the string
? sub(r,s,[t]): Search the t string for the content that matches the pattern with r, and replace the first match with s
echo "2008: 08:08 08:08:08" | awk'gsub(/:/,"-",$0)'

Homework:

1 blog.magedu.com< br /> 2 www.magedu.com
3 hhhh.magedu.com
4 dddd.magedu.com
5 b333.magedu.com
6 bkkk.magedu.com< br /> 7 ssss.magedu.com
8 wog.magedu.com
9 ulog.magedu.com
Take the host name
awk -F "[ .]"'{ print $2}' soho.txt: After confirming the delimiter, take the domain


Take the number of occurrences of the fstab file system type
awk'/^UUID/{fs[$3]++ } END{for (i in fs) {print i,fs[i]}} 'fstab

fstab word occurrences
grep -wEo "[[:alpha:]]+" fstab | awk'{word[$1]++} END{for (i in word) {print i,word[i]}}'

Extract numbers
echo "[email protected]%9&Bdh7dq+YVixp3vpw" ​​| awk'gsub(/[^[:digit:]]/, "",$0)'

Generate random numbers
awk'BEGIN{srand(); for (i=1;i<=200;i++) {if (i==200) {printf "%d", int(rand()*100) ;}else {printf "%d,", int(rand()*100) }} }'

Take the above random number Max Min
awk -F "," '{MAX=$1;MIN=$1; for (i=1;i<=NF;i++) {if ($i>= MAX) {MAX=$i} ; if ($i <= MIN) {MIN=$i}}} END{ print "MAX=",MAX, "MIN=" ,MIN} 'soho.txt



http://mail.magedu.com/index.html
http://www.magedu.com/test.html
http://study.magedu.com/index. html
http://blog.magedu.com/index.html
http://www.magedu.com/images/logo.jpg

Take the fully qualified domain name< br /> Determine the separator, select the field, count and print
awk -F"/"'{FQ[$3]++} END{ for(i in FQ) print i,FQ[i] }'soho. txt | sort -rn -k 2< br />

Example:

inode|beginnumber|endnumber|counts|
106|3363120000|3363129999|10000|
106|3368560000|3368579999 |20000|
310|3337000000|3337000100|101|
310|3342950000|3342959999|10000|
310|3362120960|3362120961|2|
311|3313460102|3313469999|9898 |
311|3313470000|3313499999|30000|
311|3362120962|3362120963|2|

Output format
310|3337000000|3362120961|10103|
311|3313460102|3362120963|39900|
106|3363120000|3368579999|30000|


awk -F'|' -v OFS='|''/^[ 0-9]/{inode[$1]++; if(!bn[$1]){bn[$1]=$2} else if(bn[$1]>$2){bn[$1]=$2}; if( en[$1]<$3)en[$1]=$3;cnt[$1]+=$(NF-1)} E{for(i in inode)print i,bn[i],en[i],cnt[ i]}' soho.txt


Use the awk command to calculate the total size of files in a directory
find. -maxdepth 1 -type f -ls | awk'{sum+ =$7} END {print sum}'


Statistics link to the 10 most local IPs

netstat -an | head | awk -F "[[:space:]]+|:" 'NR> 2 {print $6}'

netstat -an | head | awk -F "[[:space:]]+ |:" 'NR> 2 {ip[$6]++} END{for (i in ip) print i,ip[i] }'| sort -nr -k 2|head

< p>

Leave a Comment

Your email address will not be published.