Commonly used advanced command methods for awk analysis logs
Analyze access logs (Nginx as an example)
Log format:
' $remote_addr-$remote_user [$time_local] "$request" $status $body_bytes_sent "$http_referer" "$http_user_agent" "$http_x_forwarded_for"'
Count the number of IP visits:
# awk ‘{a[$1]++}END{for(i in a)print v,a[i]}‘ access.log
Statistics of IPs with more than 100 visits:
# awk ‘{a[$1]++}END{for(i in a){if(a[i]>100)print i,a[i]}}’ access.log
Count the number of visits to the IP and sort the top 10:
# awk ‘{a[$1]++}END{for(i in a)print i,a[i]|"sort -k2 -nr |head -10"}‘ access.log
The most visited IP during the statistical time period:
# awk'$4>="[02/Jan/2017:00:02:00" && $4<="[02/Jan/2017:00:03:00"{a[$1]++} END{for(i in a)print i,a[i]}'access.log
Statistics of visits in the last minute:
# date=$(date -d ‘-1 minute’+%d/%d/%Y:%H:%M)
# awk -vdate=$date ‘$4~date{c++}END{printc}‘ access.log
Statistics of the 10 most visited pages:
# awk ‘{a[$7]++}END{for(i in a)print i,a[i]|"sort -k1 -nr|head -n10"}‘ access.log
Count the number of each URL and the total size of the returned content:
# awk ‘{a[$7]++;size[$7]+=$10}END{for(i in a)print a[i],i,size[i]}‘ access.log
Count the number of status codes for each IP access:
# awk ‘{a[$1" "$9]++}END{for(i in a)print i,a[i]}‘ access.log
Statistics of the number of visits IP is 404 status:
# awk ‘{if($9~/404/)a[$1" "$9]++}END{for(i in a)print i,a[i]}‘ access.log
Attachment: The usage of sort -k, -k is to sort the output according to the first few columns, and you can choose according to the number