Content #
Let’s find the all of the unique HTTP status codes in an apache web server log file named access.log. To do this, print out the ninth item in the log file with the awk command.
$ tail -1 access.log
18.19.20.21 - - [19/Apr/2014:19:51:20 -0400] "GET / HTTP/1.1" 200 7136 "-" "Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/33.0.1750.154 Safari/537.36"
$ tail -1 access.log | awk '{print $9}'
200
$ awk '{print $9}' access.log | sort | uniq
200
301
302
404
$
Let’s take it another step forward and count how many of each status code we have.
$ awk '{print $9}' access.log | sort | uniq -c | sort -nr
5641 200
207 301
86 404
18 302
2 304
$
Now let’s see extract the status code and hour from the access.log file and count the unique occurrences of those combinations. Next, lets sort them by number of occurrences. This will show us the hours during which the website was most active.
$ cat access.log | awk '{print $9, $4}' | cut -c 1-4,18-19 | uniq -c | sort -n | tail
72 200 09
76 200 06
81 200 06
82 200 06
83 200 06
83 200 06
84 200 06
109 200 20
122 200 20
383 200 10