grep, sed, and awk for fun and profit

grep, sed, and awk for fun and profit

I recently moved this website to a new website. After doing that, I noticed a lot more captcha failures.
I don’t think there are more automated attempts. logcheck on the new server is not configured to ignore
the log messages. Tonight, I thought I’d do something about them.

The Log messages

The log messages look like this:

Aug  6 23:09:06 gelt FreeBSDDiary[43547]: captcha failure: user='Marina' IP='95.65.75.160' email='marinapetrova08@gmail.com'
Aug  6 23:21:18 gelt FreeBSDDiary[43870]: captcha failure: user='MigoneWoorope' IP='173.242.116.186' email='jeffer.s.o.n.v.3.v@gmail.com'
Aug  6 23:50:19 gelt FreeBSDDiary[47046]: captcha failure: user='apelealcoxy' IP='109.230.251.74' email='southfan@southpark.jatsu.pl'

These messages are created by some custom code I have added to Phorum

Granted, the message isn’t exactly easy to parse. If I was doing this again, I would rethink my output.

Parsing the output

My goal: get a list of IP addresses and the number of failed capchas for each. This is my first attempt:

$ bunzip2 -c  /var/log/messages.* | grep 'captcha failure' | sed "s/.*IP=\'\(.*\)\' email.*/\1/g" | sort | uniq -c | sort -r
  25 95.65.75.160
  13 77.65.48.239
  12 195.162.68.141
  10 68.185.116.91
   8 89.76.212.140
   7 188.92.77.196
   6 199.15.234.226
   5 95.64.12.21
   5 193.105.210.64
   5 190.202.87.131
   4 67.205.96.23
   4 204.124.182.82
   4 182.50.141.198
   4 110.85.4.110
   4 109.73.76.45
   3 79.133.133.158
   3 78.108.79.129
   3 77.127.24.215
   3 50.7.240.10
   3 46.118.42.143
   3 31.192.105.125
   3 188.165.254.157
   3 182.50.142.66
   3 173.242.122.112
   3 121.97.116.217
   3 117.88.218.15
   3 109.230.251.74
   3 109.230.244.101
   3 109.230.222.225
   3 109.230.220.230
   2 98.126.54.66
   2 96.45.173.2
   2 95.84.137.56
   2 91.224.160.90
   2 91.210.157.234
   2 91.207.8.46
   2 89.28.124.238
   2 89.105.246.13
   2 87.70.43.111
   2 77.87.32.102
   2 77.70.41.8
   2 77.125.105.161
   2 68.223.157.219
   2 58.8.220.166
   2 58.8.154.111
   2 46.118.42.41
   2 46.118.229.58
   2 31.184.236.44
   2 31.184.236.31
   2 221.206.40.125
   2 209.226.31.161
   2 208.76.55.91
   2 195.218.182.253
   2 184.106.170.252
   2 178.210.32.201
   2 178.168.47.154
   2 119.93.74.238
   2 109.230.246.238
   2 109.230.244.129
   2 109.186.24.10
   1 98.254.20.226
   1 98.220.58.252
   1 95.79.39.180
   1 95.78.65.113
   1 95.69.216.91
   1 95.220.216.191
   1 95.168.183.233
   1 95.135.16.67
   1 95.133.71.169
   1 95.133.207.140
   1 95.132.98.135
   1 94.232.65.104
   1 94.23.248.199
   1 94.190.47.108
   1 94.179.167.19
   1 94.179.148.219
   1 94.141.37.123
   1 93.182.133.94
   1 93.166.121.107
   1 93.157.169.18
   1 93.127.27.110
   1 92.60.232.11
   1 92.39.76.212
   1 91.224.160.132
   1 91.214.186.131
   1 91.210.104.246
   1 89.208.32.87
   1 89.139.10.153
   1 88.196.166.18
   1 88.190.26.16
   1 87.69.95.86
   1 87.68.52.89
   1 87.249.3.2
   1 85.122.23.124
   1 84.251.45.75
   1 83.21.213.143
   1 83.149.44.243
   1 83.139.165.126
   1 80.98.175.191
   1 80.87.145.14
   1 80.240.203.100
   1 79.133.140.123
   1 77.92.233.198
   1 72.64.185.93
   1 72.46.131.108
   1 71.241.146.196
   1 71.188.61.102
   1 71.184.168.232
   1 69.181.42.19
   1 68.41.239.107
   1 67.169.121.228
   1 65.78.173.203
   1 61.90.31.61
   1 59.58.154.44
   1 58.8.116.16
   1 58.8.100.13
   1 50.56.95.138
   1 46.251.237.188
   1 46.21.144.176
   1 46.17.96.12
   1 46.146.95.26
   1 41.190.16.17
   1 23.19.39.197
   1 222.165.130.214
   1 221.7.159.224
   1 218.92.8.165
   1 218.24.196.122
   1 217.77.222.158
   1 217.196.164.35
   1 216.24.192.168
   1 213.87.136.220
   1 212.87.241.135
   1 207.204.243.16
   1 204.145.80.57
   1 203.148.95.71
   1 202.181.176.3
   1 195.191.55.204
   1 195.190.13.54
   1 194.11.24.156
   1 193.105.210.113
   1 192.251.226.206
   1 188.92.76.221
   1 188.27.105.149
   1 188.26.145.208
   1 188.233.18.255
   1 188.232.72.141
   1 188.163.64.194
   1 188.143.233.14
   1 188.143.233.111
   1 188.143.232.164
   1 188.143.232.157
   1 188.143.232.109
   1 188.134.30.212
   1 187.76.192.186
   1 184.171.170.75
   1 184.107.41.143
   1 183.16.116.7
   1 178.162.155.241
   1 178.137.17.213
   1 178.122.42.126
   1 178.122.40.252
   1 178.121.103.57
   1 175.42.82.224
   1 174.36.42.78
   1 174.142.19.206
   1 173.234.229.179
   1 173.0.59.196
   1 141.105.65.153
   1 125.39.93.39
   1 125.120.185.56
   1 125.109.198.150
   1 123.121.216.142
   1 122.193.26.244
   1 121.54.84.26
   1 118.97.164.78
   1 117.41.235.212
   1 116.252.185.10
   1 115.87.242.24
   1 112.111.184.192
   1 109.95.196.34
   1 109.87.152.237
   1 109.230.251.99
   1 109.230.251.228
   1 109.230.251.184
   1 109.230.251.121
   1 109.230.244.111
   1 109.230.217.104
   1 109.230.216.123
   1 109.172.78.18
   1 108.48.26.155

The log messages look like this:

Aug  6 23:09:06 gelt FreeBSDDiary[43547]: captcha failure: user='Marina' IP='95.65.75.160' email='marinapetrova08@gmail.com'
Aug  6 23:21:18 gelt FreeBSDDiary[43870]: captcha failure: user='MigoneWoorope' IP='173.242.116.186' email='jeffer.s.o.n.v.3.v@gmail.com'
Aug  6 23:50:19 gelt FreeBSDDiary[47046]: captcha failure: user='apelealcoxy' IP='109.230.251.74' email='southfan@southpark.jatsu.pl'

That is 190 distinct IP addresses. Well, I don’t mind. I’ll rediret them all… Or perhaps just
the top 10 offenders:

$ bunzip2 -c  /var/log/messages.* | grep 'captcha failure' | sed "s/.*IP=\'\(.*\)\' email.*/\1/g" | sort | uniq -c | sort -r | head -10 | awk '{print $2}' | sort
188.92.77.196
190.202.87.131
193.105.210.64
195.162.68.141
199.15.234.226
68.185.116.91
77.65.48.239
89.76.212.140
95.64.12.21
95.65.75.160

Hmmm, now I’ll add those IP addresses to my virtual host definition. But first, some help:

$ bunzip2 -c  /var/log/messages.* | grep 'captcha failure' | sed "s/.*IP=\'\(.*\)\' email.*/\1/g" | sort | uniq -c | sort -r | head -10 | awk '{print "RewriteCond %{REMOTE_ADDR} " $2 " [OR]"}' | sort
RewriteCond %{REMOTE_ADDR} 188.92.77.196 [OR]
RewriteCond %{REMOTE_ADDR} 190.202.87.131 [OR]
RewriteCond %{REMOTE_ADDR} 193.105.210.64 [OR]
RewriteCond %{REMOTE_ADDR} 195.162.68.141 [OR]
RewriteCond %{REMOTE_ADDR} 199.15.234.226 [OR]
RewriteCond %{REMOTE_ADDR} 68.185.116.91 [OR]
RewriteCond %{REMOTE_ADDR} 77.65.48.239 [OR]
RewriteCond %{REMOTE_ADDR} 89.76.212.140 [OR]
RewriteCond %{REMOTE_ADDR} 95.64.12.21 [OR]
RewriteCond %{REMOTE_ADDR} 95.65.75.160 [OR]

Using that, I redirect those IP addresses to another URL, where they cannot login or register.

1 thought on “grep, sed, and awk for fun and profit”

  1. Cool, looks like my parsing stuff I do on apache files when
    people try (or succeed) to ddos apache.

    Wouldn’t it be cool when people like you collecting data
    of offenders could forward those found IP’s to a sort of collecting station
    on the internet, where the IP’s would be diminished in value.
    In an easy way, like just pipe the found IP’s to a locally running daemon.

Leave a Comment

Scroll to Top