I was browsing the Internet and came across: Replacing a one-line bash script with 74 lines of Go #golang; Further review led me to the programmers blog: https://milocast.com/purge.html
As such I felt a challenged whether or not one was asked for. So today I convert a common one liner to Go. Finding the top IP’s making requests based on an Apache access log.
The access log
[mharris@mori BashToGo]$ wc -l access.log
2030592 access.log
The bash one liner
[mharris@mori BashToGo]$ awk '{print $1}' access.log | sort | uniq -c | sort -rnk1 | head
295325 71.41.191.106
100954 4.30.110.85
83175 ::1
54092 93.160.60.22
41913 63.133.222.6
30961 189.207.146.56
19143 167.114.172.229
16842 167.114.156.198
16653 54.172.32.70
16079 14.141.28.114
Some profiling
real 0m5.577s
user 0m5.976s
sys 0m0.201s
Now I’m not much of a Golang programmer seeing as I just started writing in it; but doing these little projects gives me something to practice on. I’m sure there are some inefficiencies here and I’m gonna check IRC once I publish and/or wait for comments.
// awk '{print $1}' access.log | sort | uniq -c | sort -rnk1 | head
// to GoLang because ADHD + Boredom @ 1:45AM on a Friday
package main
import (
"bufio"
"fmt"
"io"
"log"
"os"
"sort"
"strings"
)
type Pair struct {
Key string
Value int
}
type PairList []Pair
func (p PairList) Swap(i, j int) { p[i], p[j] = p[j], p[i] }
func (p PairList) Len() int { return len(p) }
func (p PairList) Less(i, j int) bool { return p[i].Value < p[j].Value }
func main() {
file := "access.log"
ips := ParseIpsFromFile(file)
unique := UniqueArrayCount(ips)
sorted := sortMapByValue(unique)
for _, item := range sorted[0:10] {
fmt.Printf("%d %s\n", item.Value, item.Key)
}
}
func ParseIpsFromFile(file string) (ips []string) {
fh, err := os.Open(file)
if err != nil {
log.Fatal(err)
}
defer fh.Close()
bf := bufio.NewReader(fh)
for {
line, isPrefix, err := bf.ReadLine()
if err == io.EOF {
break
}
if err != nil {
log.Fatal(err)
}
if isPrefix {
// do nothing
}
s := strings.Split(string(line), " ")
ips = append(ips, s[0])
}
return
}
func UniqueArrayCount(ips []string) map[string]int {
m := map[string]int{}
for _, ip_address := range ips {
_, exist := m[ip_address]
if exist {
m[ip_address] += 1
} else {
m[ip_address] = 1
}
}
return m
}
func sortMapByValue(m map[string]int) PairList {
p := make(PairList, len(m))
i := 0
for k,v := range m {
p[i] = Pair{k,v}
i++
}
sort.Sort(sort.Reverse(p))
return p
}
Now surprisingly parsing 2 millions lines of text wasn’t much faster than bash:
Profiling
[mharris@mori BashToGo]$ time ./BashToGo
295325 71.41.191.106
100954 4.30.110.85
83175 ::1
54092 93.160.60.22
41913 63.133.222.6
30961 189.207.146.56
19143 167.114.172.229
16842 167.114.156.198
16653 54.172.32.70
16079 14.141.28.114
real 0m3.765s
user 0m4.053s
sys 0m0.367s
Next, to discover how to improve performance or shall we say code optimization for the sake of code optimization.