Blog
June 25, 2016 Marie H.

One line bash script to Golang - A unasked challenge

One line bash script to Golang - A unasked challenge

Photo by <a href="https://unsplash.com/@jakubzerdzicki?utm_source=cloudista&utm_medium=referral" target="_blank" rel="noopener">Jakub Żerdzicki</a> on <a href="https://unsplash.com/?utm_source=cloudista&utm_medium=referral" target="_blank" rel="noopener">Unsplash</a>

I was browsing the Internet and came across: Replacing a one-line bash script with 74 lines of Go #golang; Further review led me to the programmers blog: https://milocast.com/purge.html

As such I felt a challenged whether or not one was asked for. So today I convert a common one liner to Go. Finding the top IP’s making requests based on an Apache access log.

The access log

[mharris@mori BashToGo]$ wc -l access.log 
2030592 access.log

The bash one liner

[mharris@mori BashToGo]$ awk '{print $1}' access.log  | sort | uniq -c | sort -rnk1 | head
 295325 71.41.191.106
 100954 4.30.110.85
  83175 ::1
  54092 93.160.60.22
  41913 63.133.222.6
  30961 189.207.146.56
  19143 167.114.172.229
  16842 167.114.156.198
  16653 54.172.32.70
  16079 14.141.28.114

Some profiling

real    0m5.577s
user    0m5.976s
sys 0m0.201s

Now I’m not much of a Golang programmer seeing as I just started writing in it; but doing these little projects gives me something to practice on. I’m sure there are some inefficiencies here and I’m gonna check IRC once I publish and/or wait for comments.

// awk '{print $1}' access.log  | sort | uniq -c | sort -rnk1 | head
// to GoLang because ADHD + Boredom @ 1:45AM on a Friday
package main

import (
    "bufio"
    "fmt"
    "io"
    "log"
    "os"
    "sort"
    "strings"
)

type Pair struct {
    Key string
    Value int
}

type PairList []Pair
func (p PairList) Swap(i, j int) { p[i], p[j] = p[j], p[i] }
func (p PairList) Len() int { return len(p) }
func (p PairList) Less(i, j int) bool { return p[i].Value < p[j].Value }

func main() {
    file := "access.log"
    ips := ParseIpsFromFile(file)
    unique := UniqueArrayCount(ips)
    sorted := sortMapByValue(unique)
    for _, item := range sorted[0:10] {
        fmt.Printf("%d %s\n", item.Value, item.Key)
    }
}

func ParseIpsFromFile(file string) (ips []string) {
    fh, err := os.Open(file)
    if err != nil {
        log.Fatal(err)
    }
    defer fh.Close()

    bf := bufio.NewReader(fh)
    for {
        line, isPrefix, err := bf.ReadLine()
        if err == io.EOF {
            break
        }

        if err != nil {
            log.Fatal(err)
        }

        if isPrefix {
            // do nothing
        }

        s := strings.Split(string(line), " ")
        ips = append(ips, s[0])
    }

    return
}

func UniqueArrayCount(ips []string) map[string]int {
    m := map[string]int{}
    for _, ip_address := range ips {
        _, exist := m[ip_address]
        if exist {
            m[ip_address] += 1
        } else {
            m[ip_address] = 1
        }
    }
    return m
}

func sortMapByValue(m map[string]int) PairList {
    p := make(PairList, len(m))
    i := 0
    for k,v := range m {
        p[i] = Pair{k,v}
        i++
    }
    sort.Sort(sort.Reverse(p))
    return p
}

Now surprisingly parsing 2 millions lines of text wasn’t much faster than bash:

Profiling

[mharris@mori BashToGo]$ time ./BashToGo 
295325 71.41.191.106
100954 4.30.110.85
83175 ::1
54092 93.160.60.22
41913 63.133.222.6
30961 189.207.146.56
19143 167.114.172.229
16842 167.114.156.198
16653 54.172.32.70
16079 14.141.28.114

real    0m3.765s
user    0m4.053s
sys 0m0.367s

Next, to discover how to improve performance or shall we say code optimization for the sake of code optimization.