I was curious to know how many books, on average, I read a month. I don’t expose this information directly on bookpiles either. You could extract it from the RSS feed. It’s one the features I would like to add when I understand what information I want to present and how it is best presented.
In the meantime, I ran a query in the database and came up with this:
2009-02 2
2009-03 4
2009-04 3
2009-05 3
2009-06 3
2009-07 2
2009-08 3
2009-09 3
2009-10 3
2009-11 4
2009-12 1
2010-01 0
2010-02 1
2010-03 1
2010-04 6
2010-05 2
2010-06 5
2010-07 7
2010-08 2
2010-09 7
2010-10 4
I felt 80% done. Then, I realized I didn’t quite know how I would extract, from the command-line, the sum, mean, standard deviation, minimum and maximum value. Of course, I could run it through R. Or Excel… The question wasn’t how to do statistics in general — it was how to do it as a filter … easily … right now.
A little research didn’t turn out any obvious answer. (please, correct me if I missed an obvious solution)
I wrote my own in awk. (awk is present on ALL the machines I use)
$1 < min {min = $1}
$1 > max {max = $1}
{sum+=$1; sumsq+=$1*$1}
END {
print "lines: ", NR;
print "min: ", min;
print "max: ", max;
print "sum: ", sum;
print "mean: ", sum/NR;
print "stddev:", sqrt(sumsq/NR – (sum/NR)**2)
}
Here’s what the output looks like:
I included it in my dotfiles: the awk code and a bootstrap shell script (used above).