Archive for August, 2007

Renaming Thousands of Files

If I told you I had to rename 1,000 files, change the extensions or
change hyphens to underscores:

  • how long would it take you?
  • what tools would you use?
  • what would you do?
  • how much would that answer change for 10,000/100,000/1,000,000 files?

Take a moment to think, please, before you keep reading.

This was a situation I was faced with this week. And it reminded me of Steve Yegge’s phone interview blog post You should read it for yourself, but here’s the problem statement:

Last year my team had to remove all the phone numbers from 50,000 Amazon web page templates, since many of the numbers were no longer in service, and we also wanted to route all customer contacts through a single page.

Let’s say you’re on my team, and we have to identify the pages having probable U.S. phone numbers in them. To simplify the problem slightly, assume we have 50,000 HTML files in a Unix directory tree, under a directory called ”/website”. We have 2 days to get a list of file paths to the editorial staff. You need to give me a list of the .html files in this directory tree that appear to contain phone numbers in the following two formats: (xxx) xxx-xxxx and xxx-xxx-xxxx.

How would you solve this problem? Keep in mind our team is on a short (2-day) timeline.

These are not “never-gonna-happen” situations. Your set of skills should include “entreprise” problem solving and “low-level” scripting.

For the curious, here’s how I solved the renaming problem:

find . -name '*.TXT' > src
cp src dest
vim dest
paste src dest > todo
vim todo
source todo

A good old “find”, some vim regular expression magic, “paste”, and more vim magic (to add “mv” to every line). Another advantage to this technique is that you’ll be able to “preview” the changes before you source the file.

Read Full Post »

It was a while back and I wanted to come up with programming exercises for ruby.

This time, it’s Erlang.

I dug up the code I had done then. I felt that I would need this for future reference. Here’s what I coded to learn Ruby:

  • anagram/permutations (list/string – difference?)
  • tower of hanoi
  • reverse polish notation calculator
  • n choose k
  • threads: print “red” / print “blue”
  • threads: multiple dns lookups
  • letter histogram
  • interactive Celsius/Fahrenheit converter
  • roman/arabic numbers converter

To this, I want to add:

  • streams? (next business day? next Monday the 17th?)
  • subsets of a list
  • standard deviation

The advantage of always coding the same algorithm, especially when learning, is that it becomes less about the logic itself, which you presumably already worked out, and more about the language.

It’d be interested to find out what other people are using.

Read Full Post »

After my recent disappointments with the state of syntax highlighting for code embedded in HTML, I did a little bit of research. I would also like to thank Marc-André Cournoyer for his recommendation.

I found a way to do it with TextMate, or with other blogging plugins. However, I’m not always on my Mac, and I don’t like locking myself into proprietary solutions.

That’s when I remembered that Vim has all you need to turn the code you are seeing into HTML with the exact same colors. If you need to turn a snippet (or the whole file, for that matter) into HTML, just select it and type “:TOhtml”. A new buffer will open with your code wrapped in old-style HTML!

If you are more into CSS and are ready to leave HTML 3.2 behind, you can toggle a flag “:let html_use_css=1” before running “:TOhtml”. You’ll get semantic CSS like this:

.Statement { color: #ffff00; }
.Constant { color: #ff6060; }
.PreProc { color: #ff40ff; }
.Comment { color: #8080ff; }

There are a couple of things to keep in mind. Vim will use your current color scheme to HTMLize the code. If you’re not satisfied with your current color scheme you can switch with “:colorscheme camo”, for example.

If you want anoter incentive: Vim support 481 syntaxes (!) currently and more are added all the time.

Finally, here’s the shell script I use to automate the process:


gvim +'colorscheme camo' \
     +'let html_use_css=1' \
     +'runtime! syntax/2html.vim' \
     +'wq' \
     +'q' $1

Most of my inspiration came from this article.

Read Full Post »

go source code

I wish I had a better way to publish the source code and show it highlighted.

Any suggestions?

Read Full Post »

Not all directories are created equal. When you work on a specific machine, there are directories where you are bound to spend more time than others. The same thing happens on the web, there are pages you will want to visit more often than others. Thankfully, this problem has already been solved with bookmarks. I’m just bringing bookmarks to bash.

For years, I’ve had different systems to allow me to move around faster around my often-used directories. I’ve tried soft links, aliases, and a few other tools. I’m not inventing anything new, I’m just making it more lightweight.

Here’s a screenshot that will explain how “go” works.


You just have to create a flat file named “.gorc” and place it in your home directory. It should contain one path per line, like this:


If you need to add/edit/remove paths, just fire up your favorite text editor. You can also append easily to it with.

echo $PWD >> ~/.gorc

Finally, here’s the source code.

function go() {
  if [ ! -f $HOME/.gorc ]; then
    echo "$HOME/.gorc does not exist..."
    return 1

  if [ -n "$1" ]; then
    local dest=`cat $HOME/.gorc | sed -n $1p`
    local places=`cat $HOME/.gorc`
    local dest=`pick_from_list $places`

  [ -n "$dest" ] && cd $dest

function print_list() {
  local i item

  for item in "$@"; do
    echo "$i. $item" >&2

function pick_from_list() {
  print_list "$@"

  local n
  read -p "${PROMPT-">"} " n
  ((n--)) # zero-based index shift

  if ((n < 0 || n >= $#)); then
    return 1

  shift $n

  echo $1

Just make sure you “source” the file in your .bashrc and you’re good to go.

Read Full Post »