simplest way to benchmark a URL

Since I will forget this:

ab -c 10 -n 100

ab is a simple benchmarking tool from Apache. -c 10 means 10 concurrent connections, -n 100 means 100 requests. Note that it gave me Benchmarking localhost (be patient)...apr_socket_recv: Connection refused (61) when I tried it on http://localhost:3000/, but that seems to be an IPv6 thing and replacing that with seemed to work.

You can also do this with POST requests if you have HTTP Basic auth:

echo "observation[latitude]=38&observation[longitude]=-121" > post.txt && \
ab -c 10 -n 100 \
  -A username:password \
  -T application/x-www-form-urlencoded \
  -p post.txt \
  "" && \
rm post.txt

Haven’t found a way to just give it POST params without creating a file, but whatever, this works.

PDF cheatsheet

Some common PDF-related commands I find myself doing and forgetting.

# Convert a bunch of images to an OCR'd PDF
# Relies on my own img2pdf script:
img2pdf *.jpg
# Make a PDF of scanned page images searchable
# pdfimages comes with Poppler, which you'll need to get img2pdf working
mogrify -negate *.pbm # pdfimages seems to invert colors for some reason
img2pdf *.pbm

single day from GPX

I often carry around a GPS with me as I’m exploring, and I don’t like to remember to clear the track log, so every time I pull a GPX file off the device it’s enormous and has a bunch of extraneous data I don’t want. What I generally want is the track from the previous day. I haven’t found a completely satisfactory way to do this yet, but this comes close:

gpsbabel -t -i gpx -f in.gpx \
  -x nuketypes,waypoints,routes \
  -x track,start=20150729,stop=20150730 \
  -o gpx -F out.gpx

The nuketypes filter removes all the waypoints and routes and the start and stop filters filter the GPX data by timestamp (whatever’s being used in the GPX file, which for me is UTC). What’s annoying here is that inevitably a few track points from the day will be leftovers from the previous day. I can leave them in there, but usually I’m only filtering my tracklog like this to share it with other people, so I don’t want a couple extraneous points. I can view the data, find the earliest track point that is actually correct, and use that for the time filter, but that requires an additional step using a GUI.

FWIW, taking a quick look at the file is as easy as converting to KML and using Google Earth:

ogr2ogr -f KML gpx.kml out.gpx

Not a new problem, but something I thought would have been easier using open source software, thus I’m documenting my solution. With some research and experimentation, I adapted this script into something that will take a collection of images of text (e.g. pages from a book or a paper) and convert them into a PDF you can search. You will need to install some other packages, and my instructions here assume you’re using homebrew on a Mac, but the script should be adaptable to any platform that can run tesseract, imagemagick, and ghostscript.

I will say that it’s is WAY slower than the hard-coded OCR functionality on some scanner / printers I’ve seen. Not sure why. And FWIW, the process of editing and cropping scanned pages still takes a lot of time.

FWIW, you can also use ghostscript to add author/title metadata:

Just tried to set up cocoapods and ran into this error

checking for -std=c99 option to compiler... yes
checking for CoreFoundation... no
checking for main() in -lCoreFoundation... no
CoreFoundation is needed to build the Xcodeproj C extension.

Dislike. Finally found an answer that worked for me and didn’t resort to circumventing my homebrew / rvm setup, though it did involve installing a new ruby:

brew link autoconf
rvm install ruby-2.0.0-p353 --with-gcc=clang --verify-downloads 1
rvm use ruby-2.0.0-p353
gem install cocoapods

windshaft on ubuntu

Just set up Windshaft on Ubuntu for the first time, and since there were a few hiccups I figured I’d give a rough outline of my installation process:

# Use mapnik 2.2 (
# I was getting some errors with interactivity requests on an older version of
# mapnik
sudo add-apt-repository ppa:mapnik/v2.2.0
sudo apt-get update
sudo apt-get install libmapnik libmapnik-dev mapnik-utils python-mapnik
# Uninstall system gyp, it was causing problems for me when compiling mapnik
# node extensions, e.g. gyp 'module' object has no attribute 'script_main'
sudo aptitude remove gyp
# Use nvm ( instead of the package nodejs,
# which didn't seem to work for me, though that may have been tangled up with
# the mapnik 2.2 issue. Either way, nvm should install a working version of
# nodejs w/ npm.
curl | sh
nvm install 0.10.26 # or whatever works
# install windshaft with npm
npm install windshaft

I’m also using nginx and Passenger to serve the web app. I’ll assume you know how to install those two things and skip to my nginx conf:

http {
    passenger_root /path/to/passenger;
    passenger_ruby /path/to/ruby;
    passenger_nodejs /home/inaturalist/.nvm/v0.10.26/bin/node;
    server {
      listen 80;
      passenger_enabled on;
      passenger_app_root /path/to/app;
      passenger_document_root /path/to/app/public;
      error_log /var/log/nginx/your-error.log;
      access_log /var/log/nginx/your-access.log;

One thing that threw me for a while is that Passenger won’t work with older versions of node (see, so make sure you’re using 1.0 or higher.

Also note that console.log in Node will write to /var/log/nginx/error.log, not to one of your server’s custom log files.

ridiculous rails boot times

I recently ran safe-upgrade on Ubuntu, which involved updates to a bunch of stuff, including Linux headers and postgres, but now my Rails boot time is now 3x longer. I still don’t know why, which is frustrating, but I did learn about a few things along the way.

The first is Bumbler, a tool for inspecting gem load times, among other things.

The second is Passenger’s passenger_start_timeout setting, which is how I’m addressing my problem without really addressing my problem.

The third is that Rackspace now has a “Performance” VPS product that seems to be both faster and cheaper than their old VPSs. Unfortunately transitioning is non-trivial, since you can’t do it for 1st gen cloud servers and if you want to create a Performance Cloud Server from an Next Gen image you can only do it from a 1GB Next Gen server.

# reproject into WGS84 lat/lon
gdalwarp -t_srs EPSG:4326 -dstnodata 0 input.tif output.tif
gdal_translate -of vrt -expand rgba output.tif output.vrt -p geodetic -k output.vrt

This mostly works, but the nodata from the original GeoTIFF doesn’t get preserved as a PNG alpha channel in the KMZ tiles. Still need to figure that out.

visualizing a rails schema

Really didn’t find a perfect solution, which would allow me to optionally specify a model or a set of models and show all the attributes and relationships for only those models. railroady comes pretty close though, especially when using graphviz output and OmniGraffle‘s layout engines. rails-erd made a decent full-model diagram too.


  gem "rails-erd"
  gem "railroady"


railroady -M --hide-through -i \
  -s app/models/*observation*,app/models/photo.rb,app/models/sound.rb,app/models/taxon.rb \


Hierarchical layouts seemed to work best, with some tweaking.

iNat observation model and some associated models.

I’m sure this is all irrelevant if you use cocoapods, but I don’t (yet). My approach to installing SSZipArchive was to add it as a git submodule and drag the folder including SSZipArchive.h and minizip into my project, adding it by reference. This caused RestKit compilation to barf, mostly a lot of parse issues in RKURL.h like Expected identifier or '(', which is ridiculous b/c nothing changed in RestKit.

I eventually found my solution at change the file type of the minizip .c files to “Objective-C Source.”

Screen Shot 2013-09-24 at 3.15.51 PM

Now my project builds and everything works normally. Why this solution works is beyond my extremely limited knowledge of C and Xcode. These files look like C and/or C++, so why does telling Xcode to treat them like Objective-C even work?