Nov 13

Practical Bayesian Optimization with Spearmint

Recently, I’ve become interested in using Gaussian Processes for hyperparameter optimization. Coincidentally, there’s a great  NIPS 2012 paper by Jasper Snoek, Hugo Larochelle, and Ryan Adams about that very topic. Thankfully, not only is the paper insightful, but they also have Python source code available, called “spearmint” (I guess they chose the name so that it sounds like “experiment”). This post documents my experience using this code on a simple toy example.  This is mostly about the nuts and bolts of using the code, but I definitely recommend reading the paper.

The problem this code solves is the following:  given a set of N input/output pairs about a function, what next points should I query to maximize the expected improvement in my optimization task?

To demo this code out, I decided to look at a simple, very unrealistic problem: take a simple 1D polynomial and watch the algorithm explore for the minimum.  I chose this polynomial, shown here courtesy of Wolfram Alpha:

To cut to the chase, here’s the result, as an animated video.  Each set of red points mark the locations the algorithm wants to query next, given all black points.  Overall, not bad.

Download Video: MP4

Under the Hood

You can find all the code needed to generate this example (aside from spearmint, that is), in this zip file. Extract it to a subdirectory of your spearmint folder named “simplepoly”.

I used the “spearmint-lite” version of the code, which is the least automated and (IMHO) most understandable of the versions of spearmint available. The gist is that you provide a plain text file “results.dat” that looks like this:

  4.68750 0  -1.50000
  0.00000 0   0.00000
  0.93750 0   1.50000

where the first column is the function value, and the third column is your input “x” value. The second column here is just filler, but can be used by spearmint to estimate *how long* (in seconds) the each function evaluation will take.

You then ask spearmint where to evaluate next, via “python spearmint-lite.py –n=3”, and it will add 3 lines to “results.data” that indicate the three “pending” input points it deems the best to fill in next, under the chosen acquisition function (here this is expected improvement).

  4.68750 0  -1.50000
  0.00000 0   0.00000
  0.93750 0   1.50000
P P 0.571008983601 
P P -0.3486328125 
P P 0.1845703125 

That’s it. I’ve wrapped the whole process up in a bash script, that goes through 10 iterations of the two-step pattern (1) ask spearmint what 3 points to query next, (2) evaluate function at those points. Spearmint wants to have optimization problem located in a subdirectory relative to the main code, so that explains the cd-ing back and forth. I also force the optimization to “start fresh” with an original set of 3 input/output pairs in “orig_results.dat”.

cp orig_results.dat results.dat
for i in {1..10}
    cd ../
    echo 'Fetching next query points from spearmint...'
    python spearmint-lite.py --n=3 --method=GPEIOptChooser \
              --method-args=mcmc_iters=25,noiseless=1 simplepoly

    cd simplepoly/
    echo 'Evaluating function at given points...'
    python simplepoly.py

“simplepoly.py” is a simple python script to evaluate my simple polynomial function at pending points and fill in “results.dat” appropriately. This is simple enough, so I’ll let that code speak for itself.

I should mention that this is a constrained optimization problem. I told spearmint in advance to only consider the range (-1.5, +1.5) for this function. This is specified in the “config.json” file, along with the dimensionality of the problem (just a single input variable here).

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>