Categories
learning meaningful labor rails ruby

Simple Thinking Sphinx on Dreamhost

*** Please note – this will probably not work (at all) (for more than a day of light use) without Cron use. And isn’t at all authorised by Dreamhost!! ***

For a recent client project I’ve used a Dreamhost unlimited account, which for value compared with the resources available and the fact that you don’t have to do any building or setting up of the server environment makes it an easy win for a site that’s not going to have a huge amount of traffic or a large amount of processing.

Post-launch I got to work putting together a basic search engine and here’s a quick run through of the steps it took to get a very simple Sphinx instance working on Dreamhost, and a few hurdles thrown in the way by various googled articles.

Development Environment

Using the guide from FG install Sphinx locally:

curl -O http://sphinxsearch.com/downloads/sphinx-0.9.8-rc2.tar.gz
tar zxvf sphinx-0.9.8-rc2.tar.gz
cd sphinx-0.9.8-rc2
./configure
make
sudo make install

then install the TS plugin into your application

script/plugin install git://github.com/freelancing-god/thinking-sphinx.git

Any problems with that, check out the FG page linked.

Getting a basic search going

Following tutorials such as the Sphinx Railscast will get you there pretty quick.

In your searchable model you need to define an index


class Page < ActiveRecord::Base
  define_index do
    indexes :title
    indexes :long
    indexes :short
  end ...

Run the indexer and start the Sphinx instance:


rake thinking_sphinx:index
rake thinking_sphinx:start

After this you'll be able to search on your object. So using script/console

@searched_pages = Page.search("query")

will return what you're looking for!

Setting up Dreamhost

First things first you need to install Sphinx in your local area, as posted by Hugh Evans:

cd ~/
mkdir -p local
wget http://sphinxsearch.com/downloads/sphinx-0.9.8.1.tar.gz
tar -xzf sphinx-0.9.8.1.tar.gz
cd sphinx-0.9.8.1/
./configure --prefix=$HOME/local/ --exec-prefix=$HOME/local/
make
make install

then set up the PATHs

echo "export PATH="$PATH:~/local/bin"" >> ~/.bash_profile
source ~/.bash_profile

You can choose to set up a CRON task at this point too, but I'm not going into that.

Also at this point in the there's talk of using Sphinx being anti TOS in DH's eyes... but we'll see does the process get killed or not!

Configuring Sphinx for DH

Create a file called sphinx.yml in the RAILS_ROOT/config/ folder.

Because Dreamhost uses an externally referenced MySQL server instead of localhost you need to set up the sql_* parameters:


  sql_host: "mysql.YOURDOMAIN"
  sql_port: 3306
  sql_user: "USER"
  sql_password: "PASSWORD"
  sql_database: "DATABASE"

And because you installed Sphinx in your local area:


  bin_path: '/home/YOURUSERNAME/local/bin'

Finally, after setting whatever memory/fine tuning settings you wish/require set up the locations for the Sphinx files:


  config_file: "/home/YOURUSERNAME/DOMAIN.co.uk/shared/production.sphinx.conf"
  searchd_log_file: "/home/YOURUSERNAME/DOMAIN.co.uk/shared/log/searchd.log"
  query_log_file: "/home/YOURUSERNAME/DOMAIN.co.uk/shared/log/searchd.query.log"
  pid_file: "/home/YOURUSERNAME/DOMAIN.co.uk/shared/log/searchd.production.pid"
  searchd_file_path: "/home/YOURUSERNAME/DOMAIN.co.uk/shared/db/sphinx"

That should be you ready to start deploying.

Deploying

Using Git + Capistrano for deployment (and Passenger for the http server) my deploy.rb's namespace area looks like this:


namespace :deploy do
  task :restart do
    after_symlink
    restart_sphinx
    run "touch #{deploy_to}/current/tmp/restart.txt"
  end
  
  task :start do 
    # nothing  (this avoids the 'spin' script issue)
  end
  
  desc "Re-establish symlinks"
  task :after_symlink do
    run <<-CMD
      rm -fr #{release_path}/db/sphinx &&
      ln -nfs #{shared_path}/db/sphinx #{release_path}/db/sphinx
    CMD
  end
  
  desc "Stop the sphinx server"
  task :stop_sphinx , :roles => :app do
    run "cd #{current_path} && rake thinking_sphinx:stop RAILS_ENV=production"
  end

  desc "Start the sphinx server" 
  task :start_sphinx, :roles => :app do
    run "cd #{current_path} && rake thinking_sphinx:configure RAILS_ENV=production && rake thinking_sphinx:index RAILS_ENV=production && rake thinking_sphinx:start RAILS_ENV=production"
  end

  desc "Restart the sphinx server"
  task :restart_sphinx, :roles => :app do
    stop_sphinx
    start_sphinx
  end

end

There's probably a neater way to do this, but basically this makes sure Sphinx's indexes and conf files live in the shared deployment folder.

I recommend you try all this in a staging area first, obviously... and you can use Dreamhost's control panel to set up a staging subdomain with a new database in whatever fashion you prefer.

Any problems with this script flag them up, please! This is as much for my future reference as you googlies out there.