*** Please note – this will probably not work (at all) (for more than a day of light use) without Cron use. And isn’t at all authorised by Dreamhost!! ***
For a recent client project I’ve used a Dreamhost unlimited account, which for value compared with the resources available and the fact that you don’t have to do any building or setting up of the server environment makes it an easy win for a site that’s not going to have a huge amount of traffic or a large amount of processing.
Post-launch I got to work putting together a basic search engine and here’s a quick run through of the steps it took to get a very simple Sphinx instance working on Dreamhost, and a few hurdles thrown in the way by various googled articles.
Development Environment
Using the guide from FG install Sphinx locally:
curl -O http://sphinxsearch.com/downloads/sphinx-0.9.8-rc2.tar.gz
tar zxvf sphinx-0.9.8-rc2.tar.gz
cd sphinx-0.9.8-rc2
./configure
make
sudo make install
then install the TS plugin into your application
script/plugin install git://github.com/freelancing-god/thinking-sphinx.git
Any problems with that, check out the FG page linked.
Getting a basic search going
Following tutorials such as the Sphinx Railscast will get you there pretty quick.
In your searchable model you need to define an index
class Page < ActiveRecord::Base
define_index do
indexes :title
indexes :long
indexes :short
end ...
Run the indexer and start the Sphinx instance:
rake thinking_sphinx:index
rake thinking_sphinx:start
After this you'll be able to search on your object. So using script/console
@searched_pages = Page.search("query")
will return what you're looking for!
Setting up Dreamhost
First things first you need to install Sphinx in your local area, as posted by Hugh Evans:
cd ~/
mkdir -p local
wget http://sphinxsearch.com/downloads/sphinx-0.9.8.1.tar.gz
tar -xzf sphinx-0.9.8.1.tar.gz
cd sphinx-0.9.8.1/
./configure --prefix=$HOME/local/ --exec-prefix=$HOME/local/
make
make install
then set up the PATHs
echo "export PATH="$PATH:~/local/bin"" >> ~/.bash_profile
source ~/.bash_profile
You can choose to set up a CRON task at this point too, but I'm not going into that.
Also at this point in the there's talk of using Sphinx being anti TOS in DH's eyes... but we'll see does the process get killed or not!
Configuring Sphinx for DH
Create a file called sphinx.yml in the RAILS_ROOT/config/ folder.
Because Dreamhost uses an externally referenced MySQL server instead of localhost you need to set up the sql_* parameters:
sql_host: "mysql.YOURDOMAIN"
sql_port: 3306
sql_user: "USER"
sql_password: "PASSWORD"
sql_database: "DATABASE"
And because you installed Sphinx in your local area:
bin_path: '/home/YOURUSERNAME/local/bin'
Finally, after setting whatever memory/fine tuning settings you wish/require set up the locations for the Sphinx files:
config_file: "/home/YOURUSERNAME/DOMAIN.co.uk/shared/production.sphinx.conf"
searchd_log_file: "/home/YOURUSERNAME/DOMAIN.co.uk/shared/log/searchd.log"
query_log_file: "/home/YOURUSERNAME/DOMAIN.co.uk/shared/log/searchd.query.log"
pid_file: "/home/YOURUSERNAME/DOMAIN.co.uk/shared/log/searchd.production.pid"
searchd_file_path: "/home/YOURUSERNAME/DOMAIN.co.uk/shared/db/sphinx"
That should be you ready to start deploying.
Deploying
Using Git + Capistrano for deployment (and Passenger for the http server) my deploy.rb's namespace area looks like this:
namespace :deploy do
task :restart do
after_symlink
restart_sphinx
run "touch #{deploy_to}/current/tmp/restart.txt"
end
task :start do
# nothing (this avoids the 'spin' script issue)
end
desc "Re-establish symlinks"
task :after_symlink do
run <<-CMD
rm -fr #{release_path}/db/sphinx &&
ln -nfs #{shared_path}/db/sphinx #{release_path}/db/sphinx
CMD
end
desc "Stop the sphinx server"
task :stop_sphinx , :roles => :app do
run "cd #{current_path} && rake thinking_sphinx:stop RAILS_ENV=production"
end
desc "Start the sphinx server"
task :start_sphinx, :roles => :app do
run "cd #{current_path} && rake thinking_sphinx:configure RAILS_ENV=production && rake thinking_sphinx:index RAILS_ENV=production && rake thinking_sphinx:start RAILS_ENV=production"
end
desc "Restart the sphinx server"
task :restart_sphinx, :roles => :app do
stop_sphinx
start_sphinx
end
end
There's probably a neater way to do this, but basically this makes sure Sphinx's indexes and conf files live in the shared deployment folder.
I recommend you try all this in a staging area first, obviously... and you can use Dreamhost's control panel to set up a staging subdomain with a new database in whatever fashion you prefer.
Any problems with this script flag them up, please! This is as much for my future reference as you googlies out there.