Magnifying Glass[Update: I now use Midwestern Mac, LLC’s hosted Apache Solr search for Drupal to host Solr; it’s a lot easier than setting up (or even needing) a VPS for Solr.]

In the Archdiocese of St. Louis, I manage more than 15 separate Drupal websites (plus a few others), and I have often wanted to use Apache Solr for search across all these sites. I finally had some time to tackle this issue, and I have a pretty good (and very fast) Solr server set up, and this server is shared across all these sites on two (so far) different webservers through two different hosting companies.

The main Archdiocesan sites (archstl.org, archstldev.com, and stlouisreview.com) are all hosted via SoftLayer in Dallas, while Catholic Youth Apostolate sites (like stlyouth.org and cycstl.net) are hosted via Hot Drupal in North Carolina.

I was able to set up a linode (linode.com) for less than $20 to run Apache Solr via Jetty, and that server is then accessible to all our other servers to send and receive search index data. This solution allows our main web servers to keep resources free from expensive MySQL search queries and the large databases that result from storing 20k+ nodes’ search data in the main site DB.

You can find the process by which I set up the search server in this issue on the Development website. The best thing about this system is that I can really make the search server fly; ping takes about 30-40ms between the search server and our other servers, and queries only take about 150-250ms to reach the websites.

Any large organization looking to vastly improve search performance (and usability), especially on a Drupal site (it’s so easy, with the Apache Solr Search Integration module pluggable right out of the box), should look into setting up a dedicated search VPS or server (depending on your search traffic).

Our linode Solr server typically sits close to idle, even at peak hours (right now it’s showing 0.00, 0.00, 0.00), and I’ll probably set it up to do some other tasks off-site as well, since it has the spare CPU, memory and disk space available (and a really fat pipe to the Internet!).