Setting up an Apache Solr Search Server (for many sites/hosts)

Magnifying GlassIn the Archdiocese of St. Louis, I manage more than 15 separate Drupal websites (plus a few others), and I have often wanted to use Apache Solr for search across all these sites. I finally had some time to tackle this issue, and I have a pretty good (and very fast) Solr server set up, and this server is shared across all these sites on two (so far) different webservers through two different hosting companies.

The main Archdiocesan sites (archstl.org, archstldev.com, and stlouisreview.com) are all hosted via SoftLayer in Dallas, while Catholic Youth Apostolate sites (like stlyouth.org and cycstl.net) are hosted via Hot Drupal in North Carolina.

I was able to set up a linode (linode.com) for less than $20 to run Apache Solr via Jetty, and that server is then accessible to all our other servers to send and receive search index data. This solution allows our main web servers to keep resources free from expensive MySQL search queries and the large databases that result from storing 20k+ nodes' search data in the main site DB.

You can find the process by which I set up the search server in this issue on the Development website. The best thing about this system is that I can really make the search server fly; ping takes about 30-40ms between the search server and our other servers, and queries only take about 150-250ms to reach the websites.

Any large organization looking to vastly improve search performance (and usability), especially on a Drupal site (it's so easy, with the Apache Solr Search Integration module pluggable right out of the box), should look into setting up a dedicated search VPS or server (depending on your search traffic).

Our linode Solr server typically sits close to idle, even at peak hours (right now it's showing 0.00, 0.00, 0.00), and I'll probably set it up to do some other tasks off-site as well, since it has the spare CPU, memory and disk space available (and a really fat pipe to the Internet!).

Your rating: None Average: 5 (1 vote)

Comments

Steve's picture

I also wrote a blog post on running multi site Solr on Ubuntu that merges the handbook page you mentioned with Tomcat (for those that might want to use Tomcat instead of Jetty).

oscatholic's picture

I noticed that post at one point, but was more interested in Jetty, since it was a little simpler. Although I have tomcat running on the server... it was giving me some headaches, though.

Is there any large performance implication using Jetty vs. Tomcat?

Advancing the faith.

Steve's picture

Well, this post is very similar to yours in that it also uses Jetty. He mentions that it performs well and uses fewer resources than Tomcat. I might have to try Jetty myself just to see if I can see a difference.

oscatholic's picture

Seeing that I only get about 200 searches per hour, it's not a huge deal if there are a few ms here or there.. but as traffic grows, it could become more important.

Advancing the faith.

Post new comment

The content of this field is kept private and will not be shown publicly.