tag:www.rhnh.net,2008:/sphinx Sphinx - Xavier Shay's Blog 2008-10-01T06:11:51Z Enki Xavier Shay notreal@rhnh.net tag:www.rhnh.net,2008:Post/786 2008-10-01T06:11:00Z 2008-10-01T06:11:51Z Integration testing with Cucumber, RSpec and Thinking Sphinx <p>Ideally you would want to include sphinx in your integration tests. It&#8217;s really just like your database. In practice, this is problematic. Ensuring the DB is started and triggering a re-index after each model load is doable, if slow, with a small bit of hacking of thinking sphinx (hint &#8211; change the initializer for the <code>ThinkingSphinx::Configuration</code> to allow you to specify the environment). Here&#8217;s the rub though &#8211; if you&#8217;re using transactional fixtures the sphinx indexer won&#8217;t be able to see any of your data! Turning that off can really slow down your tests, and once you add in the re-indexing time you&#8217;re going to be making a few cups of coffee while they run.</p> <p>One approach I&#8217;ve been taking is to stub out the <code>search</code> methods with <a href="http://github.com/btakita/rr/tree/master">RR</a>. I know, I know, stubbing in your integration tests is evil. I&#8217;m being pragmatic here. For most applications your search is trivial (find me results for this keyword), and if you unit test your <code>define_index</code> block you&#8217;re pretty well covered. To go one step further you could unit test your controllers with an expect on the search method, or have a separate suite of non-transactional integration tests running against sphinx. I like the latter, but haven&#8217;t done it yet.</p> <p>Enough talk! Here&#8217;s the magic you need to get it working with <a href="http://github.com/aslakhellesoy/cucumber/tree/master">cucumber</a>:</p><table class="CodeRay"><tr> <td class="line_numbers" title="click to toggle" onclick="with (this.firstChild.style) { display = (display == '') ? 'none' : '' }"><pre>1<tt> </tt>2<tt> </tt>3<tt> </tt>4<tt> </tt>5<tt> </tt>6<tt> </tt>7<tt> </tt>8<tt> </tt>9<tt> </tt></pre></td> <td class="code"><pre ondblclick="with (this.style) { overflow = (overflow == 'auto' || overflow == '') ? 'visible' : 'auto' }"><span class="c"># features/steps/env.rb</span><tt> </tt>require <span class="s"><span class="dl">'</span><span class="k">rr</span><span class="dl">'</span></span><tt> </tt><span class="co">Cucumber</span>::<span class="co">Rails</span>::<span class="co">World</span>.send(<span class="sy">:include</span>, <span class="co">RR</span>::<span class="co">Adapters</span>::<span class="co">RRMethods</span>)<tt> </tt><tt> </tt><span class="c"># features/steps/*_steps.rb</span><tt> </tt><span class="co">Given</span> <span class="rx"><span class="dl">/</span><span class="k">a car with model '(</span><span class="ch">\w</span><span class="k">+)' exists</span><span class="dl">/</span></span> <span class="r">do</span> |model|<tt> </tt> car = <span class="co">Car</span>.create!(<span class="sy">:model</span> =&gt; model)<tt> </tt> stub(<span class="co">Car</span>).search(model) { [car] }<tt> </tt><span class="r">end</span><tt> </tt></pre></td> </tr></table> tag:www.rhnh.net,2008:Post/781 2008-05-31T16:50:00Z 2008-05-31T16:50:59Z Finding related content with Sphinx <p>Previous efforts to <a href="http://rhnh.net/2008/04/16/classifier-gem-rubbish-for-recommending-posts">find related posts with the classifier gem</a> yielded no fruit, so I tried another approach using sphinx. Turned out to be a winner.</p> <p>The basic theory is to index all posts by tag, then to find related posts just use the current post&#8217;s tags as a search string. Remember to exclude the current post from the search results. For this blog, I use tags for the main categories, which were corrupting the results &#8211; most everything is tagged &#8216;Ruby&#8217; so it doesn&#8217;t add any value in determining likeness. So rather than indexing all tags I excluded some of the main ones.</p><table class="CodeRay"><tr> <td class="line_numbers" title="click to toggle" onclick="with (this.firstChild.style) { display = (display == '') ? 'none' : '' }"><pre>1<tt> </tt>2<tt> </tt>3<tt> </tt>4<tt> </tt>5<tt> </tt>6<tt> </tt>7<tt> </tt>8<tt> </tt>9<tt> </tt><strong>10</strong><tt> </tt>11<tt> </tt>12<tt> </tt>13<tt> </tt>14<tt> </tt>15<tt> </tt>16<tt> </tt>17<tt> </tt>18<tt> </tt></pre></td> <td class="code"><pre ondblclick="with (this.style) { overflow = (overflow == 'auto' || overflow == '') ? 'visible' : 'auto' }"><span class="r">class</span> <span class="cl">Post</span> &lt; <span class="co">ActiveRecord</span>::<span class="co">Base</span><tt> </tt> has_many <span class="sy">:searchable_tags</span>, <tt> </tt> <span class="sy">:through</span> =&gt; <span class="sy">:taggings</span>,<tt> </tt> <span class="sy">:source</span> =&gt; <span class="sy">:tag</span>,<tt> </tt> <span class="sy">:conditions</span> =&gt; <span class="s"><span class="dl">&quot;</span><span class="k">tags.name NOT IN ('Ruby', 'Code', 'Life')</span><span class="dl">&quot;</span></span><tt> </tt> <tt> </tt> <span class="r">def</span> <span class="fu">related_posts</span>(number = <span class="i">3</span>)<tt> </tt> <span class="co">Post</span>.search(<span class="sy">:limit</span> =&gt; number + <span class="i">1</span>, <span class="sy">:conditions</span> =&gt; {<tt> </tt> <span class="sy">:tag_list</span> =&gt; tag_list.join(<span class="s"><span class="dl">&quot;</span><span class="k">|</span><span class="dl">&quot;</span></span>)<tt> </tt> }).reject {|x| x == <span class="pc">self</span> }.first(number)<tt> </tt> <span class="r">end</span><tt> </tt><tt> </tt> define_index <span class="r">do</span><tt> </tt> indexes searchable_tags(<span class="sy">:name</span>), <span class="sy">:as</span> =&gt; <span class="sy">:tag_list</span><tt> </tt> <span class="c"># If you want to use this for normal search as well you'll have to </span><tt> </tt> <span class="c"># add in title/body here as well</span><tt> </tt> <span class="r">end</span><tt> </tt><span class="r">end</span><tt> </tt></pre></td> </tr></table> <p>For a more complete example, see the relevant <span class="caps">RHNH</span> commits: <a href="http://gitorious.org/projects/enki/repos/rhnh/commits/cdc0bfec73499a83c9ea299a6e1d09c7eb2a56d3">cdc0bf</a> and <a href="http://gitorious.org/projects/enki/repos/rhnh/commits/d4d844dc1cad1c55888342b8dc8dc9683efffbe3">d4d844</a></p> <p>Showing links to related content is a good way to stop the bottom of your page from being a &#8216;dead end&#8217;. In the event that no related posts are found, I&#8217;m linking to the archives instead.</p>