Java Mailing List Archive

http://www.java2.5341.com/

Home » nutch-user.lucene »

Re: Searching in sub-section of site

foobar3001

2008-05-26

Replies: Find Java Web Hosting

Author LoginPost Reply

I should add some additional information that could be important:

I am using Nutch 0.9 and the command line to test the results, like so:

    $ nutch-0.9/bin/nutch org.apache.nutch.searcher.NutchBean "...
[search terms] ..."

Also, in the nutch-site.xml file, I have enabled the site and also url query
plugin. At least I think I have. The relevant lines are this:

  <property>
    <name>plugin.includes</name>
   
<value>protocol-http|urlfilter-regex|parse-(text|html|oo|pdf|msword|msexcel|mspowerpoint|zip|)|analysis-(fr|en|de)|index-basic|query-(basic|site|url)|summary-basic|scoring-opic|urlnormalizer-(pass|regex|basic)</value>
    ....
  </property>

As far as I can tell, the query-url and query-site plugins are specified.
However, even when I try "url:..." as part of the query string, it seems to
disregard the 'url:' portion and then treats whatever comes after that as a
search term in the full text, rather than something that limits the results
by the URL.


--
Sent from the Nutch - User mailing list archive at Nabble.com.

©2008 java2.5341.com - Jax Systems, LLC, U.S.A.