Java Mailing List Archive

http://www.java2.5341.com/

Home » nutch-user.lucene »

Some quick help please- No search results on nutch-0.8.1

nutch_newbie

2008-06-12

Replies: Find Java Web Hosting

Author LoginPost Reply

I would truly appreciate qucik help since i;m very short on time- thanks in
advance.
I have FC5, Java 1.6.0_06, Tomcat 5.5.16, and nutch-0.8.1.
I went through many tutorials and forums, trying to find my mistake, but no
luck...
Here is a piece i changed from my crawl-urlfilter.txt:
# accept hosts in MY.DOMAIN.NAME
+^http://([a-z0-9]*\.)*MY.DOMAIN.NAME/
+^http://([a-z0-9]*\.)*en.wikipedia.org/
+^http://([a-z0-9]*\.)*google.com/
+^http://([a-z0-9]*\.)*search.yahoo.com/
+^http://([a-z0-9]*\.)*apache.org/
+^http://([a-z0-9]*\.)*yahoo.com/
+^http://([a-z0-9]*\.)*amazon.com/
+^http://([a-z0-9]*\.)*about.com/
+^http://([a-z0-9]*\.)*bartleby.com/
+^http://([a-z0-9]*\.)*cnn.com/
+^http://([a-z0-9]*\.)*download.com/
+^http://([a-z0-9]*\.)*reference.com/
+^http://([a-z0-9]*\.)*wikipedia.org/
+^http://([a-z0-9]*\.)*www.weather.com/
+^http://([a-z0-9]*\.)*nih.gov/
+^http://([a-z0-9]*\.)*usa.gov/
+^http://([a-z0-9]*\.)*monster.com/
+^http://([a-z0-9]*\.)*time.com/time/

That looks right to me... here is the nutch0site.xml:

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<!-- Put site-specific property overrides in this file. -->

<configuration>
<property>
 <name>searcher.dir</name>
 <value>/usr/nutch-0.8.1/crawl/</value>
</property>
<property>
 <name>plugin.includes</name>

<value>protocol-file|protocol-http|parse-(text|html)|index-basic|query-(basic|site|url)</value>
</property>

<property>
 
 <name>http.agent.name</name>
 
 <value>Kate</value>
 
 <description>Kate
   
 </description>
 
</property>



<property>
 
 <name>http.agent.description</name>
 
 <value>Nutch spiderman</value>
 
 <description> Nutch spiderman
   
 </description>
 
</property>







<property>
 
 <name>http.agent.email</name>
 
 <value>MyEmail</value>
 
 <description>kateiafrika@(protected)
   
 </description>
 
</property>

</configuration>

That looks right too.
I ran the crawler, and it seems just fine. and in
localhost:8080/nutch-0.8.1 the nutch search window is displayed, but
whenever something is searched, the results always say "Hits 0-0 (out of
about 0 total matching pages): "
Can somebody please, please tell me what i'm doing wrong/not doing?
Thanks you :working:

--
Sent from the Nutch - User mailing list archive at Nabble.com.

©2008 java2.5341.com - Jax Systems, LLC, U.S.A.