Using nutch-2008-07-28_04-01-14, if I enable the index-more plugin I get:
\hadoop-Doron\mapred\local\index\_-74404655 autoCommit=true
mergePolicy=org.apac
he.lucene.index.LogByteSizeMergePolicy@(protected)
ndex.ConcurrentMergeScheduler@(protected)
maxB
uffereDeleteTerms=-1 maxFieldLength=10000 index=
Exception in thread "main"
java.io.IOException: Job failed!
at
org.apache.hadoop.mapred.JobClient.runJob (
JobClient.java:1062)
at
org.apache.nutch.indexer.Indexer.index (
Indexer.java:311)
at
org.apache.nutch.crawl.Crawl.main (
Crawl.java:145)
Removing the plugin makes the crawl work.
According to http://www.nabble.com/index-more-problem--td16757538.html, the
index= part is the issue, but no mention is made of how to fix it. Also,
this used to work fine in nutch 0.9.
Any ideas how to fix this?