Java Mailing List Archive

http://www.java2.5341.com/

Home » nutch-user.lucene »

FastSavedException for MS Word

V Sridhar

2008-08-22


Author LoginPost Reply
Hi,


I did a crawl on <!-- URL SNIPPED -->



THE ERROR IS:


Error parsing: <!-- URL Snipped --> documentname.doc failed(2,0): Can't be handled as Microsoft document.
org.apache.nutch.parse.msword.FastSavedException: Fast-saved files are
unsupported at this time.

Wanted to see if there are any workarounds ... since around 40% of the documents
are giving this error.


Rgds,
Sridhar



   Add more friends to your messenger and enjoy! Go to http://in.messenger.yahoo.com/invite/
©2008 java2.5341.com - Jax Systems, LLC, U.S.A.