Hello,
what is the current status of the RTF parser? I saw that there was a license
problem and was unable to find the code of the RTFParser.
When I crawl on rtf files, I almost always have the following error:
Error parsing: xxx.rtf : failed(2,0): Can't be handled as Microsoft
document.
java.io.IOException: Invalid header signature; read
7015536635646467195, expected -2226271756974174256
This error was also pointed by V. Shridar in one mail but unfortunately,
there was no response.
Should I definitely give up indexing rtf?
--
Sent from the Nutch - User mailing list archive at Nabble.com.