Java Mailing List Archive

http://www.java2.5341.com/

Home » java-user.lucene »

AW: Parsing MSWord

Sertic Mirko, Bedag

2008-11-12

Replies: Find Java Web Hosting

Author LoginPost Reply
Hi

You can also use a tool called "antiword" to extract the text from a .doc file, and then
give the text to lucene.

See here : http://en.wikipedia.org/wiki/Antiword

Regards
Mirko

-----Ursprüngliche Nachricht-----
Von: dipesh [mailto:dipshrestha@(protected)]
Gesendet: Mittwoch, 12. November 2008 04:38
An: java-user@(protected)
Betreff: Parsing MSWord

Hello,
I wanted to know if there are classes in Lucene that support parsing MSWord
documents.
Many thanks,
Dipesh

----------------------------------------
"Help Ever Hurt Never"- Baba
©2008 java2.5341.com - Jax Systems, LLC, U.S.A.