Nutch is working great for me except on some text files with no extention in
the filename.
For example, a file called "MOI1993-NYN-350-0244-J_62037" that contains text
gets the following error:
Error parsing:
file:/home/nsnyder/AL_PROBLEM_FILES/MOI1993-NYN-350-0244-J_62037:
org.apache.nutch.parse.ParseException: parser not found for
contentType=application/octet-stream
url=file:/home/nsnyder/AL_PROBLEM_FILES/MOI1993-NYN-350-0244-J_62037
I can open this file in vi and see the plain text. What can I do to make
nutch get the content type as text and not
application/octet-stream.
--
Sent from the Nutch - User mailing list archive at Nabble.com.