Java Mailing List Archive

http://www.java2.5341.com/

Home » nutch-user.lucene »

Streaming.jar for Nutch?

Chris Anderson

2008-06-09

Replies: Find Java Web Hosting

Author LoginPost Reply
We're planning to run some Ruby parsers on the fetched content from a
Nutch crawl. It seems like the best way to do this would be through an
interface like Hadoop's streaming.jar, but streaming.jar expects a
line-based input format.

Has anyone written a version of streaming.jar for Nutch? We're working
on one, so if you'd like to collaborate (or have any advice), please
reply!

Thanks,
Chris

--
Chris Anderson
http://jchris.mfdz.com
©2008 java2.5341.com - Jax Systems, LLC, U.S.A.