Java Mailing List Archive

http://www.java2.5341.com/

Home » nutch-user.lucene »

Using S3 with Hadoop/Nutch

Kevin MacDonald

2008-09-30

Replies: Find Java Web Hosting

Author LoginPost Reply
Does anyone have experience configuring Hadoop to use S3 for using nutch? I
tried modifying my hadoop-site.xml configuration file and it looks like
Hadoop is trying to use S3. But I think what's happening is that, once
configured to use S3, Hadoop is ONLY looking at S3 for all files. It's
trying to find a /tmp folder there, for example. And when running a crawl
Hadoop is looking to S3 to find the seed urls folder. Are there steps that
need to happen to prepare an S3 bucket for use by Hadoop so that a nutch
crawl can happen?
Kevin
©2008 java2.5341.com - Jax Systems, LLC, U.S.A.