Java Mailing List Archive

http://www.java2.5341.com/

Home » nutch-user.lucene »

Crawling MOSS 2007 content using Nutch via GSA connector

Lukas Vlcek

2008-04-24


Author LoginPost Reply
Hi,

Did anybody tried crawling the content from MOSS 2007 using Google
SharePoint connector? (
http://code.google.com/enterprise/download/sharepoint.html)
As far as I understood sharepoint connector (Apache licenced Java code)
should be able to dig list of document URLs out of MS SharePoint server.
Nutch should be then able to crawl and index these documents. Have anybody
tried this so far?

Regards,
Lukas

--
http://blog.lukas-vlcek.com/
©2008 java2.5341.com - Jax Systems, LLC, U.S.A.