Java Mailing List Archive

http://www.java2.5341.com/

Home » nutch-user.lucene »

document segement size and search performance ?

wuqi

2008-06-03

Replies: Find Java Web Hosting

Author LoginPost Reply
Hi,
As we all know, "parse_text" in the segment will be used by searcher to generate snippets,and I want to know with the two conditions below which should be faster for searcher to retrieve pars_text:
1. 50 Segments * 10,000 pages/segment
2. 5 segment * 100,000 pages/segment
If we have more segments and less pages per segment ,seems we need to open more segment files,and hence more memory? If more pages in a segment,we might need more time to get certain page out? Find a page from 10,000 pages should be faster than 100,000 pages ?
For a search engine which have about 10M documents, how many segments dir should I have ?

Thanks
-Qi


©2008 java2.5341.com - Jax Systems, LLC, U.S.A.