Mailing List
Home
Forum Home
Maven - Project building tool
Axis - Java SOAP implementation
Cocoon - MVC web framework based on XML/XSL
Lucene - Full-featured text search engine APIs
Log4J - A log library
Fop - Create PDF, PCL, PS, SVG, XML driven by XSL formatting objects.
POI - Java Excel, Word and other Microsoft Office files manipulating library
Oracle database error code ...
Subjects
log4j warning: No appenders could be found
java security AccessControlException: access denied (java io FilePermission clie
java lang InstantiationException: org apache tools ant Main
Apache Axis Tutorial
Struts <logic iterate >
log4j properties How to parse outpu to multiple files
configuring log4j with BEA Weblogic 8 1
How to use XSL FOP Java together
JSP precompile
Servlet File Download dialog problem (IE6,Adobe 6 0)
Proposal: Adding jar manifest classpath in jar and war plugins
Unsupported major minor version 48 0 problem while running the an
   telope task
java security AccessControlException: access denied (java io FilePermission
axis wsdl2java Ant Task usage
net sf hibernate MappingException: Error reading resource: test/User hbm xml
Building EAR ANT Script for websphere 5 0
CREATING WAR Files
jsp data into Excel
Classpath problem
Jboss 3 2 3+ vs Tomcat Axis Question
RE: How to include jars and add them into the MANIFEST MF/Class Path
attribute
Printing problem
InstantiationException
Couldn 't find trusted certificate
Please : How can one install ant 1 6 0 under Eclipse 2 1 ?
Excel: Too many different cell formats
Running junit tests fails
XDoclet, Struts and Maven: Where to start? SOLUTION
1 3 final: now giving me java io FileNotFoundException (Too many
open files)
AXIS: tomcat timeout ?
 
Search:  
Power your search with and, or, +, -, or "some phrase" operators.
Multiple collections

Multiple collections

2004-12-23       - By Jim Lynch

 Back
Reply:     1     2     3     4  

I'm investigating search engines and have started to look at Lucene.  I
have a couple of questions, however.  The faq seems to indicate we can't
do searches and indexing at the same time.  Is that still true, given
that the faq is a few years old now?  If so is there locking going on or
do I have to do it myself?

We have currently about 4 million documents comprised of  about 16
million terms.  This is currently broken up into about 50 different
collections which are separate "databases".  Some of these collections
are producted by a web crawler, some are produced by indexing a static
file tree and some are produced via a feed from another system, which
either adds new documents to a collection or replaces a document.  There
are really 2 questions.  Is this too much data for Lucene?  And is there
a way to keep separate collections (probably indexes) and search all
(usually just a subset) of them at once?  I see the MultiSearcher object
that may be the ticket, but IMHO javadocs leave a lot to be desired in
the way of documentation.  They seem to completely leave out the "glue"
and examples.

Thanks for any advice.

Jim.


-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------
To unsubscribe, e-mail: lucene-user-unsubscribe@(protected)
For additional commands, e-mail: lucene-user-help@(protected)


Earn $52 per hosting referral at Lunarpages.