Mailing List
Home
Forum Home
Maven - Project building tool
Axis - Java SOAP implementation
Cocoon - MVC web framework based on XML/XSL
Lucene - Full-featured text search engine APIs
Log4J - A log library
Fop - Create PDF, PCL, PS, SVG, XML driven by XSL formatting objects.
POI - Java Excel, Word and other Microsoft Office files manipulating library
Oracle database error code ...
Subjects
log4j warning: No appenders could be found
java security AccessControlException: access denied (java io FilePermission clie
java lang InstantiationException: org apache tools ant Main
Apache Axis Tutorial
Struts <logic iterate >
log4j properties How to parse outpu to multiple files
configuring log4j with BEA Weblogic 8 1
How to use XSL FOP Java together
JSP precompile
Servlet File Download dialog problem (IE6,Adobe 6 0)
Proposal: Adding jar manifest classpath in jar and war plugins
Unsupported major minor version 48 0 problem while running the an
   telope task
java security AccessControlException: access denied (java io FilePermission
axis wsdl2java Ant Task usage
net sf hibernate MappingException: Error reading resource: test/User hbm xml
Building EAR ANT Script for websphere 5 0
CREATING WAR Files
jsp data into Excel
Classpath problem
Jboss 3 2 3+ vs Tomcat Axis Question
RE: How to include jars and add them into the MANIFEST MF/Class Path
attribute
Printing problem
InstantiationException
Couldn 't find trusted certificate
Please : How can one install ant 1 6 0 under Eclipse 2 1 ?
Excel: Too many different cell formats
Running junit tests fails
XDoclet, Struts and Maven: Where to start? SOLUTION
1 3 final: now giving me java io FileNotFoundException (Too many
open files)
AXIS: tomcat timeout ?
 
Search:  
Power your search with and, or, +, -, or "some phrase" operators.
Stopwords in phrases

Stopwords in phrases

2004-12-21       - By Ravi

 Back
Reply:     1     2     3  

I want to be able to use stopwords in exact phrase searches. I have
looked at Nutch and used the same approach (replace common words with
n-grams. Look at net.nutch.analysis.CommonGrams).
 So if "to","be","or" and "not" are stop words, for the string "to be
or not to be", the analyzer produces the following tokens

[to-be, to-be-or, to-be-or-not, to-be-or-not-to, to-be-or-not-to-be,
be-or, be-or-not, be-or-not-to, be-or-not-to-be, or-not, or-not-to,
or-not-to-be, not-to, not-to-be, to-be]

 This is exactly what I wanted from the analyzer during indexing.
 But I'm having a problem with the search.
when I do a search on "not to be" the analyzer is converting my search
into
 content:"not-to not-to-be to-be" because the analyzer produces the
tokens "not-to","not-to-be","to-be"

 I'm getting 0 results on this as there is no token "not-to not-to-be
to-be" in the index.

 I want just "not-to-be" from the analyzer during the search so when I
search on "not to be" I will get the document which has "not-to-be" as a
token.

  How can I use the same analyzer to get different results in indexing
and searching?

Thanks in advance,
Ravi.

-- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ---- ------
To unsubscribe, e-mail: lucene-user-unsubscribe@(protected)
For additional commands, e-mail: lucene-user-help@(protected)


Earn $52 per hosting referral at Lunarpages.