public class SubtokenFilter
extends org.apache.lucene.analysis.TokenFilter
Currently, the StandardTokenizer takes anything of the 'alpha.alpha.alpha' form, and keeps it all together, because it thinks it may be a server hostname (like "www.atlassian.com"). This is useful, however it prevents searches on the words between the dots. An example is searching for 'NullPointerException' when 'java.lang.NullPointerException' has been indexed. This filter tokenizes the individual words, as well as the full phrase, allowing searching to be done on either. (JRA-6397)
In addition, a comma separated list of numbers (eg "123,456,789") is not tokenized at the commas. This prevents searching on just "123". This filter tokenizes the individual numbers, as well as the full phrase, allowing searching to be done on either. (JRA-7774)
Constructor and Description |
---|
SubtokenFilter(org.apache.lucene.analysis.TokenStream tokenStream) |
Modifier and Type | Method and Description |
---|---|
boolean |
incrementToken() |
void |
reset() |
addAttribute, addAttributeImpl, captureState, clearAttributes, cloneAttributes, copyTo, endAttributes, equals, getAttribute, getAttributeClassesIterator, getAttributeFactory, getAttributeImplsIterator, hasAttribute, hasAttributes, hashCode, reflectAsString, reflectWith, removeAllAttributes, restoreState, toString
public SubtokenFilter(org.apache.lucene.analysis.TokenStream tokenStream)
public final boolean incrementToken() throws IOException
incrementToken
in class org.apache.lucene.analysis.TokenStream
IOException
public void reset() throws IOException
reset
in class org.apache.lucene.analysis.TokenFilter
IOException
Copyright © 2002-2024 Atlassian. All Rights Reserved.