|
||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||
| Interface Summary | |
|---|---|
| TokenFilterFactory | A TokenFilterFactory creates a
TokenFilter to transform one TokenStream
into another. |
| TokenizerFactory | A TokenizerFactory breaks up a stream of characters
into tokens. |
| Class Summary | |
|---|---|
| BaseTokenFilterFactory | Simple abstract implementation that handles init arg processing. |
| BaseTokenizerFactory | Simple abstract implementation that handles init arg processing. |
| BufferedTokenStream | Handles input and output buffering of TokenStream |
| EdgeNGramTokenizerFactory | Creates new instances of EdgeNGramTokenizer. |
| EnglishPorterFilterFactory | |
| HTMLStripReader | A Reader that wraps another reader and attempts to strip out HTML constructs. |
| HTMLStripStandardTokenizerFactory | |
| HTMLStripWhitespaceTokenizerFactory | |
| HyphenatedWordsFilter | When the plain text is extracted from documents, we will often have many words hyphenated and broken into two lines. |
| HyphenatedWordsFilterFactory | Factory for HyphenatedWordsFilter |
| ISOLatin1AccentFilterFactory | Factory for ISOLatin1AccentFilter $Id: ISOLatin1AccentFilterFactory.java 540849 2007-05-23 05:57:03Z otis $ |
| KeywordTokenizerFactory | |
| LengthFilter | |
| LengthFilterFactory | |
| LetterTokenizerFactory | |
| LowerCaseFilterFactory | |
| LowerCaseTokenizerFactory | |
| NGramTokenizerFactory | Creates new instances of NGramTokenizer. |
| PatternReplaceFilter | A TokenFilter which applies a Pattern to each token in the stream, replacing match occurances with the specified replacement string. |
| PatternReplaceFilterFactory | |
| PatternTokenizerFactory | This tokenizer uses regex pattern matching to construct distinct tokens for the input stream. |
| PhoneticFilter | Create tokens for phonetic matches. |
| PhoneticFilterFactory | Create tokens based on phonetic encoders http://jakarta.apache.org/commons/codec/api-release/org/apache/commons/codec/language/package-summary.html This takes two arguments: "encoder" required, one of "DoubleMetaphone", "Metaphone", "Soundex", "RefinedSoundex" "inject" (default=true) add tokens to the stream with the offset=0 |
| PorterStemFilterFactory | |
| RemoveDuplicatesTokenFilter | A TokenFilter which filters out Tokens at the same position and Term text as the previous token in the stream. |
| RemoveDuplicatesTokenFilterFactory | |
| SnowballPorterFilterFactory | Factory for SnowballFilters, with configurable language Browsing the code, SnowballFilter uses reflection to adapt to Lucene... |
| SolrAnalyzer | |
| StandardFilterFactory | |
| StandardTokenizerFactory | |
| StopFilterFactory | |
| SynonymFilter | SynonymFilter handles multi-token synonyms with variable position increment offsets. |
| SynonymFilterFactory | |
| SynonymMap | Mapping rules for use with SynonymFilter |
| TokenizerChain | |
| TrimFilter | Trims leading and trailing whitespace from Tokens in the stream. |
| TrimFilterFactory | |
| WhitespaceTokenizerFactory | |
| WordDelimiterFilterFactory | |
|
||||||||||
| PREV PACKAGE NEXT PACKAGE | FRAMES NO FRAMES | |||||||||