Uses of Package
org.apache.lucene.analysis.miscellaneous
Packages that use org.apache.lucene.analysis.miscellaneous
Package
Description
A general-purpose Analyzer that can be created with a builder-style API.
Miscellaneous Tokenstreams.
Analyzer for Dutch.
Analyzer for Norwegian.
-
Classes in org.apache.lucene.analysis.miscellaneous used by org.apache.lucene.analysis.customClassDescriptionAbstract parent class for analysis factories that create
ConditionalTokenFilter
instances -
Classes in org.apache.lucene.analysis.miscellaneous used by org.apache.lucene.analysis.miscellaneousClassDescriptionA filter to apply normal capitalization rules to Tokens.Removes words that are too long or too short from the stream.Attribute providing access to the term builder and UTF-16 conversionAllows skipping TokenFilters based on the current set of attributes.Abstract parent class for analysis factories that create
ConditionalTokenFilter
instancesCharacters before the delimiter are the "token", the textual integer after is the term frequency.When the plain text is extracted from documents, we will often have many words hyphenated and broken into two lines.Marks terms as keywords via theKeywordAttribute
.Removes words that are too long or too short from the stream.A TokenFilter which filters out Tokens at the same position and Term text as the previous token in the stream.This filter normalize use of the interchangeable Scandinavian characters æÆäÄöÖøØ and folded variants (aa, ao, ae, oe and oo) by transforming them to åÅæÆøØ.This Normalizer does the heavy lifting for a set of Scandinavian normalization filters, normalizing use of the interchangeable Scandinavian characters æÆäÄöÖøØ and folded variants (aa, ao, ae, oe and oo) by transforming them to åÅæÆøØ.List of possible foldings that can be used when configuring the filterA read-only 4-byte FST backed map that allows fast case-insensitive key value lookups forStemmerOverrideFilter
Deprecated.Deprecated.A WDF concatenated 'run'A WDF concatenated 'run'A BreakIterator-like API for iterating over subwords in text, according to WordDelimiterGraphFilter rules. -
Classes in org.apache.lucene.analysis.miscellaneous used by org.apache.lucene.analysis.nlClassDescriptionA read-only 4-byte FST backed map that allows fast case-insensitive key value lookups for
StemmerOverrideFilter
-
Classes in org.apache.lucene.analysis.miscellaneous used by org.apache.lucene.analysis.noClassDescriptionThis Normalizer does the heavy lifting for a set of Scandinavian normalization filters, normalizing use of the interchangeable Scandinavian characters æÆäÄöÖøØ and folded variants (aa, ao, ae, oe and oo) by transforming them to åÅæÆøØ.