Package com.ibm.icu.impl.coll
Class CollationFastLatin
java.lang.Object
com.ibm.icu.impl.coll.CollationFastLatin
-
Field Summary
FieldsModifier and TypeFieldDescription(package private) static final int
static final int
Comparison return value when the regular comparison must be used.(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
Contraction result first word bits 8..0 contain the second contraction character, as a char index 0..NUM_FAST_CHARS-1.(package private) static final int
Contraction result first word bits 10..9 contain the result length: 1=bail out, 2=one mini CE, 3=two mini CEs(package private) static final int
Contraction with one fast Latin character.(package private) static final int
(package private) static final int
An expansion encodes two CEs.(package private) static final int
static final int
static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
The highest primary weight is reserved for U+FFFF.(package private) static final int
(package private) static final int
(package private) static final int
Encodes one CE with a long/low mini primary (there are 128).(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
Encodes one CE with a short/high primary (there are 60), plus a secondary CE if the secondary weight is high.(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
Lookup: Add this offset to secondary weights, except for completely ignorable CEs.(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
Lookup: Add this offset to tertiary weights, except for completely ignorable CEs.(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
(package private) static final int
static final int
Fast Latin format version (one byte 1..FF). -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionstatic int
compareUTF16
(char[] table, char[] primaries, int options, CharSequence left, CharSequence right, int startIndex) private static int
getCases
(int variableTop, boolean strengthIsPrimary, int pair) (package private) static int
getCharIndex
(char c) static int
getOptions
(CollationData data, CollationSettings settings, char[] primaries) Computes the options value for the compare functions and writes the precomputed primary weights.private static int
getPrimaries
(int variableTop, int pair) private static int
getQuaternaries
(int variableTop, int pair) private static int
getSecondaries
(int variableTop, int pair) private static int
getSecondariesFromOneShortCE
(int ce) private static int
getTertiaries
(int variableTop, boolean withCaseBits, int pair) private static int
lookup
(char[] table, int c) private static long
nextPair
(char[] table, int c, int ce, CharSequence s16, int sIndex) Java returns a negative result (use the '~' operator) if sIndex is to be incremented.
-
Field Details
-
VERSION
public static final int VERSIONFast Latin format version (one byte 1..FF). Must be incremented for any runtime-incompatible changes, in particular, for changes to any of the following constants. When the major version number of the main data format changes, we can reset this fast Latin version to 1.- See Also:
-
LATIN_MAX
public static final int LATIN_MAX- See Also:
-
LATIN_LIMIT
public static final int LATIN_LIMIT- See Also:
-
LATIN_MAX_UTF8_LEAD
static final int LATIN_MAX_UTF8_LEAD- See Also:
-
PUNCT_START
static final int PUNCT_START- See Also:
-
PUNCT_LIMIT
static final int PUNCT_LIMIT- See Also:
-
NUM_FAST_CHARS
static final int NUM_FAST_CHARS- See Also:
-
SHORT_PRIMARY_MASK
static final int SHORT_PRIMARY_MASK- See Also:
-
INDEX_MASK
static final int INDEX_MASK- See Also:
-
SECONDARY_MASK
static final int SECONDARY_MASK- See Also:
-
CASE_MASK
static final int CASE_MASK- See Also:
-
LONG_PRIMARY_MASK
static final int LONG_PRIMARY_MASK- See Also:
-
TERTIARY_MASK
static final int TERTIARY_MASK- See Also:
-
CASE_AND_TERTIARY_MASK
static final int CASE_AND_TERTIARY_MASK- See Also:
-
TWO_SHORT_PRIMARIES_MASK
static final int TWO_SHORT_PRIMARIES_MASK- See Also:
-
TWO_LONG_PRIMARIES_MASK
static final int TWO_LONG_PRIMARIES_MASK- See Also:
-
TWO_SECONDARIES_MASK
static final int TWO_SECONDARIES_MASK- See Also:
-
TWO_CASES_MASK
static final int TWO_CASES_MASK- See Also:
-
TWO_TERTIARIES_MASK
static final int TWO_TERTIARIES_MASK- See Also:
-
CONTRACTION
static final int CONTRACTIONContraction with one fast Latin character. Use INDEX_MASK to find the start of the contraction list after the fixed table. The first entry contains the default mapping. Otherwise use CONTR_CHAR_MASK for the contraction character index (in ascending order). Use CONTR_LENGTH_SHIFT for the length of the entry (1=BAIL_OUT, 2=one CE, 3=two CEs). Also, U+0000 maps to a contraction entry, so that the fast path need not check for NUL termination. It usually maps to a contraction list with only the completely ignorable default value.- See Also:
-
EXPANSION
static final int EXPANSIONAn expansion encodes two CEs. Use INDEX_MASK to find the pair of CEs after the fixed table. The higher a mini CE value, the easier it is to process. For expansions and higher, no context needs to be considered.- See Also:
-
MIN_LONG
static final int MIN_LONGEncodes one CE with a long/low mini primary (there are 128). All potentially-variable primaries must be in this range, to make the short-primary path as fast as possible.- See Also:
-
LONG_INC
static final int LONG_INC- See Also:
-
MAX_LONG
static final int MAX_LONG- See Also:
-
MIN_SHORT
static final int MIN_SHORTEncodes one CE with a short/high primary (there are 60), plus a secondary CE if the secondary weight is high. Fast handling: At least all letter primaries should be in this range.- See Also:
-
SHORT_INC
static final int SHORT_INC- See Also:
-
MAX_SHORT
static final int MAX_SHORTThe highest primary weight is reserved for U+FFFF.- See Also:
-
MIN_SEC_BEFORE
static final int MIN_SEC_BEFORE- See Also:
-
SEC_INC
static final int SEC_INC- See Also:
-
MAX_SEC_BEFORE
static final int MAX_SEC_BEFORE- See Also:
-
COMMON_SEC
static final int COMMON_SEC- See Also:
-
MIN_SEC_AFTER
static final int MIN_SEC_AFTER- See Also:
-
MAX_SEC_AFTER
static final int MAX_SEC_AFTER- See Also:
-
MIN_SEC_HIGH
static final int MIN_SEC_HIGH- See Also:
-
MAX_SEC_HIGH
static final int MAX_SEC_HIGH- See Also:
-
SEC_OFFSET
static final int SEC_OFFSETLookup: Add this offset to secondary weights, except for completely ignorable CEs. Must be greater than any special value, e.g., MERGE_WEIGHT. The exact value is not relevant for the format version.- See Also:
-
COMMON_SEC_PLUS_OFFSET
static final int COMMON_SEC_PLUS_OFFSET- See Also:
-
TWO_SEC_OFFSETS
static final int TWO_SEC_OFFSETS- See Also:
-
TWO_COMMON_SEC_PLUS_OFFSET
static final int TWO_COMMON_SEC_PLUS_OFFSET- See Also:
-
LOWER_CASE
static final int LOWER_CASE- See Also:
-
TWO_LOWER_CASES
static final int TWO_LOWER_CASES- See Also:
-
COMMON_TER
static final int COMMON_TER- See Also:
-
MAX_TER_AFTER
static final int MAX_TER_AFTER- See Also:
-
TER_OFFSET
static final int TER_OFFSETLookup: Add this offset to tertiary weights, except for completely ignorable CEs. Must be greater than any special value, e.g., MERGE_WEIGHT. Must be greater than case bits as well, so that with combined case+tertiary weights plus the offset the tertiary bits does not spill over into the case bits. The exact value is not relevant for the format version.- See Also:
-
COMMON_TER_PLUS_OFFSET
static final int COMMON_TER_PLUS_OFFSET- See Also:
-
TWO_TER_OFFSETS
static final int TWO_TER_OFFSETS- See Also:
-
TWO_COMMON_TER_PLUS_OFFSET
static final int TWO_COMMON_TER_PLUS_OFFSET- See Also:
-
MERGE_WEIGHT
static final int MERGE_WEIGHT- See Also:
-
EOS
static final int EOS- See Also:
-
BAIL_OUT
static final int BAIL_OUT- See Also:
-
CONTR_CHAR_MASK
static final int CONTR_CHAR_MASKContraction result first word bits 8..0 contain the second contraction character, as a char index 0..NUM_FAST_CHARS-1. Each contraction list is terminated with a word containing CONTR_CHAR_MASK.- See Also:
-
CONTR_LENGTH_SHIFT
static final int CONTR_LENGTH_SHIFTContraction result first word bits 10..9 contain the result length: 1=bail out, 2=one mini CE, 3=two mini CEs- See Also:
-
BAIL_OUT_RESULT
public static final int BAIL_OUT_RESULTComparison return value when the regular comparison must be used. The exact value is not relevant for the format version.- See Also:
-
-
Constructor Details
-
CollationFastLatin
private CollationFastLatin()
-
-
Method Details
-
getCharIndex
static int getCharIndex(char c) -
getOptions
Computes the options value for the compare functions and writes the precomputed primary weights. Returns -1 if the Latin fastpath is not supported for the data and settings. The capacity must be LATIN_LIMIT. -
compareUTF16
public static int compareUTF16(char[] table, char[] primaries, int options, CharSequence left, CharSequence right, int startIndex) -
lookup
private static int lookup(char[] table, int c) -
nextPair
Java returns a negative result (use the '~' operator) if sIndex is to be incremented. C++ modifies sIndex. -
getPrimaries
private static int getPrimaries(int variableTop, int pair) -
getSecondariesFromOneShortCE
private static int getSecondariesFromOneShortCE(int ce) -
getSecondaries
private static int getSecondaries(int variableTop, int pair) -
getCases
private static int getCases(int variableTop, boolean strengthIsPrimary, int pair) -
getTertiaries
private static int getTertiaries(int variableTop, boolean withCaseBits, int pair) -
getQuaternaries
private static int getQuaternaries(int variableTop, int pair)
-