Module org.apache.lucene.core
Class Lucene99ScalarQuantizedVectorsWriter
java.lang.Object
org.apache.lucene.codecs.KnnVectorsWriter
org.apache.lucene.codecs.hnsw.FlatVectorsWriter
org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsWriter
- All Implemented Interfaces:
Closeable
,AutoCloseable
,Accountable
Writes quantized vector values and metadata to index segments.
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescription(package private) static class
(package private) static class
(package private) static class
Returns a merged view over all the segment'sQuantizedByteVectorValues
.(package private) static final class
(package private) static final class
(package private) static class
(package private) static class
(package private) static final class
Nested classes/interfaces inherited from class org.apache.lucene.codecs.KnnVectorsWriter
KnnVectorsWriter.MergedVectorValues
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final byte
private final boolean
private final Float
private final List
<Lucene99ScalarQuantizedVectorsWriter.FieldWriter> private boolean
private final IndexOutput
private static final float
private final IndexOutput
private final FlatVectorsWriter
private static final float
private final SegmentWriteState
private static final long
private final int
Fields inherited from class org.apache.lucene.codecs.hnsw.FlatVectorsWriter
vectorsScorer
Fields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE
-
Constructor Summary
ConstructorsModifierConstructorDescriptionprivate
Lucene99ScalarQuantizedVectorsWriter
(SegmentWriteState state, int version, Float confidenceInterval, byte bits, boolean compress, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer) Lucene99ScalarQuantizedVectorsWriter
(SegmentWriteState state, Float confidenceInterval, byte bits, boolean compress, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer) Lucene99ScalarQuantizedVectorsWriter
(SegmentWriteState state, Float confidenceInterval, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer) -
Method Summary
Modifier and TypeMethodDescriptionAdd a new field for indexing(package private) static ScalarQuantizer
buildScalarQuantizer
(FloatVectorValues floatVectorValues, int numVectors, VectorSimilarityFunction vectorSimilarityFunction, Float confidenceInterval, byte bits) void
close()
void
finish()
Called once at the end before closevoid
flush
(int maxDoc, Sorter.DocMap sortMap) Flush all buffered data on disk *private static QuantizedVectorsReader
getQuantizedKnnVectorsReader
(KnnVectorsReader vectorsReader, String fieldName) private static ScalarQuantizer
getQuantizedState
(KnnVectorsReader vectorsReader, String fieldName) static ScalarQuantizer
mergeAndRecalculateQuantiles
(MergeState mergeState, FieldInfo fieldInfo, Float confidenceInterval, byte bits) Merges the quantiles of the segments and recalculates the quantiles if necessary.void
mergeOneField
(FieldInfo fieldInfo, MergeState mergeState) Write field for mergingmergeOneFieldToIndex
(FieldInfo fieldInfo, MergeState mergeState) Write the field for merging, providing a scorer over the newly merged flat vectors.mergeOneFieldToIndex
(SegmentWriteState segmentWriteState, FieldInfo fieldInfo, MergeState mergeState, ScalarQuantizer mergedQuantizationState) (package private) static ScalarQuantizer
mergeQuantiles
(List<ScalarQuantizer> quantizationStates, IntArrayList segmentSizes, byte bits) long
Return the memory usage of this object in bytes.(package private) static boolean
shouldRecomputeQuantiles
(ScalarQuantizer mergedQuantizationState, List<ScalarQuantizer> quantizationStates) Returns true if the quantiles of the merged state are too far from the quantiles of the individual states.(package private) static boolean
shouldRequantize
(ScalarQuantizer existingQuantiles, ScalarQuantizer newQuantiles) Returns true if the quantiles of the new quantization state are too far from the quantiles of the existing quantization state.private void
writeField
(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc, ScalarQuantizer scalarQuantizer) private void
writeMeta
(FieldInfo field, int maxDoc, long vectorDataOffset, long vectorDataLength, Float confidenceInterval, byte bits, boolean compress, Float lowerQuantile, Float upperQuantile, DocsWithFieldSet docsWithField) static DocsWithFieldSet
writeQuantizedVectorData
(IndexOutput output, QuantizedByteVectorValues quantizedByteVectorValues, byte bits, boolean compress) Writes the vector values to the output and returns a set of documents that contains vectors.private void
writeQuantizedVectors
(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, ScalarQuantizer scalarQuantizer) private void
writeSortedQuantizedVectors
(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int[] ordMap, ScalarQuantizer scalarQuantizer) private void
writeSortingField
(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc, Sorter.DocMap sortMap, ScalarQuantizer scalarQuantizer) Methods inherited from class org.apache.lucene.codecs.hnsw.FlatVectorsWriter
getFlatVectorScorer
Methods inherited from class org.apache.lucene.codecs.KnnVectorsWriter
mapOldOrdToNewOrd, merge
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface org.apache.lucene.util.Accountable
getChildResources
-
Field Details
-
SHALLOW_RAM_BYTES_USED
private static final long SHALLOW_RAM_BYTES_USED -
QUANTILE_RECOMPUTE_LIMIT
private static final float QUANTILE_RECOMPUTE_LIMIT- See Also:
-
REQUANTIZATION_LIMIT
private static final float REQUANTIZATION_LIMIT- See Also:
-
segmentWriteState
-
fields
-
meta
-
quantizedVectorData
-
confidenceInterval
-
rawVectorDelegate
-
bits
private final byte bits -
compress
private final boolean compress -
version
private final int version -
finished
private boolean finished
-
-
Constructor Details
-
Lucene99ScalarQuantizedVectorsWriter
public Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, Float confidenceInterval, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer) throws IOException - Throws:
IOException
-
Lucene99ScalarQuantizedVectorsWriter
public Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, Float confidenceInterval, byte bits, boolean compress, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer) throws IOException - Throws:
IOException
-
Lucene99ScalarQuantizedVectorsWriter
private Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, int version, Float confidenceInterval, byte bits, boolean compress, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer) throws IOException - Throws:
IOException
-
-
Method Details
-
addField
Description copied from class:FlatVectorsWriter
Add a new field for indexing- Specified by:
addField
in classFlatVectorsWriter
- Parameters:
fieldInfo
- fieldInfo of the field to add- Returns:
- a writer for the field
- Throws:
IOException
- if an I/O error occurs when adding the field
-
mergeOneField
Description copied from class:KnnVectorsWriter
Write field for merging- Overrides:
mergeOneField
in classKnnVectorsWriter
- Throws:
IOException
-
mergeOneFieldToIndex
public CloseableRandomVectorScorerSupplier mergeOneFieldToIndex(FieldInfo fieldInfo, MergeState mergeState) throws IOException Description copied from class:FlatVectorsWriter
Write the field for merging, providing a scorer over the newly merged flat vectors. This way any additional merging logic can be implemented by the user of this class.- Specified by:
mergeOneFieldToIndex
in classFlatVectorsWriter
- Parameters:
fieldInfo
- fieldInfo of the field to mergemergeState
- mergeState of the segments to merge- Returns:
- a scorer over the newly merged flat vectors, which should be closed as it holds a temporary file handle to read over the newly merged vectors
- Throws:
IOException
- if an I/O error occurs when merging
-
flush
Description copied from class:KnnVectorsWriter
Flush all buffered data on disk *- Specified by:
flush
in classKnnVectorsWriter
- Throws:
IOException
-
finish
Description copied from class:KnnVectorsWriter
Called once at the end before close- Specified by:
finish
in classKnnVectorsWriter
- Throws:
IOException
-
ramBytesUsed
public long ramBytesUsed()Description copied from interface:Accountable
Return the memory usage of this object in bytes. Negative values are illegal. -
writeField
private void writeField(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc, ScalarQuantizer scalarQuantizer) throws IOException - Throws:
IOException
-
writeMeta
private void writeMeta(FieldInfo field, int maxDoc, long vectorDataOffset, long vectorDataLength, Float confidenceInterval, byte bits, boolean compress, Float lowerQuantile, Float upperQuantile, DocsWithFieldSet docsWithField) throws IOException - Throws:
IOException
-
writeQuantizedVectors
private void writeQuantizedVectors(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, ScalarQuantizer scalarQuantizer) throws IOException - Throws:
IOException
-
writeSortingField
private void writeSortingField(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc, Sorter.DocMap sortMap, ScalarQuantizer scalarQuantizer) throws IOException - Throws:
IOException
-
writeSortedQuantizedVectors
private void writeSortedQuantizedVectors(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int[] ordMap, ScalarQuantizer scalarQuantizer) throws IOException - Throws:
IOException
-
mergeOneFieldToIndex
private Lucene99ScalarQuantizedVectorsWriter.ScalarQuantizedCloseableRandomVectorScorerSupplier mergeOneFieldToIndex(SegmentWriteState segmentWriteState, FieldInfo fieldInfo, MergeState mergeState, ScalarQuantizer mergedQuantizationState) throws IOException - Throws:
IOException
-
mergeQuantiles
static ScalarQuantizer mergeQuantiles(List<ScalarQuantizer> quantizationStates, IntArrayList segmentSizes, byte bits) -
shouldRecomputeQuantiles
static boolean shouldRecomputeQuantiles(ScalarQuantizer mergedQuantizationState, List<ScalarQuantizer> quantizationStates) Returns true if the quantiles of the merged state are too far from the quantiles of the individual states.- Parameters:
mergedQuantizationState
- The merged quantization statequantizationStates
- The quantization states of the individual segments- Returns:
- true if the quantiles should be recomputed
-
getQuantizedKnnVectorsReader
private static QuantizedVectorsReader getQuantizedKnnVectorsReader(KnnVectorsReader vectorsReader, String fieldName) -
getQuantizedState
-
mergeAndRecalculateQuantiles
public static ScalarQuantizer mergeAndRecalculateQuantiles(MergeState mergeState, FieldInfo fieldInfo, Float confidenceInterval, byte bits) throws IOException Merges the quantiles of the segments and recalculates the quantiles if necessary.- Parameters:
mergeState
- The merge statefieldInfo
- The field infoconfidenceInterval
- The confidence intervalbits
- The number of bits- Returns:
- The merged quantiles
- Throws:
IOException
- If there is a low-level I/O error
-
buildScalarQuantizer
static ScalarQuantizer buildScalarQuantizer(FloatVectorValues floatVectorValues, int numVectors, VectorSimilarityFunction vectorSimilarityFunction, Float confidenceInterval, byte bits) throws IOException - Throws:
IOException
-
shouldRequantize
Returns true if the quantiles of the new quantization state are too far from the quantiles of the existing quantization state. This would imply that floating point values would slightly shift quantization buckets.- Parameters:
existingQuantiles
- The existing quantiles for a segmentnewQuantiles
- The new quantiles for a segment, could be merged, or fully re-calculated- Returns:
- true if the floating point values should be requantized
-
writeQuantizedVectorData
public static DocsWithFieldSet writeQuantizedVectorData(IndexOutput output, QuantizedByteVectorValues quantizedByteVectorValues, byte bits, boolean compress) throws IOException Writes the vector values to the output and returns a set of documents that contains vectors.- Throws:
IOException
-
close
- Throws:
IOException
-