Module org.apache.lucene.sandbox
Class SampleReader
java.lang.Object
org.apache.lucene.index.KnnVectorValues
org.apache.lucene.index.FloatVectorValues
org.apache.lucene.sandbox.codecs.quantization.SampleReader
- All Implemented Interfaces:
HasIndexSlice
A reader of vector values that samples a subset of the vectors.
-
Nested Class Summary
Nested classes/interfaces inherited from class org.apache.lucene.index.KnnVectorValues
KnnVectorValues.DocIndexIterator
-
Field Summary
FieldsModifier and TypeFieldDescriptionprivate final FloatVectorValues
private final IntUnaryOperator
private final int
-
Constructor Summary
ConstructorsConstructorDescriptionSampleReader
(FloatVectorValues origin, int sampleSize, IntUnaryOperator sampleFunction) -
Method Summary
Modifier and TypeMethodDescriptioncopy()
Creates a new copy of thisKnnVectorValues
.static SampleReader
createSampleReader
(FloatVectorValues origin, int k, long seed) int
Return the dimension of the vectorsgetAcceptOrds
(Bits acceptDocs) Returns a Bits accepting docs accepted by the argument and having a vector valuegetSlice()
Returns an IndexInput from which to read this instance's values.int
Returns the vector byte length, defaults to dimension multiplied by float byte sizeint
ordToDoc
(int ord) Return the docid of the document indexed with the given vector ordinal.static int[]
reservoirSample
(int n, int k, long seed) Sample k elements from n elements according to reservoir sampling algorithm.static int[]
reservoirSampleFromArray
(int[] origin, int k, long seed) Sample k elements from the origin array using reservoir sampling algorithm.int
size()
Return the number of vectors for this field.float[]
vectorValue
(int targetOrd) Return the vector value for the given vector ordinal which must be in [0, size() - 1], otherwise IndexOutOfBoundsException is thrown.Methods inherited from class org.apache.lucene.index.FloatVectorValues
checkField, fromFloats, getEncoding, scorer
Methods inherited from class org.apache.lucene.index.KnnVectorValues
createDenseIterator, createSparseIterator, fromDISI, iterator
-
Field Details
-
origin
-
sampleSize
private final int sampleSize -
sampleFunction
-
-
Constructor Details
-
SampleReader
SampleReader(FloatVectorValues origin, int sampleSize, IntUnaryOperator sampleFunction)
-
-
Method Details
-
size
public int size()Description copied from class:KnnVectorValues
Return the number of vectors for this field.- Specified by:
size
in classKnnVectorValues
- Returns:
- the number of vectors returned by this iterator
-
dimension
public int dimension()Description copied from class:KnnVectorValues
Return the dimension of the vectors- Specified by:
dimension
in classKnnVectorValues
-
copy
Description copied from class:KnnVectorValues
Creates a new copy of thisKnnVectorValues
. This is helpful when you need to access different values at once, to avoid overwriting the underlying vector returned.- Specified by:
copy
in classFloatVectorValues
- Throws:
IOException
-
getSlice
Description copied from interface:HasIndexSlice
Returns an IndexInput from which to read this instance's values.- Specified by:
getSlice
in interfaceHasIndexSlice
-
vectorValue
Description copied from class:FloatVectorValues
Return the vector value for the given vector ordinal which must be in [0, size() - 1], otherwise IndexOutOfBoundsException is thrown. The returned array may be shared across calls.- Specified by:
vectorValue
in classFloatVectorValues
- Returns:
- the vector value
- Throws:
IOException
-
getVectorByteLength
public int getVectorByteLength()Description copied from class:KnnVectorValues
Returns the vector byte length, defaults to dimension multiplied by float byte size- Overrides:
getVectorByteLength
in classKnnVectorValues
-
ordToDoc
public int ordToDoc(int ord) Description copied from class:KnnVectorValues
Return the docid of the document indexed with the given vector ordinal. This default implementation returns the argument and is appropriate for dense values implementations where every doc has a single value.- Overrides:
ordToDoc
in classKnnVectorValues
-
getAcceptOrds
Description copied from class:KnnVectorValues
Returns a Bits accepting docs accepted by the argument and having a vector value- Overrides:
getAcceptOrds
in classKnnVectorValues
-
createSampleReader
-
reservoirSample
public static int[] reservoirSample(int n, int k, long seed) Sample k elements from n elements according to reservoir sampling algorithm.- Parameters:
n
- number of elementsk
- number of samplesseed
- random seed- Returns:
- array of k samples
-
reservoirSampleFromArray
public static int[] reservoirSampleFromArray(int[] origin, int k, long seed) Sample k elements from the origin array using reservoir sampling algorithm.- Parameters:
origin
- original arrayk
- number of samplesseed
- random seed- Returns:
- array of k samples
-