Search/Lucene/Index/SegmentInfo.php

Show: PublicProtectedPrivateinherited
Table of Contents
Zend Framework
LICENSE This source file is subject to the new BSD license that is bundled with this package in the file LICENSE.txt. It is also available through the world-wide-web at this URL: http://framework.zend.com/license/new-bsd If you did not receive a copy of the license and are unable to obtain it through the world-wide-web, please send an email to [email protected] so we can send you a copy immediately.
Category
Zend  
Copyright
Copyright (c) 2005-2014 Zend Technologies USA Inc. (http://www.zend.com)  
License
New BSD License  
Package
Zend_Search_Lucene  
Subpackage
Index  
Version
$Id$  

\Zend_Search_Lucene_Index_SegmentInfo

Package: Zend_Search_Lucene\Index
Implements
Category
Zend  
Copyright
Copyright (c) 2005-2014 Zend Technologies USA Inc. (http://www.zend.com)  
License
New BSD License  

Constants

>VConstant  FULL_SCAN_VS_FETCH_BOUNDARY = 5
"Full scan vs fetch" boundary.
If filter selectivity is less than this value, then full scan is performed (since term entries fetching has some additional overhead).
>VConstant  SM_TERMS_ONLY = 0
Scan modes
>VConstant  SM_FULL_INFO = 1
>VConstant  SM_MERGE_INFO = 2

Properties

>VPropertyprivateinteger $_delGen
Delete file generation number
-2 means autodetect latest delete generation -1 means 'there is no delete file' 0 means pre-2.1 format delete file X specifies used delete file
Details
Type
integer
>VPropertyprivatemixed $_deleted = null
List of deleted documents.
bitset if bitset extension is loaded or array otherwise.
Default valuenullDetails
Type
mixed
>VPropertyprivateboolean $_deletedDirty = false
$this->_deleted update flag
Default valuefalseDetails
Type
boolean
>VPropertyprivate\Zend_Search_Lucene_Storage_Directory_Filesystem $_directory
File system adapter.
>VPropertyprivateinteger $_docCount
Number of docs in a segment
Details
Type
integer
>VPropertyprivatearray|null $_docMap = null
Map of the document IDs Used to get new docID after removing deleted documents.
It's not very effective from memory usage point of view, but much more faster, then other methods
Default valuenullDetails
Type
array | null
>VPropertyprivatearray $_fields
Segment fields. Array of Zend_Search_Lucene_Index_FieldInfo objects for this segment
Details
Type
array
>VPropertyprivatearray $_fieldsDicPositions
Field positions in a dictionary.
(Term dictionary contains filelds ordered by names)
Details
Type
array
>VPropertyprivate\Zend_Search_Lucene_Storage_File $_frqFile = null
Frequencies File object for stream like terms reading
Default valuenullDetails
Type
\Zend_Search_Lucene_Storage_File
>VPropertyprivateinteger $_frqFileOffset
Actual offset of the .frq file data
Details
Type
integer
>VPropertyprivateboolean $_hasSingleNormFile
Segment has single norms file
If true then one .nrm file is used for all fields Otherwise .fN files are used
Details
Type
boolean
>VPropertyprivateinteger $_indexInterval
Segment index interval
Details
Type
integer
>VPropertyprivateboolean $_isCompound
Use compound segment file (*.cfs) to collect all other segment files (excluding .del files)
Details
Type
boolean
>VPropertyprivate\Zend_Search_Lucene_Index_Term $_lastTerm = null
Last Term in a terms stream
Default valuenullDetails
Type
\Zend_Search_Lucene_Index_Term
>VPropertyprivate\Zend_Search_Lucene_Index_TermInfo $_lastTermInfo = null
Last TermInfo in a terms stream
Default valuenullDetails
Type
\Zend_Search_Lucene_Index_TermInfo
>VPropertyprivatearray|null $_lastTermPositions
An array of all term positions in the documents.
Array structure: array( docId => array( pos1, pos2, ...), ...) Is set to null if term positions loading has to be skipped
Details
Type
array | null
>VPropertyprivatestring $_name
Segment name
Details
Type
string
>VPropertyprivatearray $_norms = array()
Normalization factors.
An array fieldName => normVector normVector is a binary string. Each byte corresponds to an indexed document in a segment and encodes normalization factor (float value, encoded by Zend_Search_Lucene_Search_Similarity::encodeNorm())
Default valuearray()Details
Type
array
>VPropertyprivate\Zend_Search_Lucene_Storage_File $_prxFile = null
Positions File object for stream like terms reading
Default valuenullDetails
Type
\Zend_Search_Lucene_Storage_File
>VPropertyprivateinteger $_prxFileOffset
Actual offset of the .prx file in the compound file
Details
Type
integer
>VPropertyprivatearray $_segFileSizes
Associative array where the key is the file name and the value is file size (.csf).
Details
Type
array
>VPropertyprivatearray $_segFiles
Associative array where the key is the file name and the value is data offset in a compound segment file (.csf).
Details
Type
array
>VPropertyprivate$_sharedDocStoreOptions
Details
Type
n/a
>VPropertyprivateinteger $_skipInterval
Segment skip interval
Details
Type
integer
>VPropertyprivateinteger $_termCount = 0
Actual number of terms in term stream
Default value0Details
Type
integer
>VPropertyprivatearray $_termDictionary
Term Dictionary Index
Array of arrays (Zend_Search_Lucene_Index_Term objects are represented as arrays because of performance considerations) [0] -> $termValue [1] -> $termFieldNum Corresponding Zend_Search_Lucene_Index_TermInfo object stored in the $_termDictionaryInfos
Details
Type
array
>VPropertyprivatearray $_termDictionaryInfos
Term Dictionary Index TermInfos
Array of arrays (Zend_Search_Lucene_Index_TermInfo objects are represented as arrays because of performance considerations) [0] -> $docFreq [1] -> $freqPointer [2] -> $proxPointer [3] -> $skipOffset [4] -> $indexPointer
Details
Type
array
>VPropertyprivatearray $_termInfoCache = array()
TermInfo cache
Size is 1024. Numbers are used instead of class constants because of performance considerations
Default valuearray()Details
Type
array
>VPropertyprivateinteger $_termNum = 0
Overall number of terms in term stream
Default value0Details
Type
integer
>VPropertyprivateinteger $_termsScanMode
Terms scan mode
Values: self::SM_TERMS_ONLY - terms are scanned, no additional info is retrieved self::SM_FULL_INFO - terms are scanned, frequency and position info is retrieved self::SM_MERGE_INFO - terms are scanned, frequency and position info is retrieved document numbers are compacted (shifted if segment has deleted documents)
Details
Type
integer
>VPropertyprivate\Zend_Search_Lucene_Storage_File $_tisFile = null
Term Dictionary File object for stream like terms reading
Default valuenullDetails
Type
\Zend_Search_Lucene_Storage_File
>VPropertyprivateinteger $_tisFileOffset
Actual offset of the .tis file data
Details
Type
integer
>VPropertyprivateboolean $_usesSharedDocStore
True if segment uses shared doc store
Details
Type
boolean

Methods

methodpublic__construct(\Zend_Search_Lucene_Storage_Directory $directory, string $name, integer $docCount, integer $delGen = 0, array | null $docStoreOptions = null, boolean $hasSingleNormFile = false, boolean $isCompound = null) : void

Zend_Search_Lucene_Index_SegmentInfo constructor

Parameters
NameTypeDescription
$directory\Zend_Search_Lucene_Storage_Directory
$namestring
$docCountinteger
$delGeninteger
$docStoreOptionsarray | null
$hasSingleNormFileboolean
$isCompoundboolean
methodprivate_cleanUpTermInfoCache() : void

methodprivate_deletedCount() : integer

Returns number of deleted documents.

Returns
TypeDescription
integer
methodprivate_detectLatestDelGen() : integer

Detect latest delete generation

Is actualy used from writeChanges() method or from the constructor if it's invoked from Index writer. In both cases index write lock is already obtained, so we shouldn't care about it
Returns
TypeDescription
integer
methodprivate_getFieldPosition(integer $fieldNum) : integer

Get field position in a fields dictionary

Parameters
NameTypeDescription
$fieldNuminteger
Returns
TypeDescription
integer
methodprivate_load21DelFile() : mixed

Load 2.1+ format detetions file

Returns bitset or an array depending on bitset extension availability
Returns
TypeDescription
mixed
methodprivate_loadDelFile() : mixed

Load detetions file

Returns bitset or an array depending on bitset extension availability
Returns
TypeDescription
mixed
Throws
ExceptionDescription
\Zend_Search_Lucene_Exception
methodprivate_loadDictionaryIndex() : void

Load terms dictionary index

Throws
ExceptionDescription
\Zend_Search_Lucene_Exception
methodprivate_loadNorm(integer $fieldNum) : void

Load normalizatin factors from an index file

Parameters
NameTypeDescription
$fieldNuminteger
Throws
ExceptionDescription
\Zend_Search_Lucene_Exception
methodprivate_loadPre21DelFile() : mixed

Load pre-2.1 detetions file

Returns bitset or an array depending on bitset extension availability
Returns
TypeDescription
mixed
Throws
ExceptionDescription
\Zend_Search_Lucene_Exception
methodpubliccloseTermsStream() : void

Close terms stream

Should be used for resources clean up if stream is not read up to the end
methodpubliccompoundFileLength(string $extension) : integer

Get compound file length

Parameters
NameTypeDescription
$extensionstring
Returns
TypeDescription
integer
methodpubliccount() : integer

Returns the total number of documents in this segment (including deleted documents).

Returns
TypeDescription
integer
methodpubliccurrentTerm() : \Zend_Search_Lucene_Index_Term | null

Returns term in current position

Returns
TypeDescription
\Zend_Search_Lucene_Index_Term | null
methodpubliccurrentTermPositions() : array

Returns an array of all term positions in the documents.

Return array structure: array( docId => array( pos1, pos2, ...), ...)
Returns
TypeDescription
array
methodpublicdelete( $id) : void

Deletes a document from the index segment.

$id is an internal document id
Parameters
NameTypeDescription
$id
methodpublicgetDelGen() : integer

Returns actual deletions file generation number.

Returns
TypeDescription
integer
methodpublicgetField(integer $fieldNum) : \Zend_Search_Lucene_Index_FieldInfo

Returns field info for specified field

Parameters
NameTypeDescription
$fieldNuminteger
Returns
TypeDescription
\Zend_Search_Lucene_Index_FieldInfo
methodpublicgetFieldInfos() : array

Returns array of FieldInfo objects.

Returns
TypeDescription
array
methodpublicgetFieldNum(string $fieldName) : integer

Returns field index or -1 if field is not found

Parameters
NameTypeDescription
$fieldNamestring
Returns
TypeDescription
integer
methodpublicgetFields(boolean $indexed = false) : array

Returns array of fields.

if $indexed parameter is true, then returns only indexed fields.
Parameters
NameTypeDescription
$indexedboolean
Returns
TypeDescription
array
methodpublicgetName() : string

Return segment name

Returns
TypeDescription
string
methodpublicgetTermInfo(\Zend_Search_Lucene_Index_Term $term) : \Zend_Search_Lucene_Index_TermInfo

Scans terms dictionary and returns term info

Parameters
NameTypeDescription
$term\Zend_Search_Lucene_Index_Term
Returns
TypeDescription
\Zend_Search_Lucene_Index_TermInfo
methodpublichasDeletions() : boolean

Returns true if any documents have been deleted from this index segment.

Returns
TypeDescription
boolean
methodpublichasSingleNormFile() : boolean

Returns true if segment has single norms file.

Returns
TypeDescription
boolean
methodpublicisCompound() : boolean

Returns true if segment is stored using compound segment file.

Returns
TypeDescription
boolean
methodpublicisDeleted( $id) : boolean

Checks, that document is deleted

Parameters
NameTypeDescription
$id
Returns
TypeDescription
boolean
methodpublicnextTerm() : \Zend_Search_Lucene_Index_Term | null

Scans terms dictionary and returns next term

Returns
TypeDescription
\Zend_Search_Lucene_Index_Term | null
methodpublicnorm(integer $id, string $fieldName) : float

Returns normalization factor for specified documents

Parameters
NameTypeDescription
$idinteger
$fieldNamestring
Returns
TypeDescription
float
methodpublicnormVector(string $fieldName) : string

Returns norm vector, encoded in a byte string

Parameters
NameTypeDescription
$fieldNamestring
Returns
TypeDescription
string
methodpublicnumDocs() : integer

Returns the total number of non-deleted documents in this segment.

Returns
TypeDescription
integer
methodpublicopenCompoundFile(string $extension, boolean $shareHandler = true) : \Zend_Search_Lucene_Storage_File

Opens index file stoted within compound index file

Parameters
NameTypeDescription
$extensionstring
$shareHandlerboolean
Returns
TypeDescription
\Zend_Search_Lucene_Storage_File
Throws
ExceptionDescription
\Zend_Search_Lucene_Exception
methodpublicresetTermsStream() : integer

Reset terms stream

$startId - id for the fist document $compact - remove deleted documents Returns start document id for the next segment
Returns
TypeDescription
integer
Throws
ExceptionDescription
\Zend_Search_Lucene_Exception
methodpublicskipTo(\Zend_Search_Lucene_Index_Term $prefix) : void

Skip terms stream up to the specified term preffix.

Prefix contains fully specified field info and portion of searched term
Parameters
NameTypeDescription
$prefix\Zend_Search_Lucene_Index_Term
Throws
ExceptionDescription
\Zend_Search_Lucene_Exception
methodpublictermDocs(\Zend_Search_Lucene_Index_Term $term, integer $shift = 0, \Zend_Search_Lucene_Index_DocsFilter | null $docsFilter = null) : array

Returns IDs of all the documents containing term.

Parameters
NameTypeDescription
$term\Zend_Search_Lucene_Index_Term
$shiftinteger
$docsFilter\Zend_Search_Lucene_Index_DocsFilter | null
Returns
TypeDescription
array
methodpublictermFreqs(\Zend_Search_Lucene_Index_Term $term, integer $shift = 0, \Zend_Search_Lucene_Index_DocsFilter | null $docsFilter = null) : \Zend_Search_Lucene_Index_TermInfo

Returns term freqs array.

Result array structure: array(docId => freq, ...)
Parameters
NameTypeDescription
$term\Zend_Search_Lucene_Index_Term
$shiftinteger
$docsFilter\Zend_Search_Lucene_Index_DocsFilter | null
Returns
TypeDescription
\Zend_Search_Lucene_Index_TermInfo
methodpublictermPositions(\Zend_Search_Lucene_Index_Term $term, integer $shift = 0, \Zend_Search_Lucene_Index_DocsFilter | null $docsFilter = null) : \Zend_Search_Lucene_Index_TermInfo

Returns term positions array.

Result array structure: array(docId => array(pos1, pos2, ...), ...)
Parameters
NameTypeDescription
$term\Zend_Search_Lucene_Index_Term
$shiftinteger
$docsFilter\Zend_Search_Lucene_Index_DocsFilter | null
Returns
TypeDescription
\Zend_Search_Lucene_Index_TermInfo
Documentation was generated by phpDocumentor 2.2.0 .