public abstract class

IndexReader

extends Object
implements Closeable Cloneable
java.lang.Object
   ↳ org.apache.lucene.index.IndexReader
Known Direct Subclasses
Known Indirect Subclasses

Class Overview

IndexReader is an abstract class, providing an interface for accessing an index. Search of an index is done entirely through this abstract interface, so that any subclass which implements it is searchable.

Concrete subclasses of IndexReader are usually constructed with a call to one of the static open() methods, e.g. open(Directory, boolean).

For efficiency, in this API documents are often referred to via document numbers, non-negative integers which each name a unique document in the index. These document numbers are ephemeral--they may change as documents are added to and deleted from an index. Clients should thus not rely on a given document having the same number between sessions.

An IndexReader can be opened on a directory for which an IndexWriter is opened already, but it cannot be used to delete documents from the index then.

NOTE: for backwards API compatibility, several methods are not listed as abstract, but have no useful implementations in this base class and instead always throw UnsupportedOperationException. Subclasses are strongly encouraged to override these methods, but in many cases may not need to.

NOTE: as of 2.4, it's possible to open a read-only IndexReader using the static open methods that accept the boolean readOnly parameter. Such a reader has better concurrency as it's not necessary to synchronize on the isDeleted method. You must specify false if you want to make changes with the resulting IndexReader.

NOTE: IndexReader instances are completely thread safe, meaning multiple threads can call any of its methods, concurrently. If your application requires external synchronization, you should not synchronize on the IndexReader instance; use your own (non-Lucene) objects instead.

Summary

Nested Classes
class IndexReader.FieldOption Constants describing field properties, for example used for getFieldNames(FieldOption)
interface IndexReader.ReaderFinishedListener A custom listener that's invoked when the IndexReader is finished. 
Fields
protected boolean hasChanges
protected Collection<IndexReader.ReaderFinishedListener> readerFinishedListeners
Protected Constructors
IndexReader()
Public Methods
void addReaderFinishedListener(IndexReader.ReaderFinishedListener listener)
synchronized IndexReader clone(boolean openReadOnly)
Clones the IndexReader and optionally changes readOnly.
synchronized Object clone()
Efficiently clones the IndexReader (sharing most internal state).
synchronized final void close()
Closes files associated with this index.
synchronized final void commit(Map<StringString> commitUserData)
Commit changes resulting from delete, undeleteAll, or setNorm operations If an exception is hit, then either no changes or all changes will have been committed to the index (transactional semantics).
void decRef()
Expert: decreases the refCount of this IndexReader instance.
synchronized void deleteDocument(int docNum)
Deletes the document numbered docNum.
int deleteDocuments(Term term)
Deletes all documents that have a given term indexed.
Directory directory()
Returns the directory associated with this index.
abstract int docFreq(Term t)
Returns the number of documents containing the term t.
abstract Document document(int n, FieldSelector fieldSelector)
Get the Document at the n th position.
Document document(int n)
Returns the stored fields of the nth Document in this index.
synchronized final void flush(Map<StringString> commitUserData)
synchronized final void flush()
Map<StringString> getCommitUserData()
Retrieve the String userData optionally passed to IndexWriter#commit.
static Map<StringString> getCommitUserData(Directory directory)
Reads commitUserData, previously passed to commit(Map), from current index segments file.
Object getCoreCacheKey()
Expert
static long getCurrentVersion(Directory directory)
Reads version number from segments files.
Object getDeletesCacheKey()
Expert.
abstract Collection<String> getFieldNames(IndexReader.FieldOption fldOption)
Get a list of unique field names that exist in this index and have the specified field option information.
IndexCommit getIndexCommit()
Expert: return the IndexCommit that this reader has opened.
int getRefCount()
Expert: returns the current refCount for this reader
IndexReader[] getSequentialSubReaders()
Expert: returns the sequential sub readers that this reader is logically composed of.
abstract void getTermFreqVector(int docNumber, String field, TermVectorMapper mapper)
Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of the TermFreqVector.
abstract TermFreqVector getTermFreqVector(int docNumber, String field)
Return a term frequency vector for the specified document and field.
abstract void getTermFreqVector(int docNumber, TermVectorMapper mapper)
Map all the term vectors for all fields in a Document
abstract TermFreqVector[] getTermFreqVectors(int docNumber)
Return an array of term frequency vectors for the specified document.
int getTermInfosIndexDivisor()
For IndexReader implementations that use TermInfosReader to read terms, this returns the current indexDivisor as specified when the reader was opened.
long getUniqueTermCount()
Returns the number of unique terms (across all fields) in this reader.
long getVersion()
Version number when this IndexReader was opened.
abstract boolean hasDeletions()
Returns true if any documents have been deleted
boolean hasNorms(String field)
Returns true if there are norms stored for this field.
void incRef()
Expert: increments the refCount of this IndexReader instance.
static boolean indexExists(Directory directory)
Returns true if an index exists at the specified directory.
boolean isCurrent()
Check whether any new changes have occurred to the index since this reader was opened.
abstract boolean isDeleted(int n)
Returns true if document n has been deleted
boolean isOptimized()
Checks is the index is optimized (if it has a single segment and no deletions).
static long lastModified(Directory directory2)
Returns the time the index in the named directory was last modified.
static Collection<IndexCommit> listCommits(Directory dir)
Returns all commit points that exist in the Directory.
static void main(String[] args)
Prints the filename and size of each file within a given compound file.
abstract int maxDoc()
Returns one greater than the largest possible document number.
abstract void norms(String field, byte[] bytes, int offset)
Reads the byte-encoded normalization factor for the named field of every document.
abstract byte[] norms(String field)
Returns the byte-encoded normalization factor for the named field of every document.
int numDeletedDocs()
Returns the number of deleted documents.
abstract int numDocs()
Returns the number of documents in this index.
static IndexReader open(Directory directory, boolean readOnly)
Returns an IndexReader reading the index in the given Directory.
static IndexReader open(IndexCommit commit, boolean readOnly)
Expert: returns an IndexReader reading the index in the given IndexCommit.
static IndexReader open(IndexWriter writer, boolean applyAllDeletes)
Open a near real time IndexReader from the IndexWriter.
static IndexReader open(IndexCommit commit, IndexDeletionPolicy deletionPolicy, boolean readOnly, int termInfosIndexDivisor)
Expert: returns an IndexReader reading the index in the given Directory, using a specific commit and with a custom IndexDeletionPolicy.
static IndexReader open(Directory directory)
Returns a IndexReader reading the index in the given Directory, with readOnly=true.
static IndexReader open(Directory directory, IndexDeletionPolicy deletionPolicy, boolean readOnly)
Expert: returns an IndexReader reading the index in the given Directory, with a custom IndexDeletionPolicy.
static IndexReader open(IndexCommit commit, IndexDeletionPolicy deletionPolicy, boolean readOnly)
Expert: returns an IndexReader reading the index in the given Directory, using a specific commit and with a custom IndexDeletionPolicy.
static IndexReader open(Directory directory, IndexDeletionPolicy deletionPolicy, boolean readOnly, int termInfosIndexDivisor)
Expert: returns an IndexReader reading the index in the given Directory, with a custom IndexDeletionPolicy.
void removeReaderFinishedListener(IndexReader.ReaderFinishedListener listener)
Expert: remove a previously added IndexReader.ReaderFinishedListener.
synchronized IndexReader reopen(IndexCommit commit)
Expert: reopen this reader on a specific commit point.
synchronized IndexReader reopen()
Refreshes an IndexReader if the index has changed since this instance was (re)opened.
IndexReader reopen(IndexWriter writer, boolean applyAllDeletes)
Expert: returns a readonly reader, covering all committed as well as un-committed changes to the index.
synchronized IndexReader reopen(boolean openReadOnly)
Just like reopen(), except you can change the readOnly of the original reader.
@Deprecated void setNorm(int doc, String field, float value)
This method is deprecated. Use setNorm(int, String, byte) instead, encoding the float to byte with your Similarity's encodeNormValue(float). This method will be removed in Lucene 4.0
synchronized void setNorm(int doc, String field, byte value)
Expert: Resets the normalization factor for the named field of the named document.
abstract TermDocs termDocs()
Returns an unpositioned TermDocs enumerator.
TermDocs termDocs(Term term)
Returns an enumeration of all the documents which contain term.
TermPositions termPositions(Term term)
Returns an enumeration of all the documents which contain term.
abstract TermPositions termPositions()
Returns an unpositioned TermPositions enumerator.
abstract TermEnum terms(Term t)
Returns an enumeration of all terms starting at a given term.
abstract TermEnum terms()
Returns an enumeration of all the terms in the index.
String toString()
synchronized void undeleteAll()
Undeletes all documents currently marked as deleted in this index.
Protected Methods
synchronized void acquireWriteLock()
Does nothing by default.
synchronized final void commit()
Commit changes resulting from delete, undeleteAll, or setNorm operations If an exception is hit, then either no changes or all changes will have been committed to the index (transactional semantics).
abstract void doClose()
Implements close.
abstract void doCommit(Map<StringString> commitUserData)
Implements commit.
abstract void doDelete(int docNum)
Implements deletion of the document numbered docNum.
abstract void doSetNorm(int doc, String field, byte value)
Implements setNorm in subclass.
abstract void doUndeleteAll()
Implements actual undeleteAll() in subclass.
final void ensureOpen()
void notifyReaderFinishedListeners()
void readerFinished()
[Expand]
Inherited Methods
From class java.lang.Object
From interface java.io.Closeable

Fields

protected boolean hasChanges

protected Collection<IndexReader.ReaderFinishedListener> readerFinishedListeners

Protected Constructors

protected IndexReader ()

Public Methods

public void addReaderFinishedListener (IndexReader.ReaderFinishedListener listener)

Expert: adds a IndexReader.ReaderFinishedListener. The provided listener is also added to any sub-readers, if this is a composite reader. Also, any reader reopened or cloned from this one will also copy the listeners at the time of reopen.

public synchronized IndexReader clone (boolean openReadOnly)

Clones the IndexReader and optionally changes readOnly. A readOnly reader cannot open a writeable reader.

Throws
CorruptIndexException if the index is corrupt
IOException if there is a low-level IO error

public synchronized Object clone ()

Efficiently clones the IndexReader (sharing most internal state).

On cloning a reader with pending changes (deletions, norms), the original reader transfers its write lock to the cloned reader. This means only the cloned reader may make further changes to the index, and commit the changes to the index on close, but the old reader still reflects all changes made up until it was cloned.

Like reopen(), it's safe to make changes to either the original or the cloned reader: all shared mutable state obeys "copy on write" semantics to ensure the changes are not seen by other readers.

public final synchronized void close ()

Closes files associated with this index. Also saves any new deletions to disk. No other methods should be called after this has been called.

Throws
IOException if there is a low-level IO error

public final synchronized void commit (Map<StringString> commitUserData)

Commit changes resulting from delete, undeleteAll, or setNorm operations If an exception is hit, then either no changes or all changes will have been committed to the index (transactional semantics).

Throws
IOException if there is a low-level IO error

public void decRef ()

Expert: decreases the refCount of this IndexReader instance. If the refCount drops to 0, then pending changes (if any) are committed to the index and this reader is closed. If an exception is hit, the refCount is unchanged.

Throws
IOException in case an IOException occurs in commit() or doClose()
See Also

public synchronized void deleteDocument (int docNum)

Deletes the document numbered docNum. Once a document is deleted it will not appear in TermDocs or TermPostitions enumerations. Attempts to read its field with the document(int) method will result in an error. The presence of this document may still be reflected in the docFreq(Term) statistic, though this will be corrected eventually as the index is further modified.

Throws
StaleReaderException if the index has changed since this reader was opened
CorruptIndexException if the index is corrupt
LockObtainFailedException if another writer has this index open (write.lock could not be obtained)
IOException if there is a low-level IO error

public int deleteDocuments (Term term)

Deletes all documents that have a given term indexed. This is useful if one uses a document field to hold a unique ID string for the document. Then to delete such a document, one merely constructs a term with the appropriate field and the unique ID string as its text and passes it to this method. See deleteDocument(int) for information about when this deletion will become effective.

Returns
  • the number of documents deleted
Throws
StaleReaderException if the index has changed since this reader was opened
CorruptIndexException if the index is corrupt
LockObtainFailedException if another writer has this index open (write.lock could not be obtained)
IOException if there is a low-level IO error

public Directory directory ()

Returns the directory associated with this index. The Default implementation returns the directory specified by subclasses when delegating to the IndexReader(Directory) constructor, or throws an UnsupportedOperationException if one was not specified.

Throws
UnsupportedOperationException if no directory

public abstract int docFreq (Term t)

Returns the number of documents containing the term t.

Throws
IOException if there is a low-level IO error

public abstract Document document (int n, FieldSelector fieldSelector)

Get the Document at the n th position. The FieldSelector may be used to determine what Fields to load and how they should be loaded. NOTE: If this Reader (more specifically, the underlying FieldsReader) is closed before the lazy Field is loaded an exception may be thrown. If you want the value of a lazy Field to be available after closing you must explicitly load it or fetch the Document again with a new loader.

NOTE: for performance reasons, this method does not check if the requested document is deleted, and therefore asking for a deleted document may yield unspecified results. Usually this is not required, however you can call isDeleted(int) with the requested document ID to verify the document is not deleted.

Parameters
n Get the document at the nth position
fieldSelector The FieldSelector to use to determine what Fields should be loaded on the Document. May be null, in which case all Fields will be loaded.
Returns
  • The stored fields of the Document at the nth position
Throws
CorruptIndexException if the index is corrupt
IOException if there is a low-level IO error

public Document document (int n)

Returns the stored fields of the nth Document in this index.

NOTE: for performance reasons, this method does not check if the requested document is deleted, and therefore asking for a deleted document may yield unspecified results. Usually this is not required, however you can call isDeleted(int) with the requested document ID to verify the document is not deleted.

Throws
CorruptIndexException if the index is corrupt
IOException if there is a low-level IO error

public final synchronized void flush (Map<StringString> commitUserData)

Parameters
commitUserData Opaque Map (String -> String) that's recorded into the segments file in the index, and retrievable by getCommitUserData().
Throws
IOException

public final synchronized void flush ()

Throws
IOException

public Map<StringString> getCommitUserData ()

Retrieve the String userData optionally passed to IndexWriter#commit. This will return null if commit(Map) has never been called for this index.

public static Map<StringString> getCommitUserData (Directory directory)

Reads commitUserData, previously passed to commit(Map), from current index segments file. This will return null if commit(Map) has never been called for this index.

Parameters
directory where the index resides.
Returns
  • commit userData.
Throws
CorruptIndexException if the index is corrupt
IOException if there is a low-level IO error

public Object getCoreCacheKey ()

Expert

public static long getCurrentVersion (Directory directory)

Reads version number from segments files. The version number is initialized with a timestamp and then increased by one for each change of the index.

Parameters
directory where the index resides.
Returns
  • version number.
Throws
CorruptIndexException if the index is corrupt
IOException if there is a low-level IO error

public Object getDeletesCacheKey ()

Expert. Warning: this returns null if the reader has no deletions

public abstract Collection<String> getFieldNames (IndexReader.FieldOption fldOption)

Get a list of unique field names that exist in this index and have the specified field option information.

Parameters
fldOption specifies which field option should be available for the returned fields
Returns
  • Collection of Strings indicating the names of the fields.

public IndexCommit getIndexCommit ()

Expert: return the IndexCommit that this reader has opened. This method is only implemented by those readers that correspond to a Directory with its own segments_N file.

Throws
IOException

public int getRefCount ()

Expert: returns the current refCount for this reader

public IndexReader[] getSequentialSubReaders ()

Expert: returns the sequential sub readers that this reader is logically composed of. For example, IndexSearcher uses this API to drive searching by one sub reader at a time. If this reader is not composed of sequential child readers, it should return null. If this method returns an empty array, that means this reader is a null reader (for example a MultiReader that has no sub readers).

NOTE: You should not try using sub-readers returned by this method to make any changes (setNorm, deleteDocument, etc.). While this might succeed for one composite reader (like MultiReader), it will most likely lead to index corruption for other readers (like DirectoryReader obtained through open(IndexCommit, boolean). Use the parent reader directly.

public abstract void getTermFreqVector (int docNumber, String field, TermVectorMapper mapper)

Load the Term Vector into a user-defined data structure instead of relying on the parallel arrays of the TermFreqVector.

Parameters
docNumber The number of the document to load the vector for
field The name of the field to load
mapper The TermVectorMapper to process the vector. Must not be null
Throws
IOException if term vectors cannot be accessed or if they do not exist on the field and doc. specified.

public abstract TermFreqVector getTermFreqVector (int docNumber, String field)

Return a term frequency vector for the specified document and field. The returned vector contains terms and frequencies for the terms in the specified field of this document, if the field had the storeTermVector flag set. If termvectors had been stored with positions or offsets, a TermPositionVector is returned.

Parameters
docNumber document for which the term frequency vector is returned
field field for which the term frequency vector is returned.
Returns
  • term frequency vector May be null if field does not exist in the specified document or term vector was not stored.
Throws
IOException if index cannot be accessed
See Also

public abstract void getTermFreqVector (int docNumber, TermVectorMapper mapper)

Map all the term vectors for all fields in a Document

Parameters
docNumber The number of the document to load the vector for
mapper The TermVectorMapper to process the vector. Must not be null
Throws
IOException if term vectors cannot be accessed or if they do not exist on the field and doc. specified.

public abstract TermFreqVector[] getTermFreqVectors (int docNumber)

Return an array of term frequency vectors for the specified document. The array contains a vector for each vectorized field in the document. Each vector contains terms and frequencies for all terms in a given vectorized field. If no such fields existed, the method returns null. The term vectors that are returned may either be of type TermFreqVector or of type TermPositionVector if positions or offsets have been stored.

Parameters
docNumber document for which term frequency vectors are returned
Returns
  • array of term frequency vectors. May be null if no term vectors have been stored for the specified document.
Throws
IOException if index cannot be accessed
See Also

public int getTermInfosIndexDivisor ()

For IndexReader implementations that use TermInfosReader to read terms, this returns the current indexDivisor as specified when the reader was opened.

public long getUniqueTermCount ()

Returns the number of unique terms (across all fields) in this reader. This method returns long, even though internally Lucene cannot handle more than 2^31 unique terms, for a possible future when this limitation is removed.

Throws
UnsupportedOperationException if this count cannot be easily determined (eg Multi*Readers). Instead, you should call getSequentialSubReaders() and ask each sub reader for its unique term count.
IOException

public long getVersion ()

Version number when this IndexReader was opened. Not implemented in the IndexReader base class.

If this reader is based on a Directory (ie, was created by calling open(IndexCommit, boolean), or reopen() on a reader based on a Directory), then this method returns the version recorded in the commit that the reader opened. This version is advanced every time commit() is called.

If instead this reader is a near real-time reader (ie, obtained by a call to getReader(), or by calling reopen() on a near real-time reader), then this method returns the version of the last commit done by the writer. Note that even as further changes are made with the writer, the version will not changed until a commit is completed. Thus, you should not rely on this method to determine when a near real-time reader should be opened. Use isCurrent() instead.

Throws
UnsupportedOperationException unless overridden in subclass

public abstract boolean hasDeletions ()

Returns true if any documents have been deleted

public boolean hasNorms (String field)

Returns true if there are norms stored for this field.

Throws
IOException

public void incRef ()

Expert: increments the refCount of this IndexReader instance. RefCounts are used to determine when a reader can be closed safely, i.e. as soon as there are no more references. Be sure to always call a corresponding decRef(), in a finally clause; otherwise the reader may never be closed. Note that close() simply calls decRef(), which means that the IndexReader will not really be closed until decRef() has been called for all outstanding references.

See Also

public static boolean indexExists (Directory directory)

Returns true if an index exists at the specified directory.

Parameters
directory the directory to check for an index
Returns
  • true if an index exists; false otherwise
Throws
IOException if there is a problem with accessing the index

public boolean isCurrent ()

Check whether any new changes have occurred to the index since this reader was opened.

If this reader is based on a Directory (ie, was created by calling open(IndexCommit, boolean), or reopen() on a reader based on a Directory), then this method checks if any further commits (see commit() have occurred in that directory).

If instead this reader is a near real-time reader (ie, obtained by a call to getReader(), or by calling reopen() on a near real-time reader), then this method checks if either a new commmit has occurred, or any new uncommitted changes have taken place via the writer. Note that even if the writer has only performed merging, this method will still return false.

In any event, if this returns false, you should call reopen() to get a new reader that sees the changes.

Throws
CorruptIndexException if the index is corrupt
IOException if there is a low-level IO error
UnsupportedOperationException unless overridden in subclass

public abstract boolean isDeleted (int n)

Returns true if document n has been deleted

public boolean isOptimized ()

Checks is the index is optimized (if it has a single segment and no deletions). Not implemented in the IndexReader base class.

Returns
  • true if the index is optimized; false otherwise
Throws
UnsupportedOperationException unless overridden in subclass

public static long lastModified (Directory directory2)

Returns the time the index in the named directory was last modified. Do not use this to check whether the reader is still up-to-date, use isCurrent() instead.

Throws
CorruptIndexException if the index is corrupt
IOException if there is a low-level IO error

public static Collection<IndexCommit> listCommits (Directory dir)

Returns all commit points that exist in the Directory. Normally, because the default is KeepOnlyLastCommitDeletionPolicy, there would be only one commit point. But if you're using a custom IndexDeletionPolicy then there could be many commits. Once you have a given commit, you can open a reader on it by calling open(IndexCommit, boolean) There must be at least one commit in the Directory, else this method throws IndexNotFoundException. Note that if a commit is in progress while this method is running, that commit may or may not be returned.

Returns
Throws
IOException

public static void main (String[] args)

Prints the filename and size of each file within a given compound file. Add the -extract flag to extract files to the current working directory. In order to make the extracted version of the index work, you have to copy the segments file from the compound index into the directory where the extracted files are stored.

Parameters
args Usage: org.apache.lucene.index.IndexReader [-extract] <cfsfile>

public abstract int maxDoc ()

Returns one greater than the largest possible document number. This may be used to, e.g., determine how big to allocate an array which will have an element for every document number in an index.

public abstract void norms (String field, byte[] bytes, int offset)

Reads the byte-encoded normalization factor for the named field of every document. This is used by the search code to score documents.

Throws
IOException
See Also

public abstract byte[] norms (String field)

Returns the byte-encoded normalization factor for the named field of every document. This is used by the search code to score documents. Returns null if norms were not indexed for this field.

Throws
IOException
See Also

public int numDeletedDocs ()

Returns the number of deleted documents.

public abstract int numDocs ()

Returns the number of documents in this index.

public static IndexReader open (Directory directory, boolean readOnly)

Returns an IndexReader reading the index in the given Directory. You should pass readOnly=true, since it gives much better concurrent performance, unless you intend to do write operations (delete documents or change norms) with the reader.

Parameters
directory the index directory
readOnly true if no changes (deletions, norms) will be made with this IndexReader
Throws
CorruptIndexException if the index is corrupt
IOException if there is a low-level IO error

public static IndexReader open (IndexCommit commit, boolean readOnly)

Expert: returns an IndexReader reading the index in the given IndexCommit. You should pass readOnly=true, since it gives much better concurrent performance, unless you intend to do write operations (delete documents or change norms) with the reader.

Parameters
commit the commit point to open
readOnly true if no changes (deletions, norms) will be made with this IndexReader
Throws
CorruptIndexException if the index is corrupt
IOException if there is a low-level IO error

public static IndexReader open (IndexWriter writer, boolean applyAllDeletes)

Open a near real time IndexReader from the IndexWriter.

Parameters
writer The IndexWriter to open from
applyAllDeletes If true, all buffered deletes will be applied (made visible) in the returned reader. If false, the deletes are not applied but remain buffered (in IndexWriter) so that they will be applied in the future. Applying deletes can be costly, so if your app can tolerate deleted documents being returned you might gain some performance by passing false.
Returns
  • The new IndexReader
Throws
CorruptIndexException
IOException if there is a low-level IO error
CorruptIndexException

public static IndexReader open (IndexCommit commit, IndexDeletionPolicy deletionPolicy, boolean readOnly, int termInfosIndexDivisor)

Expert: returns an IndexReader reading the index in the given Directory, using a specific commit and with a custom IndexDeletionPolicy. You should pass readOnly=true, since it gives much better concurrent performance, unless you intend to do write operations (delete documents or change norms) with the reader.

Parameters
commit the specific IndexCommit to open; see listCommits(Directory) to list all commits in a directory
deletionPolicy a custom deletion policy (only used if you use this reader to perform deletes or to set norms); see IndexWriter for details.
readOnly true if no changes (deletions, norms) will be made with this IndexReader
termInfosIndexDivisor Subsamples which indexed terms are loaded into RAM. This has the same effect as setTermIndexInterval(int) except that setting must be done at indexing time while this setting can be set per reader. When set to N, then one in every N*termIndexInterval terms in the index is loaded into memory. By setting this to a value > 1 you can reduce memory usage, at the expense of higher latency when loading a TermInfo. The default value is 1. Set this to -1 to skip loading the terms index entirely. This is only useful in advanced situations when you will only .next() through all terms; attempts to seek will hit an exception.
Throws
CorruptIndexException if the index is corrupt
IOException if there is a low-level IO error

public static IndexReader open (Directory directory)

Returns a IndexReader reading the index in the given Directory, with readOnly=true.

Parameters
directory the index directory
Throws
CorruptIndexException if the index is corrupt
IOException if there is a low-level IO error

public static IndexReader open (Directory directory, IndexDeletionPolicy deletionPolicy, boolean readOnly)

Expert: returns an IndexReader reading the index in the given Directory, with a custom IndexDeletionPolicy. You should pass readOnly=true, since it gives much better concurrent performance, unless you intend to do write operations (delete documents or change norms) with the reader.

Parameters
directory the index directory
deletionPolicy a custom deletion policy (only used if you use this reader to perform deletes or to set norms); see IndexWriter for details.
readOnly true if no changes (deletions, norms) will be made with this IndexReader
Throws
CorruptIndexException if the index is corrupt
IOException if there is a low-level IO error

public static IndexReader open (IndexCommit commit, IndexDeletionPolicy deletionPolicy, boolean readOnly)

Expert: returns an IndexReader reading the index in the given Directory, using a specific commit and with a custom IndexDeletionPolicy. You should pass readOnly=true, since it gives much better concurrent performance, unless you intend to do write operations (delete documents or change norms) with the reader.

Parameters
commit the specific IndexCommit to open; see listCommits(Directory) to list all commits in a directory
deletionPolicy a custom deletion policy (only used if you use this reader to perform deletes or to set norms); see IndexWriter for details.
readOnly true if no changes (deletions, norms) will be made with this IndexReader
Throws
CorruptIndexException if the index is corrupt
IOException if there is a low-level IO error

public static IndexReader open (Directory directory, IndexDeletionPolicy deletionPolicy, boolean readOnly, int termInfosIndexDivisor)

Expert: returns an IndexReader reading the index in the given Directory, with a custom IndexDeletionPolicy. You should pass readOnly=true, since it gives much better concurrent performance, unless you intend to do write operations (delete documents or change norms) with the reader.

Parameters
directory the index directory
deletionPolicy a custom deletion policy (only used if you use this reader to perform deletes or to set norms); see IndexWriter for details.
readOnly true if no changes (deletions, norms) will be made with this IndexReader
termInfosIndexDivisor Subsamples which indexed terms are loaded into RAM. This has the same effect as setTermIndexInterval(int) except that setting must be done at indexing time while this setting can be set per reader. When set to N, then one in every N*termIndexInterval terms in the index is loaded into memory. By setting this to a value > 1 you can reduce memory usage, at the expense of higher latency when loading a TermInfo. The default value is 1. Set this to -1 to skip loading the terms index entirely.
Throws
CorruptIndexException if the index is corrupt
IOException if there is a low-level IO error

public void removeReaderFinishedListener (IndexReader.ReaderFinishedListener listener)

Expert: remove a previously added IndexReader.ReaderFinishedListener.

public synchronized IndexReader reopen (IndexCommit commit)

Expert: reopen this reader on a specific commit point. This always returns a readOnly reader. If the specified commit point matches what this reader is already on, and this reader is already readOnly, then this same instance is returned; if it is not already readOnly, a readOnly clone is returned.

public synchronized IndexReader reopen ()

Refreshes an IndexReader if the index has changed since this instance was (re)opened.

Opening an IndexReader is an expensive operation. This method can be used to refresh an existing IndexReader to reduce these costs. This method tries to only load segments that have changed or were created after the IndexReader was (re)opened.

If the index has not changed since this instance was (re)opened, then this call is a NOOP and returns this instance. Otherwise, a new instance is returned. The old instance is not closed and remains usable.

If the reader is reopened, even though they share resources internally, it's safe to make changes (deletions, norms) with the new reader. All shared mutable state obeys "copy on write" semantics to ensure the changes are not seen by other readers.

You can determine whether a reader was actually reopened by comparing the old instance with the instance returned by this method:

 IndexReader reader = ... 
 ...
 IndexReader newReader = r.reopen();
 if (newReader != reader) {
 ...     // reader was reopened
   reader.close(); 
 }
 reader = newReader;
 ...
 
Be sure to synchronize that code so that other threads, if present, can never use reader after it has been closed and before it's switched to newReader.

NOTE: If this reader is a near real-time reader (obtained from getReader(), reopen() will simply call writer.getReader() again for you, though this may change in the future.

Throws
CorruptIndexException if the index is corrupt
IOException if there is a low-level IO error

public IndexReader reopen (IndexWriter writer, boolean applyAllDeletes)

Expert: returns a readonly reader, covering all committed as well as un-committed changes to the index. This provides "near real-time" searching, in that changes made during an IndexWriter session can be quickly made available for searching without closing the writer nor calling commit().

Note that this is functionally equivalent to calling {#flush} (an internal IndexWriter operation) and then using open(IndexCommit, boolean) to open a new reader. But the turnaround time of this method should be faster since it avoids the potentially costly commit().

You must close the IndexReader returned by this method once you are done using it.

It's near real-time because there is no hard guarantee on how quickly you can get a new reader after making changes with IndexWriter. You'll have to experiment in your situation to determine if it's fast enough. As this is a new and experimental feature, please report back on your findings so we can learn, improve and iterate.

The resulting reader supports reopen(), but that call will simply forward back to this method (though this may change in the future).

The very first time this method is called, this writer instance will make every effort to pool the readers that it opens for doing merges, applying deletes, etc. This means additional resources (RAM, file descriptors, CPU time) will be consumed.

For lower latency on reopening a reader, you should call setMergedSegmentWarmer(IndexWriter.IndexReaderWarmer) to pre-warm a newly merged segment before it's committed to the index. This is important for minimizing index-to-search delay after a large merge.

If an addIndexes* call is running in another thread, then this reader will only search those segments from the foreign index that have been successfully copied over, so far

.

NOTE: Once the writer is closed, any outstanding readers may continue to be used. However, if you attempt to reopen any of those readers, you'll hit an AlreadyClosedException.

Parameters
writer The IndexWriter to open from
applyAllDeletes If true, all buffered deletes will be applied (made visible) in the returned reader. If false, the deletes are not applied but remain buffered (in IndexWriter) so that they will be applied in the future. Applying deletes can be costly, so if your app can tolerate deleted documents being returned you might gain some performance by passing false.
Returns
  • IndexReader that covers entire index plus all changes made so far by this IndexWriter instance
Throws
IOException
CorruptIndexException
IOException

public synchronized IndexReader reopen (boolean openReadOnly)

Just like reopen(), except you can change the readOnly of the original reader. If the index is unchanged but readOnly is different then a new reader will be returned.

@Deprecated public void setNorm (int doc, String field, float value)

This method is deprecated.
Use setNorm(int, String, byte) instead, encoding the float to byte with your Similarity's encodeNormValue(float). This method will be removed in Lucene 4.0

Expert: Resets the normalization factor for the named field of the named document.

Throws
StaleReaderException if the index has changed since this reader was opened
CorruptIndexException if the index is corrupt
LockObtainFailedException if another writer has this index open (write.lock could not be obtained)
IOException if there is a low-level IO error

public synchronized void setNorm (int doc, String field, byte value)

Expert: Resets the normalization factor for the named field of the named document. The norm represents the product of the field's boost and its length normalization. Thus, to preserve the length normalization values when resetting this, one should base the new value upon the old. NOTE: If this field does not store norms, then this method call will silently do nothing.

Throws
StaleReaderException if the index has changed since this reader was opened
CorruptIndexException if the index is corrupt
LockObtainFailedException if another writer has this index open (write.lock could not be obtained)
IOException if there is a low-level IO error

public abstract TermDocs termDocs ()

Returns an unpositioned TermDocs enumerator.

Note: the TermDocs returned is unpositioned. Before using it, ensure that you first position it with seek(Term) or seek(TermEnum).

Throws
IOException if there is a low-level IO error

public TermDocs termDocs (Term term)

Returns an enumeration of all the documents which contain term. For each document, the document number, the frequency of the term in that document is also provided, for use in search scoring. If term is null, then all non-deleted docs are returned with freq=1. Thus, this method implements the mapping:

    Term    =>    <docNum, freq>*

The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.

Throws
IOException if there is a low-level IO error

public TermPositions termPositions (Term term)

Returns an enumeration of all the documents which contain term. For each document, in addition to the document number and frequency of the term in that document, a list of all of the ordinal positions of the term in the document is available. Thus, this method implements the mapping:

    Term    =>    <docNum, freq, <pos1, pos2, ... posfreq-1> >*

This positional information facilitates phrase and proximity searching.

The enumeration is ordered by document number. Each document number is greater than all that precede it in the enumeration.

Throws
IOException if there is a low-level IO error

public abstract TermPositions termPositions ()

Returns an unpositioned TermPositions enumerator.

Throws
IOException if there is a low-level IO error

public abstract TermEnum terms (Term t)

Returns an enumeration of all terms starting at a given term. If the given term does not exist, the enumeration is positioned at the first term greater than the supplied term. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration.

Throws
IOException if there is a low-level IO error

public abstract TermEnum terms ()

Returns an enumeration of all the terms in the index. The enumeration is ordered by Term.compareTo(). Each term is greater than all that precede it in the enumeration. Note that after calling terms(), next() must be called on the resulting enumeration before calling other methods such as term().

Throws
IOException if there is a low-level IO error

public String toString ()

public synchronized void undeleteAll ()

Undeletes all documents currently marked as deleted in this index.

NOTE: this method can only recover documents marked for deletion but not yet removed from the index; when and how Lucene removes deleted documents is an implementation detail, subject to change from release to release. However, you can use numDeletedDocs() on the current IndexReader instance to see how many documents will be un-deleted.

Throws
StaleReaderException if the index has changed since this reader was opened
LockObtainFailedException if another writer has this index open (write.lock could not be obtained)
CorruptIndexException if the index is corrupt
IOException if there is a low-level IO error

Protected Methods

protected synchronized void acquireWriteLock ()

Does nothing by default. Subclasses that require a write lock for index modifications must implement this method.

Throws
IOException

protected final synchronized void commit ()

Commit changes resulting from delete, undeleteAll, or setNorm operations If an exception is hit, then either no changes or all changes will have been committed to the index (transactional semantics).

Throws
IOException if there is a low-level IO error

protected abstract void doClose ()

Implements close.

Throws
IOException

protected abstract void doCommit (Map<StringString> commitUserData)

Implements commit.

Throws
IOException

protected abstract void doDelete (int docNum)

Implements deletion of the document numbered docNum. Applications should call deleteDocument(int) or deleteDocuments(Term).

protected abstract void doSetNorm (int doc, String field, byte value)

Implements setNorm in subclass.

protected abstract void doUndeleteAll ()

Implements actual undeleteAll() in subclass.

protected final void ensureOpen ()

Throws
AlreadyClosedException if this IndexReader is closed

protected void notifyReaderFinishedListeners ()

protected void readerFinished ()