package

org.apache.lucene.search.spans

The calculus of spans.

A span is a <doc,startPosition,endPosition> tuple.

The following span query operators are implemented:

  • A SpanTermQuery matches all spans containing a particular Term.
  • A SpanNearQuery matches spans which occur near one another, and can be used to implement things like phrase search (when constructed from SpanTermQueries) and inter-phrase proximity (when constructed from other SpanNearQueries).
  • A SpanOrQuery merges spans from a number of other SpanQueries.
  • A SpanNotQuery removes spans matching one SpanQuery which overlap another. This can be used, e.g., to implement within-paragraph search.
  • A SpanFirstQuery matches spans matching q whose end position is less than n. This can be used to constrain matches to the first part of the document.
In all cases, output spans are minimally inclusive. In other words, a span formed by matching a span in x and y starts at the lesser of the two starts and ends at the greater of the two ends.

For example, a span query which matches "John Kerry" within ten words of "George Bush" within the first 100 words of the document could be constructed with:

SpanQuery john   = new SpanTermQuery(new Term("content", "john"));
SpanQuery kerry  = new SpanTermQuery(new Term("content", "kerry"));
SpanQuery george = new SpanTermQuery(new Term("content", "george"));
SpanQuery bush   = new SpanTermQuery(new Term("content", "bush"));

SpanQuery johnKerry =
   new SpanNearQuery(new SpanQuery[] {john, kerry}, 0, true);

SpanQuery georgeBush =
   new SpanNearQuery(new SpanQuery[] {george, bush}, 0, true);

SpanQuery johnKerryNearGeorgeBush =
   new SpanNearQuery(new SpanQuery[] {johnKerry, georgeBush}, 10, false);

SpanQuery johnKerryNearGeorgeBushAtStart =
   new SpanFirstQuery(johnKerryNearGeorgeBush, 100);

Span queries may be freely intermixed with other Lucene queries. So, for example, the above query can be restricted to documents which also use the word "iraq" with:

Query query = new BooleanQuery();
query.add(johnKerryNearGeorgeBushAtStart, true, false);
query.add(new TermQuery("content", "iraq"), true, false);

Classes

FieldMaskingSpanQuery

Wrapper to allow SpanQuery objects participate in composite single-field SpanQueries by 'lying' about their search field. 

NearSpansOrdered A Spans that is formed from the ordered subspans of a SpanNearQuery where the subspans do not overlap and have a maximum slop between them. 
NearSpansUnordered Similar to NearSpansOrdered, but for the unordered case. 
SpanFirstQuery Matches spans near the beginning of a field. 
SpanMultiTermQueryWrapper<Q extends MultiTermQuery> Wraps any MultiTermQuery as a SpanQuery, so it can be nested within other SpanQuery classes. 
SpanMultiTermQueryWrapper.SpanRewriteMethod Abstract class that defines how the query is rewritten. 
SpanMultiTermQueryWrapper.TopTermsSpanBooleanQueryRewrite A rewrite method that first translates each term into a SpanTermQuery in a SHOULD clause in a BooleanQuery, and keeps the scores as computed by the query. 
SpanNearPayloadCheckQuery Only return those matches that have a specific payload at the given position. 
SpanNearQuery Matches spans which are near one another. 
SpanNotQuery Removes matches which overlap with another SpanQuery. 
SpanOrQuery Matches the union of its clauses. 
SpanPayloadCheckQuery Only return those matches that have a specific payload at the given position. 
SpanPositionCheckQuery  
SpanPositionCheckQuery.PositionCheckSpan  
SpanPositionRangeQuery Checks to see if the getMatch() lies between a start and end position 
SpanQuery Base class for span-based queries. 
Spans Expert: an enumeration of span matches. 
SpanScorer Public for extension only. 
SpanTermQuery Matches spans containing a term. 
SpanWeight Expert-only. 
TermSpans Expert: Public for extension only  

Enums

SpanPositionCheckQuery.AcceptStatus Return value if the match should be accepted YES, rejected NO, or rejected and enumeration should advance to the next document NO_AND_ADVANCE