Pythonic wrapper around PyLucene search engine.
Provides high-level interfaces to indexes and documents, abstracting away java lucene primitives.
Wrappers for lucene Index{Read,Search,Writ}ers.
The final Indexer classes exposes a high-level Searcher and Writer.
Create an iterable lucene TokenFilter from a TokenStream. In lucene version 2, call on any iterable of tokens. In lucene version 3, subclass and override incrementToken(); attributes are cached as properties to create a Token interface.
Advance to next token and return whether the stream is not empty.
Start and stop character offset.
Payload bytes.
Position relative to the previous token.
Term text.
Lexical type.
Return a lucene Analyzer which chains together a tokenizer and filters.
Parameters: |
|
---|
Return parsed lucene Query.
Parameters: |
|
---|
Return lucene TokenStream from text.
Delegated lucene IndexReader, with a mapping interface of ids to document objects.
Parameters: |
|
---|
Return cache of field values suitable for sorting. Parsing values into an array is memory optimized. Map values into a list for speed optimization.
Parameters: |
|
---|
Copy the index to the destination directory. Optimized to use hard links if the destination is a file system path.
Parameters: |
|
---|
Return number of documents with given term.
reader’s lucene Directory
Generate doc ids which contain given term, optionally with frequency counts.
Return MoreLikeThis query for document.
Parameters: |
|
---|
Return field names, given option description.
Generate decoded numeric term values, optionally with frequency counts.
Parameters: |
|
---|
Generate doc ids and positions which contain given term, optionally only with payloads.
Generate terms and positions for given doc id and field, optionally with character offsets.
Generate docs with occurrence counts for a span query.
Parameters: |
|
---|
Generate a slice of term values, optionally with frequency counts. Supports a range of terms, wildcard terms, or fuzzy terms.
Parameters: |
|
---|
Generate terms for given doc id and field, optionally with frequency counts.
Mixin interface common among searchers.
Closes index.
Mapping of cached filters by field, which are used for facet counts.
Mapping of cached sorters by field and associated parsers.
Mapping of cached spellcheckers by field.
Return IndexReader.comparator() using a cached SortField if available.
Generate potential words ordered by increasing edit distance and decreasing frequency. For optimal performance only iterate the required slice size of corrections.
Parameters: |
|
---|
Return number of hits for given query or term.
Parameters: |
---|
Return mapping of document counts for the intersection with each facet.
Parameters: |
|
---|
Return highlighted text fragments which match the query, using internal highlighter().
Parameters: |
|
---|
Return Highlighter specific to the searcher’s analyzer and index.
Return current Searcher, only creating a new one if necessary.
Parameters: |
|
---|
Run query and return Hits.
Parameters: |
|
---|
Return SortField with cached attributes if available.
Return and cache spellchecker for given field.
Return ordered suggested words for prefix.
Bases: engine.indexers.Searcher, IndexSearcher, engine.indexers.IndexReader
Inherited lucene IndexSearcher, with a mixed-in IndexReader.
Parameters: |
|
---|
Bases: engine.indexers.Searcher, MultiSearcher, engine.indexers.IndexReader
Inherited lucene MultiSearcher. All sub searchers will be closed when the MultiSearcher is closed.
Parameters: |
|
---|
Bases: engine.indexers.MultiSearcher, ParallelMultiSearcher
Inherited lucene ParallelMultiSearcher.
Bases: IndexWriter
Inherited lucene IndexWriter. Supports setting fields parameters explicitly, so documents can be represented as dictionaries.
Parameters: |
|
---|
Mapping of assigned fields. May be used directly, instead of set() method, for further customization.
Closes index.
Add directory (or reader, searcher, writer) to index.
Add document to index. Document is comprised of name: value pairs, where the values may be one or multiple strings.
Parameters: |
|
---|
Remove documents which match given query or term.
Parameters: |
|
---|
segment filenames with document counts
Bases: engine.indexers.IndexWriter
An all-purpose interface to an index. Creates an IndexWriter with a delegated IndexSearcher.
Commit writes and refresh searcher. Not thread-safe.
Parameters: |
|
---|
Wrappers for lucene Fields and Documents.
Delegated lucene Document. Provides mapping interface of field names to values, but duplicate field names are allowed.
Return dict representation of document.
Parameters: |
|
---|
Generate lucene Fields.
Return field value if present, else default.
Return list of all values for given field.
Generate name, value pairs for all fields.
Search results: lazily evaluated and memory efficient. Provides a read-only sequence interface to hit objects.
Parameters: |
|
---|
Generate zipped ids and scores.
Saved parameters which can generate lucene Fields given values.
Parameters: |
|
---|
Generate lucene Fields suitable for adding to a document.
Bases: engine.documents.Field
Field which uses string formatting on its values.
Parameters: |
|
---|
Return formatted value.
Generate fields with formatted values.
Bases: engine.documents.Field
Field which indexes every prefix of a value into a separate component field. The customizable component field names are expressed as slices. Original value may be stored for convenience.
Parameters: |
|
---|
Return prefix field name for given depth.
Return range of valid depth indices.
Generate indexed component fields. Optimized to handle duplicate values.
Return text from separate words.
Return prefix query of the closest possible prefixed field.
Return range query of the closest possible prefixed field.
Return immutable sequence of words from name or value.
Bases: engine.documents.PrefixField
Field which indexes every component into its own field.
Parameters: |
|
---|
Return component field name for given depth.
Return text from separate words.
Return immutable sequence of words from name or value.
Bases: engine.documents.PrefixField
Field which indexes each datetime component in sortable ISO format: Y-m-d H:M:S. Works with datetimes, dates, and any object whose string form is a prefix of ISO.
Return date range query within time span of date.
Parameters: |
|
---|
Return component field name for given depth.
Return datetime components in ISO format.
Return prefix query of the datetime.
Return optimal union of date range queries. May produce invalid dates, but the query is still correct.
Return immutable sequence of datetime components.
Return date range query within current time and delta. If the delta is an exact number of days, then dates will be used.
Parameters: |
|
---|
Wrappers for lucene NumericFields. Alternative implementations of spatial and datetime fields.
Bases: engine.documents.Field
Field which indexes numbers in a prefix tree.
Parameters: |
|
---|
Generate lucene NumericFields suitable for adding to a document.
Return lucene NumericRangeQuery.
Bases: engine.spatial.SpatialField, engine.numeric.NumericField
Geospatial points, which create a tiered index of tiles. Points must still be stored if exact distances are required upon retrieval.
Generate tiles from points (lng, lat).
Return range query which is equivalent to the prefix of the tile.
Generate tile values from points (lng, lat).
Parameters: |
|
---|
Return range queries for any tiles which could be within distance of given point.
Parameters: |
|
---|
Bases: engine.numeric.PointField
PointField which implicitly supports polygons (technically linear rings of points). Differs from points in that all necessary tiles are included to match the points’ boundary. As with PointField, the tiered tiles are a search optimization, not a distance calculator.
Generate all covered tiles from polygons.
Bases: engine.numeric.NumericField
Field which indexes datetimes as a NumericField of timestamps. Supports datetimes, dates, and any prefix of time tuples.
Return date range query within time span of date.
Parameters: |
|
---|
Generate lucene NumericFields of timestamps.
Return range query which matches the date prefix.
Return NumericRangeQuery of timestamps.
Return utc timestamp from date or time tuple.
Return date range query within current time and delta. If the delta is an exact number of days, then dates will be used.
Parameters: |
|
---|
Query wrappers and search utilities.
Inherited lucene Query, with dynamic base class acquisition. Uses class methods and operator overloading for convenient query construction.
BooleanQuery +self +other>
BooleanQuery self other>
BooleanQuery self -other>
Return BooleanQuery (AND) from queries and terms.
Return BooleanQuery (OR) from queries and terms.
Return lucene DisjunctionMaxQuery from queries and terms.
Return lucene CachingWrapperFilter, optionally just QueryWrapperFilter.
Return lucene FuzzyQuery.
Return lucene MultiPhraseQuery. None may be used as a placeholder.
Return SpanNearQuery from terms. Term values which supply another field name will be masked.
Return lucene PhraseQuery. None may be used as a placeholder.
Return lucene PrefixQuery.
Return lucene RangeQuery, by default with a half-open interval.
Return SpanTermQuery.
Return lucene TermQuery.
Generate set of query term items.
Return lucene WildcardQuery.
Inherited lucene SpanQuery with additional span constructors.
<SpanFirstQuery: spanFirst(self, other.stop)>
<SpanNotQuery: spanNot(self, other)>
<SpanOrQuery: spanOr(spans)>
Return lucene FieldMaskingSpanQuery, which allows combining span queries from different fields.
Return lucene SpanNearQuery from SpanQueries.
Parameters: |
|
---|
Inherited lucene Filter with a cached BitSet of ids.
Return cached BitSet, reader is ignored. Deprecated.
Return cached OpenBitSet, reader is ignored.
Return intersection count of the filters.
Inherited lucene SortField used for caching FieldCache parsers.
Parameters: |
|
---|
Return indexed values from default FieldCache using the given reader.
Inherited lucene Filter with stored analysis options.
Parameters: |
|
---|
Return highlighted text fragments.
Parameters: |
|
---|
Bases: dict
Correct spellings and suggest words for queries. Supply a vocabulary mapping words to (reverse) sort keys, such as document frequencies.
Generate ordered sets of words by increasing edit distance.
Return set of potential words one edit distance away, mapped to valid prefix lengths.
Return ordered suggested words for prefix.
Geospatial fields.
Latitude/longitude coordinates are encoded into the quadkeys of MS Virtual Earth, which are also compatible with Google Maps and OSGEO Tile Map Service. See http://www.maptiler.org/google-maps-coordinates-tile-bounds-projection/.
The quadkeys are then indexed using a prefix tree, creating a cartesian tier of tiles.
Utilities for transforming lat/lngs, projected coordinates, and tile coordinates.
Return TMS coordinates of tile.
Return lat/lng bounding box (bottom, left, top, right) of tile.
Return tile from latitude, longitude and precision level.
Converts given lat/lon in WGS84 Datum to XY in Spherical Mercator EPSG:900913
Generate tile keys within distance of given point, adjusting precision to limit the number considered.
Generate tile keys which span bounding box of meters.
Return reduced number of tiles, by zooming out where all sub-tiles are present.
Bases: engine.documents.Field, engine.spatial.Tiler
Mixin interface for indexing lat/lngs as a prefix tree of tiles. Subclasses should implement items and prefix methods.
Parameters: |
|
---|
Return prefix query for point at given precision.
Generate tile values from points (lng, lat).
Parameters: |
|
---|
Return prefix queries for any tiles which could be within distance of given point.
Parameters: |
|
---|
Bases: engine.spatial.SpatialField, engine.documents.PrefixField
Geospatial points, which create a tiered index of tiles. Points must still be stored if exact distances are required upon retrieval.
Generate tiles from points (lng, lat).
Bases: engine.spatial.PointField
PointField which implicitly supports polygons (technically linear rings of points). Differs from points in that all necessary tiles are included to match the points’ boundary. As with PointField, the tiered tiles are a search optimization, not a distance calculator.
Generate all covered tiles from polygons.