server

Usage: python -m lupyne.server [index_directory ...]

Options:
-h, --help show this help message and exit
-r, --read-only
 expose only read methods; no write lock
-c CONFIG, --config=CONFIG
 optional configuration file or json object of global params
-p FILE, --pidfile=FILE
 store the process id in the given file
-d, --daemonize
 run the server as a daemon
--autoreload=SECONDS
 automatically reload modules; replacement for engine.autoreload
--autoupdate=SECONDS
 automatically update index version

Restful json CherryPy server.

The server script mounts a WebSearcher (read_only) or WebIndexer root. Standard CherryPy configuration applies, and the provided custom tools are also configurable. All request and response bodies are application/json values.

WebSearcher exposes resources for an IndexSearcher. In addition to search requests, it provides access to term and document information in the index. Note Lucene doc ids are ephemeral; they should only be used across requests for the same index version.

WebIndexer extends WebSearcher, exposing additional resources and methods for an Indexer. Single documents may be added, deleted, or replaced by a unique indexed field. Multiples documents may also be added or deleted by query at once. By default changes are not visible until the update resource is called to commit a new index version. If a near real-time Indexer is used (an experimental feature in Lucene), then changes are instantly searchable. In such cases a commit still hasn’t occurred; the index based validation headers shouldn’t be used for caching.

Custom servers should create and mount WebSearchers and WebIndexers as needed. Caches and field settings can then be applied directly before starting the server. WebSearchers and WebIndexers can of course also be subclassed for custom interfaces.

CherryPy and Lucene VM integration issues:
  • Monitors (such as autoreload) are not compatible with the VM unless threads are attached.
  • WorkerThreads must be also attached to the VM.
  • VM initialization must occur after daemonizing.
  • Recommended that the VM ignores keyboard interrupts (-Xrs) for clean server shutdown.

tools

CherryPy tools enabled by default: tools.{json,allow,time,validate}.on

lupyne.server.json_(indent=None, content_type='application/json', process_body=None)[source]

Handle request bodies and responses in json format.

Parameters:
  • indent – indentation level for pretty printing
  • content_type – request media type and response content-type header
  • process_body – optional function to process body into request.params
lupyne.server.allow(methods=('GET', 'HEAD'))[source]

Only allow specified methods.

lupyne.server.time_()[source]

Return response time in headers.

lupyne.server.validate(methods=('GET', 'HEAD'), etag=True, last_modified=True, max_age=None, expires=None)[source]

Return and validate caching headers for GET requests.

Parameters:
  • methods – only set headers for specified methods
  • etag – return weak entity tag header based on index version and validate if-match headers
  • last_modified – return last-modified header based on index timestamp and validate if-modified headers
  • max_age – return cache-control max-age and age headers based on last update timestamp
  • expires – return expires header offset from last update timestamp

WebSearcher

class lupyne.server.WebSearcher(*directories, **kwargs)[source]

Dispatch root with a delegated Searcher.

docs(*path, **options)[source]

Return ids or documents.

GET /docs

Return array of doc ids.

return:[int,... ]
GET /docs/[int|chars/chars]?

Return document mapping from id or unique name and value. Optionally select stored, multi-valued, and cached indexed fields.

&fields=chars,... &fields.multi=chars,... &fields.indexed=chars[:chars],...

return:{string: string|number|array,... }
index()[source]

Return index information.

GET /

Return a mapping of the directory to the document count.

return:{string: int,... }
search(q=None, count=None, start=0, fields=None, sort=None, facets='', group='', hl='', mlt=None, spellcheck=0, timeout=None, **options)[source]

Run query and return documents.

GET /search?

Return array of document objects and total doc count.

&q=chars&q.type=[term|prefix|wildcard]&q.chars=...,
query, optional type to skip parsing, and optional parser settings: q.field, q.op,...
&filter=chars
cached filter applied to the query
if a previously cached filter is not found, the value will be parsed as a query
&count=int&start=0
maximum number of docs to return and offset to start at
&fields=chars,... &fields.multi=chars,... &fields.indexed=chars[:chars],...
only include selected stored fields; multi-valued fields returned in an array; indexed fields with optional type are cached
&sort=[-]chars[:chars],... &sort.scores[=max]
field name, optional type, minus sign indicates descending
optionally score docs, additionally compute maximum score
&facets=chars,... &facets.count=int&facets.min=0
include facet counts for given field names; facets filters are cached
optional maximum number of most populated facet values per field, and minimum count to return
&group=chars[:chars]&group.count=1&group.limit=int
group documents by field value with optional type, up to given maximum count
limit number of groups which return docs
&hl=chars,... &hl.count=1&hl.tag=strong&hl.enable=[fields|terms]
stored fields to return highlighted
optional maximum fragment count and html tag name
optionally enable matching any field or any term
&mlt=int&mlt.fields=chars,... &mlt.chars=...,
doc index (or id without a query) to find MoreLikeThis
optional document fields to match
optional MoreLikeThis settings: mlt.minTermFreq, mlt.minDocFreq,...
&spellcheck=int
maximum number of spelling corrections to return for each query term, grouped by field
original query is still run; use q.spellcheck=true to affect query parsing
&timeout=number
timeout search after elapsed number of seconds
return:
{
“query”: string,
“count”: int|null,
“maxscore”: number|null,
“docs”: [{“__id__”: int, “__score__”: number, “__highlights__”: {string: array,... }, string: object,... },... ],
“facets”: {string: {string: int,... },... },
“groups”: [{“count”: int, “value”: value, “docs”: [object,... ]},... ]
“spellcheck”: {string: {string: [string,... ],... },... },
}
terms(name='', value=':', *path, **options)[source]

Return data about indexed terms.

GET /terms?

Return field names, with optional selection.

&option=chars

return:[string,... ]
GET /terms/chars[:int|float]?step=0

Return term values for given field name, with optional type and step for numeric encoded values.

return:[string,... ]
GET /terms/chars/chars[*|?|:chars|~number]

Return term values (wildcards, slices, or fuzzy terms) for given field name.

return:[string,... ]
GET /terms/chars/chars[*|~]?count=int

Return spellchecked term values ordered by decreasing document frequency. Prefixes (*) are optimized to be suitable for real-time query suggestions; all terms are cached.

return:[string,... ]
GET /terms/chars/chars

Return document count for given term.

return:int
GET /terms/chars/chars/docs

Return document ids for given term.

return:[int,... ]
GET /terms/chars/chars/docs/counts

Return document ids and frequency counts for given term.

return:[[int, int],... ]
GET /terms/chars/chars/docs/positions

Return document ids and positions for given term.

return:[[int, [int,... ]],... ]
update(**caches)[source]

Refresh index version.

POST /update

Reopen searcher, optionally reloading caches, and return document count.

[“filters”|”sorters”|”spellcheckers”,... ]

return:int

WebIndexer

class lupyne.server.WebIndexer(*args, **kwargs)[source]

Bases: lupyne.server.WebSearcher

Dispatch root with a delegated Indexer, exposing write methods.

docs(*path, **options)[source]

Add or return documents. See WebSearcher.docs() for GET method.

POST /docs

Add documents to index.

[{string: string|number|array,... },... ]

PUT, DELETE /docs/chars/chars

Set or delete document. Unique term should be indexed and is added to the new document.

{string: string|number|array,... }

fields(name='', **settings)[source]

Return or store a field’s parameters.

GET /fields

Return known field names.

return:[string,... ]
GET, PUT /fields/chars

Set and return parameters for given field name.

{“store”|”index”|”termvector”: string|true|false,... }

return:{“store”: string, “index”: string, “termvector”: string}
index(directories=())[source]

Add indexes. See WebSearcher.index() for GET method.

POST /

Add indexes without optimization.

[string,... ]

search(q=None, **options)[source]

Run or delete a query. See WebSearcher.search() for GET method.

DELETE /search?q=chars
Delete documents which match query.
update(*path, **options)[source]

Commit index changes and refresh index version.

POST /update

Commit write operations and return document count. See WebSearcher.update() for caching options.

[“expunge”|”optimize”,... ]

return:int
PUT, DELETE /update/snapshot

Snapshot current index commit and return array of referenced filenames, or release previous snapshot.

return:[string,... ]

start

lupyne.server.mount(root, path='', config=None, autoupdate=0)[source]

Attach root and subscribe to plugins.

Parameters:
  • root,path,config – see cherrypy.tree.mount
  • autoupdate – see command-line options
lupyne.server.start(root=None, path='', config=None, pidfile='', daemonize=False, autoreload=0, autoupdate=0, callback=None)[source]

Attach root, subscribe to plugins, and start server.

Parameters:
  • root,path,config – see cherrypy.quickstart
  • pidfile,daemonize,autoreload,autoupdate – see command-line options
  • callback – optional callback function scheduled after daemonizing

Table Of Contents

Previous topic

engine

Next topic

client

This Page