Home > Hacking Invenio > WebSearch Internals > Search Services |
Search services are meant to display information contextual to a search query in very specialized way, in the sense that they can search/retrieve/display data beyond the traditional concept of records. Typical search services could for example include:
Search services are displayed (in addition) just before the results returned by the standard Invenio search engine. Each service is queried with the context, and returns:
websearch_services.CFG_WEBSEARCH_SERVICE_MAX_SERVICE_ANSWER_RELEVANCE
for details) which indicate how much the service thinks it
can address the "question" and how much it thinks it was able
to answer it. Given the wanted simplistic (non-scientific) nature of
services it is extremely important to consider the score very
carefully, and compare it with existing services when designing a new
service. Failing to do so might lead to hide more relevant answers
provided by other services with non-relevant information.
Services are designed to provide unobtrusive and useful information to the user. To that end some measure are taken to cut possible verbosity introduced by the services:
websearch_services.CFG_WEBSEARCH_SERVICE_MAX_NB_SERVICE_DISPLAY
.websearch_services.CFG_WEBSEARCH_SERVICE_MIN_RELEVANCE_TO_DISPLAY.
websearch_services.CFG_WEBSEARCH_SERVICE_MIN_RELEVANCE_TO_DISPLAY.
In order to enable a service, drop its files into the following location:
/opt/invenio/lib/python/invenio/search_services/
To disable a service, remove the file from the above directory.
Services use the Invenio plugin_utils
infrastructure, and
are self-contained in single Pythonic files that comply with the
specifications defined in section 3.1 Search service
specifications/requirements.
/opt/invenio/lib/python/invenio/search_services/
,
which name corresponds to a class defined in the file. In order to be valid, your service should inherit from the base
SearchService
class and implement some functions (see
Section 3.1.1 Using SearchService base class).
Other helper, more specialized classes exists to help you build
services that responds with links of links
(Section 3.1.2 Using ListLinksService class) or
answer based on a BibKnowledge knowledge base (3.1.3
Using KnowledgeBaseService class)
Start implementing your service by defining a class that inherits from
the SearchService
base class. Choose a class name that
matches the name of your service file.
For eg. a spellchecker service could exist
in /opt/invenio/lib/python/invenio/search_services/SpellCheckerService.py
,
with the following content:
from invenio.websearch.websearch_services import SearchService __plugin_version__ = "Search Service Plugin API 1.0" class CollectionNameSearchService(SearchService): def get_description(self, ln=CFG_SITE_LANG): "Return service description" return "Spell check user input" def answer(self, req, user_info, of, cc, colls_to_search, p, f, search_units, ln): """ Answer question given by context. Return (relevance, html_string) where relevance is integer from 0 to 100 indicating how relevant to the question the answer is (see C{CFG_WEBSEARCH_SERVICE_MAX_SERVICE_ANSWER_RELEVANCE} for details) , and html_string being a formatted answer. """ [...] return (score, html_answer)
The bare minimum for a search service to be valid is to inherit from
the abstract base class "SearchService
". The service
must:
get_description(..)
to return a short description of the service.answer(..)
to return an answer (int, html_string).__plugin_version__
for the Search Service plugin version this service is compatible with.If your service must pre-process some data and cache it, it can override the following helper methods:
prepare_data_cache(..)
to prepare some cache needed
to answer. The returned structure is cached for you and can be
retrieved later via the self.get_data_cache()
function. This method is called only when the service is queried and
the cache has not yet been prepared, or must be refreshed. The
Invenio DataCacher
infrastructure is used, to store the
cache in memory.timestamp_verifier(..)
to indicate if cache
must be refreshed, by returning the latest modification timestamp of
your data.If your service is designed to display list of responses, you can
inherit from the ListLinksService
class instead
of SearchService
, in order to benefit from the following
additional helper functions:
from invenio.websearch.websearch_services import ListLinksService from invenio.miscutil.messages import gettext_set_language __plugin_version__ = "Search Service Plugin API 1.0" class CollectionNameSearchService(ListLinksService): [...] def get_description(self, ln=CFG_SITE_LANG): "Return service description" return "A foo that answer with list of bars" def get_label(self, ln=CFG_SITE_LANG): """ Return label displayed next to the service responses. @rtype: string """ _ = gettext_set_language(ln) return _("You might be interested in") def answer(self, req, user_info, of, cc, colls_to_search, p, f, search_units, ln): """ Answer question given by context. Return (relevance, html_string) where relevance is integer from 0 to 100 indicating how relevant to the question the answer is (see C{CFG_WEBSEARCH_SERVICE_MAX_SERVICE_ANSWER_RELEVANCE} for details) , and html_string being a formatted answer. """ [...] return (score, self.display_answer_helper(list_of_tuples_labels_urls))
get_label(..)
to return the label displayed to the
user for the service next to the list of responses.display_answer_helper(..)
a function to be used as
a help to process a list of tuples (label, url) in order to return
a nicely formatted list of items from the answer().
functionThe requirements from the base class SearchService
are
still valid when using ListLinksService
:
get_description(..)
to return a short description of the service.answer(..)
to return an answer (int, html_string).__plugin_version__
for the Search Service plugin version this service is compatible with.
If you would like to build a service that answers automatically based
on the values stored in a BibKnowledge knowledge base, use
the KnowledgeBaseService
. In this case you do not need to
define the answer(..)
function, which is already
implemented for you:
from invenio.websearch.websearch_services import KnowledgeBaseService from invenio.miscutil.messages import gettext_set_language __plugin_version__ = "Search Service Plugin API 1.0" class MyKBService(KnowledgeBaseService): [...] def get_description(self, ln=CFG_SITE_LANG): "Return service description" return "A foo that answer with list of bars, based on myKB" def get_label(self, ln=CFG_SITE_LANG): """ Return label displayed next to the service responses. @rtype: string """ _ = gettext_set_language(ln) return _("You might be interested in") def get_kbname(self): "Load knowledge base with this name" return "myKB"
With the above code, the knowledge base "myKB" will be loaded and used to reply to the user. It is expected that the knowledge base contains for each mapping a list of whitespace-separated as key, and a label + url (separated with a "|") as value. For eg:
help submit thesis -> How to submit a thesis|http://localhost/help/how-to-submit-thesis Atlantis Times -> Atlantis Times journal|http://localhost/journal/AtlantisTimes
When filling up the knowledge base keys, one should carefully select the keywords in order to focus on a very specific subject, and avoid too generic terms that would unnecessarily match too many user queries. One should also consider the possible clash with other services (the first rule of the above KB example might for example hide responses from the "SubmissionNameSearchService" service
A default implementation of a KnowledgeBaseService
is
provided with Invenio: "FAQKBService". This service makes
use of the demo knowledge base named "FAQ".
See also 1. Overview section.