Package org.apache.wiki.search
Class LuceneSearchProvider
java.lang.Object
org.apache.wiki.search.LuceneSearchProvider
- All Implemented Interfaces:
WikiProvider
,SearchProvider
- Direct Known Subclasses:
TikaSearchProvider
Interface for the search providers that handle searching the Wiki
- Since:
- 2.2.21.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final int
Create contexts also.protected static final org.apache.logging.log4j.Logger
protected static final String
protected static final String
protected static final String
protected static final String
protected static final String
protected static final String
static final int
The maximum number of hits to return from searches.static final String
Which analyzer to use.static final String[]
These attachment file suffixes will be indexed.Fields inherited from interface org.apache.wiki.api.providers.WikiProvider
LATEST_VERSION
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected void
Performs a full Lucene reindex, if necessary.Searches pages using a particular combination of flags.Search for pages matching a search query.protected String
getAttachmentContent
(String attachmentName, int version) Fetches the attachment content from the repository.protected String
protected Engine
Returns the handling engine.Return a valid HTML string for information.void
initialize
(Engine engine, Properties props) Initializes the page provider.protected org.apache.lucene.document.Document
luceneIndexPage
(Page page, String text, org.apache.lucene.index.IndexWriter writer) Indexes page using the given IndexWriter.void
pageRemoved
(Page page) Delete a page from the search index.void
reindexPage
(Page page) Adds a page-text pair to the lucene update queue.protected void
updateLuceneIndex
(Page page, String text) Updates the lucene index for a single page.
-
Field Details
-
LOG
-
PROP_LUCENE_ANALYZER
Which analyzer to use. Default is StandardAnalyzer.- See Also:
-
SEARCHABLE_FILE_SUFFIXES
These attachment file suffixes will be indexed. -
LUCENE_ID
- See Also:
-
LUCENE_PAGE_CONTENTS
- See Also:
-
LUCENE_AUTHOR
- See Also:
-
LUCENE_ATTACHMENTS
- See Also:
-
LUCENE_PAGE_NAME
- See Also:
-
LUCENE_PAGE_KEYWORDS
- See Also:
-
m_updates
-
MAX_SEARCH_HITS
The maximum number of hits to return from searches.- See Also:
-
FLAG_CONTEXTS
Create contexts also. Generating contexts can be expensive, so they're not on by default.- See Also:
-
-
Constructor Details
-
LuceneSearchProvider
public LuceneSearchProvider()
-
-
Method Details
-
initialize
public void initialize(Engine engine, Properties props) throws NoRequiredPropertyException, IOException Initializes the page provider.- Specified by:
initialize
in interfaceWikiProvider
- Parameters:
engine
- Engine to own this providerprops
- A set of properties used to initialize this provider- Throws:
NoRequiredPropertyException
- If the provider needs a property which is not found in the property setIOException
- If there is an IO problem
-
getEngine
Returns the handling engine.- Returns:
- Current Engine
-
doFullLuceneReindex
Performs a full Lucene reindex, if necessary.- Throws:
IOException
- If there's a problem during indexing
-
getAttachmentContent
Fetches the attachment content from the repository. Content is flat text that can be used for indexing/searching or display- Parameters:
attachmentName
- Name of the attachment.version
- The version of the attachment.- Returns:
- the content of the Attachment as a String.
-
getAttachmentContent
- Parameters:
att
- Attachment to get content for. Filename extension is used to determine the type of the attachment.- Returns:
- String representing the content of the file. FIXME This is a very simple implementation of some text-based attachment, mainly used for testing. This should be replaced /moved to Attachment search providers or some other 'pluggable' way to search attachments
-
updateLuceneIndex
Updates the lucene index for a single page.- Parameters:
page
- The WikiPage to checktext
- The page text to index.
-
luceneIndexPage
protected org.apache.lucene.document.Document luceneIndexPage(Page page, String text, org.apache.lucene.index.IndexWriter writer) throws IOException Indexes page using the given IndexWriter.- Parameters:
page
- WikiPagetext
- Page text to indexwriter
- The Lucene IndexWriter to use for indexing- Returns:
- the created index Document
- Throws:
IOException
- If there's an indexing problem
-
pageRemoved
Delete a page from the search index.- Specified by:
pageRemoved
in interfaceSearchProvider
- Parameters:
page
- Page to remove from search index.
-
reindexPage
Adds a page-text pair to the lucene update queue. Safe to call always- Specified by:
reindexPage
in interfaceSearchProvider
- Parameters:
page
- WikiPage to add to the update queue.
-
findPages
public Collection<SearchResult> findPages(String query, Context wikiContext) throws ProviderException Search for pages matching a search query.- Specified by:
findPages
in interfaceSearchProvider
- Parameters:
query
- query to search forwikiContext
- the context within which to run the search- Returns:
- collection of pages that match query
- Throws:
ProviderException
- if the search provider failed.
-
findPages
public Collection<SearchResult> findPages(String query, int flags, Context wikiContext) throws ProviderException Searches pages using a particular combination of flags.- Parameters:
query
- The query to perform in Lucene query languageflags
- A set of flags- Returns:
- A Collection of SearchResult instances
- Throws:
ProviderException
- if there is a problem with the backend
-
getProviderInfo
Return a valid HTML string for information. May be anything.- Specified by:
getProviderInfo
in interfaceWikiProvider
- Returns:
- A string describing the provider.
-