Package org.apache.wiki.search
Class LuceneSearchProvider
- java.lang.Object
-
- org.apache.wiki.search.LuceneSearchProvider
-
- All Implemented Interfaces:
WikiProvider,SearchProvider
- Direct Known Subclasses:
TikaSearchProvider
public class LuceneSearchProvider extends java.lang.Object implements SearchProvider
Interface for the search providers that handle searching the Wiki- Since:
- 2.2.21.
-
-
Field Summary
Fields Modifier and Type Field Description static intFLAG_CONTEXTSCreate contexts also.protected static org.apache.logging.log4j.Loggerlogprotected static java.lang.StringLUCENE_ATTACHMENTSprotected static java.lang.StringLUCENE_AUTHORprotected static java.lang.StringLUCENE_IDprotected static java.lang.StringLUCENE_PAGE_CONTENTSprotected static java.lang.StringLUCENE_PAGE_KEYWORDSprotected static java.lang.StringLUCENE_PAGE_NAMEprotected java.util.List<java.lang.Object[]>m_updatesstatic intMAX_SEARCH_HITSThe maximum number of hits to return from searches.static java.lang.StringPROP_LUCENE_ANALYZERWhich analyzer to use.static java.lang.String[]SEARCHABLE_FILE_SUFFIXESThese attachment file suffixes will be indexed.-
Fields inherited from interface org.apache.wiki.api.providers.WikiProvider
LATEST_VERSION
-
-
Constructor Summary
Constructors Constructor Description LuceneSearchProvider()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected voiddoFullLuceneReindex()Performs a full Lucene reindex, if necessary.java.util.Collection<SearchResult>findPages(java.lang.String query, int flags, Context wikiContext)Searches pages using a particular combination of flags.java.util.Collection<SearchResult>findPages(java.lang.String query, Context wikiContext)Search for pages matching a search query.protected java.lang.StringgetAttachmentContent(java.lang.String attachmentName, int version)Fetches the attachment content from the repository.protected java.lang.StringgetAttachmentContent(Attachment att)protected EnginegetEngine()Returns the handling engine.java.lang.StringgetProviderInfo()Return a valid HTML string for information.voidinitialize(Engine engine, java.util.Properties props)Initializes the page provider.protected org.apache.lucene.document.DocumentluceneIndexPage(Page page, java.lang.String text, org.apache.lucene.index.IndexWriter writer)Indexes page using the given IndexWriter.voidpageRemoved(Page page)Delete a page from the search index.voidreindexPage(Page page)Adds a page-text pair to the lucene update queue.protected voidupdateLuceneIndex(Page page, java.lang.String text)Updates the lucene index for a single page.
-
-
-
Field Detail
-
log
protected static final org.apache.logging.log4j.Logger log
-
PROP_LUCENE_ANALYZER
public static final java.lang.String PROP_LUCENE_ANALYZER
Which analyzer to use. Default is StandardAnalyzer.- See Also:
- Constant Field Values
-
SEARCHABLE_FILE_SUFFIXES
public static final java.lang.String[] SEARCHABLE_FILE_SUFFIXES
These attachment file suffixes will be indexed.
-
LUCENE_ID
protected static final java.lang.String LUCENE_ID
- See Also:
- Constant Field Values
-
LUCENE_PAGE_CONTENTS
protected static final java.lang.String LUCENE_PAGE_CONTENTS
- See Also:
- Constant Field Values
-
LUCENE_AUTHOR
protected static final java.lang.String LUCENE_AUTHOR
- See Also:
- Constant Field Values
-
LUCENE_ATTACHMENTS
protected static final java.lang.String LUCENE_ATTACHMENTS
- See Also:
- Constant Field Values
-
LUCENE_PAGE_NAME
protected static final java.lang.String LUCENE_PAGE_NAME
- See Also:
- Constant Field Values
-
LUCENE_PAGE_KEYWORDS
protected static final java.lang.String LUCENE_PAGE_KEYWORDS
- See Also:
- Constant Field Values
-
m_updates
protected final java.util.List<java.lang.Object[]> m_updates
-
MAX_SEARCH_HITS
public static final int MAX_SEARCH_HITS
The maximum number of hits to return from searches.- See Also:
- Constant Field Values
-
FLAG_CONTEXTS
public static final int FLAG_CONTEXTS
Create contexts also. Generating contexts can be expensive, so they're not on by default.- See Also:
- Constant Field Values
-
-
Constructor Detail
-
LuceneSearchProvider
public LuceneSearchProvider()
-
-
Method Detail
-
initialize
public void initialize(Engine engine, java.util.Properties props) throws NoRequiredPropertyException, java.io.IOException
Initializes the page provider.- Specified by:
initializein interfaceWikiProvider- Parameters:
engine- Engine to own this providerprops- A set of properties used to initialize this provider- Throws:
NoRequiredPropertyException- If the provider needs a property which is not found in the property setjava.io.IOException- If there is an IO problem
-
doFullLuceneReindex
protected void doFullLuceneReindex() throws java.io.IOException
Performs a full Lucene reindex, if necessary.- Throws:
java.io.IOException- If there's a problem during indexing
-
getAttachmentContent
protected java.lang.String getAttachmentContent(java.lang.String attachmentName, int version)
Fetches the attachment content from the repository. Content is flat text that can be used for indexing/searching or display- Parameters:
attachmentName- Name of the attachment.version- The version of the attachment.- Returns:
- the content of the Attachment as a String.
-
getAttachmentContent
protected java.lang.String getAttachmentContent(Attachment att)
- Parameters:
att- Attachment to get content for. Filename extension is used to determine the type of the attachment.- Returns:
- String representing the content of the file. FIXME This is a very simple implementation of some text-based attachment, mainly used for testing. This should be replaced /moved to Attachment search providers or some other 'pluggable' way to search attachments
-
updateLuceneIndex
protected void updateLuceneIndex(Page page, java.lang.String text)
Updates the lucene index for a single page.- Parameters:
page- The WikiPage to checktext- The page text to index.
-
luceneIndexPage
protected org.apache.lucene.document.Document luceneIndexPage(Page page, java.lang.String text, org.apache.lucene.index.IndexWriter writer) throws java.io.IOException
Indexes page using the given IndexWriter.- Parameters:
page- WikiPagetext- Page text to indexwriter- The Lucene IndexWriter to use for indexing- Returns:
- the created index Document
- Throws:
java.io.IOException- If there's an indexing problem
-
pageRemoved
public void pageRemoved(Page page)
Delete a page from the search index.- Specified by:
pageRemovedin interfaceSearchProvider- Parameters:
page- Page to remove from search index.
-
reindexPage
public void reindexPage(Page page)
Adds a page-text pair to the lucene update queue. Safe to call always- Specified by:
reindexPagein interfaceSearchProvider- Parameters:
page- WikiPage to add to the update queue.
-
findPages
public java.util.Collection<SearchResult> findPages(java.lang.String query, Context wikiContext) throws ProviderException
Search for pages matching a search query.- Specified by:
findPagesin interfaceSearchProvider- Parameters:
query- query to search forwikiContext- the context within which to run the search- Returns:
- collection of pages that match query
- Throws:
ProviderException- if the search provider failed.
-
findPages
public java.util.Collection<SearchResult> findPages(java.lang.String query, int flags, Context wikiContext) throws ProviderException
Searches pages using a particular combination of flags.- Parameters:
query- The query to perform in Lucene query languageflags- A set of flags- Returns:
- A Collection of SearchResult instances
- Throws:
ProviderException- if there is a problem with the backend
-
getProviderInfo
public java.lang.String getProviderInfo()
Return a valid HTML string for information. May be anything.- Specified by:
getProviderInfoin interfaceWikiProvider- Returns:
- A string describing the provider.
-
-