Class LuceneSearchProvider

java.lang.Object
org.apache.wiki.search.LuceneSearchProvider
All Implemented Interfaces:
WikiProvider, SearchProvider
Direct Known Subclasses:
TikaSearchProvider

public class LuceneSearchProvider
extends java.lang.Object
implements SearchProvider
Interface for the search providers that handle searching the Wiki
Since:
2.2.21.
  • Field Summary

    Fields
    Modifier and Type Field Description
    static int FLAG_CONTEXTS
    Create contexts also.
    protected static org.apache.log4j.Logger log  
    protected static java.lang.String LUCENE_ATTACHMENTS  
    protected static java.lang.String LUCENE_AUTHOR  
    protected static java.lang.String LUCENE_ID  
    protected static java.lang.String LUCENE_PAGE_CONTENTS  
    protected static java.lang.String LUCENE_PAGE_KEYWORDS  
    protected static java.lang.String LUCENE_PAGE_NAME  
    protected java.util.List<java.lang.Object[]> m_updates  
    static int MAX_SEARCH_HITS
    The maximum number of hits to return from searches.
    static java.lang.String PROP_LUCENE_ANALYZER
    Which analyzer to use.
    static java.lang.String[] SEARCHABLE_FILE_SUFFIXES
    These attachment file suffixes will be indexed.

    Fields inherited from interface org.apache.wiki.api.providers.WikiProvider

    LATEST_VERSION
  • Constructor Summary

    Constructors
    Constructor Description
    LuceneSearchProvider()  
  • Method Summary

    Modifier and Type Method Description
    protected void doFullLuceneReindex()
    Performs a full Lucene reindex, if necessary.
    java.util.Collection<SearchResult> findPages​(java.lang.String query, int flags, Context wikiContext)
    Searches pages using a particular combination of flags.
    java.util.Collection<SearchResult> findPages​(java.lang.String query, Context wikiContext)
    Search for pages matching a search query.
    protected java.lang.String getAttachmentContent​(java.lang.String attachmentName, int version)
    Fetches the attachment content from the repository.
    protected java.lang.String getAttachmentContent​(Attachment att)  
    protected Engine getEngine()
    Returns the handling engine.
    java.lang.String getProviderInfo()
    Return a valid HTML string for information.
    void initialize​(Engine engine, java.util.Properties props)
    Initializes the page provider.
    protected org.apache.lucene.document.Document luceneIndexPage​(Page page, java.lang.String text, org.apache.lucene.index.IndexWriter writer)
    Indexes page using the given IndexWriter.
    void pageRemoved​(Page page)
    Delete a page from the search index.
    void reindexPage​(Page page)
    Adds a page-text pair to the lucene update queue.
    protected void updateLuceneIndex​(Page page, java.lang.String text)
    Updates the lucene index for a single page.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

  • Constructor Details

  • Method Details

    • initialize

      public void initialize​(Engine engine, java.util.Properties props) throws NoRequiredPropertyException, java.io.IOException
      Initializes the page provider.
      Specified by:
      initialize in interface WikiProvider
      Parameters:
      engine - Engine to own this provider
      props - A set of properties used to initialize this provider
      Throws:
      NoRequiredPropertyException - If the provider needs a property which is not found in the property set
      java.io.IOException - If there is an IO problem
    • getEngine

      protected Engine getEngine()
      Returns the handling engine.
      Returns:
      Current Engine
    • doFullLuceneReindex

      protected void doFullLuceneReindex() throws java.io.IOException
      Performs a full Lucene reindex, if necessary.
      Throws:
      java.io.IOException - If there's a problem during indexing
    • getAttachmentContent

      protected java.lang.String getAttachmentContent​(java.lang.String attachmentName, int version)
      Fetches the attachment content from the repository. Content is flat text that can be used for indexing/searching or display
      Parameters:
      attachmentName - Name of the attachment.
      version - The version of the attachment.
      Returns:
      the content of the Attachment as a String.
    • getAttachmentContent

      protected java.lang.String getAttachmentContent​(Attachment att)
      Parameters:
      att - Attachment to get content for. Filename extension is used to determine the type of the attachment.
      Returns:
      String representing the content of the file. FIXME This is a very simple implementation of some text-based attachment, mainly used for testing. This should be replaced /moved to Attachment search providers or some other 'pluggable' way to search attachments
    • updateLuceneIndex

      protected void updateLuceneIndex​(Page page, java.lang.String text)
      Updates the lucene index for a single page.
      Parameters:
      page - The WikiPage to check
      text - The page text to index.
    • luceneIndexPage

      protected org.apache.lucene.document.Document luceneIndexPage​(Page page, java.lang.String text, org.apache.lucene.index.IndexWriter writer) throws java.io.IOException
      Indexes page using the given IndexWriter.
      Parameters:
      page - WikiPage
      text - Page text to index
      writer - The Lucene IndexWriter to use for indexing
      Returns:
      the created index Document
      Throws:
      java.io.IOException - If there's an indexing problem
    • pageRemoved

      public void pageRemoved​(Page page)
      Delete a page from the search index.
      Specified by:
      pageRemoved in interface SearchProvider
      Parameters:
      page - Page to remove from search index.
    • reindexPage

      public void reindexPage​(Page page)
      Adds a page-text pair to the lucene update queue. Safe to call always
      Specified by:
      reindexPage in interface SearchProvider
      Parameters:
      page - WikiPage to add to the update queue.
    • findPages

      public java.util.Collection<SearchResult> findPages​(java.lang.String query, Context wikiContext) throws ProviderException
      Search for pages matching a search query.
      Specified by:
      findPages in interface SearchProvider
      Parameters:
      query - query to search for
      wikiContext - the context within which to run the search
      Returns:
      collection of pages that match query
      Throws:
      ProviderException - if the search provider failed.
    • findPages

      public java.util.Collection<SearchResult> findPages​(java.lang.String query, int flags, Context wikiContext) throws ProviderException
      Searches pages using a particular combination of flags.
      Parameters:
      query - The query to perform in Lucene query language
      flags - A set of flags
      Returns:
      A Collection of SearchResult instances
      Throws:
      ProviderException - if there is a problem with the backend
    • getProviderInfo

      public java.lang.String getProviderInfo()
      Return a valid HTML string for information. May be anything.
      Specified by:
      getProviderInfo in interface WikiProvider
      Returns:
      A string describing the provider.