:: RootR ::  Hosting Order Map Login   Secure Inter-Network Operations  
 
Lucy::Index::Indexer - phpMan

Command: man perldoc info search(apropos)  


Lucy::Index::Indexer(3)        User Contributed Perl Documentation        Lucy::Index::Indexer(3)



NAME
       Lucy::Index::Indexer - Build inverted indexes.

SYNOPSIS
           my $indexer = Lucy::Index::Indexer->new(
               schema => $schema,
               index  => '/path/to/index',
               create => 1,
           );
           while ( my ( $title, $content ) = each %source_docs ) {
               $indexer->add_doc({
                   title   => $title,
                   content => $content,
               });
           }
           $indexer->commit;

DESCRIPTION
       The Indexer class is Apache Lucy's primary tool for managing the content of inverted
       indexes, which may later be searched using IndexSearcher.

       In general, only one Indexer at a time may write to an index safely.  If a write lock
       cannot be secured, new() will throw an exception.

       If an index is located on a shared volume, each writer application must identify itself by
       supplying an IndexManager with a unique "host" id to Indexer's constructor or index
       corruption will occur.  See Lucy::Docs::FileLocking for a detailed discussion.

       Note: at present, delete_by_term() and delete_by_query() only affect documents which had
       been previously committed to the index -- and not any documents added this indexing
       session but not yet committed.  This may change in a future update.

CONSTRUCTORS
   new( [labeled params] )
           my $indexer = Lucy::Index::Indexer->new(
               schema   => $schema,             # required at index creation
               index    => '/path/to/index',    # required
               create   => 1,                   # default: 0
               truncate => 1,                   # default: 0
               manager  => $manager             # default: created internally
           );

       ·   schema - A Schema.  Required when index is being created; if not supplied, will be
           extracted from the index folder.

       ·   index - Either a filepath to an index or a Folder.

       ·   create - If true and the index directory does not exist, attempt to create it.

       ·   truncate - If true, proceed with the intention of discarding all previous indexing
           data.  The old data will remain intact and visible until commit() succeeds.

       ·   manager - An IndexManager.

METHODS
   add_doc(...)
           $indexer->add_doc($doc);
           $indexer->add_doc( { field_name => $field_value } );
           $indexer->add_doc(
               doc   => { field_name => $field_value },
               boost => 2.5,         # default: 1.0
           );

       Add a document to the index.  Accepts either a single argument or labeled params.

       ·   doc - Either a Lucy::Document::Doc object, or a hashref (which will be attached to a
           Lucy::Document::Doc object internally).

       ·   boost - A floating point weight which affects how this document scores.

   add_index(index)
       Absorb an existing index into this one.  The two indexes must have matching Schemas.

       ·   index - Either an index path name or a Folder.

   optimize()
       Optimize the index for search-time performance.  This may take a while, as it can involve
       rewriting large amounts of data.

       Every Indexer session which changes index content and ends in a commit() creates a new
       segment.  Once written, segments are never modified.  However, they are periodically
       recycled by feeding their content into the segment currently being written.

       The optimize() method causes all existing index content to be fed back into the Indexer.
       When commit() completes after an optimize(), the index will consist of one segment.  So
       optimize() must be called before commit().  Also, optimizing a fresh index created from
       scratch has no effect.

       Historically, there was a significant search-time performance benefit to collapsing down
       to a single segment versus even two segments.  Now the effect of collapsing is much less
       significant, and calling optimize() is rarely justified.

   commit()
       Commit any changes made to the index.  Until this is called, none of the changes made
       during an indexing session are permanent.

       Calling commit() invalidates the Indexer, so if you want to make more changes you'll need
       a new one.

   prepare_commit()
       Perform the expensive setup for commit() in advance, so that commit() completes quickly.
       (If prepare_commit() is not called explicitly by the user, commit() will call it
       internally.)

   delete_by_term( [labeled params] )
       Mark documents which contain the supplied term as deleted, so that they will be excluded
       from search results and eventually removed altogether.  The change is not apparent to
       search apps until after commit() succeeds.

       ·   field - The name of an indexed field. (If it is not spec'd as "indexed", an error will
           occur.)

       ·   term - The term which identifies docs to be marked as deleted.  If "field" is
           associated with an Analyzer, "term" will be processed automatically (so don't pre-
           process it yourself).

   delete_by_query(query)
       Mark documents which match the supplied Query as deleted.

       ·   query - A Query.

   delete_by_doc_id(doc_id)
       Mark the document identified by the supplied document ID as deleted.

       ·   doc_id - A document id.

   get_schema()
       Accessor for schema.

INHERITANCE
       Lucy::Index::Indexer isa Clownfish::Obj.



perl v5.20.2                                2015-12-01                    Lucy::Index::Indexer(3)


/man
rootr.net - man pages