Oracle ConText Cartridge Administrator's Guide
Release 2.0

A54628_01

Library

Product

Contents

Index

Prev Next

6
Setting Up and Managing Text

This chapter provides details on how to use the command-line to set up and maintain text in ConText.

The process of administering text in a ConText system comprises the following tasks:

Loading Text

This section provides instructions for loading text into database columns from the command-line:

Using ctxload

Use ctxload to load text from a load file or from separate text files into the database.

For example:

ctxload -user jsmith/welcome -name MY_DOCS -file docs.txt -log docload.log

In this example, the Oracle user's username/password is jsmith/welcome. Because the -thes argument for ctxload isn't specified, by default, ctxload loads text, rather than a thesaurus, into the specified database table. The table to which the documents are loaded is my_docs and the load file being used is docs.txt.

In addition, this example generates a log file named docload.log.

See Also:

For a complete description of ctxload requirements and options, as well as the structure and syntax of the text load file, see "ctxload Utility" in Chapter 9, "Executables and Utilities".  

Using ConText Servers for Automated Text Loading

ConText servers can be used to automatically load text from files in an operating system directory as the directory is populated with files.

ConText uses ConText servers with the Loader personality to scan a specified directory for files and call ctxload to load all existing files in the directory into a specified column. The column and the directory to be scanned are specified in a text loading source created by the user.

To setup ConText servers for automated text loading, perform the following tasks:

Note:

This example assumes that ConText is installed in a UNIX-based environment and the files to be loaded are stored in local directories on the operating system.  

  1. Use the SET_ATTRIBUTE and CREATE_PREFERENCE procedures in CTX_DDL to create a Reader preference. The Reader preference identifies the directory where the files are (or will be) located.

    For example:

    begin
      ctx_ddl.set_attribute('DIRECTORIES', '/product/docs');
      ctx_ddl.create_preference('PRODUCT_DOCS_READER',
                                'Directory scanner for /private/docs',
                                'DIRECTORY READER');
    end;
    
    

    In this example, the name of the preference created is reader_pref. The directories attribute for the DIRECTORY READER Tile specifies the directory path and name for the directory to be scanned (/private/docs).

  2. Use SET_ATTRIBUTE and CREATE_PREFERENCE to (optionally) create a Translator preference. The Translator preference converts incoming files into the format required by ctxload.
    Note:

    If the incoming files do not need to be converted, this step can be skipped.  

    For example:

    begin
      ctx_ddl.set_attribute('COMMAND', '/bin/convert.sh');
      ctx_ddl.create_preference('PRODUCT_DOCS_TRANSLATOR',
                                'script that converts files to ctxload format',
                                'USER TRANSLATOR');
    		end;
    
    

    In this example, the name of the preference created is reader_pref. The command attribute for the USER TRANSLATOR Tile, specifies the location and the name of the translation program (a shell script named convert.sh).

  3. Use the CREATE_SOURCE procedure in CTX_DDL to create a source for the column into which you want to load text.

    When you create a source, you specify the name of the source and the column to be loaded. You also specify the Reader and Translator preferences that you created.

    For example:

    begin
      ctx_ddl.create_source('DOCS_SOURCE','DOCS.TEXT',			
                            'basic source for documents in /product/docs',
                            'reader_pref =>'PRODUCT_DOCS_READER'
                          'translator_pref =>'PRODUCT_DOCS_TRANSLATOR);
    end;
    
    

    In this example, the source name is docs_source and the column to be loaded is text in a table named docs.

    Note:

    If the files to be loaded are in the required ctxload format and do not require any translation, the default Translator preference can be used, in which case you do not need to specify a Translator preference for your source, because ConText uses the default NULL Translator preference.

    In addition, it is not necessary to specify a Loader Engine preference for your source. ConText uses the default Loader Engine preference.  

  4. Start a ConText server with the Loader (R) personality.

    For example:

    ctxsrvx -user ctxsys/passwd -personality R &
    
    
    See Also:

    For a complete description of ctxsysx, see "ctxsrv/ctxsrvx Executables" in Chapter 9, "Executables and Utilities".  

Generating Document Textkeys

Each document loaded into a table must be assigned a value in the primary key column of the table. This value serves as the textkey for the document.

Textkeys can be assigned using the following methods:

Embedding Textkeys in Load File

To manually embed textkey values for documents in the load file, in each document header, create an entry which specifies the name of the primary key column in the table and the textkey value to be assigned to the document.

For example:

. . .
<TEXTSTART: PK=1000, TITLE='DOC 1000'>
doc1000.txt
<TEXTEND>
<TEXTSTART: PK=1001, TITLE='DOC 10001'>
doc1001.txt
<TEXTEND>
. . .

In this example, the load file contains pointers to separate text files (doc1000.txt and doc1001.txt), rather than the text for each document. The primary key column for the table is pk and the values specified are loaded into pk when ctxload is run.

See Also:

For a complete description of the structure of the load file, see "ctxload Utility" in Chapter 9, "Executables and Utilities".  

Generating Textkeys Using Sequences and Triggers

To automatically generate textkey values for each document loaded into a table, use the SQL command CREATE to create a trigger and sequence for the table.

The sequence generates unique values for each document. The trigger calls the sequence each time a row (document) is loaded into the table and stores the value in the primary key column for the table.

For example:

create sequence doc_seq;
create trigger doc_trigger
before insert on DOCS
for each row 
  BEGIN
    select docs_seq.nextval into :new.pk from dual;
  END;

In this example, a sequence named doc_seq and a trigger named doc_trigger are created for a table named docs, in which the primary key column is pk.

doc_trigger specifies that the next available value generated by doc_seq is inserted into the docs table before each new row is inserted into the table.

See Also:

For more information about creating sequences and triggers, see Oracle8 Server SQL Reference  

Managing Preferences

This section provides details for using the CTX_DDL PL/SQL package to perform the following administration tasks for preferences:

In addition, this section describes the following tasks for specific types of preferences:

Creating a Preference

To create a preference in the ConText data dictionary, use the SET_ATTRIBUTE and CREATE_PREFERENCE procedures in CTX_DDL.

For example:

begin
  ctx_ddl.set_attribute ('PATH',
                         '/public/doc1:/public/doc2');
  ctx_ddl.create_preference ('PUB_DOCS',
                             'Docs stored in files',
                             'OSFILE');
end;

Note:

CREATE_PREFERENCE must be called immediately after SET_ATTRIBUTE to assign the specified attribute(s) to the preference that you are creating.  

In this example, a Datastore preference named pub_docs is created for text stored externally in operating system files in a UNIX-based environment.

The PATH attribute for the OSFILE Tile specifies the directory paths and names (/pub/doc1 and /public/doc2) where the files are stored. A colon is used to separate the multiple directory paths/names.

Specifying Multiple Values for Attributes

If you want to assign more than one value to the same Tile attribute, you must call SET_ATTRIBUTE separately for each value that you want to set before calling CREATE_PREFERENCE.

Currently, only Stoplist preferences and Filter preferences for external filters require multiple values for attributes.

See Also:

For more information about specifying multiple attribute values for preferences, see "Creating a Theme Lexer Preference" and "Creating a Stoplist Preference" in this chapter.  

Creating an Engine Preference

One of the most important preferences you create is an Engine preference. In the Engine preference for a policy, you specify the amount of indexing memory allocated for the column in the policy, as well as the STORAGE clauses used for the automatically-generated tables and Oracle indexes that comprise a ConText index.

Because ConText index strings for indexed tokens are stored in memory before they are saved to the ConText index tables, it is vital that you allocate as much indexing memory as possible to avoid excessive index fragmentation.

When you create an Engine preference, you use the index_memory attribute for the GENERIC ENGINE Tile to allocate indexing memory.

If you plan to use parallel indexing, the memory specified for the Engine preference should be the amount of real memory available divided evenly among the number of ConText servers that will perform the indexing in parallel.

For example, if you are going to use three ConText servers in parallel to create an index for a column and you have 100 Mb of memory available on the machine on which the servers will be running, you should create an Engine preference with index_memory set to 33 Mb, then specify the preference in the policy for the column.

Suggestion:

To ensure the best results for indexing, calculate the total amount of real memory (not virtual memory) available on the machine which will be used to create the index, then specify this amount when you create an Engine preference.  

See Also:

For more information about creating policies, see "Creating a Column Policy" in this chapter.  

Creating a Theme Lexer Preference

If you have English-language documents in a column and you want to create a theme index for the column to enable theme queries, you need to first create a Lexer preference that calls the theme lexer, then include the preference in the policy definition for the column.

To create your own theme lexer preference, call CTX_DDL.CREATE_PREFERENCE and specify the THEME LEXER Tile.

For example:

begin
  ctx_ddl.create_preference ('MY_THEME_PREF',
                             'Pref for theme indexes',
                             'THEME LEXER');
end;

Note:

Because the THEME LEXER does not require any attributes to be set, you do not generally need to create Theme Lexer preferences; instead, you can use the THEME_LEXER predefined preference provided by ConText.  

See Also:

For more information about creating a policy that uses the theme lexer preference, see "Creating a Theme Indexing Policy" in this chapter.  

Creating a Stoplist Preference

A Stoplist preference is created by calling CTX_DDL.SET_ATTRIBUTE separately for each stopword in the list.

To define a Stoplist:

  1. Set the stop_word attribute (GENERIC STOP LIST Tile) for each word that you do not want ConText to index. In addition, for each call to SET_ATTRIBUTE for stop_word, specify a sequence from 1 to 4095. The sequence is used in the text index to record the stopwords that proceed and follow each indexed term. This enables queries for phrases that contain stopwords.
    Note:

    The maximum number of terms (stopwords) that a Stoplist preference can contain is 4095.  

  2. Call CTX_DDL.CREATE_PREFERENCE and specify the GENERIC STOP LIST Tile to create the preference.

    For example:

    begin
      ctx_ddl.set_attribute('STOP_WORD', 'OF', 1);
      ctx_ddl.set_attribute('STOP_WORD', 'TO', 2);
      ctx_ddl.set_attribute('STOP_WORD', 'A', 3);
      . . .
      . . .
      . . .
      ctx_ddl.set_attribute('STOP_WORD', 'NO', 90);
      ctx_ddl.set_attribute('STOP_WORD', 'ONLY', 91);
      ctx_ddl.set_attribute('STOP_WORD', 'SO', 92);
      ctx_ddl.set_attribute('STOP_WORD', 'MOST', 93);
      ctx_ddl.set_attribute('STOP_WORD', 'BANK', 94);
      ctx_ddl.set_attribute('STOP_WORD', 'MAY', 95);
      ctx_ddl.set_attribute('STOP_WORD', 'INTO', 96);
      ctx_ddl.set_attribute('STOP_WORD', 'ANY', 97);
      ctx_ddl.set_attribute('STOP_WORD', 'GOVERNMENT', 98);
      ctx_ddl.create_preference('MY_STOPLIST',
                                'My list of stop words',
                                'GENERIC STOP LIST');
    end;
    

Creating Filter Preferences

When creating Filter preferences, the following considerations will determine which Tiles and attributes you use, as well as the values that you specify for each attribute:

Creating a Filter Preference Using Internal Filters

Single-Format Columns:

For a single-format column using one of the internal filters, create a Filter preference that sets the format attribute (BLASTER FILTER Tile) to the format used in your column.

The following example illustrates creating a Filter preference for a column that contains documents only in MS Word for Windows format:

begin
  ctx_ddl.set_attribute('FORMAT','11')
  ctx_ddl.create_preference('WP6_FILT',
                            'WP6 filter',
                            'BLASTER FILTER');
end;
Mixed-Format Columns:

For mixed-format columns using internal filters, create a Filter preference that sets the format attribute (BLASTER FILTER Tile) for the Autorecognize filter:

begin
  ctx_ddl.set_attribute('FORMAT','997')
  ctx_ddl.create_preference('MULTI_FILT',
                            'multiple internal filters',
                            'BLASTER FILTER');
end;

Creating a Filter Preference Using External Filters
Note:

Before a Filter preference that uses external filters can be created, one or more filter executables must be created and stored in the appropriate directory in your Oracle home directory.

You can choose to create your own external filter executables or use the executables provided with ConText.

For the location of the directory for the external filter executables, see the Oracle8 installation documentation specific to your operating system.  

Single-Format Columns:

For a single-format column that uses external filters, create a Filter preference that uses the command attribute (USER FILTER Tile) to specify the filter executable for the format used in your column.

The following example illustrates creating a Filter preference for a column that contains documents in AmiPro format and uses a filter executable named amipro.exe:

begin
  ctx_ddl.set_attribute('COMMAND','amipro.exe')
  ctx_ddl.create_preference('AMIPRO_FILT',
                            'amipro external filter',
                               'USER FILTER');
end;
Mixed-Format Columns:

For a mixed-format column that uses external filters only or external and internal filters, create a Filter preference that sets the executable attribute (BLASTER FILTER Tile) once for each of the external filters you want to use in your column.

Note:

The EXECUTABLE attribute requires that you specify a format ID which identifies the document format supported by the filter executable.

For a complete list of format IDs for document formats, see "Supported Formats for Mixed-Format Columns" in Chapter 10, "ConText Data Dictionary".  

The following example illustrates creating a Filter preference for a column that contains documents in AmiPro, PDF (Adobe Acrobat), and WordPerfect 6.0 formats:

begin
  ctx_ddl.set_attribute('EXECUTABLE', 19,'amipro.sh', 1)
  ctx_ddl.set_attribute('EXECUTABLE', 57,'acrobat.sh', 2)
  ctx_ddl.create_preference('MULT_FILT',
                            'multiple ext/int filters',
                            'BLASTER FILTER');
end;

Note:

It is not necessary to explicitly specify the filter executable for WordPerfect 6.0, because ConText provides an internal filter for WordPerfect 6.0.

When the Filter preference for a column uses the executable attribute (BLASTER FILTER Tile), ConText uses internal filters for all supported formats, unless an external filter is explicitly specified in the preference.  

Deleting a Preference

To delete a preference from the ConText data dictionary, use the PL/SQL procedure CTX_DDL.DROP_PREFERENCE.

For example:

execute ctx_ddl.drop_preference('PUB_DOCS')

To use DROP_PREFERENCE, you need to specify only the name (in this example, pub_docs) of the preference that you want to drop.

Note:

If a preference is used in a policy, the policy must be deleted from the ConText data dictionary before the preference can be deleted.  

Managing Policies

This section provides details for using the CTX_DDL PL/SQL package to perform the following policy administration tasks:

In addition, this section describes the following tasks for specific types of policies:

Creating a Column Policy

To create a column policy for text indexing, use the PL/SQL procedure CTX_DDL.CREATE_POLICY.

For example:

begin
  ctx_ddl.create_policy (policy_name     => 'DOC_POL',
                         colspec       => 'DOCS.ARTICLES',
                         textkey       => 'PK',
                         dstore_pref   => 'PUB_DOCS',
                         engine_pref   => 'DOC_ENGINE',
                         filter_pref   => 'WORD6',
                         lexer_pref    => 'DOC_LINK',
                         wordlist_pref => 'CTXSYS.SOUNDEX',
                         stoplist_pref => 'MINI_STOP_LIST');
end;

In this example, the name of the policy is doc_pol. The policy does not have a description, nor does it use a source policy. The text column (specified by colspec) is articles in a table named docs. The textkey for the table is pk.

The preferences used in the policy are all user-owned policies, except for the Wordlist preference, which uses the predefined preference named SOUNDEX.

Note:

The column name in colspec must include the table name, using the following syntax: table_name.column_name.

Also, it is not necessary to specify a preference for the Compressor category. ConText uses the predefined NULL Compressor preference as the default.  

Preferences and Policies in other Schemas

In a policy, you can use preferences owned by other users; however, you must specify the fully qualified name of the preference. For example, to specify a preference owned by CTXSYS, such as the SOUNDEX preference, use the following syntax: CTXSYS.pref_name (e.g. CTXSYS.soundex).

In addition, if you use a source policy in a policy, you can specify either your template policies or the CTXSYS-owned template policies; however, you must specify the fully-qualified name of the template policy.

Creating a Theme Indexing Policy

To create a theme indexing policy, use the PL/SQL procedure CTX_DDL.CREATE_POLICY and specify either your own theme lexer preference or the predefined preference (THEME_LEXER) provided by ConText.

The following example illustrates how to create a policy identical to the previous policy example, except that my_theme_pref, the theme lexer preference created in the theme lexer example, is used in place of doc_link, which is a preference that calls the basic (indexing) lexer:

begin
  ctx_ddl.create_policy (policy_name   => 'DOC_POL',
                         colspec       => 'DOCS.ARTICLE',
                         textkey       => 'PK',
                         dstore_pref   => 'PUB_DOCS',
                         engine_pref   => 'DOC_ENGINE',
                         filter_pref   => 'WORD6',
                         lexer_pref    => 'MY_THEME_PREF',
                         wordlist_pref => 'CTXSYS.SOUNDEX',
                         stoplist_pref => 'MINI_STOP_LIST');
end;

When index creation is requested for this theme indexing policy, the theme lexer generates a theme index that can be used to perform theme queries.

See Also:

For more information about theme lexer preferences, see "Creating a Theme Lexer Preference" in this chapter.

For more information about theme indexes and queries, see "Theme Indexes" in Chapter 4, "Text Concepts".  

Using Composite Textkeys in a Policy

To create a policy that uses a composite textkey, use the PL/SQL procedure CTX_DDL.CREATE_POLICY; however, when you specify the textkey for the column, reference each of the primary or unique key columns (up to 16) that constitute the composite textkey for the column.

For example:

begin
  ctx_ddl.create_policy (policy_name   => 'DOC_POL',
                         colspec       => 'DOCS.ARTICLE',
                         textkey       => 'AUTH,TITLE',
                         dstore_pref   => 'PUB_DOCS',
                         engine_pref   => 'DOC_ENGINE',
                         filter_pref   => 'WORD6',
                         lexer_pref    => 'DOC_LINK',
                         wordlist_pref => 'CTXSYS.SOUNDEX',
                         stoplist_pref => 'MINI_STOP_LIST');
end;

In this example, the textkey for the ARTICLES column is a composite textkey consisting of the columns AUTH and TITLE in the DOCS tables. The names of the textkey columns are separated by commas and are registered in the ConText data dictionary in the order in which they are specified.

Note:

There is a 256 character limit, including the comma separators, on the combined length of the column names in a composite textkey.

Also, there is a 256 character limit on the combined length of the columns in a composite textkey.

For more information about these limits, see "Composite Textkeys" in Chapter 4, "Text Concepts".  

Creating a Template Policy

To create a template policy, use the PL/SQL procedure CTX_DDLCREATE_TEMPLATE_POLICY.

For example:

begin
  ctx_ddl.create_template_policy (policy_name   => 'TEMPLATE_POL',
                                  dstore_pref   => 'PUB_DOCS',
                                  engine_pref   => 'DOC_ENGINE',
                                  filter_pref   => 'WORD6',
                                  lexer_pref    => 'DOC_LINK',
                                  wordlist_pref => 'SOUNDEX_YES',
                                  stoplist_pref => 'MINI_STOP_LIST');
end;

In this example, the name of the policy is TEMPLATE_POL. The preferences for the policy are as specified above. If TEMPLATE_POL is specified as a source policy when creating a new policy, the preferences for TEMPLATE_POL are copied to the new policy.

Note:

You can also use CTX_DDL.CREATE_POLICY to create a template policy. When you call CREATE_POLICY, do not specify a value for colspec.  

Modifying a Policy

To modify a policy, use the PL/SQL procedure CTX_DDL.UPDATE_POLICY.

Note:

If a policy has been used to create a index for the text column in the policy, the index must be dropped before the policy can be updated.

In addition, you cannot modify the attributes for a policy; you can only modify the description and preferences for a policy.  

For example:

begin
  ctx_ddl.update_policy (policy_name   => 'DOC_POL',
                         filter_pref   => 'HTML_DOC',
                         wordlist_pref => 'CTXSYS.NO_SOUNDEX');
end;

In this example, a Filter preference named html_doc replaces the existing preference for the Filter category, while a Wordlist preference named soundex_no replaces the existing preference for the Wordlist category.

Deleting a Policy

To delete a policy from the ConText data dictionary, use the PL/SQL procedure CTX_DDL.DROP_POLICY.

For example:

execute ctx_ddl.drop_policy ('DOC_POL')

To use DROP_POLICY, you only need to specify the name (in this example, doc_pol) of the policy that you want to drop.

Note:

If a policy has been used to create a index for the text column in the policy, the index must be dropped before the policy can be deleted.  

Managing Indexes

This section provides details for using the CTX_DDL PL/SQL package to perform the following indexing tasks:

In addition, this section discusses the following topics:

Creating an Index

To create a ConText index (theme or text), use the CTX_DDL.CREATE_INDEX procedure.

The only argument required for CREATE_INDEX is the name of the policy for the text column to be indexed.

For example:

execute ctx_ddl.create_index('DOC_POL')

In this example, CREATE_INDEX creates an index for the text column defined in a policy named doc_pol.

Note:

During indexing, ConText creates Oracle indexes for the index tables using the temporary tablespace for CTXSYS. To ensure successful creation of the Oracle indexes, the temporary tablespace for CTXSYS must have enough space to store the temporary segments used in creating the Oracle indexes.

The temporary tablespace for CTXSYS is defined during installation of ConText.

For more information about defining the temporary tablespace for CTXSYS, see the Oracle8 installation documentation specific to your operating system.  

ConText Indexing in Parallel

You can optionally include a numeric value in the argument string for CTX_DDL.CREATE_INDEX to specify the number of ConText servers used for parallel indexing.

For example:

execute ctx_ddl.create_index('DOC_POL', 4)

In this example, CREATE_INDEX uses the first four available ConText servers with the DDL personality to create an index in parallel for the text column defined in the doc_pol policy.

Note:

The value you specify for parallel creation of ConText indexes cannot exceed the number of ConText servers currently running with the DDL personality. If you specify more ConText servers than the number of servers running, CREATE_INDEX will not execute.  

See Also:

For more information about ConText indexing in parallel, see "Indexing in Parallel" in Chapter 4, "Text Concepts".

For more information about the benefits of parallel indexing, see "Parallel Processing" in Chapter 8, "Tuning ConText".  

Parallel Creation of Oracle Indexes

ConText indexing in parallel does not automatically cause the Oracle indexes on the ConText index tables to be created in parallel.

To have Oracle8 create Oracle indexes in parallel, the parallel query option for Oracle8 must be installed. In addition, a value must be specified for the PARALLEL clause used in the CREATE INDEX command.

To specify a value for the PARALLEL clause used in the CREATE INDEX command for the Oracle index created on the token table in the ConText index, use the i1i_other_params attribute (GENERIC ENGINE Tile) in the Engine preference for the column policy.

To set the PARALLEL clause for the Oracle indexes created on the other tables in the ConText index, use the kid_other_params, kik_other_params, lix_other_params, sri_other_params, and sei_other_params attributes.

For example:

begin
  ctx_ddl.set_attribute('I1I_OTHER_PARMS', ' PARALLEL 4');
  ctx_ddl.set_attribute('KID_OTHER_PARMS', ' PARALLEL 4');
  ctx_ddl.set_attribute('KIK_OTHER_PARMS', ' PARALLEL 4');
  ctx_ddl.create_preference('PAR_INDEX',
                            'Parallel indexing x 4',
                            'GENERIC ENGINE');
end;

In this example, an Engine preference named par_index is created with a PARALLEL value of 4 for the Oracle indexes created on the token and document mapping tables in ConText indexes.

If the par_index preference is used in a column policy, when a ConText index is created for the policy, four Oracle8 server processes create the indexes in parallel for the token and document mapping tables.

Note:

If you do not set the other_params attributes for the indexes on a particular ConText index table, the value for PARALLEL is derived from the PARALLEL value specified for the CREATE TABLE command used to create the ConText index table.

If no PARALLEL value is specified for the ConText index table, the default is 1.  

See Also:

For more information about the PARALLEL clause in the CREATE INDEX and CREATE TABLE commands, see Oracle8 Server SQL Reference.

For more information about the parallel query option, see Oracle8 Server Tuning.  

Indexing Existing Columns (Hot Upgrade)

ConText does not require you to create new tables or modify existing tables to create indexes for text already stored in a database. If you already have text stored in a column in an existing database, you can use ConText to index the text in the column without changing the structure of the table itself. Once the column has an index, queries can be submitted against the column.

The only requirements are:

The procedure for indexing an existing text column is identical to the procedure for indexing a new text column:

  1. Create preferences (optional)
  2. Create a policy for the column
  3. Create an index using the policy for the column
    See Also:

    For more information about creating preferences and policies, see "Creating a Preference" and "Creating a Column Policy" in this chapter.

    For more information about creating ConText indexes, see "Creating an Index" in this chapter.  

Updating an Index

Once an index is created for a text column, ConText automatically updates the index each time a document (row) is added, deleted, or modified in the table.

In addition, the index can be manually updated for a single document using CTX_DML.REINDEX.

Dropping an Index

To drop an existing index from the data dictionary, use the PL/SQL procedure CTX_DDL.DROP_INDEX.

For example:

execute ctx_ddl.drop_index ('DOC_POL')

In this example, the index and associated tables for doc_pol are deleted from the database. If you wanted to perform subsequent text queries against the text column for doc_pol, the index for the column in doc_pol must be recreated using CTX_DDL.CREATE_INDEX.

Optimizing an Index

Index optimization can be used to help reduce the size of ConText indexes, as well as update the indexes to reflect deleted/modified documents.

To optimize an index in the data dictionary, use the PL/SQL procedure, CTX_DDL.OPTIMIZE_INDEX.

For example:

execute ctx_ddl.optimize_index('DOC_POL', ctx_ddl.defragment_to_new_table);

In this example, the optimization method used for the ConText index for doc_pol is defragment_to_new_table. This method uses a second, mirror ConText index table to compact the index fragments for all indexed terms with multiple fragments and remove references from the index strings for all deleted/modified documents.

See Also:

For more information about ConText index optimization, see "Index Optimization" in Chapter 4, "Text Concepts"  

Parallel Optimization

Similar to index creation, index optimization can be performed in parallel. To perform parallel index optimization, specify a degree of parallelism when calling the OPTIMIZE_INDEX procedure.

For example:

begin
  ctx_ddl.optimize_index(policy_name => 'DOC_POL',
                         optyp       => ctx_ddl.defragment_to_new_table,
                         parallel    => 4);
end;

In this example, OPTIMIZE_INDEX is called for doc_pol with an optimization method of defragment_to_new_table and degree of parallelism of 4.

Note:

The parallel issues for Oracle index creation on ConText index tables apply to ConText index optimization as well.

For more information about the issues regarding parallel index creation, see "ConText Indexing in Parallel" in this chapter.  

Resuming Index Creation/Optimization

If an index operation (creation or optimization) fails, you can use the PL/SQL procedure CTX_DDL.RESUME_FAILED_INDEX to resume the operation once the reason for the failure has been determined and corrected/removed.

For example:

execute ctx_ddl.resume_failed_index('DOC_POL')

In this example, text DDL operation is the default (index creation) and is resumed for the text column for the DOC_POL policy.

You can also choose to start the index operation over from the beginning using CTX_DDL.CREATE_INDEX or CTX_DDL.OPTIMIZE_INDEX.

You can view the index log in the System Administration tool or through the CTX_INDEX_LOG view to determine when and where the index operation failed.

The log also can be used to determine whether to resume the operation or simply start the operation over, based on the stage at which the operation failed and/or the percentage of the operation completed before failure.

Managing Thesauri

This section provides details for using the CTX_THES PL/SQL package and/or ctxload to perform the following indexing tasks:

Creating a Thesaurus

To create a thesauri, use the PL/SQL function CTX_THES.CREATE_THESAURUS or use the ctxload command-line utility.

Note:

CREATE_THESAURUS creates a thesaurus with no entries.

ctxload creates a thesaurus using a thesaurus import file. The file can contain thesaurus entries or can be empty.

To add entries to a thesaurus, you must use CTX_THES.CREATE_PHRASE.  

The following SQL*Plus example creates an empty thesaurus named tech_thes using CREATE_THESAURUS:

variable thesid number
execute :thesid := ctx_ddl.create_thesaurus('tech_thes')

The following command-line example creates a thesaurus named science_thes using ctxload:

ctxload -user ctxdev/passwd -thes -name science_thes -file sci_terms.txt

In this example, the owner of the thesaurus is an Oracle user named ctxdev. The -thes argument specifies that ctxload is used to create/import a thesaurus. The name of the thesaurus import file is sci_terms.txt.

See Also:

For a complete description of ctxload requirements and options, as well as the structure and syntax of the thesaurus import file, see "ctxload Utility" in Chapter 9, "Executables and Utilities".  

Creating/Updating a Thesaurus Entry

To create a entry in an existing thesaurus or update an existing entry, use the PL/SQL function CTX_THES.CREATE_PHRASE. The only update allowed for an existing entry is the definition of a new relationship between the phrase in the entry and another phrase in the thesaurus.

Suggestion:

Because the relationships between terms in a thesaurus entry can be complex, updating entries using CREATE_PHRASE is not recommended.

If possible, use the System Administration tool, which provides a graphical representation of thesaurus entries and relationships, to update entries.  

The following SQL*Plus example creates two new phrases (intranet and world wide web) in a thesaurus named tech_thes:

variable phraseid number
execute :phraseid := ctx_ddl.create_phrase('tech_thes','intranet')
execute :phraseid := ctx_ddl.create_phrase('tech_thes','world wide web')

The following SQL*Plus example establishes the phrase intranet as a narrower partitive term for world wide web in tech_thes:

variable phraseid number
execute :phraseid := ctx_ddl.create_phrase('tech_thes','intranet','NTP','world wide web')

Deleting a Thesaurus

To delete an existing thesaurus, use the PL/SQL procedure CTX_THES.DROP_THESAURUS.

For example:

execute ctx_ddl.drop_thesaurus('science_thes')

In this example, a thesaurus named science_thes and all of its entries are deleted from the thesaurus tables.

Creating a Thesaurus Output File

To create an output file containing all the entries for an existing thesaurus, use the ctxload command-line utility.

For example:

ctxload -user ctxdev/passwd -thesdump -name tech_thes -file tech_terms.out

In this example, the owner of the thesaurus is an Oracle user named ctxdev. The -thesdump argument specifies that ctxload is used to create/export a thesaurus output file. The name of the thesaurus import file is tech_terms.out and it is created in the directory from which ctxload is run.

See Also:

For a complete description of ctxload requirements and options, as well as the structure and syntax of the thesaurus import file, see "ctxload Utility" in Chapter 9, "Executables and Utilities".  




Prev

Next
Oracle
Copyright © 1997 Oracle Corporation.

All Rights Reserved.

Library

Product

Contents

Index