Oracle8i interMedia Text Migration
Release 8.1.5

A67845-01

Library

Product

Contents

Index

Prev Next

7
Document Presentation

This chapter describes how to migrate document presentation. The following topics are covered:

Highlighting

In interMedia Text query applications, you can present selected documents with query terms highlighted for text queries or with themes highlighted for ABOUT queries.

You can generate three types of output associated with highlighting: a marked-up version of the document, a plain text version of the document (filtered output), and highlight offset information for the document.

In pre-8.1.5, you used the procedure CTX_QUERY.CTX_HIGHLIGHT the three types of output listed above, namely a marked-up version of the document, a plain text version of the document (filtered output), and highlight offset information for the document.

In interMedia Text 8.1.5, these three types of output are generated by three different procedures in the CTX_DOC (document services) package. In addition, you can get plain text and HTML versions for each type of output.

The result tables you use to store this output in 8.1.5.5 are also different from pre-8.1.5 result tables.

In interMedia Text 8.1.5, the output for theme highlighting is different from what is was in pre-8.1.5. In pre-8.1.5, the system highlighted paragraphs in the document that best represented the query. In interMedia Text 8.1.5, individual themes, which can be words or phrases, are highlighted.

Pre-8.1.5 Method

Use CTX_QUERY.HIGHLIGHT to obtain highlight information, marked-up documents, and filtered documents.

For example, to highlight all the occurrences of the term dog in a document identified by textkey 14, issue the following statement:

ctx_query.highlight (
       cspec=> 'text_policy',
       textkey => '14', 
       query => 'dog', 
       id=> 14, 
       hightab => 'highlight_ascii', 
       mutab   => 'mu_ascii' );

This example stores the offset information in the HIGHTAB table and the highlighted marked-up document in the MU_ASCII table.

New 8.1.5 Solutions

Text highlighting

For text highlighting, the behavior is same as in pre-8.1.5. You supply the query, and Oracle highlights words in document that satisfy the query. You can obtain plain-text or HTML highlighting.

Theme Highlighting

For theme queries, interMedia Text 8.1.5 procedures highlight and markup words or phrases that best represent the theme query. This is behavior is different from pre-8.1.5 where paragraphs are highlighted for theme queries.

Highlight Procedure

Highlight offset information is useful for when you write your own custom routines for displaying documents.

To obtain highlight offset information, use the CTX_DOC.HIGHLIGHT procedure. This procedure takes a query and a document, and returns highlight offset information for either plaintext or HTML formats.

With offset information, you have the freedom to highlight with different font types or colors rather than using the standard plain text markup obtained from CTX_DOC.MARKUP.

See Also:

For more information about using CTX_DOC.HIGHLIGHT, see its specification in the Oracle8i interMedia Text Reference.  

Markup Procedure

The CTX_DOC.MARKUP procedure takes a document reference and a query, and returns a marked-up version of the document. The output can be either marked-up plaintext or marked-up HTML.

In 8.1.5, you can customize the markup sequence for HTML navigation.

See Also:

For more information about CTX_DOC.MARKUP, see its specification in the Oracle8i interMedia Text Reference.  

Filter Procedure

When documents are stored in their native formats such as Microsoft Word, you can use the filter procedure CTX_DOC.FILTER to obtain either a plain text or HTML version of the document.

See Also:

For more information about CTX_DOC.FILTER, see its specification in the Oracle8i interMedia Text Reference.  

Obtaining List of Themes, Gists, and Theme Summaries

The following changes have been made in 8.1.5:

The following table describes list of themes, Gists, and theme summaries. Their definitions have not changed in 8.1.5:

Table 7-1
Output Type  Description 

List of Themes  

A list of the main concepts of a document.

You can generate list of themes where each theme is a single word or phrase or where each theme is a hierarchical list of parent themes.  

Gist  

Text in a document that best represents what the document is about as a whole.  

Theme Summary  

Text in a document that best represents a given theme in the document.  

Pre-8.1.5 Method

Creating Output Tables

Before you generate list of themes, theme summaries, or Gists, you must create result table to store the CTX_LING output.

To create a theme table called CTX_THEMES to store the list of themes from REQUEST_THEMES, issue the following SQL statement:

    create table ctx_themes (
        cid        number,
        pk         varchar2(64),
        theme      varchar2(2000),
        weight     number);

To create a Gist table called CTX_GIST to store the Gist or theme summaries from REQUEST_GIST, issue the following SQL statement:

    create table ctx_gist (
        cid        number,
        pk         varchar2(64),
        pov        varchar2(80),
        gist       long); 

List of Themes

Use CTX_LING.REQUEST_THEMES to generate themes.

Example
The following anynomous PL/SQL block generates a list of themes for document 
20 by calling CTX_LING.REQUEST_THEMES and then CTX_LING.SUBMIT.

declare handle number;
begin
ctx_ling.request_themes('CTXSYS.DOC_POLICY','20','CTX_THEMES');
handle := ctx_ling.submit; 
end;

Theme Summaries and Gists

Use CTX_LING.REQUEST_GIST to generate theme summaries and gists.

Example

The following anonymous PL/SQL block generates a theme summary for document 20 about the theme of insects. The theme summary is generated by calling CTX_LING.REQUEST_GIST and then CTX_LING.SUBMIT.

declare handle number;
begin
ctx_ling.request_gist('CTXSYS.DOC_POLICY','20','CTX_GIST',
                      'PARAGRAPH', 'insects');
handle := ctx_ling.submit; 
end;

Full Theme Output

You can obtain a list of themes where each element in the list is a hierarchical list of parent themes. To do so, issue the following statements:

SQL> exec ctx_ling.set_full_themes(TRUE)  
SQL> exec ctx_ling.request_themes('ctx_thidx', pk, 'ctx_themes')  
SQL> exec ctx_ling.submit(200)  

Changing Gist Size

You change the default size of Gists using the ConText Workbench administration tool.


New 8.1.5 Solution

The CTX_LING package is no longer supported. The Gist and theme generation procedures are in the CTX_DOC package. No need to explicitly submit document services requests, since requests are synchronous. No servers need to be running.

List of Themes

A list of themes is a list of the main concepts in a document.

Use the CTX_DOC.THEMES procedure to generate lists of themes.

See Also:

To learn about the command syntax for CTX_DOC.THEMES, see Oracle8i interMedia Text Reference.  

Theme Table

To create a theme table:

create table ctx_themes (query_id number, 
                         theme varchar2(2000), 
                         weight number);
Single Themes

To obtain a list of themes where each element in the list is a single theme, issue:

begin
ctx_doc.themes('newsindex',34,'CTX_THEMES',1,full_themes => FALSE);
end;
Full Themes

To obtain a list of themes where each element in the list is a hierarchical list of parent themes, issue:

begin
ctx_doc.themes('newsindex',34,'CTX_THEMES',1,full_themes => TRUE);
end;

Gist and Theme Summary

The definition of a Gist and theme summary has not changed for 8.1.5. A Gist is the text of a document that best represents what the document is about as a whole. A theme summary is the text of a document that best represents a single theme in the document.

In 8.1.5, you can specify the size of the Gist or theme summary when you call the procedure.

Use the procedure CTX_DOC.GIST to generate Gists and theme summaries.

See Also:

To learn about the command syntax for CTX_DOC.GIST, see Oracle8i interMedia Text Reference.  

Gist Table

To create a gist table:

create table ctx_gist (query_id  number,
                       pov       varchar2(80), 
                       gist      CLOB);
Gists

The following example returns a default sized paragraph level Gist for document 34:

begin
ctx_doc.gist('newsindex',34,'CTX_GIST',1,'PARAGRAPH', pov =>'GENERIC');
end;

The following example generates a non-default size Gist of ten paragraphs:

begin
ctx_doc.gist('newsindex',34,'CTX_GIST',1,'PARAGRAPH', pov =>'GENERIC',        
numParagraphs => 10);
end;

The following example generates a Gist whose number of paragraphs is ten percent of the total paragraphs in document:

begin 
ctx_doc.gist('newsindex',34,'CTX_GIST',1, 'PARAGRAPH', pov =>'GENERIC', 
maxPercent => 10);
end;
Theme Summary

The following example returns a theme summary on the theme of insects for document with textkey 34. The default Gist size is returned.

begin
ctx_doc.gist('newsindex',34,'CTX_GIST',1, 'PARAGRAPH', pov => 'insects');
end;




Prev

Next
Oracle
Copyright © 1999 Oracle Corporation.

All Rights Reserved.

Library

Product

Contents

Index