Oracle8i Spatial User's Guide and Reference
Release 8.1.5

A67295-01

Library

Product

Contents

Index

Prev Next

11
Loading Spatial Data

This chapter describes how to load spatial data into a database, including storing the data in a table and creating a spatial index for it. This chapter refers to the relational Spatial model only.

11.1 Load Model

There are two steps involved in loading raw data into a spatial database such that it can be queried efficiently:

  1. Loading the data into spatial tables

  2. Creating or updating the index on the spatial tables

Table 11-1 through Table 11-4 show the format of the tables or views needed to store and index spatial data. Note that these tables show the relational schema.

Table 11-1 <layername>_SDOLAYER
SDO_ORDCNT   SDO_LEVEL   SDO_NUMTILES   SDO_MAXLEVEL   SDO_COORDSYS  
<number>   <number>   <number>   <number>   <varchar>  
Table
Table 11-2 <layername>_SDODIM Table or View
SDO_DIMNUM   SDO_LB   SDO_UB   SDO_TOLERANCE   SDO_DIMNAME  
<number>   <number>   <number>   <number>   <varchar>  
Table 11-3 <layername>_SDOGEOM Table or View
SDO_GID   SDO_ESEQ   SDO_ETYPE   SDO_SEQ   SDO_X1   SDO_Y1   ...   SDO_Xn   SDO_Yn  
<number>   <number>   <number>   <number>   <number>   <number>   ...   <number>   <number>  
Table 11-4 <layername>_SDOINDEX Table
SDO_GID   SDO_CODE   SDO_MAXCODE   SDO_GROUPCODE   SDO_META  
<number>   <raw>   <raw>   <raw>   <raw>  

11.2 Load Process

The process of loading data can be classified into two categories:

11.2.1 Bulk Loading

Bulk loading can be used to import large amounts of legacy or ASCII data into a spatial database. Bulk loading is accomplished using SQL*Loader1.

Example 11-1 shows the format of the raw data and control file that would be required to load the data into the SDOGEOM table with the layer name ROADS. You can choose any format of ASCII data as long you can write a SQL*Loader control file to load that data into the tables.

Assume that the ASCII data consists of a file with delimited columns, and separate rows fixed by the limits of the table with the format shown in Example 11-1:

Example 11-1 Raw Data Format

geometry rows:    GID, ESEQ, ETYPE, SEQ, LON1, LAT1, LON2, LAT2

The coordinates in the geometry rows represent the end points of line segments, which taken together, represent a polygon. Example 11-2 shows the control file for loading the data into the geometry table.

Example 11-2 Control File to Load Data into the Geometry Table

LOAD DATA INFILE *                                          
INTO TABLE ROADS_SDOGEOM                                    
FIELDS TERMINATED BY WHITESPACE TRAILING NULLCOLS           
(SDO_GID INTEGER EXTERNAL,                                  
SDO_ESEQ INTEGER EXTERNAL,                                  
SDO_ETYPE INTEGER EXTERNAL,                                 
SDO_SEQ INTEGER EXTERNAL,                                   
SDO_X1 FLOAT EXTERNAL,                                      
SDO_Y1 FLOAT EXTERNAL,                                      
SDO_X2 FLOAT EXTERNAL,                                      
SDO_Y2 FLOAT EXTERNAL)                                      

BEGINDATA  
1 0 3 0 -122.401200   37.805200 -122.401900   37.805200 
1 0 3 1 -122.401900   37.805200 -122.402400   37.805500 
1 0 3 2 -122.402400   37.805500 -122.403100   37.806000 
1 0 3 3 -122.403100   37.806000 -122.404400   37.806800 
1 0 3 4 -122.404400   37.806800 -122.401200   37.805200 
1 1 3 0 -122.405900   37.806600 -122.407549   37.806394 
1 1 3 1 -122.407549   37.806394 -122.408300   37.806300 
1 1 3 2 -122.408300   37.806300 -122.409100   37.806200 
1 1 3 3 -122.409100   37.806200 -122.405900   37.806600 
2 0 2 0 -122.410800   37.806000 -122.412300   37.805800 
2 0 2 1 -122.412300   37.805800 -122.414100   37.805600 
2 0 2 2 -122.414100   37.805600 -122.412300   37.805800 
2 0 2 3 -122.412300   37.805800 -122.410800   37.806000 
3 0 1 0 -122.567474   38.643564 
3 0 1 1 -126.345345   39.345345 


Note that table ROADS_SDOGEOM exists in the schema before attempting the load.

In Example 11-3, the data resides in a single flat file and the data set consists of point, line string, and polygon data. The data uses fixed-position columns and overloaded table rows.

Example 11-3 Raw Data Format

SDO_GID  SDO_ESEQ  SDO_ETYPE  SDO_SEQ  SDO_X1  SDO_Y1  SDO_X2  SDO_Y2

The corresponding control file for this format of input data is shown in Example 11-4

Example 11-4 Control File to Load from a Single Flat File

LOAD DATA INFILE *                                          
INTO TABLE NEW_SDOGEOM  	                                    
(SDO_GID POSITION (1:5) INTEGER EXTERNAL,                       
SDO_ESEQ POSITION (7:10) INTEGER EXTERNAL,                      
SDO_ETYPE POSITION (12:15) INTEGER EXTERNAL,
SDO_SEQ POSITION (17:21) INTEGER EXTERNAL,                      
SDO_X1 POSITION (23:35) FLOAT EXTERNAL,                         
SDO_Y1 POSITION (37:48) FLOAT EXTERNAL,                         
SDO_X2 POSITION (50:62) FLOAT EXTERNAL,                         
SDO_Y2 POSITION (64:75) FLOAT EXTERNAL)  

BEGINDATA      	                                                         
1     0    3    0    -122.401200   37.805200   -122.401900   37.805200 
1     0    3    1    -122.401900   37.805200   -122.402400   37.805500 
1     0    3    2    -122.402400   37.805500   -122.403100   37.806000 
1     0    3    3    -122.403100   37.806000   -122.404400   37.806800 
1     0    3    4    -122.404400   37.806800   -122.401200   37.805200 
1     1    3    0    -122.405900   37.806600   -122.407549   37.806394 
1     1    3    1    -122.407549   37.806394   -122.408300   37.806300 
1     1    3    2    -122.408300   37.806300   -122.409100   37.806200 
1     1    3    3    -122.409100   37.806200   -122.405900   37.806600 
2     0    2    0    -122.410800   37.806000   -122.412300   37.805800 
2     0    2    1    -122.412300   37.805800   -122.414100   37.805600 
2     0    2    2    -122.414100   37.805600   -122.412300   37.805800 
2     0    2    3    -122.412300   37.805800   -122.410800   37.806000 
3     0    1    0    -122.567474   38.643564  
3     0    1    1    -126.345345   39.345345  

11.2.2 Transactional Insert Using SQL

Spatial uses standard Oracle8i tables that can be accessed or loaded with standard SQL syntax. Example 11-5 loads data for a geometry (GID 17) consisting of a polygon with four sides that contains both a hole and a point. Notice that the first coordinate of the polygon (5, 20) is repeated at the end to close the polygon.

Example 11-5 Transactional Insert

INSERT INTO SAMPLE_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, 
                              SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, 
                              SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5)
  VALUES (17, 0, 3, 0, 5, 20, 5, 30, 10, 30, 10, 20, 5, 20);

   -- hole 
INSERT INTO SAMPLE_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, 
                              SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, 
                              SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5)
  VALUES (17, 1, 3, 0, 8, 21, 8, 24, 9, 24, 9, 21, 8, 21);

   -- point
INSERT INTO SAMPLE_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, 
                              SDO_X1, SDO_Y1)
  VALUES (17, 2, 1, 0, 9, 29);
    

The SQL INSERT statement inserts one row of data per call. In Example 11-5, the table had enough columns to store the polygon in a single row. However, if your table had fewer columns (or your polygon had more points), you would have to perform multiple inserts in order to match the table structure; the data would not wrap automatically to the next row. To load a large geometry, repeat the SDO_GID, SDO_ESEQ, and SDO_ETYPE, and increment the SDO_SEQ for each line as shown in Example 11-6.

Example 11-6 Transactional Insert for a Large Geometry

INSERT INTO SAMPLE2_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, 
                              SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3,
                              SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5)
  VALUES (18, 0, 3, 0, 1, 15, 1, 16, 2, 17, 3, 17, 4, 18);

INSERT INTO SAMPLE2_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, 
                              SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3,
                              SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5)
  VALUES (18, 0, 3, 1, 4, 18, 5, 18, 6, 19, 7, 18, 6, 17);

INSERT INTO SAMPLE2_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, 
                              SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3,
                              SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5)
  VALUES (18, 0, 3, 2, 6, 17, 7, 16, 7, 15, 6, 14, 7, 13);

INSERT INTO SAMPLE2_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, 
                              SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3,
                              SDO_Y3, SDO_X4, SDO_Y4, SDO_X5, SDO_Y5)
  VALUES (18, 0, 3, 3, 7, 13, 6, 12, 5, 13, 4, 13, 3, 14);

INSERT INTO SAMPLE2_SDOGEOM (SDO_GID, SDO_ESEQ, SDO_ETYPE, SDO_SEQ, 
                              SDO_X1, SDO_Y1, SDO_X2, SDO_Y2, SDO_X3, SDO_
Y3)
  VALUES (18, 0, 3, 4, 3, 14, 2, 14, 1, 15);

11.3 Index Creation

Once data has been loaded into the spatial tables through either bulk or transactional loading, a spatial index needs to be created on the tables for efficient access to the data.

Create an Oracle table called <layername>_SDOINDEX as follows:

SQL>  create table <layername>_SDOINDEX 
   2  (
   3    SDO_GID number, 
   4    SDO_CODE raw(255)
   5    );

For a bulk load, you can call the SDO_ADMIN.POPULATE_INDEX() procedure once to tessellate the geometry table and add the generated tiles to the spatial index table. The argument to this procedure is simply the name of the layer. The level to which the geometry should be tessellated and whether to use the fixed or the hybrid indexing technique is determined by values in the <layername>_SDOLAYER table.

If data is updated in or deleted from a specific geometry table, you can call SDO_ADMIN.UPDATE_INDEX() to update the index for one SDO_GID. The arguments to this procedure are the name of the layer and the SDO_GID of the designated geometry.

See Chapter 13, "Administrative Functions and Procedures" for a complete description of the SDO_ADMIN.POPULATE_INDEX() and SDO_ADMIN.UPDATE_INDEX() procedures.

11.3.1 Choosing a Tessellation Algorithm

Spatial provides two methods for spatial indexing, fixed and hybrid. Fixed indexing is recommended for the relational Spatial model.

Which tessellation algorithm is used by the SDO_ADMIN.POPULATE_INDEX() and SDO_ADMIN.UPDATE_INDEX() procedures is determined by the values of the SDO_LEVEL and SDO_NUMTILES columns in the <layername>_SDOLAYER table as shown inTable 11-5.

Table 11-5 Choosing a Tessellation Algorithm
SDO_LEVEL  SDO_NUMTILES  Action 

NULL  

NULL  

Error.  

>= 1  

NULL  

Fixed indexing with fixed-size tiles (recommended).  

>= 1  

>= 1  

Hybrid indexing with fixed-size and variable-sized tiles. The SDO_LEVEL column defines the fixed tile size. The SDO_NUMTILES column defines the number of tiles to generate per geometry.  

NULL  

>= 1  

Not supported.  

11.3.2 Spatial Indexing with Fixed-Size Tiles

Oracle recommends using fixed-size cover tiles for indexing a geometry stored using the relational model.

The fixed-size tile algorithm is expressed as a level referring to the number of tessellations performed. To use fixed-size tile indexing, set the SDO_NUMTILES column in the <layername>_SDOLAYER table to NULL and the SDO_LEVEL column to the desired tiling level. The relationship between the tiling level and the resulting size of the tiles is dependent on the domain of the layer.

The domain used for indexing is defined by the upper and lower boundaries of each dimension stored in the <layername>_SDODIM table. A typical domain in a GIS application could be -90 to 90 degrees for latitude, and -180 to 180 degrees for longitude2, as represented in Figure 11-1.

Figure 11-1 Sample GIS Domain

If the SDO_LEVEL column is set to 1, then the tiles created by the indexing mechanism are the same size as tiles at the first level of tessellation. Each tile would be 180 degrees by 90 degrees as shown in Figure 11-2.

Figure 11-2 Fixed-Size Tiling at Level 1

The formula for the number of fixed-size tiles is 4n where n is the number of tessellations, stored in the SDO_LEVEL column. Figure 11-3 shows fixed-size tiling at level 2. In this figure, each tile is 90 degrees by 45 degrees.

Figure 11-3 Fixed-Size Tiling at Level 2

The size of a tile can be determined by applying the following formula to each dimension:

length = (upper_bound - lower_bound) / 2 ^ sdo_level

The length refers to the length of the tile along the specified dimension. Applying this formula to the tiling shown in Figure 11-3 yields the following sizes:

length for dimension X = (180 - (-180) ) / 2^2
                       = (360)  / 4
                       = 90
length for dimension Y = (90 - (-90) ) / 2^2
                       = (180) / 4
                       = 45

Thus, at level 2 the tiles are 90x45 degrees in size. As the number of levels increases, the tiles become smaller and smaller. Smaller tiles provide a more precise fit of the tiles over the geometry being indexed. However, because the number of tiles generated is unbounded, you must take into account the performance implications of using higher levels. The SDO_TUNE.ESTIMATE_TILING_LEVEL() function can be used to determine an appropriate level for indexing with fixed-size tiles. See Chapter 14 for a description of this procedure.

Besides the performance aspects related to selecting a fixed-size tile, tessellating the geometry into fixed-size tiles might have benefits related to the type of data being stored, such as using tiles sized to represent 1-acre farm plots, city blocks, or individual pixels on a display. Data modeling is an important part any database design, and is essential in a spatial database where the data often represents actual physical locations.

In the following example, assume that data has been loaded into a layer called ROADS, and you want to create a spatial index on that data. This is accomplished by first creating a table ROADS_SDOINDEX and invoking the following procedure:

sdo_admin.populate_index('ROADS');

The value in the SDO_LEVEL column of the ROADS_SDOLAYER table can be used as a tuning parameter while tessellating objects. Increasing the level increases the number of tiles to provide a more precise fit of the tiles over the object. See the description of the ESTIMATE_TILING_LEVEL()function in Chapter 14 for information on estimating the tiling level in several different ways.

After the SDO_ADMIN.POPULATE_INDEX() procedure has been called to fill the spatial index, you should also create a concatenated index using the SDO_CODE and SDO_GID columns. The concatenated index helps the join to the <layername>_SDOGEOM table during a query. The SDO_GID values from the primary filter will come from the index instead of from the table.

If a geometry with an SDO_GID 5944 has been added to the spatial tables, update the index with the following procedure:

sdo_admin.update_index('ROADS', 5944);

Like the CREATE INDEX statement in SQL, the SDO_ADMIN.POPULATE_INDEX() procedure performs an implicit commit. The SDO_ADMIN.UPDATE_INDEX() procedure, however, does not. Therefore, SDO_ADMIN.UPDATE_INDEX() transactions can be rolled back.

The SDO_ADMIN.POPULATE_INDEX() procedure operates as a single transaction. To reduce the amount of rollback space required to execute this procedure, you can write a routine that loops and calls SDO_ADMIN.UPDATE_INDEX(). See Section A.3.1, "cr_spatial_index.sql Script" for more information.

11.3.3 Hybrid Spatial Indexing with Fixed-Size and Variable-Sized Tiles

This section describes a variation on the linear quadtree (Morton encoding) scheme that uses both fixed-size and variable-sized tiles as a spatial indexing mechanism. the terms hybrid indexing, hybrid tiling, and hybrid tessellation will be used interchangeably in this section. Spatial indexing with purely variable-sized tiles is not recommended for production systems and is not supported in this release.

To use hybrid tiling, the SDO_LEVEL and SDO_NUMTILES columns in the <layername>_SDOLAYER table must contain valid values. That is, both SDO_LEVEL and SDO_NUMTILES must be greater than one.

The SDO_NUMTILES column determines the number of tiles that will be used to cover a geometry being indexed. Typically, this value is small, such as 4 or 8 tiles. However, the larger the number of tiles, the better the tiles will fit the geometry being covered. This increases the selectivity of the primary filter, but also increases the number of index entries per geometry (See Section 12.3.2 and Section 12.3.3 for a discussion of primary and secondary filters.) The SDO_NUMTILES value should be larger for long linear spatial entities, such as major highways or rivers, than for area-based spatial entities such as county or state boundaries.

The SDO_LEVEL column determines the size of the fixed tiles used in hybrid indexing. Setting the proper SDO_LEVEL value may appear more like art than science. Performing some simple data analysis and testing, however, puts the process back in the realm of science. One approach would be use the SDO_TUNE.ESTIMATE_TILING_LEVEL() function to determine an appropriate starting SDO_LEVEL value, and then compare the performance with slightly higher or lower values. This, and other techniques, are described in Appendix A.

Assume that the ROADS layer has already been loaded. Furthermore, assume that the there is one row with valid values for the ROADS_SDOLAYER.SDO_LEVEL and ROADS_SDOLAYER.SDO_NUMTILES columns. To create the spatial index on ROADS, first create a table ROADS_SDOINDEX with appropriate columns:

SQL>  create table <layername>_SDOINDEX 
   2  (
   3    SDO_GID number, 
   4    SDO_CODE raw(255),
   5    SDO_GROUPCODE raw(255),
   6    SDO_MAXCODE raw(20),
   7    SDO_META raw(255),
   8    );

Then, invoke SDO_ADMIN.POPULATE_INDEX('ROADS') to build the spatial index.

After the SDO_ADMIN.POPULATE_INDEX() procedure has been called to fill the spatial index, you should also create a concatenated index on the SDO_CODE and SDO_GID columns. The concatenated index helps the join to the
<layername>_SDOGEOM table during a query. The SDO_GID values from the primary filter will come from the index instead of from the table.

If a geometry with an SDO_GID 5944 has been added to the spatial tables, update the index with the following procedure:

sdo_admin.update_index('ROADS', 5944);

Like the CREATE INDEX statement in SQL, the SDO_ADMIN.POPULATE_INDEX() procedure performs an implicit commit. The SDO_ADMIN.UPDATE_INDEX() procedure, however, does not. Therefore, SDO_ADMIN.UPDATE_INDEX() transactions can be rolled back.

The SDO_ADMIN.POPULATE_INDEX() procedure operates as a single transaction. To reduce the amount of rollback space required to execute this procedure, you can write a routine that loops and calls SDO_ADMIN.UPDATE_INDEX(). See Section A.3.1, "cr_spatial_index.sql Script" for more information.


1 See the Oracle8i Utilities User's Guide for information on SQL*Loader.
2 The transference of the domain onto a sphere or Mercator projection is left to GIS (or other) application programmers. Spatial treats the domain as a conventional X by Y rectangle.



Prev

Next
Oracle
Copyright © 1999 Oracle Corporation.

All Rights Reserved.

Library

Product

Contents

Index