Error processing SSI file
1. Introduction.
At the SAGIS conference on Geographic Information Systems held in
Pietermaritzburg in July 1989, it was made clear that the majority of
time, effort and expense in Geographic Information Systems, (GIS), is
in the data capture phase. Remotely sensed data provides, relatively
cheaply and rapidly, an already digitised form of landcover
information. However, remotely sensed data is usually in raster (grid)
form, and most GIS's work with vector polygon data. Thus remotely
sensed data must be converted to GIS format before it can be used.
Figure 1 displays a very broad overview of the
production cycle of an image. Firstly the scene is captured by one of
the satellites, (eg LANDSAT 5), and the scene is converted into a grid
of numbers called an image. The image is classified into classes of
interest to the user. The classified image is in raster form and must
be vectorized by finding a polygon that surrounds each region. These
strings of vectors are then smoothed and exported to a GIS.
A GIS, such as ARC/INFO, provides a raster to vector conversion
routine. So why write another?
There are several motivations :-
- The image processing site is a service organization providing a
service to clients using GIS's. Thus to fulfil our role we must
provide data suitable for immediate consumption, and not requiring
many hours of preprocessing to render it digestible to the GIS.
- Raster classifications can require very large amounts of storage
space, especially if stored in a form easily transportable between
different systems.
- The algorithm developed here is a local one and
thus suitable for processing very large raster images, of the order of
8000 by 7000 pixels.
- The output of most raster to vector conversion routines is a set
of vectors that outlines each pixel resulting in jagged
lines. Consider the process of rasterizing a large vector triangle, as
shown in Figure 2, and then vectorizing the resulting raster image,
(Figure 3f). The end result may look quite like a triangle but may
contain hundreds of tiny vectors instead of the original three.
Figure 2. Rasterizing a triangle.
- The original scene.
- Original scene with grid overlay.
- Only those grid cells with 50% of area within the triangle.
- The result.
- Smoothing or "thinning" algorithms, which are standard on most
GIS packages, can be used to remove the "jaggies", but suffer from the
following problems :-
- Either they are not exact, i.e. a raster to vector conversion
routine is exact if and only if rasterizing the
resulting vector image, using the same grid, results in exactly the
same raster image.
- Or they are not optimal. A raster to vector conversion routine
is optimal if it results in an exact vector
representation containing less than or the same number of vectors as
all other exact vector representations.
Figure 3
- Raster triangle.
- There are 12 unsmoothed vectors.
- Smoothing.
- Result is NOT the same as original. Right and bottom sides
now match raster
- The vector end points are often constrained to lie only on pixel
corners.
- The smoothing algorithm is general and is divorced from the
vectorizing algorithm, thus it cannot make use of considerable
information obtainable from contemplating the nature of the output of
vectorizing procedures.
Figure 4 Using thinning to smooth a triangle.
- Raster scene.
- Outline.
- 2nd approximation not exact.
- 3rd approx. but more vectors.
- Exact Approximation.
- Uses 6 vectors instead of 3
Two assumptions need to be made before continuing :-
- The features that make up our classes partition the earth into
polygonal regions. This polygonal earth model,
although it is only a poor approximation, is so universally used,
(without mention), in mapping, that I will unabashedly adopt it
henceforth.
- The rasterization / classification process works in the
Westminster style. I.e. Each pixel is assigned the
class value of the class on the ground which occupies more of the
pixel area than any other single class.
The raster to vector conversion process can be divided into three
sections, the first of which, called the Spaghetti
Machine, produces an outline of the pixels of each
region. The second parses these outlines into line segments, and the
third smooths these in an exact manner. But first a chapter on the
programming environment.
Error processing SSI file