Introduction: Raster to Vector conversion in a Local, Exact and Near optimal manner

Error processing SSI file

1. Introduction.

At the SAGIS conference on Geographic Information Systems held in Pietermaritzburg in July 1989, it was made clear that the majority of time, effort and expense in Geographic Information Systems, (GIS), is in the data capture phase. Remotely sensed data provides, relatively cheaply and rapidly, an already digitised form of landcover information. However, remotely sensed data is usually in raster (grid) form, and most GIS's work with vector polygon data. Thus remotely sensed data must be converted to GIS format before it can be used.

Figure 1 displays a very broad overview of the production cycle of an image. Firstly the scene is captured by one of the satellites, (eg LANDSAT 5), and the scene is converted into a grid of numbers called an image. The image is classified into classes of interest to the user. The classified image is in raster form and must be vectorized by finding a polygon that surrounds each region. These strings of vectors are then smoothed and exported to a GIS.

A GIS, such as ARC/INFO, provides a raster to vector conversion routine. So why write another? There are several motivations :-

The image processing site is a service organization providing a service to clients using GIS's. Thus to fulfil our role we must provide data suitable for immediate consumption, and not requiring many hours of preprocessing to render it digestible to the GIS.
Raster classifications can require very large amounts of storage space, especially if stored in a form easily transportable between different systems.
The algorithm developed here is a local one and thus suitable for processing very large raster images, of the order of 8000 by 7000 pixels.
The output of most raster to vector conversion routines is a set of vectors that outlines each pixel resulting in jagged lines. Consider the process of rasterizing a large vector triangle, as shown in Figure 2, and then vectorizing the resulting raster image, (Figure 3f). The end result may look quite like a triangle but may contain hundreds of tiny vectors instead of the original three.

Figure 2. Rasterizing a triangle.
Smoothing or "thinning" algorithms, which are standard on most GIS packages, can be used to remove the "jaggies", but suffer from the following problems :-
1. Either they are not exact, i.e. a raster to vector conversion routine is exact if and only if rasterizing the resulting vector image, using the same grid, results in exactly the same raster image.
2. Or they are not optimal. A raster to vector conversion routine is optimal if it results in an exact vector representation containing less than or the same number of vectors as all other exact vector representations.
  Figure 3
  
  Raster triangle.
  There are 12 unsmoothed vectors.
  Smoothing.
  Result is NOT the same as original. Right and bottom sides now match raster
3. The vector end points are often constrained to lie only on pixel corners.
4. The smoothing algorithm is general and is divorced from the vectorizing algorithm, thus it cannot make use of considerable information obtainable from contemplating the nature of the output of vectorizing procedures.
Figure 4 Using thinning to smooth a triangle.
Two assumptions need to be made before continuing :-
- The features that make up our classes partition the earth into polygonal regions. This polygonal earth model, although it is only a poor approximation, is so universally used, (without mention), in mapping, that I will unabashedly adopt it henceforth.
- The rasterization / classification process works in the Westminster style. I.e. Each pixel is assigned the class value of the class on the ground which occupies more of the pixel area than any other single class.

The raster to vector conversion process can be divided into three sections, the first of which, called the Spaghetti Machine, produces an outline of the pixels of each region. The second parses these outlines into line segments, and the third smooths these in an exact manner. But first a chapter on the programming environment.

Previous Next Contents

Error processing SSI file