Chapter IV 
Processing large hyperspectral data sets for urban area mapping

Chapter IV:1 Introduction


The traditional workflow for airborne hyperspectral data includes the sequential processing steps of (1) system correction by the provider, (2) geometrical rectification, (3) radiometric correction, i.e. the removal of atmospheric influence and normalization of reflectance anisotropy, and (4) derivation of application products (e.g.Richter and Schläpfer, 2002;Schläpfer and Richter, 2002;Schläpfer et al., 2007). The radiometric correction of airborne hyperspectral data has been discussed in Chapter II of this work and a new empirical approach for normalizing reflectance anisotropy in urban imagery was introduced.


In the context of an optimized processing workflow the geometrical rectification, also referred to as geocoding or orthorectification, is of special interest: on the one hand, geocoding is an essential image processing step in remote sensing to provide consistent data sets (Goshtasby, 1988;Toutin, 2004). Geocoded imagery can be combined with other Earth observation data or additional spatial information. This way, results like land cover maps can be assimilated into environmental models (e.g.Wilson et al., 2003;Nichol and Wong, 2005). Moreover, results from image analysis itself can be improved by using additional geocoded data during processing. This might include data from different sensors (e.g.Waske and Benediktsson, 2007) or census data (e.g.Lu and Weng, 2006). On the other hand, geocoding has two negative side effects. At first, not all pixels of the image with map coordinates can be directly mapped from the projected original data. In order to generate spatially continuous data sets, some way of resampling of spectral information from adjacent pixels in the original data is required. At second, the physical size of an image file might be significantly increased, especially in the case of airborne line scanner data: whenever the flight direction is not parallel to one of the axes of the map coordinate system, the image is rotated in the output grid and a high number of no data pixels exist. In addition, it is useful to increase the resolution of the output grid to preserve information during the rotation of the image (Schläpfer and Richter, 2002).

Airborne hyperspectral data is characterized by high spectral information content at relatively high spatial resolution and, hence, large file sizes. Thus, the processing workflow should be optimized with respect to radiometric accuracy and processing times. In this context, Schläpfer et al. (2007) propose shifting the spatial resampling to the very end of the workflow and working in raw scan geometry. Performing the atmospheric correction before geocoding to optimize radiometric consistency has also been suggested by Hill and Mehl (2003). Such alternative workflows appear especially useful to optimize the processing of very large hyperspectral data sets.

In Chapter III of this work a 7277 by 512 pixel image from the Hyperspectral Mapper (HyMap) acquired over Berlin, Germany, is classified using support vector machines (SVM) without previous geocoding. This way, no spectral resampling is performed before classification, but more important memory allocation problems during the classification process are avoided. Results will be used to map impervious surface coverage and therefore need to be geocoded.


In addition to the regular pixel-based approach, the SVM classification in Chapter III is applied to segmented image data. For this purpose, results from image segmentation have been stored in two separate files following the method suggested by Schiefer et al. (2005). Overall accuracy of the segment-based classification before geocoding is generally lower than that of the pixel-based approach, e.g. 83.2% at an average segment size of 13.1 pixels compared to 88.7% at pixel level based on 1253 reference pixels (see Chapter III). However, this accuracy needs to be considered against the background of reducing the file size by a factor of 13.1 compared to the image without geocoding. Processing time of the SVM classification decreases linearly with file size. In comparison to the traditional approach, this factor of 13.1 can be multiplied by the efficiency factor achieved when performing the geocoding towards the end of the processing workflow.

Both the alternative pixel- and segment-based approach are limited to a nearest neighbor (NN) resampling during geocoding, since other interpolation methods cannot be applied to the discrete values of land cover classes or segment indices. The traditional workflow on the other hand offers a variety of interpolation methods. Some of these lead to better results in terms of geometric representation. This is especially the case for line and block-wise features, which dominate urban areas. Differences in the accuracy of maps derived from the different approaches can hardly be predicted, though. In order to assess the usefulness of the alternative workflows, the influence of the resampling methods needs to be quantified and discussed with regard to the decrease in processing times by data reduction.

In the present chapter the geocoding of the pixel-based result and one segment-based classification result from Chapter III is described (Fig. IV-1). The resulting geocoded maps from these alternative and segment-compressed workflows are then compared to maps from the traditional workflow, where a geocoding with bilinear interpolation is performed before the SVM classification. Differences caused by the two interpolation methods are quantified. In addition, maps are compared to results from a field survey to assess the accuracy in map geometry. In this context the following research questions will be discussed to evaluate a possible decision for one of the approaches:


  1. Are there relevant differences in the land cover classification of interpolated pixels between the traditional and the alternative workflow?
  2. Is the lower classification accuracy of the segment-based classification further increased by geocoding in the segment-compressed workflow?

Figure IV-1: Three different workflows for mapping land cover from hyperspectral data. During the pixel-based alternative workflow (top) the geocoding constitutes the last processing step and the increase in physical file size is moved to the end of the workflow. In the traditional workflow (bottom) the SVM classification is performed on the large geocoded data set. The segment-compressed workflow (middle) further decreases the amount of data by separating spatial and spectral information and independently performing geocoding and SVM classification.

Chapter IV:2 Material and methods

Chapter IV:2.1 Image data, preprocessing and classification

To answer the research questions of this chapter, the original HyMap data and classification results from Chapter III are geocoded. The HyMap data was acquired over the city of Berlin on 20 June, 2005. The original spatial resolution of the 7277 by 512 pixel image was 3.9 by 4.5 m at nadir in across- and along-track direction, respectively. Prior to segmentation and classification the data was corrected for atmospheric effects following the approach by Richter and Schläpfer (2002) and normalized to nadir reflectance following the approach introduced in Chapter II. The proposed alternative and segment-compressed workflow require the relief information for the atmospheric processing to be projected into raw scan geometry (Hill and Mehl, 2003). The available software did not allow a processing without geocoding, however. Therefore, the image data was geocoded as described in Section 2.4 for the atmospheric correction and then re-projected into its original dimensions for the segmentation and classification. The few pixels that were not directly mapped during the first forward projection cannot be re-projected. This has no relevance, however, since pixels that were interpolated during re-projection for the image segmentation and classification will again be lost during the second forward projection.


Image segmentation was performed using the approach by Baatz and Schaepe (2000). For this work, segmented data with an average segment size of 13.1 pixels is used. Both, the image with and without segmentation were classified into the five land cover classes vegetation, built-up impervious areas, non built-up impervious areas, pervious areas, and water using SVM classification (see Chapter III).

Chapter IV:2.2 Digital elevation model

A digital elevation model (DEM) that was derived from isolines in the official digital map was available. The DEM's original resolution of 25 m and 0.1 m in horizontal and vertical direction, respectively, was bilinearly resampled to 3.5 m spatial resolution in Universal Transverse Mercator Projection. A digital surface model (DSM) with information on building height was not available.

Chapter IV:2.3 Field data

Parallel to the acquisition of the HyMap data, intensive ground mappings were performed. Using 0.25 m aerial photographs as base images, 9 rectangular plots of approximately 220 by 220 m were continuously mapped on two levels. The first level included the present land cover and surface material at ground. Altogether 21 land use related surface categories and more than 40 materials were differentiated. The second level showed the extent of tree canopy above ground. For this work the ground mapping was overlaid with the tree canopy and generalized to the five land cover classes in order to assess the difference between maps that result from the three different workflows.

Chapter IV:2.4 Parametric geocoding


The geocoding of airborne line scanner data is different to that of image data from satellite platforms. For traditional moderate resolution satellite data, an affine transformation based on a first degree polynomial is sufficient to correct most geometric effects, since the platform is expected to be stable during the acquisition of one scene (Goshtasby, 1988). Information from a DEM might be included to correct elevation induced shifts on rough terrain (Itten et al., 1992;Toutin, 2004). In the case of airborne line scanner data, the platform is usually operated at altitudes below 4000 m above ground and thus exposed to turbulence in the lower troposphere. The outer orientation of the platform is not stable and the acquired image does not represent a regular equidistant grid of pixels in across- and along-track direction. Geometric correction based on a polynomial transformation of the entire image would require an enormous number of ground control points (GCP) and is not feasible (McGwire, 1996).

Therefore, a parametric approach is needed to solve the high frequency distortions in images from airborne line scanners. During acquisition, physical measurements from a differential global positioning system (DGPS) and an inertial navigation system (INS) are recorded. Based on this information and the sensor model, the six classical orientation parameters roll, pitch, yaw, easting, northing, and height for attitude and position of the platform are reconstructed for each scan line. Based on this information and a terrain model the recorded pixels are then individually projected onto a grid with the desired map projection and resolution, the so-called mapping array. The approach bears the potential of complete automation and sub-pixel accuracy when the auxiliary data can be provided with high precision and absolutely calibrated in space. (Schläpfer and Richter, 2002)

According to Schläpfer (2005), critical parameters in the process are:


  1. the synchronization uncertainty: the orientation parameters are independently measured, matched with the oscillation frequency of the line scanner by interpolation and used for all pixels of the corresponding scan line;
  2. offsets in the orientation/position measurements caused by misalignments between the sensor model and the INS or inaccuracies in the DGPS estimates of altitude and true heading of the airplane; and
  3. the quality of the DEM.

The parametric geocoding in this work was performed using the software package for Parametric Geocoding (PARGE), version 2.2 (Schläpfer, 2005). PARGE uses a statistical approach based on a number of GCPs to correct possible misalignment of sensor geometry and the INS. For every GCP, the difference between the real and the estimated GCP position is calculated and iteratively minimized by determining individual offsets for roll, pitch, heading, or the aircraft position. 35 GCPs were identified in the HyMap image and referenced using 0.25 m color aerial photographs. An additional number of 15 reference GCPs was selected in the same manner for an accuracy assessment of the geocoding.

In the PARGE approach, the grid coordinates and spatial resolution of the output image are defined by the incorporated DEM, i.e. 3.5 m in this work. By choosing an output resolution 10-20% higher than that of the raw image data, a higher portion of spectral information is preserved (Schläpfer, 2005). When the original pixels are mapped onto the grid with map coordinates at the original resolution 20-30% of all image information would get lost due to double mapping. Nevertheless, some image data will still be lost due to aircraft motion.


At the same time, gaps in the higher resolved mapping array need to be filled by interpolation based on surrounding pixels. Various methods for the interpolation exist, which are always a trade-off between spectral and spatial quality of the output. NN interpolation, for example, will lead to an output image which contains only original spectral information while showing unnatural block-wise spatial structures. Such spatial structures appear smoother and hence more real, when gaps in the output grid are filled by bilinear interpolation. In this case, however, interpolation causes artificial spectral mixtures. The decision between different interpolation methods should be based on the objectives of subsequent data processing. (Schläpfer and Richter, 2002;Schläpfer, 2005)

The mapping array was derived by first geocoding the original HyMap data. This mapping array is then used for the geocoding of data in the alternative and segment-compressed workflow with NN resampling of gaps based on the triangulation of surrounding directly mapped pixels. For the traditional workflow, the same mapping array was used, but gaps were filled by bilinear interpolation. Bilinear interpolated gaps can be considered a "good compromise" between spatial and spectral quality (Schläpfer, 2005). The classification in the traditional workflow was performed using the SVM classifier trained for the pixel-based classification in the alternative workflow. Thus, pixels that are directly mapped during geocoding will be identical for the traditional and pixel-based alternative workflow. However, the classification of interpolated gaps will differ whenever NN and bilinear resampling lead to significant differences in the spectral values of interpolated pixels.

Chapter IV:3 Results and discussion

Chapter IV:3.1 Accuracy assessment of the mapping array

The position of 15 reference GCPs after geocoding was used to assess the quality of the parametric geocoding alone. The accuracy of this measure yields a root mean squared error of 2.9 m and 3.1 m for Easting and Northing respectively. Hence, sub-pixel accuracy is achieved. However, GCPs on bridges or areas that have recently undergone construction show offsets of up to 5.5 m. The quality of the DEM needs to be questioned for such areas. More important, GCPs were not selected from areas covered by buildings. Without the vertical information on buildings from a DSM, displaced roof tops cannot be corrected. To reconstruct surfaces occluded by displaced buildings, information from at least one additional image is needed in addition to the DSM (Zhou et al., 2005). The impact of this phenomenon is directly driven by building height and view-angle. A building of 20 m height, for example, exhibits about 3.5 m offset at 10° off-nadir and 11.5 m offset at 30° (Schläpfer, 2005). Thus the final map accuracy of built-up areas will decrease with nadir distance. For quantification of the impact of displaced buildings accurate and reliable information on buildings in the image area is required, e.g. from a cadastre.

Chapter IV:3.2 Geocoding of HyMap images


The number of pixels of the HyMap image is increased by a factor of 5.4 during geocoding (Table IV-1). This increase is caused by the flight direction of 256° and the higher spatial resolution (Fig. IV-2). When the flight line is displayed in a rectangular grid, 70.6% of all pixels are no-data pixels outside the covered area. Regardless of their missing information content, the no-data pixels increase the physical file size. Processes like principal component transformation are critical on this physically large data set; the image segmentation algorithm applied in Chapter III is not feasible.

An analysis of the mapping array shows that only 0.4% of the original pixels are lost during geocoding and almost all information is preserved by the higher spatial resolution. As a consequence, however, 37.1% of all pixels in the flight line are gaps in the mapping array and need to be resampled. This way, the tradeoff between preserving information and creating additional sources of inaccuracy during the geocoding process is shown. The analysis of the influence of different resampling methods thus appears important.

Table IV-1: Comparison of spatial properties and physical file size of HyMap image before and after geocoding. The physical file size relates to 114 spectral bands in 16 bit.

Raw image

Geocoded image

No. of samples/lines



Overall no. of pixels



No. of pixels in flight line



No. of directly mapped pixels



No. of resampled pixels



Pixel area [m]

3.9 x 4.6

3.5 x 3.5

Physical file size [megabyte]




Figure IV-2: HyMap image from Berlin after geocoding (R = 829 nm; G = 1648 nm; B = 662 nm). Flight direction was 256° from East to West. Black pixels indicate no-data values outside the flight line 

Typical differences between interpolated pixels from the two approaches can be identified in the image data (Fig. IV-3). Along the edges of objects like buildings or streets, the images with bilinear interpolated gaps appear smoother than those, where gaps were filled by nearest neighbor resampling. With regard to the high number of resampled pixels the differences appear marginal, however. The actual impact of the spectral differences is best investigated based on results from the land cover classification.

Figure IV-3: Subsets from HyMap data after geocoding with bilinear (top) and NN interpolation (middle) (R = 829 nm; G = 1648 nm; B = 662 nm). The mapping array (bottom) is shown for comparison. White pixels were resampled; black areas indicate directly mapped pixels.

Chapter IV:3.3 Accuracy of geocoded land cover maps


As expected, the pixel-based map with NN resampling differs from that derived on the image data after bilinear interpolation. 5.2% of all pixels in the flight line are not assigned to the same land cover class in the two pixel-based workflows. All directly mapped pixels are spectrally identical and assigned to the same class. Thus, all ambiguously classified pixels are resampled pixels. They account for 14.1% of all resampled pixels and further assessment of the influence of the different workflows is required.

A statistical evaluation of the accumulated distribution of interpolated pixels by classes does not suggest class specific trends (Fig. IV-4). A comparison of subsets from the geocoded land cover maps reveals some differences between the two pixel-based approaches, although most of the results are very similar (Fig. IV-5). Straight edges appear more fringed in the NN resampled map. Especially interesting is the misclassification of pixels from the roof of a shopping center (Fig. IV-5, third column). The bilinear interpolation of two different roofing materials (compare Fig. IV-3) leads to a line of pixels with mixed spectral information. The mixture of the two materials is assigned to the class pervious in either approach. Thus, a high number of erroneously classified pixels exist along the edge between the two materials.

Figure IV-4: Number of interpolated pixels per land cover class for workflows with nearest neighbor resampling and bilinear interpolation of gaps in the mapping array.


Figure IV-5: Subsets from the geocoded land cover maps from the traditional workflow (top), the alternative workflow (middle) and the segment-compressed workflow (bottom). The impact of image segmentation on land cover classification at different levels of aggregation before geocoding is discussed in Chapter III.

To quantify the differences between the two pixel-based maps, they are compared to results from the field survey. The land cover polygons of the field survey are overlaid with the maps in raster format. The overall accuracy and the producer's accuracy, i.e. percentage of area that was correctly classified within the polygons of each class, are evaluated (Fig. IV-6). Reference pixels used for the accuracy assessment before geocoding in Chapter III cannot be used in this context, since they relate to pixels in raw scan geometry which are directly mapped and not interpolated.

Figure IV-6 Producer's accuracies for five land cover classes based on reference data from field survey for classification results from three different processing workflows.


The overall accuracies of the maps derived by the three different workflows yield values of 68.5%, 69.5%, and 70.2% for the alternative, segment-compressed and traditional workflow, respectively. For different reasons, these accuracies are below those documented in Chapter III. The intersection of 4 m pixels with polygons in vector format will always cause inaccuracies along the edges of the polygons. Given the high frequent patterns and small object sizes of urban areas, this influence is relatively high and it is further increased by the general inaccuracy of the geocoding (see Section 3.1). The missing correction of displaced buildings has already been mentioned (see Section 3.1) and is expected to contribute heavily to low map accuracies. In addition, the land use related surface categories of the field survey sometimes do not correspond to the reference pixels used in Chapter III. During the field survey, information was generalized by mapped parcels, whereas the labeling of the reference pixels relates only to the pixel and its direct neighborhood. In the case of derelict sites or of industrial grounds with heaps of sand, for example, this might cause differences. More important, the areas from the field survey are not representative for the entire image. The proportions of the classes impervious and pervious are overrepresented. They account for 50% of the mapped areas instead of a value of below 30% expected for the entire image area. Since these are two of the spectrally most critical classes, a negative bias is introduced.

The accuracies of the two pixel-based approaches show that the map that results from the traditional workflow is always 1% to 2% better. This workflow appears more accurate than the alternative workflow. Apparently, the spectral resampling during geocoding with bilinear interpolation has no negative influence on the results. This can be explained by the high number of spectrally mixed pixels in the training data from the image with 4 m spatial resolution before geocoding. The supervised SVM classifier is thus well suited for mixed signatures. The less accurate representation of lines and block-wise objects, on the other hand, appears to have negative impact on map accuracies which are based on polygons from the field survey. The first research question addressed the relevance of differences between the traditional and the alternative workflow. With regard to this question it needs to be stated that 14.1% of all interpolated pixels being ambiguous is not much. However, the lower accuracies for all classes but water do not favor the alternative approach. Unfortunately, the general value and the significance of this difference cannot be evaluated based on the selection of surveyed areas. Even if the slightly lower accuracy of the alternative workflow was proved by additional reference data, results would need to be considered in the context of an optimized workflow and of the data reduction by a factor of 5.4. The SVM classification in the traditional workflow requires several hours on a very powerful computer. Processes like image segmentation are not feasible on such a data set of ~6 gigabyte.

The issue of workflow optimization is even more important when discussing results from the segment-compressed workflow, wherein the data size during SVM classification is decreased by a factor of ~70. Here, mapping results differ for both directly mapped and resampled pixels. Before geocoding a 4.5% difference in overall accuracy between pixel- and segment-based results at this aggregation level has been reported (see Chapter III). The accuracy assessment based on the field survey yields an overall accuracy that is in between those of the alternative and traditional workflows. Thus no relevant difference to pixel-based approaches can be reported. However, an assessment of producer's accuracies for the five classes (Fig. IV-6) shows that results from the segment-compressed workflow are always lowest except for the overrepresented class impervious. In addition, all surveyed areas include many wide open spaces and easily accessible grounds. Such large structures lead to the H-resolution case according to Strahler et al. (1986) for all classes. This is favorable for the spatially generalized segment-based analysis. The second research question asks whether the lower accuracy of the segment-based results achieved in Chapter III are further increased by geocoding in the segment-compressed workflow. Due to the non-representative selection of survey areas, this question cannot be finally answered. However, there are no indications for a further decrease in mapping accuracy.


In addition to the two original research questions, the general accuracy after geocoding has to be discussed at this point. The 18-20% decrease in overall accuracy of the pixel-based results during geocoding is by far more striking than the 1-2% difference between the two pixel-based workflows. Despite the mentioned non-representative selection of survey areas, it must be assumed that the various factors that have a negative influence on the final mapping accuracy after geocoding add-up to a relevant value of decrease. At a mere 70% overall accuracy, results from the land cover mapping with airborne hyperspectral data are not satisfying. Thus an additional study has to be performed, that thoroughly investigates the potential error sources within a single processing workflow and links individual error sources to data characteristics such as spectral detail, spatial resolution or sensor geometry. For a study of this kind, a variety of different reference data sets that cover the entire flight line is needed.

Chapter IV:4 Conclusions

Two alternative workflows for the processing of hyperspectral data are presented. Both aim at optimizing processing times while making good use of the high spectral information content of the airborne hyperspectral data. Moving the geocoding of the image data to the end of the processing reduces the amount of data for image classification by a factor of 5.4 for the pixel-based alternative workflow. The mandatory NN resampling during geocoding influences the accuracy of the final map. Despite a slightly lower overall accuracy compared to the traditional workflow, this approach is favorable for studies with very large data sets, since advanced image processing steps are either critical or not feasible in terms of processing times and memory allocation. For studies outside urban areas, i.e. areas with less frequent changes of spectrally varying materials, differences between the traditional and the alternative workflow are expected to be lower.

In the same way, the slightly lower accuracy of the segment-compressed workflow compared to the traditional workflow appears not relevant given the accumulated decrease in data size by a factor of ~70, i.e. 5.4 by geocoding and 13.1 by image segmentation. The issue of data compression will become more important with regard to increasing amounts of hyperspectral data to be processed in the future. With the increasing availability of airborne hyperspectral data products like the Airborne Reflective Emissive Spectrometer (ARES) (Müller et al., 2005) and the Airborne Prism Experiment (APEX) (Nieke et al., 2006), but also spaceborne missions like the Environmental Mapping and Analysis Project (EnMAP) (Kaufmann et al., 2005) the optimization of workflows for hyperspectral data will play an increasing role.


Against the background of the slight differences in map accuracies observed in this study, the decision on an optimal workflow should be based on processing capacities. If these are not the limiting factor, highest accuracies will be achieved with the traditional workflow. If processing times and memory allocation do not allow the traditional approach, the alternative workflow will be a reliable solution. Depending on the heterogeneity of the observed environment and with regard to findings from Chapter III, the segment-compressed workflow appears to be a very time efficient approach.

In this work, the presumably great influence of building displacement was not considered. The occlusion of surfaces behind displaced buildings negatively impacts subsequent urban environmental analyses. Similar to the phenomenon of tree crowns obscuring the surface underneath, this effect can not be corrected, regardless whether a DSM is available or not. Thus, a detailed assessment and quantification of the influence of these phenomena at different stages within the workflow is required to evaluate the final map accuracy of classification results form airborne hyperspectral data. Such an assessment will help better understanding the decrease of overall accuracy during geocoding and evaluating the reliability of results achieved with airborne hyperspectral data.

© Die inhaltliche Zusammenstellung und Aufmachung dieser Publikation sowie die elektronische Verarbeitung sind urheberrechtlich geschützt. Jede Verwertung, die nicht ausdrücklich vom Urheberrechtsgesetz zugelassen ist, bedarf der vorherigen Zustimmung. Das gilt insbesondere für die Vervielfältigung, die Bearbeitung und Einspeicherung und Verarbeitung in elektronische Systeme.
DiML DTD Version 4.0Zertifizierter Dokumentenserver
der Humboldt-Universität zu Berlin
HTML generated: