(PDF) Accuracy assessment of lidar-derived digital elevation models

Airborne laser scanning sensors are impressive in their ability to collect a large number of topographic points in three dimensions in a very short time thus providing a high-resolution depiction of complex objects in the scanned areas. The quality of any final product naturally depends on the original data and the methods of generating it. Thus, the quality of the data should be evaluated before assessing any of its products. In this research, a detailed evaluation of a LIDAR system is presented, and the quality of the LIDAR data is quantified. This area has been under-emphasized in much of the published work on the applications of airborne laser scanning data. The evaluation is done by field surveying. The results address both the planimetric and the height accuracy of the LIDAR data. The average discrepancy of the LIDAR elevations from the surveyed study area is 0.12 m. In general, the RMSE of the horizontal offsets is approximately 0.50 m. Both relative and absolute height discr...

This paper summarizes the methods and results of error modelling and propagation analyses in the Olše and Stonávka confluence area. In terrain analyses, the outputs of the aforementioned analysis are always a function of input. Two approaches according to the input data were used to generate field elevation errors which subsequently entered the error propagation analysis. The main goal solved in this research was to show the importance of input data in slope estimation and to estimate the elevation error propagation as well as to identify DEM errors and their consequences. Dependencies were investigated as well to achieve a better prediction of slope errors. Four different digital elevation model (DEM) resolutions (0.5, 1, 5 and 10 meters) were examined with the Root Mean Square Error (RMSE) rating up to 0.317 meters (10 m DEM). They all originated from a LIDAR survey. In the analyses, a stochastic Monte Carlo simulation was performed with 250 iterations. The article focuses on the ...

With continuous developments in LiDAR technologies high point cloud densities have been attainable but accompanied by challenges for processing big volumes of data. Reductions in high point cloud densities are expected to lower data acquisition and data processing costs; however this could affect the characteristics of the generated Digital Elevation Models (DEMs). This research aimed to evaluate the effects of reductions in airborne LiDAR point cloud data densities on the visual and statistical characteristics of the generated DEMs. DEMs have been created from a dataset which constitutes last returns of raw LiDAR data that was acquired at bare lands for Gilmer County, USA between March and April 2004, where qualitative and quantitative testing analyses have been performed. Visual analysis has shown that the DEM can withstand a considerable degree of quality with reduced densities down to 0.128 pts/m<sup>2</sup> (47 % of the data remaining), however degradations in the D...

The Photogrammetric Record 23(122): 148–169 (June 2008) ACCURACY ASSESSMENT OF LIDAR-DERIVED DIGITAL ELEVATION MODELS Fernando J. Aguilar (faguilar@ual.es) University of Almeria, Spain Jon. P. Mills (j.p.mills@newcastle.ac.uk) Newcastle University (Based on a contribution to the Annual Conference of the Remote Sensing and Photogrammetry Society at Newcastle upon Tyne, 14th September 2007) Abstract Despite the relatively high cost of airborne lidar-derived digital elevation models (DEMs), such products are usually presented without a satisfactory associated estimate of accuracy. For the most part, DEM accuracy estimates are typically provided by comparing lidar heights against a ﬁnite sample of check point coordinates from an independent source of higher accuracy, supposing a normal distribution of the derived height differences or errors. This paper proposes a new methodology to assess the vertical accuracy of lidar DEMs using conﬁdence intervals constructed from a ﬁnite sample of errors computed at check points. A non-parametric approach has been tested where no particular error distribution is assumed, making the proposed methodology especially applicable to non-normal error distributions of the type usually found in DEMs derived from lidar. The performance of the proposed model was experimentally validated using Monte Carlo simulation on 18 vertical error datasets. Fifteen of these data-sets were computed from original lidar data provided by the International Society for Photogrammetry and Remote Sensing Working Group III/3, using their respective ﬁltered reference data as ground truth. The three remaining data-sets were provided by the Natural Environment Research Council’s Airborne Research and Survey Facility lidar system, together with check points acquired using high precision kinematic GPS. The results proved promising, the proposed models reproducing the statistical behaviour of vertical errors of lidar using a favourable number of check points, even in the cases of data-sets with non-normally distributed residuals. This research can therefore be considered as a potentially important step towards improving the quality control of lidar-derived DEMs. Keywords: accuracy assessment, conﬁdence intervals, DEM, lidar Introduction Accurate digital elevation models (DEMs) of high spatial resolution from airborne lidar data are in increasing demand for a growing number of mapping and GIS tasks related to a wide variety of applications including forest management, urban planning, bird population 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. Blackwell Publishing Ltd. 9600 Garsington Road, Oxford OX4 2DQ, UK and 350 Main Street Malden, MA 02148, USA. The Photogrammetric Record modelling, ice sheet mapping, ﬂood control and road design (Lim et al., 2003). Lidar DEMs are also being increasingly used for new applications relating to change detection and geopositioning (James et al., 2006; Rodarmel et al., 2006; Miller et al., 2007). However, despite the growing range of applications, and in spite of the relatively high cost of this new type of digital product, such DEMs are usually presented without an appropriate associated estimate of their accuracy. Indeed, with a popular estimate being that 80% of all data used by managers and decision-makers is spatially referenced (van Oort and Bregt, 2005), many researchers have stressed the need to deal with issues of spatial quality. A recent, comprehensive review regarding the importance of assessing error in DEMs can be found in Fisher and Tate (2006). In most cases, users of lidar-derived DEMs are content with some nominal speciﬁcations about raw data accuracy, as supplied by the instrument manufacturer or the data provider. Such statistics are often inappropriate given that the major component of the total error could well be the error introduced during the processing of the raw data (data ﬁltering, gridding, segmentation and/or object reconstruction), rather than the error induced during the data capture process (which is related to vertical and horizontal error in the positioning of the laser platform, laser scan angle, surface reﬂectivity and terrain slope). In fact, gridding error can comprise a very important, and often neglected, source of error which should be taken into account (Smith et al., 2005; Aguilar et al., 2006). For example, it may be especially relevant for canopy height model estimation in forestry applications where there are fewer ground-return samples for effective DEM surface interpolation (Lim et al., 2003; Clark et al., 2004; Hopkinson et al., 2004; Su and Bork, 2006). Although there have been a number of studies dealing with lidar-derived DEM error, most can be classiﬁed as empirical work in which the inﬂuence of different variables on DEM error was analysed (Hopkinson et al., 2004; Hyyppä et al., 2005; Goodwin et al., 2006; Göpfert and Heipke, 2006; Su and Bork, 2006). In practice, statistical tests are the only way to ensure that requirements are met at a moderate cost by using an inference process based on sampling theory. At present, the great majority of DEM accuracy standards are based on computing the vertical accuracy of a ﬁnite sample data-set (check points) from the differences (residuals) between data-set heights and height values from an independent source of higher accuracy, usually obtained by differential GPS techniques and supposing those residuals follow a normal distribution. This is the case, for example, for the ‘‘National Standards for Spatial Data Accuracy’’ (NSSDA) document published by the US Federal Geographic Data Committee (FGDC, 1998). However, it should be recognised that vertical errors in lidar DEMs often follow a non-normal distribution when they are captured over non-open terrain (for instance, over vegetated or built-up areas) and, therefore, both ﬁltering and gridding can introduce a large quantity of non-random noise (Flood, 2004; Hopkinson et al., 2004; Göpfert and Heipke, 2006), presenting high kurtosis and skewness due to the presence of systematic errors and outliers. The presence of systematic errors is relatively frequent in lidar-derived DEMs of forest areas where non-ground samples (such as low vegetation and logs) are included in the DEM interpolation process and the resulting mean error tends to be positive. In such cases a lidar-derived DEM tends to overestimate the reference elevation (Clark et al., 2004). Concerned about the likely non-normal nature of lidar-derived DEM error under non-open terrain, the American Society for Photogrammetry and Remote Sensing (ASPRS), through the work of the ASPRS Lidar Committee, recently approved a new set of guidelines (Flood, 2004). The guidelines recommend that for open terrain, where error distribution is thought to be close to a normal distribution, the methodology proposed by the NSSDA should be adopted. That is, for assessing vertical accuracy at 95% conﬁdence level: 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. 149 Aguilar and Mills. Accuracy assessment of lidar-derived digital elevation models Vertical accuracy ¼ 196rmsez ð1Þ where rmse is the root mean square error. The resulting ﬁgure is known as the ‘‘fundamental’’ vertical accuracy. Away from open terrain, the fundamental vertical accuracy is replaced by the so-called ‘‘supplemental’’ vertical accuracy, tested using the 95th percentile method. The 95th percentile method may be used regardless of whether or not the errors follow a normal distribution and whether or not errors qualify as outliers, indicating that 95% of the errors in the data-set will have absolute values of equal or lesser value and 5% of the errors will be of larger value. Regarding the number of necessary check points, the ASPRS recommends a minimum of 20 check points (30 preferred) in each of the major land cover categories represented in the surveyed area. This gives rise to the question of exactly how many check points should be used in order to guarantee that a certain conﬁdence level is achieved. Conversely, it may be necessary to derive the reliability of fundamental and supplemental vertical accuracy estimates when a particular number of check points is used. Deriving deﬁnitive values is not easy, and the ASPRS guidelines do not offer advice with respect to these issues. Addressing the aforementioned issues, the main objective of this paper is the development of a theoretical methodology to assess lidar DEM vertical accuracy. Vertical accuracy is estimated by means of the construction of conﬁdence intervals from a ﬁnite sample of residuals computed at check points evenly distributed across the whole survey area. One of the goals is to derive the minimum number of check points needed to guarantee a certain conﬁdence level. In this way, a non-parametric approach based on the theory of estimating functions has been adopted and tested. This approach means that no particular distribution of residuals at check points is assumed and so it is applicable to non-normal error distributions which are usually found in lidar-derived DEMs. Development of the Proposed Models The models tested in the work reported here concern two different scenarios: (1) Construction of conﬁdence intervals relating to the distribution of lidar vertical errors from a ﬁnite sample of residuals coming from a population not necessarily expected to be normally distributed. (2) Construction of conﬁdence intervals for the assessment of an average error statistic for which the meaning will easily be understood, both on the part of the user and the producer of lidar-derived DEMs. In this case rmse, as a widely used average error statistic, was chosen. Both of the contemplated scenarios avoid the very common assumption of normally distributed errors, as adopted in the majority of current vertical accuracy standards. Model for Estimating Lidar Vertical Accuracies Suppose that the difference between a lidar-derived DEM and ground truth is a random variable X. This being the case, a sample of size N height differences may be used to estimate the mathematical expectation (mean value, l) and dispersion (standard deviation, r) of the said height differences for the whole surface (X = {x1, x2, …, xN}). This is the procedure adopted by the NSSDA standards and the ASPRS Lidar Committee, as was previously discussed. 150 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. The Photogrammetric Record The ﬁrst step is to obtain a conﬁdence interval for l by means of statistical inference, taking into account the likely non-normal nature of the errors. It is therefore necessary to determine the conﬁdence interval using a non-parametric approach where no particular distribution is assumed. This issue can be resolved using the theory of estimating functions, as demonstrated by Aguilar et al. (2007a). An outline of the main steps in deriving equations (2) and (3) can also be found in the Appendix located at the end of this paper. sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2 pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ta ðc2m þ2Þðc2m þ2c21m Þ c2m þ2 c2m þ2 þ1 þ4 c1m þ c1m jc1m j r pﬃﬃﬃﬃ lupper ¼ x þ ð2Þ 2 N c2m þ2 c1m llower ¼ x þ sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ 2 pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ ta ðc2m þ2Þðc2m þ2c21m Þ c2m þ2 þ4 þ 1 c jc j 1m 1m 2 r pﬃﬃﬃﬃ N ð3Þ where x is the sample mean, N is the sample size, r is the value of the population standard deviation (which will be estimated by sample standard deviation Sd), and ﬁnally c1m and c2m are the skewness and standardised kurtosis of the sampling distribution of the sample mean, respectively. Skewness and standardised kurtosis can be estimated from a ﬁnite sample by means of the following expressions: P N Ni¼1 ðxi xÞ3 c ; c1m ¼ p1ﬃﬃﬃﬃ ð4Þ c1 ¼ 3 ðN 1ÞðN 2ÞSd N c2 ¼ P N ðN þ 1Þ Ni¼1 ðxi xÞ4 3ðN 1Þ2 c ; c2m ¼ 2 : 4 ðN 2ÞðN 3Þ N ðN 1ÞðN 2ÞðN 3ÞSd ð5Þ The use of ta, a one-tailed critical value corresponding to a conﬁdence level of 1 ) a, instead of za should be noted in equations (2) and (3). This change is recommended when estimating conﬁdence intervals using the sample standard deviation instead of r, especially when working with small samples (lower than 100 observations). In this case it is more appropriate to employ Student’s t-distribution calculating ta values for N ) 1 degrees of freedom, N being the sample size (Steel and Torrie, 1980). Once the conﬁdence interval for the mean is determined, which therefore includes the disturbance effects of error bias, skewness and kurtosis, the second step implies treating data as if it came from a normal population. Under this hypothesis the following expressions can be written to estimate the maximum and minimum expected vertical error (xupper and xlower, respectively) for a 1 ) a conﬁdence level: xupper ¼ lupper þ ta Sd ð6Þ xlower ¼ llower ta Sd: ð7Þ Systematic errors are not assumed to have been removed and so the mean is not necessarily expected to be zero. This phenomenon is frequently encountered when working with lidar data in afforested areas where it is usual to ﬁnd a positive bias due to dense low lying vegetation beneath the tree canopy (Kraus and Pfeifer, 1998; Goodwin et al., 2006; Göpfert and Heipke, 2006; Su and Bork, 2006). 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. 151 Aguilar and Mills. Accuracy assessment of lidar-derived digital elevation models Model for Estimating Lidar-Derived DEM Accuracy Using Rmse In this case it is necessary to construct a conﬁdence interval to estimate uncertainty when rmse is employed to evaluate lidar-derived DEM vertical accuracy, again supposing a non-normal error distribution. The approach proposed by Aguilar et al. (2007a), based on estimating functions theory, was again adopted to obtain the following equations: vﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ u 2 pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ u ta ðc2mse þ2Þðc2mse þ2c21mse Þ c2mse þ2 c2mse þ2 u þ þ4 þ 1 u c1mse c1mse jc1mse j t ð8Þ rmse rmseupper ¼ mse þ 2 rmselower vﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ sﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ u 2 pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ u ta ðc2mse þ2Þðc2mse þ2c21mse Þ c2mse þ2 c2mse þ2 u þ1 þ4 u c1mse c1mse jc1mse j t rmse ¼ mse þ 2 ð9Þ where mse is the mean square error, rmse is the standard deviation of the mse sampling distribution, and ﬁnally c1mse and c2mse are mse skewness and standardised kurtosis, respectively. All described central moments for square errors may be estimated from a sample of size N check points by employing the following expressions: mse ¼ Sdx2 ¼ c1x2 c2x2 ¼ N P vﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ uN uP 2 u ðxi mseÞ2 ti¼1 N x2i i¼1 N r2 Sdx2 ; rmse ¼ pxﬃﬃﬃﬃ pﬃﬃﬃﬃ N N P c1x2 N Ni¼1 ðx2i mseÞ3 p ﬃﬃﬃﬃ ¼ ; c ¼ 1mse ðN 1ÞðN 2ÞSd3x2 N P c 2 N ðN þ 1Þ Ni¼1 ðx2i mseÞ4 3ðN 1Þ2 ; c2mse ¼ 2x : 4 N ðN 1ÞðN 2ÞðN 3ÞSdx2 ðN 2ÞðN 3Þ ð10Þ ð11Þ ð12Þ ð13Þ Methodology Test Data The study sites concern lidar surveys of two different areas. The ﬁrst area comprises the data-sets from the second phase of the European Organisation for Experimental Photogram- 152 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. The Photogrammetric Record metric Research (OEEPE—now European Spatial Data Research, EuroSDR) project on laser scanning. Data of the second area was acquired in a series of lidar surveys carried out by the Natural Environment Research Council’s Airborne Research and Survey Facility (NERC ARSF). Data-sets from the EuroSDR Project on Laser Scanning As part of the second phase of the EuroSDR project on laser scanning, different lidar datasets, ﬁrst and second pulse return, were acquired over the Vaihingen/Enz and Stuttgart city centre (Germany) test sites using an Optech ALTM laser scanner. International Society for Photogrammetry and Remote Sensing (ISPRS) Working Group III/3 has previously used these data-sets for testing numerous ﬁltering algorithms (Sithole and Vosselman, 2004) and data has been made available through the working group’s website (http://www.commission3.isprs.org/ wg3). A total of seven study sites (four urban and three rural) were selected for testing the aforementioned models, with sites chosen for their varied characteristics and diverse feature content (open ﬁelds, dense buildings, large buildings, quarries, bridges, dense vegetation and steep slopes, among others). The selected sites were appropriate to enable validation of the proposed models against a very wide range of operational conditions. Fifteen sub-sites were extracted from the original data (Table I). For a better understanding of the nature of the data, a planimetric view of Site 2 and corresponding reference sub-sites are depicted in Fig. 1. Reference data (ground truth) was generated by manual ﬁltering of the height data-set to segment bare earth data, the process assisted by knowledge of the landscape and aerial photographs (see Sithole and Vosselman (2003) for further details). Data-sets from the NERC ARSF Filey Bay is located approximately 11 kilometres south of the coastal town of Scarborough in North Yorkshire, UK. Three lidar data-sets of Filey Bay were acquired using Table I. Characteristics of the selected EuroSDR data-sets. Site Site 1 Site 2 Site 3 Site 4 Site 5 Site 6 Site 7 Point spacing Reference data (sub-site) Number of points 1m 1m 1m 1m 1m 1m 1m Sample Sample Sample Sample Sample Sample Sample 11 12 21 22 23 24 31 17 626 25 344 10 055 21 570 11 025 3755 15 544 1m 1m 2m 2m 2m 2m 2m 2m Sample Sample Sample Sample Sample Sample Sample Sample 41 42 51 52 53 54 61 71 1653 11 989 13 860 17 678 25 113 3948 31 531 12 786 Characteristics Steep slopes, mixture of vegetation and buildings on hillside, buildings on hillside Large and irregularly shaped buildings, small tunnel, road and bridge Vegetation between a high density of buildings, building with eccentric roof, open terrain with mixture of low and high features Railway station with trains (low density of ground points) Steep vegetated slopes, quarry (sharp break lines), vegetated river bank Large buildings, road including embankments Road, underpass, road including embankments 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. 153 Aguilar and Mills. Accuracy assessment of lidar-derived digital elevation models Sample 21 5403200 Sample 24 5403100 5403000 5402900 5402800 5402700 513500 Sample 22 513600 513700 513800 Sample 23 Fig. 1. Site 2 with corresponding reference sub-sites shown as perspective views. an Optech ALTM 3033 scanner (operating at 1000 m ﬂight altitude) in April 2005 (data-set 0405), August 2005 (0805) and May 2006 (0506); Miller et al. (2007) contains further details of the surveys. All issues related to georeferencing were carried out by the NERC ARSF. As independent quality control, delivered lidar data was compared against a testﬁeld based on check points acquired during a previous survey in 2001 using kinematic GPS (anticipated vertical accuracy of around 1 to 2 cm). The test area consisted of a hard surface, an asphalt road circuit, located at Filey Brigg Country Park (Fig. 2) that enabled an open terrain vertical accuracy evaluation of the lidar data to be undertaken. In this case no error from ﬁltering or gridding is expected, and so all error can be assumed to arise from the data capture process. Table II gives statistical parameters associated to the vertical errors computed for the Filey Bay data-sets. It should be highlighted that error distribution generally ﬁtted a normal distribution except in the case of the 0506 lidar data, where the ‘‘3-sigma’’ rule (Daniel and Tennant, 2001) had to be applied to remove three likely outliers. Notice that the Kolmogorov– Smirnov (K-S) test (Royston, 1982), with a conﬁdence level of 95%, demonstrated that the error distribution for each data-set matched quite well a normal probability distribution because the K-S statistic was always lower than the 95% critical value. Despite the 0805 lidar data-set presenting an absolute value of skewness higher than the critical value of 0Æ5, proposed by some authors as a reasonable limit to consider data to come from a normal distribution (Daniel and Tennant, 2001), no outlier correction was applied because of the results yielded by the K-S test. Therefore, these data-sets were subsequently used for testing the proposed model to construct rmse conﬁdence intervals from normally distributed lidar error data. 154 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. The Photogrammetric Record Fig. 2. GPS check point testﬁeld located at Filey Brigg Country Park. Table II. Statistical characteristics of Filey Bay data-sets. Data-sets Number of check points Minimum error (m) Maximum error (m) Mean error (m) Standard deviation (m) Skewness Standardised kurtosis K-S statistic K-S critical value (95%) 0405 lidar 0805 lidar 0506 lidar (raw) 0506 lidar (3-sigma rule) 840 )0Æ16 0Æ03 )0Æ06 0Æ03 )0Æ04 0Æ12 0Æ026 0Æ047 833 )0Æ29 0Æ23 )0Æ16 0Æ05 1Æ24 9Æ07 0Æ038 0Æ047 834 )0Æ03 1Æ31 0Æ09 0Æ06 10Æ75 179Æ45 0Æ151 0Æ047 831 )0Æ03 0Æ20 0Æ08 0Æ04 0Æ19 0Æ00 0Æ029 0Æ047 Raw Data Processing of EuroSDR Data Last-return raw EuroSDR lidar data was ﬁltered to segment the ground surface from vegetation, buildings and any gross errors embedded in the general point cloud (Axelsson, 1999). The ﬁltering algorithm developed by Axelsson (2000) was used to segment ground points by means of a progressive triangular irregular network (TIN) densiﬁcation method where the surface was allowed to ﬂuctuate within certain values. Brieﬂy, a sparse TIN is derived from neighbourhood minima, and then progressively densiﬁed through the laser point cloud. The algorithm has been implemented in TerraScan (integrated in Terrasolid, version 007.004), the software used to carry out the ﬁltering process. As the ﬁrst step in the ﬁltering process, points regarded as too low or very high were removed as likely outliers. Subsequently, the ground point class was ﬁltered and saved as points belonging to the DEM. Fig. 3 depicts an example of the results from the ﬁltering process for Site 5. The interpolation method used to reﬁll any gaps created in the ﬁltered ground model (gaps resulting from the removal of points on buildings, vegetation and the like) was the 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. 155 Aguilar and Mills. Accuracy assessment of lidar-derived digital elevation models Fig. 3. Filtering process over Site 5. Top: last-return raw data. Bottom: last-return ground-ﬁltered data. Multiquadric Radial Basis Function working with the local support of the eight neighbours closest to the interpolated point, since this has previously been found generally to yield better results (Aguilar et al., 2005). The ﬁnal grid spacing of the interpolated DEM was the same as presented by the raw lidar data. Lidar error was computed as the difference between the derived ground DEM heights (obtained after applying Axelsson’s ﬁltering) and the heights from the manually ﬁltered reference data for each of the 15 EuroSDR samples (as listed in Table I). It is important to note that large DEM data gaps caused by the ﬁltering process (for example, the removal of points on large buildings) were not included in the error calculation in order to avoid the uncertainty in areas where no reference data (ground truth) was available. Hence the error data-sets obtained (Table III) can be considered very close to the errors found in real-world lidar surveys. Furthermore, the 3-sigma rule was applied to remove outliers which can corrupt the true statistical distribution of the errors. In fact, this technique is generally applied under operational conditions for removing outliers in the DEM quality assessment of lidar data (Daniel and Tennant, 2001). The percentage of data removed ranged between 0Æ5 and 3Æ5%, with a mean value of 1Æ8%. These results are fairly consistent with those published by Torlegård et al. (1986) referring to errors from photogrammetric DEMs. Nevertheless, despite applying the 3-sigma rule, corrected error data-sets still followed a non-normal distribution according to the K-S 95% conﬁdence test. 156 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. The Photogrammetric Record Table III. Statistics for 3-sigma rule corrected error from EuroSDR data-sets. Samples 11 12 21 22 23 24 31 41 42 51 52 53 54 61 71 Statistics Points Mean (m) Sd (m) c1 c2 K-S Critical K-S % data outliers 16 995 25 203 9742 21 193 10 871 3695 15 315 1626 11 743 13 701 17 368 24 702 3863 31 057 12 517 0Æ16 0Æ04 0Æ02 0Æ04 0Æ05 0Æ04 0Æ01 0Æ25 0Æ02 0Æ00 0Æ08 0Æ16 0Æ00 0Æ01 0Æ02 0Æ44 0Æ16 0Æ05 0Æ13 0Æ20 0Æ17 0Æ04 1Æ11 0Æ07 0Æ06 0Æ30 0Æ71 0Æ08 0Æ10 0Æ11 3Æ30 6Æ26 1Æ83 5Æ12 5Æ06 5Æ31 1Æ29 5Æ05 2Æ98 0Æ26 2Æ52 6Æ64 2Æ79 3Æ84 2Æ33 11Æ82 54Æ41 4Æ77 32Æ39 31Æ26 45Æ81 4Æ16 24Æ41 13Æ22 5Æ32 8Æ41 57Æ08 19Æ27 24Æ07 10Æ80 0Æ322 0Æ291 0Æ153 0Æ272 0Æ299 0Æ273 0Æ108 0Æ424 0Æ201 0Æ107 0Æ270 0Æ360 0Æ179 0Æ252 0Æ208 0Æ010 0Æ010 0Æ014 0Æ009 0Æ013 0Æ022 0Æ011 0Æ034 0Æ013 0Æ012 0Æ010 0Æ009 0Æ022 0Æ008 0Æ012 3Æ5 0Æ5 3Æ1 1Æ7 1Æ4 1Æ6 1Æ4 1Æ6 2Æ0 1Æ1 1Æ7 1Æ6 2Æ1 1Æ5 2Æ1 Experimental Validation of the Proposed Models The Monte Carlo method was employed for model validation. Monte Carlo numerical simulation is based on stochastic techniques, meaning it is based on the use of random numbers and probability statistics, to investigate complex problems without relying on analytical equations (Weir, 2002). For the ﬁrst model evaluated, described through equations (2) to (7), the simulation started by extracting a sample of residuals (sample X of size N) from each of the EuroSDR data-sets by means of random sampling. In this case N took the values of 20, 40, 60, 80 and 100 check points. The model parameters were then computed and the bounds of the interval, xupper and xlower, were obtained for a 1 ) a conﬁdence level. The number of errors from the whole data-set (population) which were within the bounds estimated by the model was then calculated. It must be noted that the percentage of errors included within the bounds should be close to 95% if, for instance, a signiﬁcance level of a = 0Æ05 was chosen. This procedure was repeated 1000 times to estimate the mean value of the percentage of errors from the population falling within the bounds and its associated reliability measure (coefﬁcient of variation). Over the same simulated samples, and purely for comparison purposes, fundamental and supplemental vertical accuracies were also computed according to the ASPRS guidelines for lidar data (Flood, 2004), their corresponding mean values and associated reliabilities also being obtained for 1000 runs. For the evaluation of the second model, described through equations (8) to (13), the simulation procedure was quite similar to the ﬁrst, with the exception that the estimated bounds were rmse values and the sample size increased in increments of 20, from 20 to 200 check points. The population rmse for every data-set was calculated from all residual values available in each. It was then necessary to determine whether the corresponding rmse population fell within the limits of the calculated conﬁdence intervals and whether the model had correctly predicted the expected uncertainty for data-set estimation. In this case the procedure was run 2000 times because it was computationally faster than for the ﬁrst model. Note that because samples were extracted from ﬁnite populations, a correcting coefﬁcient must be applied to calculate sample variance in order to correct the reduced variability expected on consecutive random samplings extracted from the same ﬁnite residual population (Steel and Torrie, 1980). Therefore, the following equation was applied: 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. 157 Aguilar and Mills. Accuracy assessment of lidar-derived digital elevation models Sd Sdm ¼ pﬃﬃﬃﬃ N rﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ M N M ð14Þ where M is the number of residuals, N is the sampling size, Sd and Sdm are the standard deviation of the residuals sample and the mean sample, respectively. Results and Discussion Confidence Intervals for Lidar Vertical Accuracy Assessment 40 30 20 10 40 60 80 100 75 0 20 40 30 20 10 80 100 Percentage of errors within computed intervals 50 60 0 120 40 30 20 10 80 100 Number of check points Fundamental vertical accuracy Supplemental vertical accuracy 95% population percentile Reliability for fundamental accuracy calculation Reliability for supplemental accuracy calculation 0 120 Percentage of errors within computed intervals 50 Reliability (%) Accuracy (m) 60 60 100 5 4·5 4 3·5 3 2·5 2 1·5 1 0·5 0 120 90 85 80 75 0 20 40 60 80 100 70 40 80 Number of check points 1·5 1·35 1·2 1·05 0·9 0·75 0·6 0·45 0·3 0·15 0 20 60 95 Number of check points 0 40 100 60 Reliability (%) Accuracy (m) 70 40 100 80 Number of check points 1·5 1·35 1·2 1·05 0·9 0·75 0·6 0·45 0·3 0·15 0 20 5 4·5 4 3·5 3 2·5 2 1·5 1 0·5 0 120 85 Number of check points 0 100 90 Reliability (%) 20 95 95 90 85 80 75 0 20 40 60 80 Reliability (%) 0 0 120 5 4·5 4 3·5 3 2·5 2 1·5 1 0·5 0 120 Reliability (%) 100 60 50 Percentage of errors within computed intervals 70 1·5 1·35 1·2 1·05 0·9 0·75 0·6 0·45 0·3 0·15 0 Reliability (%) Accuracy (m) Fig. 4 shows the performance of the proposed model compared against results achieved using the ASPRS guidelines (Flood, 2004). Only results for samples 11, 52 and 61 are given, although results are similar for all 15 samples. Number of check points Mean percentage of errors within the computed interval (95% confidence level) for the proposed model Reliability for the computed intervals calculation Fig. 4. Evaluation results for samples 11 (top graphs), 52 (middle) and 61 (bottom) from the EuroSDR data-set. Left column: accuracy following ASPRS guidelines. Right column: proposed model. 158 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. The Photogrammetric Record Results calculated using the fundamental accuracy assessment, as deﬁned in equation (1) supposing a normal distribution of the error, were usually far from the true vertical accuracy computed for the error population. This is to be expected, taking into account the non-normally distributed nature of lidar error data-sets in non-open terrain, as previously conﬁrmed from results yielded by the K-S test given in Table III. Therefore, following the recommendations of the ASPRS guidelines, the supplemental vertical accuracy test ﬁgures should be considered and computed using the 95th percentile method, as explained in the introduction. In this case supplemental accuracy always overestimated the true value, but closed in to the truth when the sample size increased. The problem with both fundamental and supplemental vertical accuracy is that the variability of the estimates was very noticeable, especially in the case of the latter (always above ±20% even when using up to 100 check points). Hence many more than 100 check points would have to be employed to ensure a conﬁdence level near to the target value of 95%. However, the 95% conﬁdence intervals computed by means of the proposed model, developed through equations (2) to (7), achieved a percentage of population errors within the bounds very close to the target value, even when working with only 20 check points. Better still, the variability of the estimates was very low, as can be seen in Fig. 4. Summing up, only around 60 check points would be needed to reach a conﬁdence level of 95% with an estimate error below ±2Æ5%. Fig. 5 shows conﬁdence interval limits (the mean of 1000 runs) computed from the proposed model for the three samples referred to in Fig. 4, varying randomly the initial ﬁnite sample of check points. The stability of the conﬁdence interval length along the horizontal axis (sample size) should be highlighted. The conﬁdence interval length basically depends on the sample standard deviation, skewness and standardised kurtosis, rather than the number of check points used. Therefore, precise and narrow conﬁdence intervals can be calculated when working with as few as 60 check points. Notice that unlike the case for a normal distribution, the conﬁdence intervals are not symmetrical around the mean value, but they are biased towards the direction of skewness. The importance of the development of this simple and understandable model for accuracy estimation is supported by the fact that, traditionally, users are known to be failing to quantify risks due a lack of tools, theory and poor documentation on spatial data quality (van Oort and Bregt, 2005). The presentation of clear upper and lower limits rather than a single accuracy value may substantially improve the understanding of accuracy test results by users of lidar data and enhance communication with data providers. Furthermore, it is worth noting that the proposed model, unlike the ASPRS methodology, does not assume any previous lidar error distribution since all required information is extracted from the sample by means of a nonparametric approach. Thus, the so-called ‘‘strong assumption’’ of distribution normality accepted in many previous works, and in the majority of the current standards for positional accuracy, is avoided. Confidence Intervals for Rmse Estimate Figs. 6 and 7 show the performance of the model tested to compute rmse 95% conﬁdence intervals as well as another model based on the Student’s t asymptotic approach. The Student’s t model has been formulated under the hypothesis of an almost-normal distribution of lidar vertical error and is given by the following expressions: pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ rmseupper ¼ mse þ ta=2 rmse ð15Þ 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. 159 Aguilar and Mills. Accuracy assessment of lidar-derived digital elevation models Error value (m) Upper limit Lower limit Population mean 1·4 1·2 1 0·8 0·6 0·4 0·2 0 –0·2 –0·4 –0·6 –0·8 0 20 40 60 80 100 120 100 120 Error value (m) Number of check points 1·4 1·2 1 0·8 0·6 0·4 0·2 0 –0·2 –0·4 –0·6 –0·8 0 20 40 60 80 Error value (m) Number of check points 1·4 1·2 1 0·8 0·6 0·4 0·2 0 –0·2 –0·4 –0·6 –0·8 0 20 40 60 80 100 120 Number of check points Fig. 5. Conﬁdence intervals (95% conﬁdence level) as a function of sample size computed, using the proposed model, for samples 11 (top graph), 52 (middle) and 61 (bottom) from the EuroSDR data-set. rmselower ¼ pﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃﬃ mse ta=2 rmse ð16Þ where mse is the mean square error given by equation (10), rmse is the standard deviation of the mse sampling distribution (equation (11)) and ta=2 is the two tails critical value corresponding to conﬁdence level a for N ) 1 degrees of freedom, N being the sample size (see Aguilar et al. (2007a) for further details). In the case of error data coming from nonopen terrain, such as those from the EuroSDR reference data-sets (Fig. 6), it seems to be clear 160 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. 100 95 90 85 80 75 70 65 60 55 1 0·9 0·8 0·7 0·6 0·5 0·4 0·3 0·2 0·1 0 P(%) Rmse (m) The Photogrammetric Record 0 20 40 60 80 100 120 140 160 180 200 220 0 20 40 1 0·9 0·8 0·7 0·6 0·5 0·4 0·3 0·2 0·1 0 Number of check points 100 95 90 85 80 75 70 65 60 55 0 20 40 60 80 100 120 140 160 180 200 220 0 20 40 Number of check points 1 0·9 0·8 0·7 0·6 0·5 0·4 0·3 0·2 0·1 0 60 80 100 120 140 160 180 200 220 Number of check points 100 95 90 85 80 75 70 65 60 55 P(%) Rmse (m) 60 80 100 120 140 160 180 200 220 P(%) Rmse (m) Number of check points 0 20 40 60 80 100 120 140 160 180 200 220 Number of check points Upper proposed model Population rmse Lower proposed model Upper t-Student 0 20 40 60 80 100 120 140 160 180 200 220 Number of check points Proposed model t-Student Lower t-Student Fig. 6. Rmse evaluation results for samples 11 (top graphs), 52 (middle) and 61 (bottom) from the EuroSDR dataset. Left column: rmse conﬁdence intervals (95% level) computed from the proposed model and the Student’s t approach. Right column: percentage of times population rmse falls within intervals computed for the proposed model and the Student’s t approach. that the Student’s t approach failed to reach the expected target value of 95% conﬁdence, even when using sample sizes of up to 200 check points (Fig. 6, right). Conversely, the model tested here can be considered successful when using anything upwards from around 60 to 160 check points, 100 being the average for all 15 EuroSDR data-sets (see Fig. 9). Notice that the length of the conﬁdence intervals (Fig. 6, left) is narrower in the case of the Student’s t model because the residual population has been assumed to be normal, and so standardised kurtosis and skewness have been set to zero (no outliers or systematic errors present). This is an unreal scenario which is overcome by means of the assumption-free nonparametric model. It is recognised that whilst non-normal, fairly symmetrical population shapes converge to normal looking sampling distributions with relatively small samples, strongly skewed distributions require much larger samples for the approximation to be good when a normal distribution is supposed. Thus the Student’s t approach is expected to require a large number of check points when working with a population that exhibits high skewness. 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. 161 0·3 0·28 0·26 0·24 0·22 0·2 0·18 0·16 0·14 0·12 0·1 0·08 0·06 0·04 100 95 P (%) Rmse (m) Aguilar and Mills. Accuracy assessment of lidar-derived digital elevation models 90 85 80 0 20 40 60 80 100 120 140 160 180 0 200 220 20 40 100 120 140 160 180 200 220 95 P (%) Rmse (m) 80 100 0·3 0·28 0·26 0·24 0·22 0·2 0·18 0·16 0·14 0·12 0·1 0·08 0·06 0·04 90 85 80 0 20 40 60 80 100 120 140 160 180 200 0 220 20 40 Number of check points 60 80 100 120 140 160 180 200 220 Number of check points 100 0·3 0·28 0·26 0·24 0·22 0·2 0·18 0·16 0·14 0·12 0·1 0·08 0·06 0·04 95 P (%) Rmse (m) 60 Number of check points Number of check points 90 85 80 0 20 40 60 80 100 120 140 160 180 200 Number of check points Upper proposed model Upper t-Student Lower proposed model Lower t-Student 220 0 20 40 60 80 100 120 140 160 180 200 220 Number of check points Population rmse Proposed model t-Student Fig. 7. Rmse evaluation results for 0405 lidar (top graphs), 0805 lidar (middle) and 3-sigma rule corrected 0506 lidar (bottom) from the NERC ARSF data-set. Left column: rmse conﬁdence intervals (95% level) computed from the proposed model and Student’s t approach. Right column: percentage of times population rmse falls within intervals computed for the proposed model and Student’s t approach. Results from the NERC ARSF test data (Fig. 7), however, demonstrated that the Student’s t approach may be deemed appropriate when dealing with normal lidar vertical error distributions which are usually found over open terrain (no ﬁltering error). It should be noted that the three NERC ARSF error data-sets clearly followed a normal distribution after being ﬁltered using the 3-sigma rule, as can be observed from the K-S test results in Table II. In this case, results indicate that working with only a few check points, around 20 to 40, would be sufﬁcient. In this instance the proposed model computed conﬁdence intervals that were too wide and illustrated a somewhat erratic behaviour when working with a small number of check points. Indeed, despite the wide mean conﬁdence interval that was calculated, the number of times the population rmse fell within the computed interval limits was found to be very poor until a sample size of between 60 and 80 check points was reached. This was mainly due to the high level of uncertainty in estimating the kurtosis value from a small sample size (Aguilar et al., 2007b). The Student’s t model estimates only mean square error and the standard deviation of the mean square error sampling distribution, whose values are more reliable than kurtosis and skewness when estimated from small sample sizes. 162 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. Check points needed to obtain the required confidence level The Photogrammetric Record 170 150 130 R2 = 0·9515 110 R2= 0·9526 90 70 50 30 0 10 20 30 40 50 60 Standardised kurtosis Fig. 8. Relationship between check points needed to reach a determined conﬁdence level for the computed rmse bounds (circles 95%; triangles 90%) and the residual population kurtosis. Data from the 15 EuroSDR samples. Determining the Minimum Required Number of Check Points The nature of the relationship between the number of check points needed to reach a determined conﬁdence level for the computed rmse bounds and the residual population kurtosis is depicted in Fig. 8. A second-order polynomial has been ﬁtted to the experimental data to provide a better understanding of the non-linear increase in the number of required check points when errors come from a more leptokurtic data-set. The ﬁnding can be considered as logical since high kurtosis implies more errors are likely to be outliers, and so more check points are required for the construction of a reliable conﬁdence interval. It should be stressed that when less stringent conﬁdence levels are required (for instance, reducing conﬁdence from 95 to 90%, as illustrated in Fig. 8), a smaller number of check points will be needed to construct a reliable interval according to the chosen conﬁdence level. Kurtosis therefore seems to be the key to determining the required number of check points in order to achieve a certain conﬁdence level in the uncertainty of the accuracy estimate. From these results, further work should be undertaken to determine some morphological terrain and landscape indicators (terrain roughness, landscape spatial distribution and arrangement of features such as buildings Check points needed to obtain the required confidence level 180 160 140 120 Maximum value 100 Minimum value 80 Mean value 60 40 20 0 75% 80% 85% 90% 95% 100% Confidence level Fig. 9. Relationship between the number of check points needed to reach a determined conﬁdence level and the desired conﬁdence level. Maximum, minimum and mean values extracted from the 15 EuroSDR samples and for every conﬁdence level tested. 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. 163 Aguilar and Mills. Accuracy assessment of lidar-derived digital elevation models and vegetation) to help predict the expected degree of kurtosis in the error budget after applying lidar data ﬁltering and gridding processes. Currently, the ASPRS guidelines specify that accuracies should be reported at the 95% conﬁdence level. However, some researchers report applications related to geopositioning tasks which typically demand a 90% or even lower conﬁdence level (Rodarmel et al., 2006). Therefore, rather than require 95%, the proposed method allows the user to determine the conﬁdence level depending on the application of the data. Changing the conﬁdence level is as simple as changing the signiﬁcance level in the developed equations. Fig. 9 clearly shows that fewer check points are required when the degree of conﬁdence is relaxed, although the relationship is not linear. In fact, in the worst case (maximum values in Fig. 9), the number of check points required for computing rmse bounds at the 95% conﬁdence level would be 160, whilst it would be only 110 for an 80% conﬁdence level. Conclusions From the results obtained through this research the following conclusions can be drawn: (1) The model proposed to compute conﬁdence intervals for lidar vertical error has proved to be advantageous compared to the methodology recommended by the ASPRS Lidar Committee for non-open terrain (non-normal error distribution). The proposed methodology works quite well under open and non-open terrain conditions because input parameters such as skewness and kurtosis allow the model to seek the statistical nature of the error data-set. Furthermore, reliable and narrow conﬁdence intervals can be calculated when working from as few as 60 check points. This is because interval length depends basically on the sample standard deviation, skewness and standardised kurtosis, rather than the number of check points used. (2) The proposed methodology not only estimates conﬁdence intervals for lidar vertical error, but also computes uncertainty in rmse estimates, which is usually better understood by DEM users and data providers. In this case the model developed by Aguilar et al. (2007a) is recommended when lidar error in non-open terrain is evaluated. Depending on the kurtosis of error population, between 60 and 160 check points (around 100 on average) could be required. However, lidar accuracy assessment via rmse in open terrain (normal error distribution) would be more successfully achieved by means of the Student’s t approach because it needs fewer check points than the proposed model. In fact, only 20 to 40 check points would be required in this case. (3) Bearing in mind the variability of the data-sets tested, the results have proved to be very promising, the proposed models reproducing the statistical behaviour of lidar vertical errors using a reasonably low number of check points, even in the case of the more non-normally distributed residual data-sets. This research can therefore be considered as an important step towards improving and understanding the quality control of lidar-derived DEMs. Acknowledgments The authors are grateful to the Spanish Government for ﬁnancing the ﬁrst author’s fellowship at Newcastle University through the Spanish Research Mobility Programme ‘‘Estancias de profesores e investigadores españoles en centros de ensenanza superior e investigación extranjeros’’. Thanks are due to the NERC ARSF and Pauline Miller, Newcastle 164 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. The Photogrammetric Record University, for assisting with the data from Filey Bay and also to ISPRS WG III/3 ‘‘3D Reconstruction from Airborne Laser Scanner and InSAR Data’’ for facilitating access to the EuroSDR lidar data. references Aguilar, F. J., Agüera, F., Aguilar, M. A. and Carvajal, F., 2005. Effects of terrain morphology, sampling density, and interpolation methods on grid DEM accuracy. Photogrammetric Engineering & Remote Sensing, 71(7): 805–816. Aguilar, F. J., Aguilar, M. A., Agüera, F. and Sánchez, J., 2006. The accuracy of grid digital elevation models linearly constructed from scattered sample data. International Journal of Geographical Information Science, 20(2): 169–192. Aguilar, F. J., Aguilar, M. A. and Agüera, F., 2007a. Accuracy assessment of digital elevation models using a non-parametric approach. International Journal of Geographical Information Science, 21(6): 667–686. Aguilar, F. J., Agüera, F. and Aguilar, M. A., 2007b. A theoretical approach to modeling the accuracy assessment of digital elevation models. Photogrammetric Engineering & Remote Sensing, 73(12): 1367–1380. Axelsson, P., 1999. Processing of laser scanner data—algorithms and applications. ISPRS Journal of Photogrammetry and Remote Sensing, 54(2/3): 138–147. Axelsson, P., 2000. DEM generation from laser scanner data using adaptive TIN models. International Archives of Photogrammetry and Remote Sensing, 33(B4): 110–117. Clark, M. L., Clark, D. B. and Roberts, D. A., 2004. Small-footprint lidar estimation of sub-canopy elevation and tree height in a tropical rain forest landscape. Remote Sensing of Environment, 91(1): 68–89. Daniel, C. and Tennant, K., 2001. DEM quality assessment. Digital Elevation Model Technologies and Applications: The DEM Users Manual (Ed. D. F. Maune). American Society for Photogrammetry and Remote Sensing, Bethesda, Maryland. 539 pages: 395–440. FGDC, 1998. Geospatial positioning accuracy standards. National standards for spatial data accuracy. http:// www.fgdc.gov/standards/projects/FGDC-standards-projects/accuracy/part3/chapter3 [Accessed: 23rd August 2007]. Fisher, P. F. and Tate, N. J., 2006. Causes and consequences of error in digital elevation models. Progress in Physical Geography, 30(4): 467–489. Flood, M., 2004. ASPRS guidelines. Vertical accuracy reporting for lidar data. http://www.asprs.org/society/ divisions/ppd/standards/Lidar%20guidelines.pdf [Accessed: 23rd August 2007]. Godambe, V. P. (Ed.), 1991. Estimating Functions. Oxford University Press, Oxford. 356 pages. Godambe, V. P. and Thompson, M. E., 1989. An extension of quasi-likelihood estimation. Journal of Statistical Planning and Inference, 22(2): 137–152. Goodwin, N. R., Coops, N. C. and Culvenor, D. S., 2006. Assessment of forest structure with airborne LiDAR and the effects of platform altitude. Remote Sensing of Environment, 103(2): 140–152. Göpfert, J. and Heipke, C., 2006. Assessment of LiDAR DTM accuracy in coastal vegetated areas. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 36(3): 79–85. Hopkinson, C., Chasmer, L. E., Zsigovics, G., Creed, I. F., Sitar, M., Treitz, P. and Maher, R. V., 2004. Errors in LiDAR ground elevation and wetland vegetation height estimates. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 36(8/W2): 108–113. Hyyppä, H., Yu, X., Hyyppä, J., Kaartinen, H., Kaasalainen, S., Honkavaara, E. and Rönnholm, P., 2005. Factors affecting the quality of DTM generation in forested areas. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 36(3/W19): 85–90. James, T. D., Murray, T., Barrand, N. E. and Barr, S. L., 2006. Extracting photogrammetric ground control from lidar DEMs for change detection. Photogrammetric Record, 21(116): 312–328. Kraus, K. and Pfeifer, N., 1998. Determination of terrain models in wooded areas with airborne laser scanner data. ISPRS Journal of Photogrammetry and Remote Sensing, 53(4): 193–203. Lim, K., Treitz, P., Wulder, M., St-Onge, B. and Flood, M., 2003. LiDAR remote sensing of forest structure. Progress in Physical Geography, 27(1): 88–106. Miller, P., Mills, J., Edwards, S., Bryan, P., Marsh, S., Hobbs, P. and Mitchell, H., 2007. A robust surface matching technique for integrated monitoring of coastal geohazards. Marine Geodesy, 30(1/2): 109– 123. Oort, P. A. J. van and Bregt, A. K., 2005. Do users ignore spatial data quality? A decision-theoretic perspective. Risk Analysis, 25(6): 1599–1610. 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. 165 Aguilar and Mills. Accuracy assessment of lidar-derived digital elevation models Rodarmel, C., Theiss, H., Johanesen, T. and Samberg, A., 2006. A review of the ASPRS guidelines for the reporting of horizontal and vertical accuracy in lidar data. International Archives of Photogrammetry, Remote Sensing and Spatial Information Sciences, 36(1): 6 pages (on CD-ROM). Royston, J. P., 1982. Expected normal order statistics (exact and approximate). Applied Statistics, 31(2): 161–165. Sithole, G. and Vosselman, G., 2003. Report: ISPRS comparison of ﬁlters. http://www.itc.nl/isprswgIII-3/ ﬁltertest/Report05082003.pdf [Accessed: 23rd August 2007]. Sithole, G. and Vosselman, G., 2004. Experimental comparison of ﬁlter algorithms for bare-earth extraction from airborne laser scanning points clouds. ISPRS Journal of Photogrammetry and Remote Sensing, 59(1–2): 85–101. Smith, S. L., Holland, D. A. and Longley, P. A., 2005. Quantifying interpolation errors in urban airborne laser scanning models. Geographical Analysis, 37(2): 200–224. Steel, R. G. D. and Torrie, J. H., 1980. Principles and Procedures of Statistics: A Biometrical Approach. Second edition. McGraw-Hill, New York. 631 pages. Su, J. and Bork, E., 2006. Inﬂuence of vegetation, slope, and lidar sampling angle on DEM accuracy. Photogrammetric Engineering & Remote Sensing, 72(11): 1265–1274. Torlegård, K., Östman, A. and Lindgren, R., 1986. A comparative test of photogrammetrically sampled digital elevation models. Photogrammetria, 41(1): 1–16. Weir, M. J. C., 2002. Monte Carlo simulation of long-term spatial error propagation in forestry databases. Spatial Data Quality (Eds. W. Shi, P. F. Fisher and M. F. Goodchild). Taylor & Francis, London. 336 pages: 294– 303. Appendix An estimating equation for parameter h is an equation g(X, h) = 0 which has a unique solution h = h(X) for every value of X, X being a random variable. In this case, X is a set of N independent residuals computed at corresponding check points for evaluating lidar-derived DEM accuracy. The function g is named the estimating function. An optimal estimation of h can be achieved considering the class of estimating functions as having zero expectation given by the following linear combination: g ¼ ah1 þ bh2 ðA1Þ where h1 and h2 are real functions which would be computed depending on the underlying nature of the statistical problem, and a and b are real functions that can be estimated by the locally optimal solution found by Godambe and Thompson (1989) and shown in equation (A2). For the correct application of this optimal solution, E(h1h2) should be equal to zero (E expressing the mathematical expectation of the operation in parentheses). That is to say, h1 and h2 must be orthogonal. A more detailed discussion on the theory of estimating functions, together with numerous applications, can be found in Godambe (1991). @h 1 2 E @h @h @h a ¼ ¼ ; b : ðA2Þ Eðh21 Þ Eðh22 Þ In this case, the parameter h would be the expected or mean value for the variable sample mean xðlÞ. The basic estimating functions, supposing the third and fourth central moments are known (skewness and kurtosis, respectively), would be (Godambe and Thompson, 1989) h1 ¼ x l; h2 ¼ ðx lÞ2 r2m c1m rm ðx lÞ ðA3Þ where rm and c1m are the standard deviation and the skewness of the sampling distribution of the sample mean, respectively. It can be easily proved that h1 and h2 are orthogonal functions, as stated before. Therefore, the estimates of real functions a and b can be obtained by means of equation (A2): 166 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. The Photogrammetric Record a ¼ 1 ; r2m b ¼ c1m rm ; þ 2 c21m Þ r4m ðc2m r rm ¼ pﬃﬃﬃﬃ : N ðA4Þ Hence the estimating function g for l would be estimated by g* = a * h1 + b * h2. Given that the distribution of the statistic g*/rg* can be approximated by a standardised normal distribution, the following expression can be written: g Za ¼ 1 a ðA5Þ P rg where Za is the one-tailed critical value corresponding to a conﬁdence level of 1 ) a. Hence Za = 1Æ64 for a = 0Æ05. Equations (2) and (3) in the main text can be obtained from equation (A5) and solving a quadratic equation for the variable x l. The reader is referred to Godambe and Thompson (1989) for further details. Résumé Malgré le coût relativement élevé des modèles numériques des altitudes (MNA) obtenus à partir de lidars aéroportés, ces produits sont généralement fournis sans qu’on les accompagne d’une estimation convenable de leur exactitude. Dans la plupart des cas, l’estimation de l’exactitude des MNA est fournie d’après une comparaison entre les altitudes obtenues au lidar et celles en un nombre limité de points de vériﬁcation dont les coordonnées proviennent d’une source indépendante et de précision supérieure; on suppose également que la distribution des erreurs et des écarts d’altitudes suit une loi normale. On propose dans cet article une nouvelle méthodologie pour évaluer l’exactitude altimétrique des MNA issus de lidars en recourant à des intervalles de conﬁance élaborés à partir d’un échantillonnage limite d’erreurs calculées aux points de vériﬁcation. On a essayé une solution nonparamétrique dans laquelle on ne suppose aucune distribution particulière des erreurs, ce qui permet d’appliquer la méthodologie proposée lorsque les distributions d’erreurs ne sont pas normales, ce qui est généralement le cas des MNA issus de lidars. On a testé la qualité de la modélisation proposée en utilisant une simulation expérimentale de type Monte Carlo sur 18 jeux de données d’erreurs altimétriques. On a établi 15 de ces jeux à partir des données-lidar originales fournies par le Groupe de Travail III/3 de la Société Internationale de Photogrammétrie et de Télédétection, en utilisant comme réalité de terrain leurs données de référence ﬁltrées manuellement. Les trois derniers jeux de données ont été fournis par le système lidar du Natural Environment Research Council et son Airborne Research and Survey Facility, en déterminant les points de vériﬁcation par un système GPS cinématique de haute précision. Les résultats se sont montrés prometteurs, les modèles proposés reproduisant bien le comportement statistique des erreurs altimétriques du lidar pour un nombre convenable de points de vériﬁcation, y compris dans le cas où les jeux de données n’ont pas une distribution normale des résidus. On peut donc considérer que cette étude constitue une avancée importante pour l’amélioration du contrôle de la qualité des MNA issus de lidars. Zusammenfassung Trotz der relativ hohen Kosten für Digitale Höhenmodelle (DEMs) aus ﬂugzeuggestütztem Lidar werden diese Produkte ohne eine zufrieden stellende 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. 167 Aguilar and Mills. Accuracy assessment of lidar-derived digital elevation models Genauigkeitsabschätzung geliefert. Meist wird die Genauigkeit des Höhenmodells durch Vergleich einer Auswahl von Kontrollpunkthöhen aus unabhängigen Messungen höherer Genauigkeit mit den Höhen aus den Lidar Daten abgeleitet. Dazu wird eine Normalverteilung der abgeleiteten Höhendifferenzen, bzw. Höhenfehler angenommen. Dieser Beitrag schlägt eine neue Methode zur Schätzung der Höhengenauigkeit eines Lidar Höhenmodells vor, das Konﬁdenzintervalle nutzt, die aus einer Stichprobe von Fehlern an Kontrollpunkten abgeleitet werden. Ein nicht-parametrischer Ansatz wurde getestet, bei dem keine besondere Fehlerverteilung angenommen wurde. Damit wird die vorgeschlagene Methode besonders auf die nichtnormalverteilten Fehler anwendbar, wie sie typischerweise in den Höhenmodellen aus Lidar auftreten. Das Verhalten des vorgeschlagenen Modells wurde empirisch mit einer Monte-Carlo Simulation an 18 Höhenfehlerdatensätzen untersucht. 15 dieser Datensätze wurden aus den originalen Lidar Daten der Arbeitsgruppe III/3 der Internationalen Gesellschaft für Photogrammetrie und Fernerkundung gerechnet, wobei die zugehörigen geﬁlterten Referenzdaten als Sollwerte angenommen wurden. Die restlichen drei Datensätze wurden durch das Lidar System des ‘‘Natural Environment Research Council’s Airborne Research and Survey Facility’’ bereitgestellt zusammen mit Kontrollpunkten, die mit hochgenauem kinematischen GPS bestimmt worden sind. Die Ergebnisse waren viel versprechend. Die vorgeschlagenen Methode reproduzierte das statistische Verhalten der Höhenfehler der Lidar Daten mit eine günstigen Anzahl von Kontrollpunkten, sogar in den Datensätzen mit nichtnormalverteilten Verbesserungen. Diese Forschungsergebnisse können deshalb als wichtiger Schritt zur Verbesserung der Qualitätskontrolle von Höhenmodellen aus Lidar Daten betrachtet werden. Resumen Pese al relativamente elevado coste de los Modelos Digitales de Elevación (MDE) obtenidos mediante lı́dar aerotransportado, estos productos se proporcionan habitualmente sin una adecuada estimación de su precisión. En la mayorı́a de los casos las estimación de la exactitud de un MDE se hace comparando las elevaciones lı́dar con una muestra ﬁnita de puntos de comprobación procedentes de una fuente independiente de referencia de mayor exactitud. En estos casos se suele asumir que la distribución estadı́stica de las diferencias de altura o error vertical sigue una distribución normal. Este artı́culo propone una nueva metodologı́a para estimar la exactitud vertical de un MDE lı́dar mediante intervalos de conﬁanza construidos a partir de una muestra ﬁnita de errores calculados en los puntos de comprobación. En este sentido, se ha adoptado una perspectiva no paramétrica, sin asumir ninguna distribución especı́ﬁca del error, de modo que la metodologı́a propuesta es particularmente aplicable en el caso de distribuciones no normales como las que suelen encontrarse en MDE lı́dar. La bondad del modelo propuesto fue validada experimentalmente en 18 conjuntos de datos lı́dar mediante una simulación numérica basada en el método de Monte Carlo. 15 de las 18 poblaciones de errores se obtuvieron a partir de datos lı́dar brutos facilitados por el Grupo de Trabajo III/3 de International Society for Photogrammetry and Remote Sensing, usando sus respectivos datos de referencia, ﬁltrados manualmente, como forma de validación. Las tres poblaciones de errores restantes fueron proporcionadas por el sistema lı́dar del Airborne Research and Survey Facility del Natural Environment Research Council, junto con puntos de comprobación adquiridos mediante GPS cinemático de 168 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. The Photogrammetric Record alta precisión. Los resultados obtenidos resultaron ser muy prometedores, observándose cómo los modelos propuestos reproducı́an acertadamente el comportamiento estadı́stico de los errores lı́dar verticales a partir de una muestra relativamente pequeña de puntos de comprobación, incluso en casos en que los datos tenı́an residuos con una distribución no normal. Por tanto, esta investigación puede considerarse como un avance muy importante en la mejora del proceso de control de calidad de los DEM lı́dar. 2008 The Authors. Journal Compilation 2008 The Remote Sensing and Photogrammetry Society and Blackwell Publishing Ltd. 169

RELATED PAPERS

RELATED TOPICS

Log In

Accuracy assessment of lidar-derived digital elevation models

Accuracy assessment of lidar-derived digital elevation models

Related Papers

RELATED PAPERS

RELATED TOPICS