收藏 分享(赏)

scaling field data to calibrate and validate moderate spatial.pdf

上传人:weiwoduzun 文档编号:1753897 上传时间:2018-08-22 格式:PDF 页数:10 大小:357.54KB
下载 相关 举报
scaling field data to calibrate and validate moderate spatial.pdf_第1页
第1页 / 共10页
scaling field data to calibrate and validate moderate spatial.pdf_第2页
第2页 / 共10页
scaling field data to calibrate and validate moderate spatial.pdf_第3页
第3页 / 共10页
scaling field data to calibrate and validate moderate spatial.pdf_第4页
第4页 / 共10页
scaling field data to calibrate and validate moderate spatial.pdf_第5页
第5页 / 共10页
点击查看更多>>
资源描述

1、AbstractValidation and calibration are essential components ofnearly all remote sensing-based studies. In both cases,ground measurements are collected and then related to theremote sensing observations or model results. In manysituations, and particularly in studies that use moderateresolution remot

2、e sensing, a mismatch exists between thesensors field of view and the scale at which in situ meas-urements are collected. The use of in situ measurements formodel calibration and validation, therefore, requires arobust and defensible method to spatially aggregate groundmeasurements to the scale at w

3、hich the remotely senseddata are acquired. This paper examines this challenge andspecifically considers two different approaches for aggregat-ing field measurements to match the spatial resolution ofmoderate spatial resolution remote sensing data: (a) land-scape stratification; and (b) averaging of

4、fine spatial resolu-tion maps. The results show that an empirically estimatedstratification based on a regression tree method provides astatistically defensible and operational basis for performingthis type of procedure.IntroductionRemote sensing is used routinely to map vegetation proper-ties over

5、large areas. To calibrate remote sensing-basedmodels and validate model results, field data are required(Ardo, 1992; Gemmell, 1995; Wulder, 1998; Franklin, 1986;Cohen and Spies, 1992; Danson and Curran, 1993; Puhr andDonoghue, 2000; Cohen et al., 2001; Cohen et al., 2003). Forpractical reasons, howe

6、ver, the number of data and the areasampled on the ground through fieldwork tends to be quitesmall. Indeed, the need to limit the cost of fieldwork overlarge areas often results in relatively small areas beingsampled using field plots that are smaller than 1 ha. Thus, acentral challenge in using fie

7、ld data in remote sensing-basedstudies is ensuring that the in situ measurements provide anappropriate and representative sample in support of theresearch or mapping goals.The above problem is particularly acute in studies thatencompass large areas using moderate spatial resolutionremotely sensed da

8、ta. Because moderate spatial resolutionScaling Field Data to Calibrate and Validate Moderate Spatial Resolution Remote Sensing ModelsA. Baccini, M.A. Friedl, C.E. Woodcock, and Z. Zhusensors provide data with spatial resolutions on the order of100s of meters, the ground area covered by a pixel typic

9、allyconsists of a mosaic of land-cover types. Thus, the radiancemeasured by the sensor reflects the combined properties ofthis mosaic. In this situation, it is important to consider howthe area sampled on the ground is related to the instanta-neous-field-of-view measured by remote sensing. Specifica

10、lly,to use plot-scale data to calibrate or validate moderate spatialresolution remote sensing models, it is necessary to upscalethe field data to match the spatial resolution of the remotesensing observations.Despite its importance, the above issue has receivedrelatively little attention. Recently,

11、however, a variety ofpapers have identified it as a key problem, particularly inregards to validation activities. For example, to minimize theeffect of mixed pixels in validation efforts for productsgenerated from the Moderate Resolution Imaging Spectrora-diometer (MODIS), Milne and Cohen (1999) sug

12、gested locat-ing field plots in areas with low spatial variance in Landsatdata. In a related study conducted in Botswana, Tian et al.(2002) addressed the problem of validating estimates of leafarea index (LAI) derived from MODIS using field plots rangingfrom 34 to 2,756 m2. Because the scale at whic

13、h the fielddata were collected was not consistent with the spatialresolution of MODIS data, a direct comparison was notfeasible. As a solution, Tian et al. (2002) used Landsat datato identify homogeneous areas on the basis of spectralsimilarity and adjacency using a segmentation algorithm.These regi

14、ons were then used to aggregate the field data tothe scale of the MODIS pixels.The research reported in this paper extends theseefforts and specifically considers the following question:how can data collected at the scale of field plots beaggregated and related to remote sensing measurementscollecte

15、d at much coarser spatial resolutions such that fine-scale spatial variation in surface properties does notcorrupt or reduce the validity of the analysis? To answerthis question, we examine several methods to spatiallyaggregate biophysical measurements collected in fieldsurveys to the scale of moder

16、ate spatial resolution remotelysensed data. Here, we examine the specific problemof mapping biomass. However, the general issues andPHOTOGRAMMETRIC ENGINEERING Schroeder et al., 1997) and provides anaccurate approximation to timber volume which is normallycalculated with a similar allometric equatio

17、n that accountsfor tree shape.Remote Sensing DataWe used two main sources of remote sensing data. First,Landsat Enhanced Thematic Mapper (ETMH11001) data were usedto characterize landscape-scale spatial variation in forestcomposition and structure. Second, moderate spatial resolu-tion (1 km2) data w

18、ere provided by MODIS.A set of three Landsat ETMH11001 images acquired on 06 July1999, 30 January 2000, and 08 May 2001 were used. Prepro-cessing of these images was performed as part of the Multi-Resolution Land Characteristics Consortium (MRLC) at theUSGS EROS Data Center (Irish, 2000). To exploit

19、 informationrelated to seasonal vegetation dynamics separate imagesfrom spring, summer, and winter were used. These imageswere radiometrically and geometrically corrected accordingto methods developed at the USGS EROS Data Center (Irish,2000). Geometric rectification was performed (includingterrain

20、correction) using the USGS 1-arc-second NationalElevation Dataset.For each Landsat ETMH11001 scene, the MRLC providesimages based on the Tasselled Cap transformation (bright-ness, greenness, and wetness), in addition to the rawspectral data. To reduce the dimensionality of the data setprincipal comp

21、onents were computed using spectral datafrom all three dates using all of the Landsat ETMH11001 bands,excluding the thermal band. The first six components,which explained 96 percent of the variance in the data,were used in the analysis presented below.The MODIS data used for this work consisted of 1

22、 kmnadir bidirectional reflectance distribution function- (BRDF)adjusted surface reflectance (NBAR). Each NBAR imageprovides surface reflectance data that have been normalizedto a consistent nadir view geometry and are atmospheri-cally corrected, cloud-cleared, and representative of 16-dayperiods. C

23、omplete details regarding the MODIS NBAR productare provided in Schaaf et al. (2002).AnalysisThe conceptual approach that was used in this study ispresented in Figure 1. The basic challenge that we addressis how to most accurately scale data collected in the field tomatch remotely sensed measurement

24、s made at much coarserspatial resolutions: in this case, 1 km MODIS data. As we946 August 2007 PHOTOGRAMMETRIC ENGINEERING Box,1981; Walter, 1979; Woodward, 1987; Prentice et al., 1992).Similarly, climate and topographic variables have been usedwidely in association with remotely sensed data to incr

25、easethe accuracy of maps of vegetation composition and struc-ture (Strahler, 1980). For example, topographic data (eleva-tion, slope, and transformed aspect) have been used as asurrogate for microclimate and combined with Landsat datato map forest vegetation in California (Franklin et al., 1986;Wood

26、cock et al., 1994; Franklin et al., 2000). Having saidthis, the relationship between forest properties and climate,elevation, and spectral information is complex and oftennon-linear (Baccini et al., 2004). As a result, parametricstatistical methods often fail to characterize the magnitudeand form of

27、 relationships among these variables.For this work, we used the approach described byMichaelsen et al. (1994), where regression tree analysis isused to produce a landscape stratification using remotesensing, climate, and terrain variables. Specifically, we usedbiomass as the response variable and ET

28、MH11001, climate andtopographic data as predictors. Because all of the predictorvariables were available in a spatially continuous fashion(i.e., everywhere in the study region), the resulting regressiontree was used as a basis for landscape stratification followingthe method described by Michaelsen

29、et al. (1994). Theresults from this analysis were then used to aggregate the FIAplot data to the scale of 1 km2MODIS pixels following thearea-weighted averaging procedure previously described.ResultsRegression Model ResultsTo assess the OLS model approach, we split the available FIAdata into two set

30、s, where roughly 23 percent of the data (101sites) were selected at random and used as independent testdata. Results from the univariate regression analysis usingNDVI derived from ETMH11001 data (spring date) show that NDVIexplained only 24 percent of the variance in biomass with aroot-mean-squared-

31、error (RMSE) H11005 81 tonnes haH110021for theindependent data set. A multiple linear regression explained47 percent of the variance (RMSE H11005 71.9 tonnes haH110021) whenapplied to the independent data set. However, four out of thenine predictor variables possessed a VIF larger than 10, whichindi

32、cates a high degree of multicollinearity in the model. Itis therefore likely that the estimated R2is inflated. Figure 2presents scatterplots of observed versus predicted biomassfor both regression models for the independent data set andclearly illustrates the level of uncertainty in these models.Per

33、haps more importantly, inspection of diagnostic plots(Figure 3) for the estimated models reveals non-constantvariance in the model residuals, a serious violation of modelassumptions that makes prediction on unseen cases usingPHOTOGRAMMETRIC ENGINEERING VM indicates stratification derived from the ve

34、getation map.this model problematic. We, therefore, do not consider OLSmethods further.Regression Tree AnalysisThe regression tree analysis was performed using ETMH11001,elevation, and climate data to predict biomass. To determinethe optimal tree size, two different analyses were performed.First, a

35、10-fold cross-validation procedure was used inwhich large regression trees were grown and successivelypruned back.The optimal tree was selected based on the tree size thatexplained the most variance in unseen data. Second, wevisually inspected stratifications produced using trees ofdifferent sizes.

36、In both cases, the results were tested using aleave-one-out cross-validation approach.Table 1 presents results from this analysis for tree sizesof 2, 4, 8, 12, and 17 terminal nodes. For comparison, weinclude results from a one-way analysis of variance wherethe data were stratified using the vegetat

37、ion map. Theresulting regression trees explained 29 to 40 percent ofthe variance in the unseen data. According to these results,the optimal tree size was eight strata. It is interesting tonotice that the regression trees approach even with a smallernumber of strata compared to the vegetation stratif

38、ication(for example: 8 strata versus 10 strata), explains more variancethan the vegetation map approach. A visual inspection ofthe strata map derived from the regression tree stratificationsuggested that a stratification with 17 strata captured importantadditional spatial variability, with relativel

39、y little loss inexplanatory power.Comparison of Stratification MethodsTo evaluate the effectiveness of the regression tree approachversus using the existing LANDFIRE vegetation map as abasis for stratification, we computed the area-weightedmean, the RMSE, and the area-weighted variance of the meanfo

40、llowing the approach used by Davis et al. (1992). Resultsfrom this comparison are presented in Table 2. For theregression tree analysis we present results for two sets ofdata: one that was used for estimating the regression tree,and a second data set that was held out and used as anindependent basis

41、 for evaluating the effectiveness of theregression tree approach. This was not required for thestratification based on the LANDFIRE map. To allowcomparison of results across methods we, therefore, presentthese results for three distinct data sets: (a) the entire dataset, (b) the training set, and (c

42、) the test set.The results presented in Table 2 show that the esti-mated variance in the mean is 14 tonnes haH110021for theregression tree stratification, and about 30 tonnes haH110021forthe stratification based on the vegetation map. These results,in combination with those presented in Table 2 sugg

43、est thatthe regression tree approach provides a significantly moreaccurate stratification with respect to biomass. Note that twostrata are missing in the test set for the regression treestratification, which likely explains the difference in thearea-weighted average between the training and testing

44、sets.In particular, the high biomass stratum is not present in thetest set, which biases the estimated average. Note also thatthe area-weighted means are systematically lower whenestimated using the regression tree stratification. The keyresult, however, is that the uncertainty in the mean based ont

45、he regression tree stratification is reduced relative to thatestimated using the vegetation map.Table 3 reports the mean and standard deviation inbiomass along with the vegetation classes included and thenumber of observations (field plots) for each biomassstratum identified by the regression tree.

46、Table 4 reportsthe percent area occupied, the average biomass, and thenumber of plots within each vegetation class. The fact thatmore than one vegetation class is embedded in all exceptone stratum identified by the regression tree demonstratesthat different vegetation classes have approximately thes

47、ame biomass (and vice versa), which reduces the utility ofthe vegetation map as a basis for stratifying landscapeswith regards to biomass. In contrast, the regression tree-based stratification is specifically optimized to explainvariance in biomass and, therefore, should provide aTABLE 2. THIS TABLE

48、 SHOWS FOR THE TRAINING AND INDEPENDENT TEST DATA: THE NUMBER OF FIELD PLOTS, ANALYSIS METHOD, THE NUMBER OF BIOMASS STRATA, THE ESTIMATE OF ERROR VARIANCE FOR THE MEAN, THE AREA WEIGHTED AVERAGE, THE PERCENTAGE OF ERROR CALCULATED ON THE TOTAL MEAN, AND THE ESTIMATE OF THE VARIANCE OF THE AREA WEIG

49、HTED MEAN. THE UNWEIGHTED MEAN FOR THE FIA DATA SET IS 146 (TONNES HAH110021) WITH A STANDARD DEVIATION OF 112 (TONNES HAH110021)Number StratificationaNumber RMSE Area Weighted Variance of Data of Plots Method of Strata tonnes haH110021 Average % Error A.W. MeanTraining set 336 RT 17 60.5 140 44 13.5Test set 101 RT 15 81.8 126 55 14.6Training set 336 VM 10 97.9 163 61 32.4Test set 101 VM 9 129.5 131 66 24.4Train H11001 Test 437 RT 17 61.0 136 50 16.2Train H11001 Test 437 VM 10 95.38 157 61 30.45aRT indicates stratification based on the regression tree approach;VM indicates stratification d

展开阅读全文
相关资源
猜你喜欢
相关搜索

当前位置:首页 > 企业管理 > 经营企划

本站链接:文库   一言   我酷   合作


客服QQ:2549714901微博号:道客多多官方知乎号:道客多多

经营许可证编号: 粤ICP备2021046453号世界地图

道客多多©版权所有2020-2025营业执照举报