An approximate point-based alternative for the estimation of variance under big BAF sampling

2021-10-12 08:11:24ThomasLynchJeffreyGoveTimothyGregoireandMarkDucey

Forest Ecosystems 2021年3期

Thomas B.Lynch ,Jeffrey H.Gove ,Timothy G.Gregoire and Mark J.Ducey

Abstract Background:A new variance estimator is derived and tested for big BAF(Basal Area Factor)sampling which is a forest inventory system that utilizes Bitterlich sampling(point sampling)with two BAF sizes,a small BAF for tree counts and a larger BAF on which tree measurements are made usually including DBHs and heights needed for volume estimation.Methods:The new estimator is derived using the Delta method from an existing formulation of the big BAF estimator as consisting of three sample means.The new formula is compared to existing big BAF estimators including a popular estimator based on Bruce’s formula.Results:Several computer simulation studies were conducted comparing the new variance estimator to all known variance estimators for big BAF currently in the forest inventory literature.In simulations the new estimator performed well and comparably to existing variance formulas.Conclusions:A possible advantage of the new estimator is that it does not require the assumption of negligible correlation between basal area counts on the small BAF factor and volume-basal area ratios based on the large BAF factor selection trees,an assumption required by all previous big BAF variance estimation formulas.Although this correlation was negligible on the simulation stands used in this study,it is conceivable that the correlation could be significant in some forest types,such as those in which the DBH-height relationship can be affected substantially by density perhaps through competition.We derived a formula that can be used to estimate the covariance between estimates of mean basal area and the ratio of estimates of mean volume and mean basal area.We also mathematically derived expressions for bias in the big BAF estimator that can be used to show the bias approaches zero in large samples on the order of where n is the number of sample points.

Keywords:Bitterlich sampling,Delta method,Double sampling,Estimator bias,Forest inventory,Horizontal point sampling,Variance of a product,Volume basal area ratio,Covariance estimation

Background

The big BAF (Basal Area Factor) estimator for forest inventory is based on horizontal point sampling (HPS)(also called Bitterlich sampling)using angle gauges having two different BAFs at each sample point in the field—the smaller BAF angle gauge BAFcis used to obtain a count of qualifying trees at each sample point and the larger BAF angle gauge BAFvis used to select sample trees on which careful measurements are made.Usually these careful measurements include measurement of DBH and height from which tree volume,weight or biomass can be estimated(Bell et al.1983;Bruce 1961;Oderwald and Jones 1992).Although big BAF sampling is a form of double sampling,as Marshall et al.(2004)indicated,it differs from some common forms of double sampling in forest inventory such as double sampling with regression estimators because both samples are taken at each point,so that the small sample is not simply a subset of the large sample point locations (discussions of double sampling in forest inventory include Gregoire and Valentine(2008,p.262)and de Vries(1986,p.164)among others).

As indicated by Iles(2012)the history of big BAF sampling may go back to Grosenbaugh(1952,p.53).An early proposal for using two prism factors in big BAF sampling was given by Bell et al.(1983,p.702) and later a detailed description of big BAF sampling was given by Marshall et al.(2004).A detailed review of the history of big BAF sampling was given a recent treatment by Gove et al.(2020)who compared variance estimation methods which have been proposed for the method.Recent texts of forest sampling and mensuration which include descriptions of big BAF sampling include Gregoire and Valentine(2008,p.268);Kershaw et al.(2016,p.377).

Big BAF methods have been used operationally to inventory forests in the USA and Canada in both western and eastern forest types (Corrin 1998;Desmarais 2002).Brooks (2006) compared combinations of 13“big”BAFs and 6“small”BAFs to an inventory using fixed-radius plots.Rice et al.(2014)compared a number of forest sampling methods including big BAF,HPS with various BAFs,horizontal line sampling and fixed-radius plot sampling.These studies were conducted in partial harvests in mixed species Acadian forests of northern Maine.A comparison of the results of these forest inventories showed that only the smallest BAF for HPS had a standard error smaller than the big BAF inventory.Methods for determination of optimal sampling plans for big BAF were described by Yang et al.(2017).These methods allow for choice of optimal combinations of BAFs and sample sizes for big BAF according to economic criteria.Chen et al.(2019)used these results to devise practical cost-efficient plans for estimation of forest carbon content using big BAF sampling for forest populations in the northeastern USA.Yang and Burkhart(2019)compared big BAF sampling to two other methods of subsampling count trees on point samples using simulated loblolly pine (Pinus taedaL.)plantations and found all three methods were satisfactory for estimating stand volume.

Despite the successes and evident utility of big BAF sampling variance estimation remains challenging.Moreover,the basic estimator associated with big BAF sampling is not,itself,design-unbiased.Gove et al.(2020) showed that one of the traditional methods for estimation of the variance in big BAF sampling could be derived using the Delta method if the covariance terms are assumed to be negligible.As described by Gove et al.(2020)the history of the Delta method,which is based on using a Taylor series approximation for nonlinear functions of random variables,has been traced by Ver Hoef (2012).Wolter(2007,p.231)states that utilization of the method with a first-order approximation as done in this article has often provided satisfactory variance estimates for large complex surveys.The primary objective of this study is use of the Delta method to derive a new variance estimation formula for big BAF sampling that takes correlations among important sampling variables into account.As indicated below the traditional approaches to variance estimation for big BAF sampling ignore possible covariances between count basal area obtained by using BAFcand the volume per square unit of basal area on trees sampled with BAFv.An additional objective of this study is to test this newly derived estimator using Monte Carlo simulations to compare it with the variance estimators that have been previously proposed for big BAF sampling.We also derive expressions for the bias associated with the big BAF estimator.

Basic big BAF estimation formulas

Two BAF factors are needed for big BAF,a small BAF factorFcused to select trees which are counted but not measured and a larger BAF factorFvused to select trees on which measurements are made usually including dbh and height for volume,weight or biomass estimation.

To obtain the big BAF estimator we first express the volume to basal area ratio(VBAR)for each treeiselected by the larger BAF,Fv:

whereviis the volume of treeiandbiis the basal area of treeiwhere there arei=1,2,...,mvstrees on each of a sample ofs=1,2,...,npoints(Kershaw et al.2016,p.377).The average VBAR is then(Gregoire and Valentine 2008,p.258):

The big BAF volume estimator can then be obtained by multiplying the sample mean VBAR by the count-based basal area as:

Gove et al.(2020) and Iles (2012) have indicated that the variance for the big BAF volume estimator cannot be obtained by using the traditional formula for double sampling from survey sampling theory and practice.This is because the smaller sample in big BAF sampling is not a smaller selection of the total number of sample points but instead the smaller sample actually occurs at each sample point,thus the point-wise sample sizes utilized by the sample survey double sampling formula are equal,which is contrary to the requirements of that formula.However,because the estimator above is a product of two random variables one may employ standard methods for expressing the variance of a product.By applying the formula of Goodman(1960)for the variance of a product as noted by Gove et al.(2020)one obtains:

This formula assumes that the covariance betweenandis zero or negligible for practical purposes.This may be expected to be true if VBARs are not greatly affected by density in the forest population being sampled.The variance estimators ofandare given by Gregoire and Valentine(2008,p.256–257)as:

Equivalent variance expressions are found in Kershaw et al.(2016,p.380).Standard error estimates are:

A simplified estimator of varcan be obatined by using the formula of Bruce (1961).Written in percent standard error form the equation of Bruce(1961)gives the following estimator:

As indicated by Gove et al.(2020);Marshall et al.(2004);Gregoire and Valentine(2008,p.259);Bell and Alexander(1957) were the first to present the version of the product variance for standard error computation presented above.Gove et al.(2020) discussed the historical background of this formula and show how it can be derived using the Delta method (Ver Hoef 2012)which is based on a Taylor’s series approximation and has often been used to approximate the variance of a function of one or more random variables Kendall and Stuart(1977,p.247).It has been noted by Gove et al.(2020);Gregoire and Valentine(2008,p.259);Marshall et al.(2004);Iles(2012)that there is close agreement between the variance derived from(11)and the variance derived from Goodman’s formula in Eq.6 because the third term in the latter equation is typically small and dominated by the other two terms.

Gregoire and Valentine (2008) (equation 8.33) have noted that the big BAF volume per hectare estimator can be formulated as follows:

This formulation of the Big BAF estimator is based on three sample means.Gregoire and Valentine (2008) discuss alternative estimators for big BAF sampling including estimators based on Bruce’s traditional formula(Bell and Alexander 1957)and the Goodman(1962)formula for the variance of products of random variables.

Eq.7(same as equation 6 of Gove et al.(2020))computes the variance of the mean volume basal area ratio as a mean of ratios.However the number of individual tree ratiosmvis a random variable.In the classic formulations of the mean of ratios estimator in the context of design-based sample survey sampling(Schreuder et al.1993,p.89)the number of sample ratios is fixed rather than random.As recognized by Gregoire and Valentine(2008,p.258–259)Eq.12 provides the opportunity to formulate the average volume basal area as the ratio of means becauseis the mean volume per sample point andis the mean basal area per sample point when the large basal area factorFvis used for tree selection

A classical estimate for the variance of this estimated ratio according to equation 6.13 in Cochran(1977,p.155)is:

As indicated by Sukhatme et al.(1984,p.99) this variance estimator is algebraically equivalent to the estimator suggested by Gregoire and Valentine (2008,p.259).The equation above could then be used as alternative to estimate the variance of the average VBAR in Goodman’s variance formula (6) instead of the more traditional formula(7)for the estimated variance of the average VBAR.Thus this formula is equivalent to equation (12) in Gove et al.(2020) and was included in the simulations presented there.Those simulations are replicated here in order to compare them to the results from a new equation described below and termed the point-wise Delta method.

Methods

Bias in the big BAF estimator

The big BAF estimator has been described above and in Gregoire and Valentine (2008) as based on three sample means.Two of the sample meansandprovide design-based estimates of basal area per acre while the thirdprovides a design-based estimate of volume per hectare,albeit typically with a high variance.However their combination forms the big BAF estimator with a much lower variance but which is not design-unbiased.Here we provide equations giving a simple approximation for the bias as well as an exact upper bound to the bias associated with big BAF sampling.

Approximate bias

In Appendix Eq.A.11 we use a bias approximation formula from Seber (1982,p.7) to derive the following approximation for the bias of the estimatorin big BAF sampling:

Exact bias

In the Appendix an expression for the exact bias in the big BAF sampling estimator,Eq.(A.15),is derived based on methods used by Hartley and Ross (1954) to find the exact bias of the standard ratio estimator(also see Cochran(1977,p.162))

This formula also seems to indicate that the bias will tend to be smaller in stands having higher basal area.Again Eq.A.21 was derived in the Appendix following the methods of Hartley and Ross(1954)resulting in the following upper bound on the absolute relative bias in the big BAF estimator(also see Cochran(1977,p.162)):

This formula indicates that the bias relative to the standard error of the big BAF estimator approaches zero as sample sizenbecomes large,on the order ofThis is also the case for the standard ratio estimator according to Cochran(1977,p.160).

The Delta method for big BAF variance based on three sample means

Previous approaches to variance estimation for big BAF sampling view the estimator as the product of two random variables,the count basal area per hectare and the mean volume basal area ratio.As indicated above,these approaches have assumed that the covariance between count basal area per hectare and the mean volume basal area ratio VBAR is negligible.However,if we do not wish to make that assumption,an alternative is to use the Delta method (Kendall and Stuart 1977,p.247),to approximate the variance of the big BAF estimator (12)in the form presented by Gregoire and Valentine (2008,equation 8.33)indicated above as a function of three sample means.On the basis of a Taylor series,the Delta method approximates the variance of a function of estimators of parameterswhich estimatesg(θ)whereθ=(θ1,θ2,...,θn).Now,since the population parameters are generally unknown,the unbiased estimators,whereare substituted here in the formula for the Delta method presented by Kendall and Stuart (1977,p.247)viz.,

or,assuming independence...

In addition in typical applications it is necessary to estimate the variance and covariance terms.In this section we will assume without loss of generality thatA=1.Let us define the functiongin the formula for the Delta method withas follows:

The Delta method requires the following three partial derivatives:

Applying the Delta method and substituting estimates for variances,covariances and means we then have:

The estimated variance of the volume per a unit area based on the large BAF angle gauge alone is

which is the total volume at sample pointsand

The estimated variance for the basal area per hectare based on the large angle gaugeFvis

Now since Eq.24 utilizes covariance terms,we present the computational formulas for these.Recall the relationship between the sample covariance and the estimated covariance between sample means based onnindependent samples is:

Using this relationship the estimated covariance betweenis:

the estimated covariance betweenis:

and the estimated covariance betweenis:

As is shown in Supplementary Materials equations S.3–S.7 variance estimator (24) can also be derived as a special case of an estimator presented by Hansen et al.(1953,p.512–514)for the variance of a ratio between the product ofkrandom variables and the product ofp-krandom variables(Wolter 2007,p.233–234).

The variance equation can be simplified by noting that the two basal area estimatorshave the same expected valueBand the variance ofis likely to be smaller because it is based on the smaller BAFcwhich selects more trees per point.In the original true variance approximation the coefficients multiplied by variances and covariances are functions of parameters which we must estimate when obtaining the approximate variance estimator.This justifies substitution ofin the variance formula above because they have the same expectation.Making this substitution and factoring out sample sizen,the variance formula can be simplified to:

We have derived this variance estimation formula under the assumption thatA=1 so the variance estimate for an entire tract of areaAcan be obtained by multiplying byA2or alternatively expressingin total tract units rather than as per hectare.Note that because we have factored out a quantity ofthe estimator above is a function of the sample variances and covariances among sample point HPS estimates.

The variance estimator above is very similar to equation(12)of Palley and Horwitz(1961)which they obtained for the Bell and Alexander(1957)estimator which was essentially double sampling with a ratio estimator.However an important difference is that the Bell and Alexander(1957)estimator consists of a large sample of points on which basal area counts are made and a subsample of points on which tree volumes are also determined.By contrast for big BAF sampling the volume subsample is made on every point so there is no smaller point-wise sample.Therefore some of the variances and covariances for the Palley and Horwitz (1961) variance estimator of the Bell and Alexander (1957) volume estimator must be determined on the subsample which is smaller thann,but for big BAF sampling all the variances and covariances have the same point-wise sample size ofn.A consequence is that for the big BAF variance estimator we cannot further simplify the variance estimator above by utilizing the ratio of large-tosmall point-wise sample size as was done by Palley and Horwitz(1961).

Simulation trials

We used two simulated forest populations that were previously employed by Gove et al.(2020)to compare traditional and previously proposed big BAF sampling variance estimators.The sampling simulation programsampSurfGove(2012)which was written in R(R Core Team 2021)was used to conduct the simulations.The concept of“sampling surface”(Williams 2001a,2001b) was used to construct thesampSurfsimulator in which a raster tract of areaAis tessellated into square grid cells.Trees are located on the tract and inclusion zones are established for each tree based on the sampling procedure(horizontal point sampling for these simulations).A sample point is considered to be located in the center of each grid cell.Totals for each grid cell are based on the attributes of trees whose inclusion zones contain the sample point at the grid cell center.The sampling surface is developed based on the total attributes values over all the grid cells.For our simulations square tracts were used with grid cells 1 m2in size.

Nine sets of simulations were conducted using every combination of BAF pairs (FvandFc) whereFc∈{3,4,5} andFv∈ {10,20,30} for both forest populations.For each sampling simulation sampling surfaces were developed for total basal area and total volume using every combination ofFcandFvresulting in 36 simulation surfaces.A Monte Carlo experiment was conducted for each of the 9 BAF factor combinations in which random samples ofn=10,25,50 and 100 were drawn with 1,000 replications.Summary statistics were computed for HPS and big BAF sampling for each sample on each sampling surface.The statistics available for each BAF combination made it possible to compare the big BAF results to an HPS sample usingFcin which every sample tree was measured for volume(e.g.,DBH and height measured).

Mixed northern hardwood population

The mixed northern hardwood population is the same one used by Gove et al.(2020).The population is artificially constructed but resembles what could typically be found in a mixed northern hardwoods forest.It is established on a tract having an area ofA=3.17 ha and containing 31,684 grid cells.The tract is bounded by an external buffer 18 m wide so that the portion of the tract containing the tree population internal to the buffer has an area of 2 ha.A population ofm=667 trees with a total basal area of 48.4 m2was established within the tract boundaries.This is approximately equivalent to 333 trees·ha-1and a basal area of 24.2 m2·ha-1with a stand quadratic mean diameter ofAccording to northern hardwoods stocking guides by Leak et al.(2014) the stand would be in fully stocked condition.A three-parameter Weibull distribution (Bailey and Dell 1973) was used to assign tree DBHs,with location,scale and shape parameters respectively beingα=10 cm,γ=2 andζ=30 cm.Total heights for each tree in the simulated northern hardwoods stand were assigned using the all-species DBH-height equation by Fast and Ducey(2011)for northern hardwoods in New Hampshire converted to metric units.A normal random error term with mean zero and standard deviation 2.5 m was added to each height prediction.A spatial inhibition process (Venables and Ripley 2002,p.434) with an inhibition distance of 3 m was used to assign trees to spatial locations within the simulated northern hardwoods forest tract.The method of Masuyama(1953)for boundary overlap correction was used in which tree inclusion zones were allowed to overlap into the buffer region(Gregoire and Valentine 2008,p.224).Because random sample points can fall anywhere in the tract which includes the buffer region,each tree has a complete inclusion zone.

The following taper function is used within thesamp-Surfsimulation(Van Deusen 1990):

whereDuis the top diameter at tree stem heighth,Dbis the tree stem butt diameter and 0 ≤h≤His tree height.The value of the taper parameterrwas randomly selected for each tree from the ranger∈[1.5,3].With the taper function above a neiloidal form results if 02.The taper function for each tree was used to compute individual tree volume according to the procedures of Gove(2011a,p.8).There was a correlation coefficientρ(V,b)=0.62 between individual tree VBAR and basal area in the simulated northern hardwoods population.Figure S.2 in the Supplementary Material for Gove et al.(2020)displays histograms of the DBH and height distributions for the simulated northern hardwoods forests.

Eastern white pine population

The eastern white pine(Pinus strobusL.)population used by Gove et al.(2020) was also used in this study.Gove et al.(2000) describes data collection for the eastern white pine based on Barr &Stroud FP-12 dendrometry over a 20-year period.These data were obtained from pure even-aged white pine forest stands in southern New Hampshire.Data processing utilized the RDendrometrypackage(Gove 2011a).The white pine population used for simulations consists ofm=316 white pine trees with multiple measurements on some during the period.Trees were located within a 1 ha tract having an 18 m wide buffer and having a total area ofA=1.85 ha in size with 18,496 grid cells.The population has a basal area of 47.2 m2and a quadratic mean DBH ofAccording to the Leak and Lamson (1999) white pine stocking guide,the tract is solidly in the full stocking range.The trees were originally measured in several different stands without location information.To assign trees spatial locations for the simulation stand,a spatial inhibition process having an inhibition distance of 3 m was employed similarly to the northern hardwoods stand discussed above.As with the northern hardwoods stand,Mayasuma’s method was used to correct for boundary overlap in point sampling,so that randomly located sample points were permitted to fall into the buffer strip surrounding the 1 ha white pine tract.No taper function was required for the white pine stand because dendrometry measurements were available for upper-stem taper on each tree.As described by Gove(2011b),a cubic spline was fitted to tree dendrometry measurements.Smalian’s formula (Kershaw et al.(2016,p.241)) was used to calculate individual tree volumes.Figure S.5 in the Supplementary Material of Gove et al.(2020) displays histograms of DBH and total height distributions for the white pine population.

Results

Big BAF estimator bias

We derived two expressions for the bias in big BAF sampling,Eq.15 which is an approximation to the bias and(16) which is an exact expression of the bias.An indication of the bias in big BAF sampling is shown in Fig.1 which was prepared using Eq.15 with variance and covariance values computed from all the lattice points on the sampling surfaces for the northern hardwoods and white pine populations.Use of all the lattice points in these computations should provide a very close approximation to true population values.Only sampling plans withFc=3 are presented because results from the other values of BAFcused in this study are very similar and the bias withFc=3 is larger than other values of BAFcused in this study by a very small amount.Figure 1 shows that the bias is quite small even forn=10 which is 0.17%for the white pine population.Bias percentages decline steeply with increasing sample sizes and are essentially negligible for all sample sizes equal to or greater than 10 for both the white pine and northern hardwoods population.As expected bias also declines as BAFvapproaches the value ofFc=3.

Fig.1 Approximate bias in the big BAF estimator for the northern hardwoods and white pine populations with BAFv=30(dotted,?),BAFv=20(dash,△)and BAFv=10(solid,+)

Population sampling surfaces

The results concerning the sampling surfaces for the Northern Hardwoods and the White Pine populations were given by Gove et al.(2020)in Table 1 of that paper.As expected the results for basal area and volume from the sampling surfaces were quite close to the actual population values.The white pine population had higher stocking and volume per hectare than the northern hardwoods population as would be typical for fully-stocked stands in the New England,USA region.Gove et al.(2020)noted that the northern hardwoods DBH distribution was more positively skewed than the white pine DBH distribution.Tree heights in the white pine populations were generally taller than the northern hardwoods population which was likely the primary reason that the volume per tree in the white pine was considerably greater than that for the northern hardwoods population.Figure S.1 in Supplementary Material shows sampling surfaces for the northern hardwoods population for(a)total BAFcwithFc=3 basal area and (b) total BAFcvolume withFv=30.A realization of a Monte Carlo sample consisting ofn=100 is also indicated with each point denoted by a red“×”.

Figure S.2 in Supplementary Materials indicates the population correlationsρfor the northern hardwoods population between important variables such as the basal area and volume on the count and volume points and the volume for all 6 combinations of point sampling BAFs used in the simulations.As might be expected the correlation between basal area and volume when using the same BAF factor is close to 1 and fairly constant over variation in the range of BAFvfor big BAF sampling.As also might be expected the correlation between two variables when one is sampled on BAFcand the other is sampled on BAFvdeclines with increasing count BAFcand ranges from 0.83 to 0.52.Covariances between these variables are displayed on Figure S.3 in Supplementary Materials.

Figure S.4 in Supplementary Materials illustrates the sampling surfaces for basal area and volume forFc=3 andFv=30 for the eastern white pine population.A red“×”is indicated on the volume surface illustration for each point in a realization of one Monte Carlo sample ofn=100.

Population correlations between important variables for the eastern white pine population are given in Figure S.5.The patterns are similar to those for the northern hardwoods population.However the correlations between pairs of variables,in which one is from the large BAFvand the other is from the small BAFc,are higher.Correlations range from 0.91 to 0.67,declining with increasing values of BAFv.Population covariances for this population are given in Figure S.6.

Monte Carlo simulations

Standard error comparisons

Figure 2 displays the standard error results from Monte Carlo simulations comparing Goodman’s Method(Eq.6),the“traditional”Delta method (Eq.11),the new pointbased Delta method (Eq.24) and simplified point-based Delta method (Eq.33) for the northern hardwoods population.For total sample size ofn=100,the standard errors of the four methods are virtually indistinguishable for all three count BAFs,Fc=3,Fc=4 andFc=5.As expected standard errors decline for all four variance approximation methods as the BAFvdeclines and more closely approaches the value of BAFc.For reference purposes the standard errors for HPS using all trees selected on the count BAFcare displayed.In Supplementary Material it is indicated in Figure S.7 that this HPS standard error is extremely close to the standard error among big BAF simulation estimates for all values of BAFcand BAFvutilized in this study.Because these were so extremely close we present the HPS standard errors only for comparison to the results for big BAF sampling in Figs.2 and 3.It is a remarkable fact that HPS with measurement of all trees for volume determination provides no appreciable reduction in variance in volume estimation compared with big BAF in which only a much reduced subsample of trees are measured at each point.

As the total point-wise sample sizendecrease fromn=100 ton=10 separations between the standard errors given by the four approximations become more evident with Goodman’s method being slightly below the Delta method and the point-based Delta method standard error being lower than either of the other three approximations,particularly forn=10 and the largest value of BAFv.The simplified point-based Delta method is consistently lower than the other three methods.This makes it closer to the reference line indicated for HPS except forn=10 where simplified point-based Delta method underestimates compared to the reference line for the two smaller values of BAFv.However it should be noted that even in the latter case the difference between the point-based Delta method and simplified point-based Delta method compared to the traditional Delta method is only in the range of 5% even forn=10 and the largest value of BAFv.In this latter case both point-based Delta method and simplified point-based Delta method are within 2%of the reference line for HPS with point-based Delta method overestimating and simplified point-based Delta method underestimating for the two largest values of BAFv.For samples sizes equal to or larger thann=25 which are more likely to be representative of typical big BAF sampling plans all four variance estimators are consistently within 2%of each other becoming closer as samples sizenbecomes larger.

Figure 3 contains the white pine population Monte Carlo simulation standard error results.As was the case for the northern hardwoods population,the traditional Delta method (Eq.11),Goodman’s method (Eq.(6)),the point-based Delta method (Eq.24) and the simplified point-based Delta method (Eq.(33)) are displayed in the figure.However,the maximum difference between the standard errors of the three methods was about 8% forn=10 with differences atn=100 being only about 0.5%.In the case ofn=10,both point-based Delta method and simplified point-based Delta method were within approximately 2%of the HPS reference line but point-based Delta method was an overestimate while simplified point-based Delta method was an underestimate.Consistently the lowest estimate of standard error was provided by simplified point-based Delta method which made it closer to the HPS reference line than the other methods except in the case ofn=10 where it was somewhat lower than the HPS reference line.As expected,standard errors for each given level ofnand BAFcmostly decline with decreasing levels of BAFvthough there are some slight exceptions in the case of point-based Delta method forn=10 These trends are similar to those for the northern hardwoods population.However,the standard errors from the northern hardwoods population ranged from about 32% forn=100 to 100% forn=10 (Fig.2) while the standard errors associated with the white pine population were substantially greater ranging from approximately 50%forn=100 to 160%forn=10(Fig.3).

Fig.2 The northern hardwood Monte Carlo standard error simulation results as the average over 1,000 replications for each BAF pair and sample size with the Delta Method(dashed,◇),Goodman’s(dot-dashed,+),the point-based Delta method(dotted,▽)and the simplified point-based Delta method(long-dash,×).The reference line(solid,?)is the average Monte Carlo standard error for the BAFc HPS results

Fig.3 The white pine Monte Carlo standard error simulation results as the average over 1,000 replications for each BAF pair and sample size with the Delta Method(dashed,◇),Goodman’s(dot-dashed,+),the point-based Delta method(dotted,▽)and the simplified point-based Delta method(long-dash,×).The reference line(solid,?)is the average Monte Carlo standard error for the BAFc HPS results

Confidence interval captures

Figure 4 depicts the confidence interval capture rates on the northern hardwoods population for the 1,000 replications of Monte Carlo simulation for big BAF standard error estimates obtained using the traditional Delta method (11),Goodman’s method (6),the point-based Delta method (24) and the simplified point-based Delta method (33).The results in this figure are based on the percentage of simulation trials in which a 95% confidence interval for total volume from the big BAF trial contains the true mean total volume of the simulated population.Thus a capture rate of 95% would be ideal.Forn=100 andn=50 the capture rates for the traditional Delta method,Goodman’s method,the point-based Delta method and the simplified point-based Delta method are very close for all values of BAFcand BAFv,all ranging between 94.1%and 95.4%.In some cases such asn=100 withFc=3 andFc=4,the big BAF capture rates are even closer to 95%than the capture rate for HPS with the given value of BAFc.Forn=25,Fc=3 andFv=10 the point-based Delta method and the simplified point-based Delta method have a capture rate slightly lower than the other two big BAF standard error estimators but all capture rates are between 93.5% and 95%.All capture rates were between 93.5%and 95.3%forn=10,with the pointbased Delta method and the simplified point-based Delta method being lower than the other two big BAF methods forFv=30 and the simplified point-based Delta method being somewhat lower for all values of BAFv,otherwise the three methods were extremely close.In summary the four big BAF standard error estimates produced confidence interval capture rates that were very similar as might be expected from the fact that they produced very similar variance estimates as indicated by Fig.2.

Figure 5 displays the confidence interval capture rates for the white pine population using confidence intervals constructed with standard errors based on the traditional Delta method (Eq.11) Goodman’s method (Eq.(6)) the point-based Delta method (Eq.24) and the simplified point-based Delta method (Eq.(24)).Similarly to the northern hardwoods population,simulations with the white pine population produced confidence interval capture rates between 91.6%and 96%for all combinations ofFc=3,4 and 5,Fv=10,20 and 30,and sample sizesn=10,25,50,and 100.As well,the capture rates for a conventional HPS with BAFcare very similar to the capture rates with the big BAF approaches.Forn=100 the three big BAF capture rates were all very close to each other and between 93.7%and 94.8%.It is perhaps surprising that forn=100 the largest deviation from the ideal capture rate of 95%were the results for the conventional HPS with BAFcwhich was lowest among all 5 methods ranging from 93.2%to 94.5%for the case of BAFc=5.Forn=100,the capture rates for the point-based Delta method and the simplified point-based Delta method were very slightly lower than those for the other two big BAF methods.For then=50 samples size,the capture rates for the big BAF methods as well as the conventional HPS with BAFcwere all between 94.7%and 96%.Again the capture rates for the point-based Delta method and the simplified point-based Delta method were slightly lower than for the other two big BAF methods which in this case made them slightly closer to 95%.Capture rates for the conventional HPS with BAFcwere extremely close to 95%and sightly lower than the big BAF methods forFc=3 andFc=4 but forFc=5,the HPS capture rates were slightly higher than those for the point-based Delta method and the simplified point-based Delta method and slightly lower than those associated with the other two big BAF methods.Similar results were obtained forn=25 with the pointbased Delta method and the simplified point-based Delta method having nearly equal or slightly lower capture rates than the other two big BAF methods,and all capture rates being between 94.7%and 96%.Capture rates for conventional HPS with BAFcwere slightly lower or nearly equal to the three big BAF methods forn=25.In the case ofn=10,all capture rates ranged between 91.6%and 94.6%with the point-based Delta method and the simplified point-based Delta method once again trending somewhat lower than the other two big BAF methods.

Correlations

Point-wise estimated correlations between estimates of basal area and volume for BAFv,basal area from BAFcand volume fromand basal area from BAFcand BAFvare given for the northern hardwoods Monte Carlo simulations in Figure S.8 and the white pine Monte Carlo simulations in Figure S.10 for each combination ofn,BAFcand BAFv.Estimated correlations between volume and basal area estimatesfor both the northern hardwoods in Figure S.8 and white pine in Figure S.10 were extremely close to one for all sample sizes and combinations of BAFcand BAFv.

Estimated correlations between basal area obtained from BAFcand BAFvand correlations between BAFcand volume estimateswere very close and,within each sample size,declined with increasing values of BAFv.These estimated correlations are generally somewhat higher for the white pine populations for a given value ofn,BAFcand BAFv.For the northern hardwoods population,estimated correlations range between 0.83 and 0.52 and decline with increasing BAFv.As indicated,estimated correlations in the white pine simulations tended to be higher than for the northern hardwoods population and ranged between 0.91 to 0.67.

Discussion

Results from inspection of Fig.1 indicated that the bias in the big BAF estimator is quite low for both the northern hardwoods and the white pine populations used in this study.For larger values ofnthe bias approaches zero and is especially low for the northern hardwoods population.The great majority of big BAF forest sampling plans in practice would be expected to have 10 or more sample points.Perhaps an exception might be a small stratum in a stratified sample design.While it would be possible to estimate bias from sample statistics using Eq.15,it should not be necessary given the very low values of bias obtained in this example,especially for the larger values ofnthat are commonly used in practical applications of big BAF sampling.Furthermore,we are not aware of any instances reported in the literature of big BAF sampling in which bias has presented problems for practical applications.

It should be noted that compared to the traditional Delta method (11),Goodman’s method (6) contains a negative term which causes it to be smaller than the traditional Delta method,although the difference is quite modest according to Figs.2 and 3.Both the traditional Delta method and Goodman’s method omit covariance terms between variables used in the estimation process while the point-based Delta method (24) and the simplified pointbased Delta method(33)do account for covariances.This may be the reason that the point-based Delta method and the simplified point-based Delta method standard error estimates are somewhat smaller for the smaller sample sizes(especiallyn=10)in Figs.2 and 3.

Gove et al.(2020) have previously compared the traditional Delta method Eq.(11) to Goodman’s Eq.6 with the same results presented here for these two variance estimation methods.However they did not include the point-based Delta method Eq.24 or the simplified pointbased Delta method(33)in their simulations,simulations of the traditional Delta method and Goodman’s equation were included in the present study so that the performance of the point-based Delta method and the simplified point-based Delta method could be compared to those previously-developed variance estimators.

The impact of the correlations discussed above which is included in the point-based Delta method and the simplified point-based Delta method but neglected in the two traditional variance estimation methods was apparently small for the simulated populations tested here.However,it is conceivable that these correlations could have a larger effect in some natural populations.Because of the way the artificial tree populations were constructed for this article,the possible effects of local variations in stand density on tree dimensions and the tree DBH-height relationship were minimized.For some species,density variations may affect the DBH-height relationship thus inducing some degree of correlation between volume per tree and basal area per hectare.For example,suppose an even-aged natural pine stand has extreme density variations so that DBHs tend to be smaller in relation to height in locally dense areas but larger in relation to height in areas where density is substantially less.This could induce a negative correlation between basal area and volume per tree because higher basal area regions would have less volume per tree than lower basal area regions.From this point of view the point-based Delta method and the simplified point-based Delta method may be a more conservative approaches because they do account for covariances of the kind just discussed.As well,the correlation terms present in the point-based Delta method (24) and the simplified point-based Delta method (33) may provide additional opportunities to investigate the effects of forest stand structure on big BAF variance.

By looking at the estimation problem as point-wise selection of random sample points an estimator of the covariance between the basal area estimate and ratio of mean volume to mean basal area estimates was derived in the Appendix(Eq.A.34):

The derivation uses an equation derived by Taylor’s series methods as was the case for the Delta method (Kendall and Stuart 1977,p.247).This equation could be used to estimate a correlation coefficient between the basal area estimate and the ratio of sample mean volume to sample mean basal area when expressed as a ratio of means after the manner of the form of the big BAF estimator(12)proposed by Gregoire and Valentine (2008,equation 8.33).Equation 14 would be used to estimate the standard error of the volume basal area ratio in the denominator of the correlation formula.In addition the formula above might be used to incorporate covariance information into past approaches to big BAF variance estimator that were based on the variance of a product when Eq.14 is also used to estimate the variance of the ratio of mean volume to mean basal area.

The point-based Delta method and the simplified pointbased Delta method result in a computational Eqs.24 and(33)which are longer and more complex than Goodman’s equation (6) or the traditional Delta method (11).However the formulas for the point-based Delta method or the simplified point-based Delta method can easily be coded in programming languages such as R(R Core Team 2021)as was done for the simulations reported here or in a spreadsheet.It is perhaps becoming rare to rely on“hand”calculations with data entered in calculators to perform the computations required for a forest inventory.Once the point-based Delta method or the simplified point-based Delta method has been coded in a programming language or a spreadsheet template,computations required for the method should not be a barrier to its use.

Instances in which the variances of the point-based Delta method and the simplified point-based Delta method were associated with lower confidence interval capture rates than Goodman’s or the traditional Delta method are associated with simulation parameters for which the estimated standard error for the point-wise Delta method were lower than standard errors from the other two methods.As indicated above,this may possibly be due to negative terms associated with covariances in the point-based Delta method Eq.(24)and the simplified point-based Delta method (33) which are not present in Goodman’s formula Eq.6 or the traditional Delta method Eq.(11)as they have traditionally been applied to the big BAF variance estimation problem.Inspection of Eq.24 for the point-based Delta method and(33)for the simplified point-based Delta method indicate that the last two terms in the equations will likely be negative because they contain covariances expected to be positive but which are multiplied by negative coefficients.These negative coefficients result from taking the partial derivative of a ratio with respect to the denominator in the ratio (Eq.(23))as required by the Delta method.Intuitively,when the denominator in a ratio is positively correlated to a term in the numerator of the ratio,this tends to stabilize the variability in the ratio because large values in the numerator then tend to be matched by large values in the denominator.This intuition accords with negative terms in the variance formula associated with covariances between terms in the numerator and the denominator of the ratio for the big BAF estimator.

Acceptable results from confidence interval captures tends to confirm that the Delta method based on a first-order Taylor series approximation can provide good variance estimates for big BAF sampling.As indicated previously,Wolter (2007,p.231) states that the first order approximation has frequently been found to be acceptable in practice.The point-based Delta method,the simplified point-based Delta method and the traditional Delta method tested here are based on first-order Taylor series expansions,but the traditional Delta method assumes that covariance terms are negligible.

Another technical advantage of the point-based Delta method and the simplified point-based Delta method is that these equations do not depend on the variation among trees within points.Thus there is no dependence on variance terms that implicitly assume that trees sampled within the same point are independent as Eq.7 does.Actually the only independent random samples in big BAF sampling are the sample points themselves.Similarly Palley and Horwitz (1961,p.60) noted when considering a traditional variance estimator based on Bell and Alexander(1957)for two-stage sampling in which a subset of the count basal area points are selected for volume measurements,“There is some confusion here,since in point sampling the measurement we are concerned with attaches to points...rather than to trees.”Technically,trees sampled at the same point are correlated,although apparently this fact did not prevent accurate evaluations of variance with(7)in simulations.However,use of the traditional big BAF variance estimation methods with the ratio variance estimate Eq.14 instead of Eq.7 also avoids the problem of estimating variance using possibly nonindependent sample trees within the same points.

In comparing the point-based Delta method to the simplified point-based Delta method we recommend the use of the simplified point-based Delta method (33) for applications.Inspection of Figs.2 and 3 show that the simplified point-based Delta method was slightly closer to the HPS reference line than the other three variance estimation methods for sample sizes ofn=25 or greater.In the case ofn=10,the simplified point-based Delta method was generally about equally distant from the HPS reference line as point-based Delta method but tended to be a slight underestimate instead of an overestimate.In practical applications the majority of big BAF sampling plans will likely have samples sizes ofn>10.

From a theoretical point of view the point-based Delta method should be preferred to Bruce’s method because Bruce’s method does not take into account possible correlations between the basal area estimate obtained from BAFcand the estimate of mean volume to basal area ratio.The point-based Delta method implicitly does this by accounting for the correlations between all basic basal area and volume estimates.In a very similar way Palley and Horwitz(1961,p.60)used the Delta method to derive a“conceptually sounder”variance estimator that they recommend as an alternative to the Bell and Alexander(1957,p.17)variance estimator for double sampling with a ratio estimate in the context of point sampling.The simplified point-based Delta method should be preferred to the point-based Delta method because the estimate of total basal area using BAFcis more precise than the estimate of basal area using BAFvand therefore a better estimate of the total basal areaBfor use in the variance approximation formula for the simplified point-based Delta method.

A final thought concerning the efficacy of the pointbased Delta method and the simplified point-based Delta method relates to the current practice of estimating covariances and correlations to assess the independence assumption for big BAF sampling.As noted in Gove et al.(2020),past attempts at calculating these quantities have all been ad hoc due to the nature of differing‘sample support’ between the VBAR estimates,which are tree-based,and the basal area estimates,which are point-based.The point-based Delta method and the simplified point-based Delta method solve this dilemma by determining covariances and correlations completely on a point-wise basis,yielding true estimates in each case rather than aggregating tree-wise attributes for comparison on a point-wise manner as in the traditional Delta method application.

Conclusions

New variance formulas for big BAF sampling have been derived and tested.They have been termed the pointbased Delta method and the simplified point-based Delta method because they have been derived using the Delta method based on sources of variation among sample points.This approach takes the covariances among the variables in the big BAF sampling estimator into account.More traditional methods of estimating the big BAF sampling variance are based on the variances of variables comprising the big BAF estimator but do not take the covariances between these variables into account.Monte Carlo simulation experiments conducted on a northern hardwoods forest population and a white pine population indicated that the point-based Delta method and the simplified point-based Delta method performed comparably to two existing big BAF estimators.Estimates from the point-based Delta method and the simplified point-based Delta method were sometimes slightly lower than estimates from the other two methods in smaller sample sizes(numbers of sample points).This might be partially due to negative terms of modest magnitude associated covariances among variables which are not considered with the more traditional estimators.We have also shown mathematically that the bias in big BAF sampling approaches zero as sample size becomes large on the order ofbehavior that is similar to the standard ratio of means estimator commonly used in survey sampling.

Appendix

On the bias in the big BAF estimator and the covariance between the basal area and volume basal area ratio estimates

Approximate bias

According to Seber (1982,p.7) a second-order approximation to the bias in a functiongof the means of random variablesxibased on Talyor series is:

Begining with the equations for the first partial derivatives(21),(22),and(23)we obtain the following required second partial and cross-partial derivatives.Note that the second partials ofgwith respect toare zero so they are not given below.We assume without loss of generality in this section that tract areaA=1 so the results are on a per-unit area basis.

The true population bias is a function of expected values and population variances.Noting the expected values for the HPS sample means EVthe relevant second partial and cross partial derivatives evaluated at the mean vectorθ=(B,B,V)are:

Substituting into(A.1)with algebraic rearrangement we obtain:

The expression above is essentially the same as bias derived from Equation 11 in Palley and Horwitz (1961)for the Bell and Alexander(1957)estimate which is essentially the same as double sampling with a ratio estimator.However,in their case two different point-wise sample sizes were involved,the total sample size and a subsample size,which is not the case for the big BAF estimator.That means some of the formulas for the variances and covariances within Equation 11 of Palley and Horwitz (1961)would not be the same as for big BAF sampling because they would depend on the sub-sample size rather than the total sample size.Note thatare means of independent identical HPS samples so that

We note that this expression approaches zero asnbecomes large on the order ofwhich is similar to the behavior of the standard ratio estimator according to Cochran(1977,p.160).

Exact bias

We can mathematically investigate the bias in the big BAF estimator by a method similar to that give by Cochran(1977,p.162) and originally developed by Hartley and Ross (1954).In the derivations to simplify notation we assume without loss of generality areaA=1 because the final results are ratios that do not involve area.Consider the covariance between the big BAF estimatorand the basal area estimate from the small BAF factor angle gauge

Then by the definition of the covariance and the fact thatare design-unbiased HPS estimators(Palley and Horwitz 1961)of total volumeVand basal areaBrespectively we have+BVresulting in:

Now we may quantify the exact bias as:

Using the definition of the correlation coefficientρthe absolute value of the bias is then

For forest populations we generally expect that estimates of basal area and volume from common samples on the same populations would be positively correlated so thatso that the maximum value of the correlations is one.If the correlations are positive the maximum possible difference in the absolute value on right hand side of the equation above occurs when one of the terms is zero and the other is greater than zero we have:

BecauseBand var(Bc)are constant population values,asnapproaches infinity the right-hand side of the equation above goes to zero,with the result that the relative bias for the big BAF estimator approaches zero on the order of(using“bigO”notation

Once again becauseBand var(Bv)are population constants the relative bias in the big BAF estimator approaches zero asnapproaches infinity.

In the unusual case were one of the correlations between basal area and volume estimates may be negative,we can posit

In this case as well the relative bias in the big BAF estimator approaches zero asnapproaches infinity.Thus for all three possible cases the relative bias in the big BAF estimator approaches zero as the sample size approaches infinity on the order ofAccording to Cochran(1977,p.160)this is similar to the behavior of the standard ratio estimator often used in sample surveys.

Covariance between basal area and volume basal area ratios

The covariance between two possibly nonlinear functionscan be approximated using Taylor’s series methods in a way similar to the derivation of the Delta method.The following approximation formula is based on Kendall and Stuart(1977,p.247)

To find the approximate covariance between estimated basal area and the ratio of the estimates of mean volume to mean basal area we define:

and the ratio of the estimates of mean volume and mean basal area:

We then obtain the following partial derivatives needed for the approximation formula:

Finally inserting the function definitions and partial derivatives in to the approximation formula we find the following approximate estimator for the covariance between the basal area estimate and the volume basal area estimate:

We may factor a quantity offrom the parentheses:

Abbreviations

BAF:Basal area factor;BAFc:Count sample basal area factor;BAFv:Volume sample basal area factor;HPS:Horizontal point sampling;VBAR:Volume to basal area ratio

Supplementary Information

The online version contains supplementary material available at https://doi.org/10.1186/s40663-021-00304-0.

Additional file 1:Supplementary Material:An Approximate Point-Based Alternative for the Estimation of Variance under Big BAF Sampling

Acknowledgements

This is the second in a series of two papers dedicated to a pioneer of research and education in point sampling.In recognition of his many contributions and insights in areas relating to forest inventory,and his influential teaching and mentoring of many students,professionals and inventory scientists,we dedicate this paper in memory of our esteemed colleague Dr.John F.Bell.

Authors’contributions

TBL initial concept,writing main manuscript,deriving bias and variance equations;JHG conducting computer simulations,deriving equations,assisted in writing manuscript,writing Supplemental Materials;TGG and MJD assisted in study design and contributed ideas,text and comments to the manuscript.All authors read and approved the final manuscript.The findings and conclusions in this publication are those of the authors and should not be construed to represent any official USDA or U.S.Government determination or policy.

Funding

MJD:Support was provided by Research Joint Venture Agreement 17-JV-11242306045,“Old Growth Forest Dynamics and Structure,”between the USDA Forest Service and the University of New Hampshire.Additional support to MJD was provided by the USDA National Institute of Food and Agriculture McIntire-Stennis Project Accession Number 1020142,“Forest Structure,Volume,and Biomass in the Northeastern United States.”TBL:This work was supported by the USDA National Institute of Food and Agriculture,McIntire-Stennis project OKL0 2834 and the Division of Agricultural Sciences and Natural Resources at Oklahoma State University.

Availability of data and materials

The datasets used and/or analysed during the current study are available from the corresponding author on reasonable request.

Declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Author details

1Professor Emeritus,Department of Natural Resource Ecology and Management,Oklahoma State University,Ag Hall Room 008C,Stillwater 74078,OK,USA.2USDA Forest Service,Northern Research Station,271 Mast Road,Durham 03824,NH,USA.3Yale School of Environment,Yale University,360 Prospect St,New Haven 06511,CT,USA.4Department of Natural Resources and the Environment,University of New Hampshire,114 James Hall,Durham 03824,NH,USA.

Received:17 December 2020Accepted:13 April 2021

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Forest Ecosystems2021年3期

Forest Ecosystems的其它文章: Soil carbon and nutrient stocks under Scots pine plantations in comparison to European beech forests:a paired-plot study across forests with different management history and precipitation regimes; Deforestation and fragmentation trends of seasonal dry tropical forest in Ecuador:impact on conservation; Zonal aspects of the influence of forest cover change on runoff in northern river basins of Central Siberia; Early detection of pine wilt disease in Pinus tabuliformis in North China using a field portable spectrometer and UAV-based hyperspectral imagery; Effects of stand features and soil enzyme activity on spontaneous pedunculate oak regeneration in Scots pine dominated stands–implication for forest management; Wood substitution potential in greenhouse gas emission reduction–review on current state and application of displacement factors