Estimation of Similarity Indices via Two-Sample Jackknife Procedure
Post on 19-Jan-2017
222 Views
Preview:
Transcript
Estimation of Similarity Indices via Two-Sample
Jackknife Procedure
Chia-Jui Chuang
Applied Mathematics, National Chung Hsing University,
Taichung, Taiwan 402, R.O.C.
Abstract
The similarity indices are often used to assess the biodiversity of two communities. Because some
species are not observed in the samples, the common naive estimators may be unsatisfactory. Based on
the quadrat sampling, a series of two-sample jackknife estimators for the Jaccard and Sørensen indices
are developed. The sequence of estimators is able to reduce bias by increasing the jackknife order;
however, it may result in a large variation with the increasing order. To compensate for the bias-variance
trade-off, we consider a sequential testing procedure to select the jackknife order. A simulation study
based on two real forest plots is used to evaluate the performance of the proposed method.
Key Words: Jaccard Index, Sørensen Index, Two-Sample Jackknife, Quadrat Sampling
1. Introduction
The similarity index provides a quantitatively based
measurement for comparing two populations. For exam-
ple, the DNA-fingerprint is an application of similarity
for DNA profiles [1], and is commonly used in parental
testing and criminal investigation. In the computer sci-
ence and data mining research, similar pairs can be found
among objects [2]. The similarity index has also been
widely used in ecology [3,4] and is, therefore, the focus
of this paper.
The advantages of classic similarity indices, such as
the Jaccard index, Sørensen index, Bray-Curtis index,
and the Morisita-Horn index, have been discussed [5,6].
Among these similarity indices, the Jaccard index [7]
and Sørensen index [8] are commonly applied in eco-
logy. These indices are often used to measure the spe-
cies-diversity for the optimum size for natural protection
[9,10]. Boyce and Ellison [11] indicated that these indi-
ces have a consistently high performance. Therefore, this
study focuses on the Jaccard and Sørensen indices. The
Jaccard index is defined as the number of shared species
divided by the number of total distinct species in two
communities. The Sørensen index is the ratio of the num-
ber of shared species to the average number of total spe-
cies in two communities.
The definitions of the Jaccard and Sørensen indices
are based on the numbers of species in two populations.
However, we usually obtain the information by samples
collected from populations. Hence, similarity indices are
usually estimated by substituting the numbers of species
as the observed numbers of species from samples in
practice. These estimators are referred to as naive esti-
mators. In general, the naive estimators result in biases
because some species are not observed in the sample;
thus there is still room to improve them.
The jackknife procedure can reduce the bias of esti-
mators in various applications [12,13]. Moreover, Heltshe
and Forrester [14] applied the jackknife procedure for one
community. Burnham and Overton [15] extended the jack-
knife procedure in order to reduce more bias. In this paper,
a new two-sample jackknife procedure for the Jaccard and
Sørensen indices was developed in this study. We start with
naive estimators and derive serial jackknife estimators for
these indices. When the order of jackknife estimators in-
creases, the resulting bias decreases; however, a higher
variation may occur. Therefore, a sequential test for select-
ing a proper order of estimators is also proposed.
Journal of Applied Science and Engineering, Vol. 15, No. 3, pp. 301�310 (2012) 301
*Corresponding author. E-mail: cjchuang0302@gmail.com
Prior studies had discussed the estimation of simi-
larity indices. Heltshe [16] suggested a jackknifed sim-
ple matching coefficient (SMC) to estimate the Jaccard
index. Kaitala [17] used the hypergeometric distribution
with a simple rank approach to estimate the similarity
index. More recently, Yue and Clayton [18] estimated
the Jaccard index by the nonparametric maximum likeli-
hood estimators. Severiano et al. [19] used the jackknife
and bootstrap method to find the confidence intervals of
similarity indices. In this paper, we find that the jack-
knifed SMC estimator may be inappropriate in certain
situations. The estimator is evaluated in section 4.2.
In section 2, we demonstrate the development of the
two-sample jackknife algorithm, and the sequential test-
ing procedure for the Jaccard and Sørensen indices under
the abundance datasets. Section 3 considers the two-sam-
ple jackknife procedure for the quadrat datasets. In section
4, the performance of jackknife estimators for the Jaccard
and Sørensen indices are discussed with the simulations
of artificial datasets and two tropical forests in Panama.
Section 5 offers concluding remarks and discussions.
2. Methodology and Procedures
2.1 Jackknife Estimates for the Jaccard Index
Let M1 and M2 be the numbers of existent species in
community I and II, respectively. Assume that these spe-
cies can be decomposed as: M12 is shared species in both
communities; M1 � M12 and M2 � M12 are unique species
in community I and II, respectively. Let X = (X1, ..., X M1)
and Y = (Y1, ..., YM2) be the sampling frequencies with
samples sizes n1 and n2 from community I and II, respec-
tively. We assume that X follows the multinomial distri-
bution with a total size n1 and cell probabilities (p1, ...,
pS1), and for Y with a total size n2 and cell probabilities
(q1, ..., qS2).
The Jaccard index is defined as �J = M12/M, where M
= M1 + M2 � M12 denotes the number of total species in
two communities. Let D I Xi
M
1 110� �
�� ( ) be the num-
ber of observed species in sample I where I(�) denotes the
usual indicator function, and let D2 be the same for sam-
ple II. Let D I X Yi ii
M
12 10 0� � �
�� ( , ) be the number of
species observed in both samples. We denote fjk =
I X j Y ki ii
M( , )� �
�� 1as the frequency that is accurately
represented by j individuals in sample I and k individuals
in sample II. The number of total observed species in
both samples is D = D1 + D2 � D12. Due to the missing of
M12 and M, the naive method takes D12/D to estimate the
Jaccard index. The number of observed species is gener-
ally less than the actual number. Consequently, the bias
in the naive estimator occurs.
The two-sample jackknife procedure is applied to re-
duce the naive estimator’s bias [20]. Let the naive esti-
mator be the initial estimator, denoted as�� J 0
, for �J based
on all n1 + n2 observations. Define�
��
J 0
( , )� �as an estimator
by evaluating�� J 0
when the �th observation in X has been
deleted, where � �1 1, ..., n . The �th pseudo-value defines
as n nJ J1 10 01
� ��
� �� �� �
( )( , )
. Then, the jackknife estimator is
defined as the average of these pseudo-values and is de-
rived as:
where f I X Yi ii
M
1 11 0
� �� � �� ( , ) denotes that the num-
ber of observed shared species accurately represented
one individual in sample I. For Y, the jackknife estima-
tor by deleting one individual in Y at a time is
where f I X Yi ii
M
� �� � ��1 1
0 1( , ). The weighted aver-
age of�� J X0 ,
and�� J Y0 ,
is the resulting first-order jackknife
estimator:
Following the two-sample jackknife [21], we con-
tinue to remove one individual in Y at a time for the esti-
mator�� J X0 ,
. Let��
J
m
X0 ,
( , )� �be an estimator that evaluates
�� J X0 ,
when the mth individual is removed from Y. The mth pse-
udo-value is n nJ J
m
X X2 20 0
1� �� �
, ,( )
( , )� �
� �and the second-order
302 Chia-Jui Chuang
jackknife estimator is derived as:
where�
��
J
m
0
( , )� � denotes as an estimator by evaluating�� J 0
when the �th observation in X and mth observation in Y
has been deleted.
We may continue the jackknife procedure to further
reduce the bias. However, the number of terms for the
jackknife estimator with order k is 3k for k = 1, 2, … and
the form of estimator is complicated, as k is large. There-
fore, we suggest ignoring entries, in terms of 1/D2 and
1/D3, in developing the jackknife estimator. Conse-
quently, the approximated estimators of�� J1
and�� J 2
are
The approximated estimator~�J1
and~�J 2
are nearly to�� J1
and�� J 2
as D is large enough. In section 4.1, we consider
several scenarios of communities to confirm the proxim-
ity between approximated and original estimators.
Note that~�J1
and~�J 2
can be represented as~�J i
=
c f Djk
i
jkkj
( )/
�� �� 00, where cjk
i( )is the coefficient corre-
sponding to fjk and i = 1, 2. According to the two-sample
jackknife procedure above and ignoring the1 / D � terms
for � � 2, our jackknife algorithm is summarized as:
Step 0. Set initially v = 0.
Step 1. Let i = 2v + 1. Define� �� �J X Jv v
n
v2 2
1
1, �
�
�� �
�
� �
��n v
vn
J
n
v
111
1
1 2
1( )
/( , )
��
�� , and
�� J Yv2 , �
n
vJ v
2
1 2�
�� �
n v
vn
J
m
m
n
v
221
1
1 2
2� �
�
� �
��( )
/( , )
�� . The ith
order jackknife estimator is� �� �J J Xi v
n� �( ,1 2
n J Yv2 2
�� , ) / (n1 + n2). An approximated estimator
for the ith order is~
/( )
�J jk
i
jkkjic f D�
�� �� 00.
Step 2. The (i + 1)th order jackknife estimator is
An approximated estimator for the (i + 1)th order
is
Step 3. Increase integer v to v + 1 and return to Step 1.
The explicit formulae for~�J1
to~�J 6
are provided in Ap-
pendix A.
The estimated variance of~�J i
for i = 1, …, 6 are de-
rived from the delta method. For a given D, the estimated
variance of~�J i
is
(1)
Note that f = (f11, …, fjk, …) conditional on D approxi-
mates to the multinomial distribution with size D and
cell probabilities� jk jkf D� / for all j, k > 0. Hence, the
estimated covariance in Eq. (1) is
Furthermore, by replacing the covariance in Eq. (1)
with the estimated covariance above; Eq. (1) can be
simplified as
2.2 Order Selection
Although a higher-order jackknife estimator might
Estimation of Similarity Indices via Two-Sample Jackknife Procedure 303
reduce bias further, it is usually accompanied by a higher
variance. The bias-variance trade-off of estimators, there-
fore, is crucial to select the jackknife order. We follow
Burnham and Overton [15] to develop a sequential test
for selecting an ideal order of jackknife estimator. The
test is based on the hypothesis
(2)
for i = 1, …, 6. The difference of two adjacent estima-
tors can also be expressed:
Using the delta method described in section 2.1, the con-
ditional variance of the difference can be estimated by
We assume, under the null hypothesis H0i, that the test-
ing statistics asymptotically follow the standard normal
distribution (Burnham and Overton [15]);
Given a significance level , the testing hypothesis
(2) begins at i = 1 to determine whether~�J1
and�� J 0
are
significantly different. If the p-value of Ti is less than ,
we continue the testing hypothesis for H0,i+1 until the
p-value of Ti+1 is greater than . Assume that the proce-
dure stops at i = i*, the estimator~
*�J
i �1
is treated as our
proposed estimator for the Jaccard index, denoted that~ ~
*� �J J
i
��1
. The variance of~�J is suggested [15].
However, the estimated variance is usually biased down-
ward as it does not count the variability of the selecting
order. Instead, we suggest the non-parametric bootstrap
method [22] to obtain the variance of~
*�J
i �1
.
2.3 The Sørensen Index
In addition to the Jaccard index, the Sørensen index
is also commonly used. The Sørensen similarity index is
defined as � = 2M12/(M1 + M2). Let the naive estimator of
the Sørensen index be�� � �2 12 1 2D D D/ ( ) and regard as
the initial estimator for the two-sample jackknife proce-
dure. Following the steps in section 2.1, the order 1 and 2
jackknife estimates, which ignore the terms of 1/(D1 +
D2)2 and 1/(D1 + D2)
3 are
(3)
In Appendix A, we also provide the explicit formulae of
order 1 to 6.
Rewrite Eq. (3) as~
/( )
� �S S jk
i
jkkjid f� �
�� ���
0 00
(D1 + D2 � 1), where d jk
i( )is the coefficient corresponding
to fjk. Given D, the estimated variance of~�Si
can be ap-
proximated by the delta method, mentioned in section
2.1. The difference formula between two adjacent esti-
mators is~ ~
( ) /( ) ( )
� �S S jk
i
jk
i
jkkji id d f� � �
�
�
�� ��1
1
00(D1 +
D2 � 1). Following the sequential hypothesis testing in
section 2.2, we can determine an ideal order as the final
Sørensen estimator.
3. Similarity Indices for Incidence Data
It is common to collect data by quadrat sampling de-
sign in a field survey, and especially in a forest commu-
nity. Instead of counting the exact abundance of each
species in a quadrat, only the presence (1) or absence (0)
for each species can be recorded at each sampled quadrat.
For example, assume that the ith species is observed in
the j th quadrat, we denote zij = 1 and otherwise zij = 0. Let
t1 be the number of sampling quadrats in community I,
and be the number of quadrats where the ith species was
present (i.e. X zi ijj
t*�
�� 1
1 ). Let X X X M
* * *( , ..., )� 1 1be
the collected data. Let n X ii
M
1 1
1* *�
�� then X* follows the
multinomial distribution with cell probabilities ( ( ) /*E X 1
n E X nM1 11
* * *, ..., ( ) / ) when n1
* is given. Similarly, let t2 be
the number of sampling quadrats and Y Y YM
* * *( , ..., )� 1 2
be the sampling data of community II. Let n Yii
M
2 1
2* *�
��then Y* also follows the multinomial distribution with
304 Chia-Jui Chuang
size n2
* and cell probabilities ( ( ) / , ...,* *E Y n1 2 E YM( ) /*
2n2
* ).
Let D be the total number of observed species in two
samples and fjk represent the exact number of shared spe-
cies detected by j quadrats in sample I and k quadrats in
sample II. Following the same jackknife algorithm in
section 2.1, we find that the jackknife estimators, derived
from the quadrat sampling design, are the same as those
shown in Appendix A. Hence, the sequential testing cri-
terion in section 2.2 is also recommended to select the
jackknife order for the Jaccard and Sørensen indices
under the quadrat sampling design.
4. Simulation Study
4.1 Artificial Populations
A simulation study is conducted to examine the dif-
ference between the proposed jackknife estimators and
the original estimators under different scenarios, and to
investigate the performance of the selected jackknife es-
timators (~�J and
~�S ). In order to test the performance of
the two-sample jackknife procedure under various data
structure, we consider three communities, consisting of
500 species in each, with the discovery probabilities as
follows:
Community 1: Pi’s are independently generated from
uniform distribution with range from 0 to
0.5.
Community 2: The values of Pi’s are set as Pi = 0.01, i =
1, …, 100;
Pi = 0.02, i = 101, …, 200; Pi = 0.15, i =
201, …, 300;
Pi = 0.56, i = 301, …, 400; Pi = 0.98, i =
401, …, 500.
Community 3: Pi = 4/(511 � i), i = 1, …, 500.
The coefficient of variation (CV) for these three
communities is 0.58, 1.09 and 1.45, respectively. We
consider six possible combinations from the three com-
munities, namely, 1 vs. 1, 1 vs. 2, …, 3 vs. 3. The num-
bers of sampling quadrats are set as t1 = t2 = 50. For each
combination, we assume that the first 100, 250, and 400
species are the shared species, and generate 1000 data-
sets for each scenario.
The averaged difference between the proposed jack-
knife estimators (~
,~
,~
,~
)� � � �J J S S1 2 1 2and the original jack-
knife estimators ( , , , )� � � �� � � �J J S S1 2 1 2
are summarized in
Table 1. All the proposed jackknife estimators are higher
than the original estimators. The exceeding value is quite
small for all of the cases considered. A significant dis-
tinction occurs in case 3 vs. 3, which has a higher CV
than the other five cases. Since the value of difference
increases as the number of shared species increases, we
suggest a small number of shared species to assess the
performance of the selected jackknife estimators.
In the simulation, the number of shared species from
120 to 400 has been considered. Moreover, we set that
the sampling quadrate range from 10 to 100. Due to the
word count limit, we only report the result of 120 shared
species with sampling size 50 because the conclusions
are similar. Figure 1 presents the mean of naive estima-
tors, the first order to the fourth order jackknife esti-
mators, and the selected estimators for the Jaccard and
Sørensen indices. In Figure 1, the x-axis corresponds
with the sequence of six combinations. The horizontally
dotted lines in the figure indicate the true values 0.1364
and 0.24 for the Jaccard and Sørensen indices, respec-
tively. Furthermore, the root mean square error (RMSE)
of the selected jackknife estimator does not vary a lot,
with the significant level from 5% to 10%. Therefore,
we only report the mean of selected estimators in Figure
1, with the significant level at 10%.
Estimation of Similarity Indices via Two-Sample Jackknife Procedure 305
In Figure 1, the naive estimators significantly under-
estimate the true values. The first three order jackknife
estimators are also underestimated; however, they de-
crease the bias as the corresponding order increases. The
fourth order jackknife estimators (~�J 4
and~�S4
) become
overestimated; however, their absolute bias are still
smaller than those of the naive estimators. We then ob-
tain that the selected jackknife estimator lies between the
third and fourth jackknife estimator for all cases. Al-
though the selected jackknife estimators have the small-
est bias, they are not the ideal choice in terms of RMSE,
shown in Figure 2 as the selecting procedure leads to the
extra variation. The third order jackknife estimators (~�J 3
and~�S3
) have the smallest RMSE in almost all the cases
considered. Hence, the third order jackknife estimators
are also recommended for simplified application in eco-
logy study.
4.2 Real Populations
There are two protected forests in Panama: the Sher-
man, and the Cocoli plot. General survey data for these
two forests can be found on the Website of the Center for
Tropical Forest Science. Various studies have been con-
ducted on their distinguishing characteristics [23]. The
Sherman forest is located in the San Lorenzo National
Park, in the tropical moist forest along the Caribbean
Ocean coast. It is L-shaped, and covers a surface area of
5.96 ha (a 400 m � 100 m rectangle and a 140 m � 140 m
square). Furthermore, a census of species has been taken
three times in January 1996, December 1997, and Febru-
ary 1999. The Cocoli plot is located on the Pacific Ocean
side of the Panama Canal. It covers a surface area of 4 ha
and is also L-shaped (a 300 m � 100 m rectangle and a
100 m � 100 m square). The census of the species was
taken in November of 1994, 1997 and 1998. The dis-
tance between these two forests is 58.8 km, and the 1997
census data is available for both. There are 50 shared
species in these forests. The dataset selected for this
study includes tree species, with a diameter breast height
greater than 10 mm. The basic characteristics of the two
forests are summarized in Table 2.
As the surface area covered by the two forests is very
small, a 5 m � 5 m quadrat size with five sampling pro-
portions (2%, 4%, 6%, 10%, and 20%) is examined.
1000 repetitions are carried out at random for each sam-
pling proportion, and 100 times bootstrap are used to cal-
culate the standard error of the selected jackknife esti-
mator for each repetition. Due to the word count limit,
we only report the sampling size of (2%, 4%, 10%). The
first six order jackknife estimators for the Jaccard and
Sørensen indices and the selected estimator at 5% and
10% significant levels are assessed in this study. How-
ever, we find that most testing procedures stop before the
fourth jackknife estimator. Hence, the first four estima-
306 Chia-Jui Chuang
Figure 1. The averaged value of estimators for the Jaccardand Sørensen indices. � denotes
��J0
; � denotes~�J1
; denotes
~�J2
; � denotes~�J3
; � denotes~�J4
; � de-notes
~�J . The same symbols for the Sørensen index.
Figure 2. The RMSE of of estimators for the Jaccard andSørensen indices. � denotes
��J0
; � denotes~�J1
;
denotes~�J2
; � denotes~�J3
; � denotes~�J4
; � denotes~�J . The same symbols for the Sørensen index.
tors, and the selected estimator of significant level at
10%, are reported in Tables 3 and 4, where � denotes the
mean of standard error and�� denotes the mean of esti-
mated standard error.
In general, sampling without replacement is more
suitable than with replacement in the case of sedentary
species such as plants. As the maximum sampling pro-
portion is 20% in our setting, the sampling data with re-
placement or without replacement are similar. As ob-
served in Tables 3 and 4, the number of shared species is
low in small sampling sizes (less than 10%). Therefore,
the mean of the naive estimators significantly underesti-
mate for the Jaccard and Sørensen indices. The proposed
jackknife estimators and selecting estimators are effi-
cient, in terms of reducing bias. The jackknife estimators
perform better, as the sampling size becomes larger and
performs optimally in the case of 10% sampling size.
The estimated variance of the selecting estimates is
Estimation of Similarity Indices via Two-Sample Jackknife Procedure 307
Table 2. Several characteristics of the Sherman and Cocoli forests
Sherman Cocoli
Location 9�21’ N, 97�57’ W 8�58’ N, 79�35’ W
Size of Plot 5.96 ha 4 ha
No. of Species 224 170
No. of Individuals 21799 8288
No. of Quadrat for 5 m � 5 m 2384 1600
No. of Shared Species 50
Jaccard index 0.1453
Sørensen index 0.2538
slightly underestimated, and debases the reliability of~�J
and~�S in 95% coverage rate.
The averages of selecting estimators always lie be-
tween the third order and the fourth order jackknife esti-
mator. For the~�J , its mean is close to the mean of the
third order, and the RMSE is close to that of the fourth
order. In terms of RMSE,~�J 3
has the smallest value for
almost all cases. Then~�J 3
is also recommended for the
Jaccard index due to the simple application. Although~�S2
has the smallest RMSE in Table 4, its bias tends to be
higher than~�S3
. Therefore, for simplification, we recom-
mend~�S3
for the Sørensen index.
5. Discussion
This study presented a new procedure based on the
two-sample jackknife to estimate the Jaccard and Søren-
sen indices in the case of the abundance dataset and the
quadrat dataset. A sequential testing criterion, for select-
ing a proper order between jackknife estimators, is also
suggested. Heltshe [16] proposed the two-sample jack-
knife for estimating the Jaccard index by using the SMC
estimator; however, our findings reveal that�� SMC is un-
suitable for this dataset.
Heltshe and Forrester [14] pointed out the jackknife
estimators sensitive to the sampling size. The two-sam-
ple jackknife also has the similar problem. We discover
that the jackknife estimators always underestimate when
sampling size is small. As the sampling size increases,
the performance of the jackknife estimators improves. In
addition, the relative abundance distribution (Mouillot
and Lepretre [24]) also affects the performance of the
jackknife estimators. For the six combinations listed in
section 4.1, we consider the abundance shared species in
one community vs. the rare shared species in the other
community. For most cases, the selected jackknife esti-
mator still performs very well besides the case 3 vs. 3
which has highest CV than the other cases. Although the
selected jackknife estimator overestimates in the case of
3 vs. 3, it has smaller bias and RMSE than the naive
estimator.
The criterion of the sequential hypothesis test for
identifying the most suitable order of the estimator is
based on the work of Burnham and Overton [15], but is
also accompanied by extra variation in the selecting pro-
cedure. Hence, the bootstrap method is appropriate, to
308 Chia-Jui Chuang
estimate the variance of the selected estimator. However,
the defect of bootstrap is time consuming. Therefore,
further investigation is required for the variance esti-
mation of the true variation.
A number of similarity indices have not been co-
vered in this paper, including the Kulczynski index, the
Morisita-Horn index, and the Bray-Curtis index. The
comparisons between similarity indices could provide
insights into which index is the most appropriate, and
which index is inappropriate. In addition, future studies
should be conducted on how to extend the jackknife pro-
cedure to multiple communities for the similarity index
issue.
Appendix A. Jackknife Estimators~� J i
and~� Si
for i = 1, …, 6
Define a coefficient matrix which depends on sampling
individuals n as
Furthermore, the frequencies fjk, for j, k = 1, 2, 3, reorga-
nize into frequency matrix
The formula of jackknife estimators are summarized as
following:
Estimation of Similarity Indices via Two-Sample Jackknife Procedure 309
References
[1] Lynch, M., “The Similarity Index and DNA Finger-
printing,” Molecular Biology and Evolution, Vol. 7,
pp. 478�484 (1990).
[2] Tan, P.-N., Steinbach, M. and Kumar, V., Introduction
to Data Mining, Addison-Wesley (2005).
[3] Hubalek, Z., “Coeffcients of Association and Similarity,
Based on Binary (Presence-Absence) Data: An Evalua-
tion,” Biological Reviews, Vol. 57, pp. 669�689 (1982).
[4] Chao, A., Chazdon, R. L., Colwell, R. K. and Shen,
T.-J., “Abundance-Based Similarity Indices and Their
Estimation When There are Unseen Species in Sam-
ples,” Biometrics, Vol. 62, pp. 361�371 (2006).
[5] Magurran, A. E., Ecological Diversity and Its Mea-
surement, Princeton University Press (1988).
[6] Magurran, A. E., Measuring Biological Diversity,
Wiley-Blackwell (2004).
[7] Jaccard, P., “Lois De Distribution Florale Dans La
Zone Alpine,” Bulletin Societe Vau-doise Sciences Na-
turelles, Vol. 38, pp. 67�130 (1902).
[8] Sørensen, T., “A Method of Establishing Groups of
Equal Amplitude in Plant Sociology Based on Similar-
ity of Species and Its Application to Analyses of the
Begetation on Danish Commons,” Biologiske Skrifter /
Kongelige Danske Videnskabernes Selskab, Vol. 5, pp.
1�34 (1957).
[9] Higgs, A. J. and Usher, M. B., “Should Nature Re-
serves Be Large or Small?” Nature, Vol. 285, pp.
568�569 (1980).
[10] Legendre, P. and Legendre, L., Numerical Ecology,
2nd ed, Elsevier Science (1998).
[11] Boyce, R. L. and Ellison, P. C., “Choosing the Best
Similarity Index When Performing Fuzzy Set Ordina-
tion on Binary Data,” Vol. 12, pp. 711�720 (2001).
[12] Quenouille, M. H., “Notes on Bias in Estimation,”
Biometrika, Vol. 61, pp. 353�360 (1956).
[13] Schucany, W. R., Gray, H. L. and Owen, D. B., “On
Bias Reduction in Estimation,” Journal of the American
Statistical Association, Vol. 66, pp. 524�533 (1971).
[14] Heltsche, J. F. and Forrester, N. E., “Estimating Spe-
cies Richness Using the Jackknife Procedure,” Bio-
metrics, Vol. 39, pp. 1�11 (1983).
[15] Burnham, K. P. and Overton W. S., “Estimation of the
Size of a Closed Population When Capture Probabili-
ties Vary Among Animals,” Biometrika, Vol. 65, pp.
625�633 (1978).
[16] Heltshe, J. F., “Jackknife Estimate of the Matching
Coefficient of Similarity,” Biometrics, Vol. 44, pp.
447�460 (1988).
[17] Kaitala, S., Maximov, V. N. and Niemi A., “A Simple
Approach to Estimate Similarity in Ecosystem Analy-
sis,” Plant Ecology, Vol. 92, pp. 101�112 (1991).
[18] Yue, J.-C. and Clayton, M. K., “A Similarity Measure
Based on Species Proportions,” Communications in
Statistics�Theory and Methods, Vol. 34, pp. 2123�
2131 (2005).
[19] Severiano, A., Carrico J. A., Robinson, D. A., Ramirez,
M. and Pinto F. R., “Evaluation of Jackknife and Boot-
strap for Defining Confidence Intervals for Pairwise
Agreement Measures,” PLoS ONE, Vol. 6, pp. 1�11
(2011).
[20] Arvesen, J. N., “Jackknifing U-Statistics,” The Annals
of Mathematical Statistics, Vol. 40, pp. 2076�2100
(1969).
[21] Schechtman, E. and Wang, S., “Jackknifing Two-Sam-
ple Statistics,” Journal of Statistical Planning and In-
ference, Vol. 119, pp. 329�340 (2004).
[22] Chao, A., Hwang, W.-H., Chen, Y.-C. and Kuo, C.-Y.,
“Estimating the Number of Shared Species in Two
Communities,” Statistica Sinica, Vol. 10, pp. 227�246
(2000).
[23] Condit, R., Watts, K., Bohlman, S. A., Perez, R.,
Hubbell, S. P. and Foster, R. B., “Quantifying the De-
ciduousness of Tropical Forest Canopies under Vary-
ing Climates,” Journal of Vegetation Science, Vol. 11,
pp. 649�658 (2000).
[24] Mouillot, D. and Lepretre, A., “Introduction of Rela-
tive Abundance Distribution (RAD) Indices, Esti-
mated from the Rank-Frequency Diagrams (RFD), to
Assess Changes in Community Diversity,” Environ-
mental Monitoring and Assessment, Vol. 63, pp. 279�
295 (2000).
Manuscript Received: Aug. 8, 2011
Accepted: Nov. 14, 2011
310 Chia-Jui Chuang
top related