Top Banner
Time vicariance by geology (Density covariation) and water boundary (State Space).
10

Multi-Model Panbiogeography

Nov 25, 2015

Download

Documents

Brad McFall

Some notes to develop multiple model inference in panbiogeography.
Welcome message from author
This document is posted to help you gain knowledge. Please leave a comment to let me know what you think about it! Share it to your friends and learn new things together.
Transcript

Time vicariance by geology (Density covariation) and water boundary (State Space).

Borchers and Burnham remark with respect to a particular use of maximum likelihood estimation that (page 12) , The fundamental idea behind maximum likelihood estimation is very straightforward: given an equation quantifying the probability of what was observed (the data) as a function of some unknown parameters, the maximum likelihood estimates are those parameter values at which this function is a maximum+++++++++++++++++++++1)The simplistic exemplar of this technique is to imagine the likelihood of obtaining say 6 heads out of 12 flips of a coin.2)In mark-recapture studies the probabilities are associated with each data/observation event that consist of whether a particular marked individual was seen or not at different times and or in different places. Given the data of events found, the maximum likelihood results are those parameter values that maximizes the probability of obtaining those found records.(for a coin the likelihood of obtaining 6 heads with 12 flips is: p = ).3) In panbiogeography the data are the collection localities of different species individuals subject to change in time that results in a particular placement of records in current time. One wants to try to find out how the distributions reflect changes in divisions through past time and translations across the entire space. The divisions maximally represent phylogentic trajectories and minimally may present detectable pre-speciation events that may or may not be an object of vicariance. The generalized track may be viewed as graphical model of these past changes.

Space Vicaraince by compounded tracks in community How the individual tracks compose the generalized track provides parameters to the model of how the generalized track fits the observed data. Inference by model averaging provides a way to use biogeography to suggest phylogenetic divisions and thus synthesize space, time per form.

MODEL B MODEL A MODEL CThe number of forms per track offers output useful in the study of metacommunity, landscape density affects, and ecosystem connectivity meaning.So the parameters (two tracks one node vs one track vs 4 nodes&3tracks say) of the generalized track permit parametric inference about possible clade node seriality. One has a generalized track state space which may or may not be correlated to geology and a search encounter composite track made of track parts, nodes and a masses.

Thus one needs to write an equation that gives the probability of observing collection localities given the different models which are made up of different generalized track parts. The probability of finding a collection locality around each line track part is like distance sampling with the probability of detection falling of with the distance away. The distance function is modified by the track parts around (line or node) or in (mass) it. The mass effect is equivalent to having the probability of detection decrease slower with distance from the search encounter composite track formation.The density of individual tracks (data) per generalized track can correlate with geological variation to provide the divisions for a given number of forms per track which implies a ecosystem level promixation affective through metacommunities in space.So the ontology of the past biogeography in the model gets the divisions geologically and space changes biotically and therethrough multimodel inference offers a specific synthesis of space, time and form.Here we show how to create the probabilities by separating the density variation covariation from the fixed vs random number of N per generalized track from the spatial metacommunity connectivity that links multiple generalized tracks independent of the temporal divisions.-----------------------------------------Least squares is one way to regress a set of points to a model line. Here the line is the generalized track but since this track is not simply a straight line the regression onto the model track is not homogenous as in the least squares sense but depends on the number of nodes and the angles the nodes make to the component individual tracks. Each track part regresses the points around it to the line or circle but the data inbetween are are also regressed relative to the adjacent generalized track part (node to line) (line to node) Since the metacommunity binds as output multiple generalized tracks the mass parameter further modifies the regression to effect spatial difference independent of the covariation that geology provides for through density of individual tracks (regardless of nodes).

In linear regression, data are modeled using linear predictor functions, and unknown model parameters are estimated from the data. Such models are called linear models.F(plethodon) = b0 + b(pleth) + errorF(eurycea) = b0 + b(eury) + error

Suppose there are n data points {(xi, yi), i = 1, ..., n}. The goal is to find the equation of the straight line

which would provide a "best" fit for the data points. Here the "best" will be understood as in the least-squares approach: a line that minimizes the sum of squared residuals of the linear regression model. But here (where the model is a single straight line track) we do not simply find the which is the best line through the geographic collections per species but we supply the line and determine what error is required to fit the particular species data to that line.

Thus the upper track line is a better model for Eurycea than the lower is for Plethodon (given these data). The maximal likelyhoood maximizes where this line should go given the data since for this model the probability of a point dropping off away from the line is the same regardless of where the data are relative to the model (in other node-line-mass models this may not be the case). We use AIC to select the best model for the data and model averaging to enable inference from the Panbiogeographic parts to phylogenetic divisioning.

So there is a three way regression analysis of variance (line-node-line) or (node-line-node) per mass (four way regression) that is subject to density covariation. This is one way to do axiomatic panbiogeography.

A maximum likelihood estimator coincides with the most probable Bayesian estimator given a uniform prior distribution on the parameters. Indeed, the maximum a posteriori estimate is the parameter that maximizes the probability of given the data, given by Bayes' theorem.Thus for single track part models the Bayesian and MLE are equivalent provided there is no influence of the state space geometry on the distributions. Because the notion of track width is undefined it is possible to see both points of statistical view as the same in the most fundamental example of Croziats method. The common intercept disappears in this more general modeling environment.Maximum Likelihood PrincipleThe method of maximum likelihood chooses as estimates thosevalues of the parameters that are most consistent with the sampledata. Thus the MAximul likelihood of the single track modes are the values of beta (intercept, slope, and error that give rise to the best fit to the data.

DATA(X1,Y2,5); (X2,Y1,3,4,6);(X9,Y5);(X10,Y1,3,4,6);(X11,Y2,Y5);X12,y5;X13,Y5)

Questions to AskIIs the relationship really linear?IWhat is the distribution of the of \errors"?IIs the t good?IHow much of the variability of the response is accounted forby including the predictor variable?IIs the chosen predictor variable the best one?

Borchers and Burnham 2007 Buckland et. al. editors, General formulation for distance sampling in Advanced Distance Sampling