In this workshop students will be introduced to multivariate phylogenetic comparative methods with the mvMORPH R package.
The mvMORPH package contains tools for modelling the evolution of correlated continuous traits (e.g. morphometric measurement, geometric morphometric datasets, life history traits, gene expression data, etc.) on phylogenetic trees [ with either fossil species, extant species or both] as well as statistical tools such as multivariate generalized least squares (GLS) linear models -e.g. multivariate regression, MANOVA, MANCOVA – for studying comparative datasets.
In this course, students will be first introduced to some theory with illustrative examples (both from simulated data as well as students’ own datasets) and will learn how to interpret the models, their parameters, as well as how to assess their reliability.
Graduate or postgraduate degree in Biomedical, Life or Earth Sciences, basic knowledge of statistics and knowledge of R at user level. Participants must have a personal computer (Windows, Mac, Linux). The use of a webcam and headphones is strongly recommended, and a good internet connection.
We would like to encourage participants to bring along their own dataset with a matching phylogenetic tree (or sample of trees)
- Introduction to phylogenies, trait evolution and the comparative methods
- Short introduction on the modelling rationale and theoretical basis on trait (multivariate) evolution and models
- Illustration with simulation examples in R (e.g. 3D plot of bivariate processes)
- Modelling the evolution of traits on trees (simulations, model fit and comparison)
- Review of some multivariate models (BM, OU, EB, Shift…), assumptions, and limits.
- Step by step procedure for model comparison and interpretation of parameters using simulated and empirical datasets.
- Hypothesis testing and constrained parameters estimation
- Working with high-dimensional datasets
- Introduction to the high-dimensional challenges (when the number of traits approach or is larger than the number of species such as in geometric morphometric and gene expression data) of comparative methods (e.g. comparison of likelihood, penalized likelihood, and alternative techniques)
- Model fit on high-dimensional datasets (model comparison, estimation of parameters, reconstruction of evolutionary trajectories)
- Fitting linear models (MANOVA, MANCOVA, multivariate regression) to comparative data
- Introduction to phylogenetic linear models and multivariate counterparts, their assumptions, the various tests.
- Linear hypothesis testing
- Illustration on both empirical and simulated datasets
- Using diagnostic plots, simulations, and Monte-Carlo techniques to assess the reliability of model fit and parameters
- Introduction to bootstrap and parametric bootstrap techniques, estimation of uncertainties, assessing relative and absolute fit to the data.
- Imputing missing values and estimating ancestral states
- Introduction on how to estimate missing values and ancestral states; formatting the data, etc.
- Inferring dependencies and causal links between traits evolving on trees
- Introduction to inferring graphs of dependencies between evolving traits (e.g. partial correlations, graphical LASSO, etc.)
- Using multivariate models to infer “causal” links (e.g, comparative study on sexual dimorphism)
- Working on non-ultrametric trees (e.g., fossil data, virus strains)
- Introduction on identifiability issues and the strength and weaknesses of working with non-ultrametric trees
- Illustration with worked examples [and students’ datasets]
- Transformations and data pre-treatments
- Discussions on the use of pre-transformations (eg log-transformation) and data reduction techniques (PCA, phylogenetic PCA) in comparative datasets; measurement error and intraspecific variance.
- (Digression) Modelling multivariate time series
- Illustration on how to model multivariate traits on time-series rather than phylogenetic trees in mvMORPH (inferring trends, causal links, etc.)
- Summary on the various techniques’ strengths and weaknesses, model assumptions, and alternative tools currently available on R with some worked examples.