Preface

The material in this online book is designed to support the Primer on Phylogenetic Comparative Methods for the Biological Sciences by Natalie Cooper and Rob P Freckleton. These materials were primarily written by me (Natalie), so don’t blame Rob for any errors!

All practical exercises use R (R Core Team 2023), so some knowledge of R is required. I have provided the basics in the first chapter. The online book focuses on practical implementations of methods for the most part. For information on the theoretical underpinnings of the topics covered here please refer to the Primer.

Datasets and scripts

[UPDATE THIS WHEN FINISHED WITH OUP LINK]

All datasets, trees, R scripts (as R Markdown/.Rmd files), and an R Project (.Rproj) file for each exercise are available for download as a ZIP from here. When you click this link it will take you to a website and the download should start automatically. Don’t forget to unzip** this before starting.** The data/trees and scripts are in the appropriate folder for each practical exercise. Note that there is a folder for each of the exercises in the workbook.

How to use these materials

It is possible to use these materials in a number of ways.

Follow the online workbook in a web browser, copy-pasting code into an R script and then running it in the R console.
Open the .Rmd (R Markdown) file for each exercise in RStudio and use it to run the code. This allows you to run chunks of code in the script and for the results to appear below the code. You need to open the .Rmd file in folder that contains the correct datasets for that exercise.

To run the code, you just click the little green triangles to the far right of each code chunk (grey boxes with R code in them) to run the code as shown below.

What a code chunk looks like in an RMarkdown file.

The outputs, graphs and results will all appear in the .Rmd file underneath the code, as shown below.

After you click the green triangle in the top left hand corner of the code chunk, the code runs and the outputs appear under the code chunk within the RMarkdown file.

This is probably the best way to learn these methods. Note, however, that some of the formatting will look a bit weird. For example to get RMarkdown to output the Greek letter $\lambda$ (lambda) we type $\lambda$ . If you want to use these files I’d recommend also opening the online workbook in a web browser so if anything looks odd you can check what it is meant to look like!

All the code and outputs are shown in the online workbook, so you can also use this as a reference and then use the code to complete the practical exercises at the end of each chapter, or to apply them to your own data. This might be a good solution if you’re using these materials to learn something specific and don’t need practice with R or PCMs.

Boxes

Throughout the book are boxes of text highlighting particularly important issues:

Information boxes. These boxes highlight important details. These boxes may also show you how to solve problems that may not affect every user.

Extra details boxes. These boxes contain detailed explanations of things for those who like to fully understand the complexities of what they are doing, for example technical details of the code that I have not explained in detail in the text.

Caveats boxes. These boxes highlight important points that need to be considered when working through your own analyses. They reveal areas where it is important to be careful and think about what you are doing and why. The image is a Jurassic Park era velociraptor to remind you of the “Jurassic Park caveat”, i.e. that just because you can perform an analysis in R doesn’t mean that you should (thanks to Dr Ian Malcolm and Dr Michael Crichton)! Always consider the question at hand, your study group, and the quality of the data you are using before embarking on a new comparative analysis.

Example datasets

I’ve tried to keep my examples to a minimum so that you have chance to get familiar with the trees and data. As such there are just three main example datasets in this book. In each case I’ve removed a few species and a few variables to make things a bit more straightforward. If you want to use these datasets for your own work you should download the data from the publications listed to get the complete datasets

Apologies in advance to the non-vertebrate, non-animal fans out there. To make up for it I’ve added several plant and invertebrate examples to the practical exercises at the end of each chapter. If it helps just replace the word frog with fly, snake with sponge, and marsupial with grass. It won’t alter the R code.

Frog eye size evolution

Frog eyes are really variable. Two tree frogs with large eyes and a burrowing frog and an aquatic frog with tiny eyes. Image credits: hehaden/Flickr CC BY-NC 2.0; Brian Gratwicke/Flickr CC BY 2.0, Rushen/Flickr CC BY-SA 2.0, Sue Cro/Flickr CC BY-NC 2.0

Who doesn’t love frogs? Frogs are cool. One of the coolest things about them is that they have weird bulgy eyes…or do they? Some species have teeny tiny eyes, while others have massive eyes. In fact frogs have some of the biggest eyes relative to their body size across all vertebrates. K. N. Thomas et al. (2020) predicted that this variation might be due to where they live, their mating habits, the time of day they are active, and their body size. In our examples we’ll test some of these hypotheses using phylogenetic comparative methods.

The data and modified tree for this example come from K. N. Thomas et al. (2020), and the original tree comes from Feng et al. (2017). If you want to see the full results check out K. N. Thomas et al. (2020)! And there’s a nice summary of the paper here.

Natricine snake head shape evolution

Snakes have different head shapes in different habitats. These are (clockwise starting in the top left) terrestrial/semiaquatic, aquatic, aquatic burrowing, and burrowing natricines. Image credits: see V. Deepak, Gower, and Cooper (2023)

Snakes are also cool, especially natricines which are the group that contains both the delightful European grass snake (Natrix natrix) and the ubiquitous garter snakes (genus Thamnophis) of North America. Natricine snakes are found across the globe, and have a range of interesting ecologies and more morphological variation than you might expect, especially in their head shape. V. Deepak, Gower, and Cooper (2023) predicted that these variations in head shape would be more closely related to the ecomorph they belonged to (i.e. whether the snake was terrestrial, aquatic, burrowing or aquatic burrowing) than their evolutionary history. They expected that head shape might be an example of convergent evolution. In our examples we’ll test some of these hypotheses using phylogenetic comparative methods.

The data for this example comes from V. Deepak, Gower, and Cooper (2023), and the tree comes from V. Deepak et al. (2021). If you want to see the full results check out V. Deepak, Gower, and Cooper (2023)!

Diversification in dragonflies

Dragonflies are amazing, this is Aeshna juncea. Image credits: Wikimedia CC BY-SA 3.0

You’ve probably guessed that yes, dragonflies are also cool. They’re incredible predators and extremely agile fliers. My favourite fact about dragonflies is that one species, the globe skimmer (Pantala flavescens) make an annual multi-generational migration of around 18,000km (!) with individual insects flying more than 6,000km (thanks to Dr Jessica Ware for that fact and this dataset!). Dragonflies today are generally found near water, with some preferring lotic habitats with fast flowing waters and others lentic habitats with slow moving waters. The clade has been around for over 300 million years, and currently has over 3000 species. But how quickly did they diversify? Do different clades have different rates of evolution? Do their habitat preferences influence their diversification rates? In our examples we’ll test some of these questions using phylogenetic comparative methods.

The 522 species tree for this example comes from Letsch, Gottsberger, and Ware (2016b) and is available to download from Letsch, Gottsberger, and Ware (2016a). This paper looked across dragonflies to investigate whether species from lotic habitats with fast flowing waters diversify more rapidly than species from lentic habitats with slow moving waters. If you want to see the full results check out Letsch, Gottsberger, and Ware (2016b)!

Citing R and R packages

Lots of people work on R and R packages for free. They’re the reason that R is so great! The best way to thank them for this selfless work is to cite R, and any R packages that you use, whenever you write a report, article, thesis chapter or paper. This means that R developers can show their funders, bosses, supervisors and potential employers that people are using their work.

The citation for R will usually look something like this

All analyses used R version 4.2.0 (R Core Team 2022).

Your version number might be different (4.2.0 is the current version at the time of writing this book). You only need to do this once, usually in the methods section. The full citation for the bibliography is usually something like:

R Core Team (2023). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.

If you don’t remember this, or can’t work out what version of R you are using, the R folk have you covered. To get the citation you can use:

citation()

## 
## To cite R in publications use:
## 
##   R Core Team (2022). R: A language and environment for statistical computing. R
##   Foundation for Statistical Computing, Vienna, Austria. URL
##   https://www.R-project.org/.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Manual{,
##     title = {R: A Language and Environment for Statistical Computing},
##     author = {{R Core Team}},
##     organization = {R Foundation for Statistical Computing},
##     address = {Vienna, Austria},
##     year = {2022},
##     url = {https://www.R-project.org/},
##   }
## 
## We have invested a lot of time and effort in creating R, please cite it when using
## it for data analysis. See also 'citation("pkgname")' for citing R packages.

To get the version of R you can use:

R.Version()$version.string

## [1] "R version 4.2.0 (2022-04-22)"

You can also look at more version information by running:

R.Version()

I’ve suppressed the output here as it will be different for every user. The version number is near the bottom of the output. You’ll also see one of the fun things about R here which is that each version has a nickname, all of which are the titles of Peanuts comics! For more info see this slackoverflow discussion.

What about R packages? You should cite these at the relevant points in your methods section. For example, for caper(we’ll return to what this does later in the book) we might write

We fitted phylogenetic generalised least squares (PGLS) models using the R package caper version 1.0.1 (Orme et al. 2018).

To find out what the citation is for an R package we also use the function citation but this time specify the package

citation(package = "caper")

## 
## To cite package 'caper' in publications use:
## 
##   Orme D, Freckleton R, Thomas G, Petzoldt T, Fritz S, Isaac N, Pearse W (2018).
##   _caper: Comparative Analyses of Phylogenetics and Evolution in R_. R package
##   version 1.0.1, <https://CRAN.R-project.org/package=caper>.
## 
## A BibTeX entry for LaTeX users is
## 
##   @Manual{,
##     title = {caper: Comparative Analyses of Phylogenetics and Evolution in R},
##     author = {David Orme and Rob Freckleton and Gavin Thomas and Thomas Petzoldt and Susanne Fritz and Nick Isaac and Will Pearse},
##     year = {2018},
##     note = {R package version 1.0.1},
##     url = {https://CRAN.R-project.org/package=caper},
##   }
## 
## ATTENTION: This citation information has been auto-generated from the package
## DESCRIPTION file and may need manual editing, see 'help("citation")'.

Usually package citations contain the version number, but if not you can get the version using

packageVersion("caper")

## [1] '1.0.1'

An additional benefit to citing R packages is that it helps people understand exactly what you did. It’s possible there are multiple ways to run a PGLS model, but if your report says you used caper, it’s easy for a reader to check how caper does it and to know exactly what you did. This helps people reproduce your analysis, and can also help you prove to anyone assessing your work that you know what you are doing!

Acknowledgements

Thanks to the many generations of postdocs and students who have taken courses with me and helped me to hone these materials. And to the many others out there teaching PCMs and writing tutorials that helped me learn these methods in the first place, especially Luke Harmon, Brian O’Meara, David Orme, Sam Price, Dan Rabosky, Liam Revell and Graham Slater. Also thanks to everyone for test running these materials.

Particular thanks to the authors of the R packages used and cited in this book. None of this would be possible without them. Do not forget to cite the packages you use in your own work. And if you meet one of them in person, buy them a beer/cake/coffee to say thank you!

Best of luck to you all, and happy PCM-ing!

References

Deepak, V, Natalie Cooper, Nikolay A Poyarkov, Fred Kraus, Gustavo Burin, Abhijit Das, Surya Narayanan, Jeffrey W Streicher, Sarah-Jane Smith, and David J Gower. 2021. “Multilocus phylogeny, natural history traits and classification of natricine snakes (Serpentes: Natricinae).” Zoological Journal of the Linnean Society 195 (1): 279–98. https://doi.org/10.1093/zoolinnean/zlab099.

Deepak, V., D. J. Gower, and N. Cooper. 2023. “Diet and Habit Explain Head-Shape Convergences in Natricine Snakes.” Journal of Evolutionary Biology 36 (2): 399–411. https://doi.org/https://doi.org/10.1111/jeb.14139.

Feng, Yan-Jie, David C Blackburn, Dan Liang, David M Hillis, David B Wake, David C Cannatella, and Peng Zhang. 2017. “Phylogenomics Reveals Rapid, Simultaneous Diversification of Three Major Clades of Gondwanan Frogs at the Cretaceous–Paleogene Boundary.” Proceedings of the National Academy of Sciences 114 (29): E5864–70.

Letsch, H., B. Gottsberger, and J. L. Ware. 2016a. “Data from: Not Going with the Flow: A Comprehensive Time-Calibrated Phylogeny of Dragonflies (Anisoptera: Odonata: Insecta) Provides Evidence for the Role of Lentic Habitats on Diversification,” Dryad, Dataset https://doi.org/10.5061/dryad.f3d4f.

———. 2016b. “Not Going with the Flow: A Comprehensive Time-Calibrated Phylogeny of Dragonflies (Anisoptera: Odonata: Insecta) Provides Evidence for the Role of Lentic Habitats on Diversification.” Molecular Ecology 25 (6): 1340–53. https://doi.org/https://doi.org/10.1111/mec.13562.

R Core Team. 2023. R: A Language and Environment for Statistical Computing. Vienna, Austria: R Foundation for Statistical Computing. http://www.R-project.org/.

Thomas, Kate N, David J Gower, Rayna C Bell, Matthew K Fujita, Ryan K Schott, and Jeffrey W Streicher. 2020. “Eye Size and Investment in Frogs and Toads Correlate with Adult Habitat, Activity Pattern and Breeding Ecology.” Proceedings of the Royal Society B 287 (1935): 20201393.

Online materials for Primer on Phylogenetic Comparative Methods for the Biological Sciences