--- title: "Introduction to rTPC" author: "Daniel Padfield" output: rmarkdown::html_vignette date: "`r Sys.Date()`" description: > vignette: > %\VignetteIndexEntry{Introduction to rTPC} %\VignetteEngine{knitr::rmarkdown} %\VignetteEncoding{UTF-8} --- #### An introduction to rTPC and how it can be used to fit thermal performance curves using nls.multstart. ------------------------------------------------------------------------ ```{r, include = FALSE} knitr::opts_chunk$set( collapse = TRUE, comment = "#>", #tidy.opts=list(width.cutoff=60), #tidy=TRUE, fig.align = 'center' ) ``` ```{r setup, message=FALSE} # load packages library(rTPC) library(nls.multstart) library(broom) library(tidyverse) ``` **rTPC** provides a suite of functions to help fit thermal performance curves to empirical data. After searching the literature, **rTPC** contains `r length(rTPC::get_model_names())` different model formulations that have been used previously. These functions can be easily applied to methods in R that use non-linear least squares regression to estimate thermal performance curves. The available model formulations can be accessed using **get_model_names()**. ```{r get_mod_names} get_model_names() ``` They are generally named after the author of the paper (and hence the name of the model within the literature) and the year at which I found the model to be first used, separated by a **"\_"**. Some original model formulations have been altered so that all models take temperature in degrees centigrade and raw rate values as input. We can demonstrate the fitting procedure by taking a single curve from the example dataset **rTPC** - a dataset of 60 TPCs of respiration and photosynthesis of the aquatic algae, *Chlorella vulgaris*. We can plot the data using **ggplot2** ```{r first_plot, fig.width=6, fig.height = 4} # load in data data("chlorella_tpc") # keep just a single curve d <- filter(chlorella_tpc, curve_id == 1) # show the data ggplot(d, aes(temp, rate)) + geom_point() + theme_bw(base_size = 12) + labs(x = 'Temperature (ºC)', y = 'Metabolic rate', title = 'Respiration across temperatures') ``` For each model, **rTPC** has helper functions that estimate sensible start values (**get_start_vals()**), lower (**get_lower_lims()**) and upper (**get_upper_lims()**) limits. To demonstrate this, we shall use the sharpe-schoolfield model for high temperature inactivation only. ```{r} # choose model mod = 'sharpschoolhigh_1981' # get start vals start_vals <- get_start_vals(d$temp, d$rate, model_name = 'sharpeschoolhigh_1981') # get limits low_lims <- get_lower_lims(d$temp, d$rate, model_name = 'sharpeschoolhigh_1981') upper_lims <- get_upper_lims(d$temp, d$rate, model_name = 'sharpeschoolhigh_1981') start_vals low_lims upper_lims ``` One problem with most methods of fitting models in R using non-linear least squares regression is that they are sensitive to the choice of starting parameters. This problem also occurs in previous specialist R packages that help fit thermal performance curves, such as [devRate](https://github.com/frareb/devRate/) and [temperatureresponse](https://github.com/low-decarie/temperatureresponse). These methods can fail entirely or give different parameter estimates between multiple runs of the same code. To overcome this, we recommend using the R package [**nls.multstart**](https://github.com/padpadpadpad/nls.multstart), which uses **minpackLM::nlsLM()**, but allows for multiple sets of starting parameters. It iterates through multiple starting values, attempting a fit with each set of start parameters. The best model is then picked using AIC scores. Using **nls_multstart**, we will use a random-search/shotgun approach to fit the curve. Random start parameter values are picked from a uniform distribution between `start_lower` and `start_upper` for each parameter. If the best model is not improved upon (in terms of AIC score) for 100 new start parameter combinations, the function will return that model fit. This is controlled by convergence_count, if this is set to `FALSE`, **nls_multstart()** will try and fit all iterations. ```{r fit_model} # fit model fit <- nls_multstart(rate~sharpeschoolhigh_1981(temp = temp, r_tref,e,eh,th, tref = 15), data = d, iter = 500, start_lower = start_vals - 10, start_upper = start_vals + 10, lower = low_lims, upper = upper_lims, supp_errors = 'Y') fit ``` To calculate additional parameters of interest, we can use **rTPC::calc_params()**. This function uses high resolution predictions of the fitted model to estimate traits associated with a thermal performance curve. The currently available methods can be viewed by running `?calc_params`. For example, we may be interested in variation in the optimum temperature, $T_{opt}$, given that we adapted algae to different temperatures. ```{r} # calculate additional traits calc_params(fit) %>% # round for easy viewing mutate_all(round, 2) ``` Finally for this introduction, we can get predictions of our model using **broom::augment()**, which is similar to **predict()**. These are then plotted over our original data. ```{r pred_and_plot, fig.width=6, fig.height = 4} # predict new data new_data <- data.frame(temp = seq(min(d$temp), max(d$temp), 0.5)) preds <- augment(fit, newdata = new_data) # plot data and model fit ggplot(d, aes(temp, rate)) + geom_point() + geom_line(aes(temp, .fitted), preds, col = 'blue') + theme_bw(base_size = 12) + labs(x = 'Temperature (ºC)', y = 'Metabolic rate', title = 'Respiration across temperatures') ```