library(rmarkdown)
paged_table(feather)
More ANOVAs
Dealing with multiple group membership and interactions
In the last chapter we introduced the idea of comparing means among populations (one-way ANOVAs, our first linear models). However, the units that we measure may belong to multiple groups. We will extend our analysis of variance to consider multiple group membership and interactions in this chapter. As a starting point, consider that group membership may be an inherent property of the unit we measure or we may assign it.
Example: Back to the birds
One of the last chapters practice problems focused bird feathers. While studying feather color in Northern flickers (Colaptes auratus), Wiebe and Bortolotti (2002) noted that ~25% of birds had one or more “odd” tail feathers. They decided to compare the color of these odd and “typical” feathers.
Example and data provided by McDonald (2014).
How do we analyze this data?
We may first note that we have a continuous measurement (feather color, measured using color hues from a digital camera and another statistical technique that we will not go into here) and a categorical variable (feather type, with levels “typical” and “odd”). This hopefully reminds you of an ANOVA/t-test!
We could plot the data
library(ggplot2)
ggplot(feather, aes(x=Feather_type, y= Color_index, color=Feather_type))+
geom_jitter()+
labs(y= "Color index",
x= "Feather type",
title="Comparing odd and typical feathers in Northern flickers")+
guides(color=F)
Warning: The `<scale>` argument of `guides()` cannot be `FALSE`. Use "none" instead as
of ggplot2 3.3.4.
Develop a set of hypotheses:
\[ \begin{split} H_O: \mu_{\textrm{odd feather color}} = \mu_{\textrm{typical feather color}}\\ H_A: \mu_{\textrm{odd feather color}} \neq \mu_{\textrm{typical feather color}}\\ \end{split} \]
and test them using a t-test:
t.test(Color_index ~ Feather_type, data=feather)
Welch Two Sample t-test
data: Color_index by Feather_type
t = -3.56, df = 29.971, p-value = 0.00126
alternative hypothesis: true difference in means between group Odd and group Typical is not equal to 0
95 percent confidence interval:
-0.21579254 -0.05845746
sample estimates:
mean in group Odd mean in group Typical
-0.176125 -0.039000
or, using more generalizable functions, a linear model:
library(car)
Anova(lm(Color_index ~ Feather_type, data=feather), type = "III")
Anova Table (Type III tests)
Response: Color_index
Sum Sq Df F value Pr(>F)
(Intercept) 0.49632 1 41.816 3.814e-07 ***
Feather_type 0.15043 1 12.674 0.001259 **
Residuals 0.35607 30
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We find a significant p value, but we did not check assumptions. For linear models (remember, $i.i.d. N(,)$, we could use our visual checks
plot(lm(Color_index ~ Feather_type, data=feather))
Which appears ok, but there is a problem.
Our data are not independent!
Lack of Independence
Odd and typical feathers were measured on a single bird (note the Bird column) in the dataset. We might assume feathers on a given bird are more closely related in color than feathers on different birds. This could be due to diet or other factors making all feathers on a given bird brighter or darker than those on another. Regardless of reason (and “good” p value), we know the measurements are linked in some way. Note we could “connect” individual observations.
ggplot(feather, aes(x=Feather_type, y= Color_index, color=Feather_type, group=Bird))+
geom_line(position = position_dodge(0.4), color="black") +
geom_point(position = position_dodge(0.4)) +
labs(y= "Color index",
x= "Feather type",
title="Comparing odd and typical feathers in Northern flickers")+
guides(color=F)
This may also occur if we measure outcomes with-in a single unit (e.g., a study of fertilizer impacts using multiple fields) or over time (e.g., before/after studies). Regardless of the reason, when our experimental design has led to measurements being connected/not independent, we need to consider these connections in order to properly note (and sometimes even observe) impacts of focal variables.
Blocking, two-way ANOVAs,and paired t-tests
In this case, the connections may be considered artifacts of the data. We didn’t assign birds. We also made a choice to compare odd and typical feathers from the same bird - why? In general, accounting for extra variation in the data will give you a better answer about how a given variable influences outcomes. This may be called blocking. Although the motivation might therefore be to get a “better” p value, it should be driven by experimental design (and thus we started with an example where we didn’t “need” to account for it to achieve significance).
In order to consider how color differs by bird and feather type, we need to add both variables to our linear model. For each variable we add, we also add a null (and corresponding alternative) hypothesis. So we retain our focus on feather type:
\[ \begin{split} H_O: \mu_{\textrm{odd feather color}} = \mu_{\textrm{typical feather color}}\\ H_A: \mu_{\textrm{odd feather color}} \neq \mu_{\textrm{typical feather color}}\\ \end{split} \]
but also add a set of hypotheses focused on birds:
\[ \begin{split} H_O: \mu_{\textrm{color of bird A}} = \mu_{\textrm{color of bird B}}....\textrm{for all birds}\\ H_A: \mu_{\textrm{color of bird A}} \neq \mu_{\textrm{color of bird B}}....\textrm{for all birds}\\ \end{split} \]
We can analyze this using our linear model approach. This is possible because, as we noted earlier, we can subdivide variance among multiple levels. Under the hood, the linear model approach build a model matrix that considers the impact of feather type and bird on outcomes. Since both variables are categorical, this is often called a two-way ANOVA. First, let’s make the object
<- lm(Color_index ~ Feather_type + Bird, data=feather) two_way_anova_example
You can see the new model matrix and coefficients if you want
Note the model matrix now includes columns for feather type and bird (lots of dummy variables, and now intercept is Bird A’s odd feather!). The \(\beta\) matrix of coefficients has corresponding estimates.
library(rmarkdown)
paged_table(data.frame(model.matrix(two_way_anova_example)))
matrix(as.numeric(two_way_anova_example$coefficients), ncol=1)
[,1]
[1,] -0.3580625
[2,] 0.1371250
[3,] 0.0905000
[4,] 0.0450000
[5,] 0.1250000
[6,] 0.2535000
[7,] 0.2575000
[8,] 0.1500000
[9,] 0.2525000
[10,] 0.2885000
[11,] 0.2150000
[12,] 0.2530000
[13,] 0.1445000
[14,] 0.1365000
[15,] 0.2190000
[16,] 0.2530000
[17,] 0.2275000
So for our first observation, which is
1,] feather[
Bird Feather_type Color_index
1 A Typical -0.255
Our estimate is the intercept (since it’s bird A) and the typical feather:
model.matrix(two_way_anova_example)[1,] %*% matrix(as.numeric(two_way_anova_example$coefficients), ncol=1)
[,1]
[1,] -0.2209375
and thus our residual is
$residuals[1] two_way_anova_example
1
-0.0340625
which is the same as
1,]$Color_index-model.matrix(two_way_anova_example)[1,] %*% matrix(as.numeric(two_way_anova_example$coefficients), ncol=1) feather[
[,1]
[1,] -0.0340625
Then check the assumptions
plot(two_way_anova_example)
Note, visually speaking, the residuals do appear to be closer to normal now. Since assumptions look ok, we can analyze the outcome
summary(two_way_anova_example)
Call:
lm(formula = Color_index ~ Feather_type + Bird, data = feather)
Residuals:
Min 1Q Median 3Q Max
-0.12444 -0.05209 0.00000 0.05209 0.12444
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) -0.35806 0.06955 -5.148 0.000119 ***
Feather_typeTypical 0.13712 0.03374 4.065 0.001017 **
BirdB 0.09050 0.09542 0.948 0.357936
BirdC 0.04500 0.09542 0.472 0.643998
BirdD 0.12500 0.09542 1.310 0.209903
BirdE 0.25350 0.09542 2.657 0.017950 *
BirdF 0.25750 0.09542 2.699 0.016505 *
BirdG 0.15000 0.09542 1.572 0.136802
BirdH 0.25250 0.09542 2.646 0.018330 *
BirdI 0.28850 0.09542 3.023 0.008554 **
BirdJ 0.21500 0.09542 2.253 0.039643 *
BirdK 0.25300 0.09542 2.651 0.018139 *
BirdL 0.14450 0.09542 1.514 0.150719
BirdM 0.13650 0.09542 1.431 0.173069
BirdN 0.21900 0.09542 2.295 0.036567 *
BirdO 0.25300 0.09542 2.651 0.018139 *
BirdP 0.22750 0.09542 2.384 0.030759 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 0.09542 on 15 degrees of freedom
Multiple R-squared: 0.7304, Adjusted R-squared: 0.4427
F-statistic: 2.539 on 16 and 15 DF, p-value: 0.03923
Anova(two_way_anova_example, type= "III")
Anova Table (Type III tests)
Response: Color_index
Sum Sq Df F value Pr(>F)
(Intercept) 0.24133 1 26.5059 0.000119 ***
Feather_type 0.15043 1 16.5214 0.001017 **
Bird 0.21950 15 1.6072 0.184180
Residuals 0.13657 15
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Note we see a significant difference in color among birds and feather type. Although we may be tempted to (and could) use post-hoc tests to consider which birds are different than which others, this is typically not done for blocked variables. We did not assign these pairings and it is not the focus of our efforts.
Since we only had 2 types of feathers, we also don’t need post-hoc tests. A significant p value means they differ from each other, and the estimates provided by the summary command indicate the typical feathers have a higher color index.
t-test connections
When we have only two measurements per group (e.g., odd and typical feathers from each bird), we can use a t-test approach to achieve similar goals. This approach is known as a paired t-test. Instead of focusing on the difference in means (like a 2-sample t-test), the test focuses on the mean difference between paired measurements (which would be 0 under the null hypothesis!). In this way, it is effectively a one-sample test that is pairing the data to reduce variation (blocking). We can carry out the test, but (as of early 2024) we need to use a wide dataset.
library(reshape2)
<- dcast(feather, Bird~Feather_type, value.var="Color_index")
feather_wide paged_table(feather_wide)
Now we have wide data…
t.test(feather_wide$Odd, feather_wide$Typical, data=feather, paired=TRUE)
Paired t-test
data: feather_wide$Odd and feather_wide$Typical
t = -4.0647, df = 15, p-value = 0.001017
alternative hypothesis: true mean difference is not equal to 0
95 percent confidence interval:
-0.20903152 -0.06521848
sample estimates:
mean difference
-0.137125
and get the same results as above (note we don’t even have to consider corrections like the Welch approach since this a one-sample test). Common examples of paired t-tests include before-after and twin studies.
In an earlier chapters we considered options for one- and two-sample tests when t-tests assumptions were not met. For two-sample tests, one of these approaches, the sign or binary test, is only valid for paired data. The differences in paired observations are compared to a set value (typically 0). Under the null hypothesis, half should be below the proposed median and half should be above. Differences matching the proposed value are ignored, thus reducing the sample size and making it harder to reject the null hypothesis; this is actually an odd way of accounting for them. The proportion of values below the proposed median is then evaluated using a binomial test. For two sample, the SIGN.test function in the BSDA package requires 2 columns of data and assumes the order of the column represents paired data.
library(BSDA)
SIGN.test(feather[feather$Feather_type == "Odd", "Color_index"],
$Feather_type == "Typical", "Color_index"],
feather[feathermd = 0)
Dependent-samples Sign-Test
data: feather[feather$Feather_type == "Odd", "Color_index"] and feather[feather$Feather_type == "Typical", "Color_index"]
S = 3, p-value = 0.02127
alternative hypothesis: true median difference is not equal to 0
95 percent confidence interval:
-0.24048275 -0.02331055
sample estimates:
median of x-y
-0.114
Achieved and Interpolated Confidence Intervals:
Conf.Level L.E.pt U.E.pt
Lower Achieved CI 0.9232 -0.2400 -0.0320
Interpolated CI 0.9500 -0.2405 -0.0233
Upper Achieved CI 0.9787 -0.2410 -0.0140
More than 2 measurements? Back to the linear model
We can also block for variation when we take more than 2 measurements per unit. For example, imagine if these birds also had a special, long tail feather.
set.seed(25)
<- data.frame(Bird = LETTERS[1:16], Feather_type = "Special",
special Color_index= feather[feather$Feather_type == "Typical", "Color_index"] +
3 +runif(16,1,1)*.01)
.<- merge(feather, special, all = T)
feather_extra $Feather_type <- factor(feather_extra$Feather_type) feather_extra
We could still block for variation using the linear model/ANOVA, but not the t-test, approach. As another review, we create the model
<-lm(Color_index ~ Feather_type + Bird, data=feather_extra) more_blocks
Check assumptions
plot(more_blocks)
Check outcome (this time focusing on Anova output)
Anova(more_blocks, type="III")
Anova Table (Type III tests)
Response: Color_index
Sum Sq Df F value Pr(>F)
(Intercept) 0.36392 1 59.9538 1.224e-08 ***
Feather_type 1.67906 2 138.3093 7.208e-16 ***
Bird 0.34649 15 3.8055 0.0008969 ***
Residuals 0.18210 30
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We still see feather type has a significant impact on color, but since we have more than 2 groups we need to follow up this finding with a post-hoc test.
library(multcomp)
<- glht(more_blocks, linfct = mcp(Feather_type = "Tukey"))
compare summary(compare)
Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: Tukey Contrasts
Fit: lm(formula = Color_index ~ Feather_type + Bird, data = feather_extra)
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
Special - Odd == 0 0.44712 0.02755 16.232 <1e-04 ***
Typical - Odd == 0 0.13712 0.02755 4.978 <1e-04 ***
Typical - Special == 0 -0.31000 0.02755 -11.254 <1e-04 ***
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
Other ways to be in multiple groups
In the bird example, one of our categories (bird) was un-intential. We chose to measure odd and typical feathers, and accounting for variation among birds was an appropriate step given lack of independence in measurements. However, we can also assign units to multiple groups (instead of making multiple measurements within one unit).
Consider if we ran an experiment focused on the impact of factors A and B on some trait. We can fully cross the factors in an experiment. Doing so can let us consider the main effects of multiple variables and potential interactions among them in what is often called a factorial ANOVA. For starters, let each factor have only 2 levels, and let the levels be the absence or presence of the factor.
Factor A | ||
Factor B | Absent | Present |
Absent | Control | Impact of A only |
Present | Impact of B only | Combined impact of A+B |
Experimental design notes
For a factorial ANOVA, we need to assign each unit randomly to a level of factor A. Then each level of factor B is randomly assigned to subjects at each level of factor A. This is different than randomly assigning treatments of A and B, as that could lead to outcomes where some level of factor B is not represented in some level of factor A.
We also need multiple units (3+) assigned to each combination.
When both are absent we have a classic control outcome. When one is present and the other absent we see main effects impacts of only one factor. Note we previously analyzed experiments that considered only one factor using ANOVAs or t-tests (linear models), but now we have multiple factors. We should not analyze the main effects of each using 2 one-way ANOVAs. Doing so cuts our data in half, meaning our estimates of variances are less precise and we increase our chance of making a type 1 error. More importantly, we wouldn’t be able to properly consider the combined impacts of A + B. What could these be?
<- data.frame(Treatment = c(rep("Control",5),
example_interaction rep("Impact of A only",5),
rep("Impact of B only",5),
rep("A+B Additive",5),
rep("A+B Synergistic", 5),
rep("A+B Antagonistic", 5)),
Cause= rep(c("Control","Factor A","Factor B", "Synergistic", "Antagonistic"), 6),
Impact = c(5,0,0,0,0,
5,2,0,0,0,
5,0,3,0,0,
5,2,3,0,0,
0,0,0,20,0,
0,0,0,0,6))
$Treatment <- factor(example_interaction$Treatment, levels=c("Control","Impact of A only","Impact of B only", "A+B Additive", "A+B Synergistic", "A+B Antagonistic"))
example_interaction$Cause <- factor(example_interaction$Cause, levels=c("Control","Factor A","Factor B", "Synergistic", "Antagonistic"))
example_interactionggplot(example_interaction, aes(x=Treatment, y= Impact, fill= Cause))+
geom_col(position = position_stack(reverse = TRUE))+
theme(axis.text.x = element_text(angle = -45))
As shown in the graph (Inspired by (Fong, Bittick, and Fong 2017)), A and B could have additive effects (where they simply stack), synergistic effects (the combined impact is more than the sum of the two), or antagonistic effects (the combined impacts is less than the sum of the two). Synergistic and antagonistic impacts are both examples of interactions. Interactions occur when the impact of one variable depends on the level of another.
Example: Impacts of grazing and fertilization
We can extend this example to consider more than 2 levels for one or more factors. For example, Valdez et al. (2023) wanted to consider the impact of top-down (snail grazing) and bottom- up (nutrient availability) on marsh plant (Spartina alterniflora) growth. To do this, they assigned plots to one of 3 grazer treatments and one of 2 nitrogen treatments.
This design is different from the bird example. No two measurements for a given trait were taken on the same plot. In this case, we likely care about the main effects, or impacts, of both variables. However, we may also need to consider interactions among the variables. Interactions occur when the impact of one variable depends on the level of another. For example, snail removal might have major impacts on nitrogen-enriched plots while having no impact on ambient plots. Due to this, we now have even more hypotheses:
\[ \begin{split} H_O: \mu_\textrm{plant growth, no fertilizer} = \mu_\textrm{plant growth, fertilizer}\\ H_O: \mu_\textrm{plant growth, snails removed} = \mu_\textrm{plant growth, control snails}= \mu_\textrm{plant growth, snails added}\\ H_O: \textrm{impact of snail grazing does not depend on nitrogen level}\\ \end{split} \]
Fortunately, these are easy to consider in our linear model framework. While not shown here, the model matrix adds columns to note our new interaction terms, and the coefficient matrix estimates them. From an R standpoint, we can include the interaction between two variables using the “:” notation. We’ll focus on below-ground biomass (standardized to m2) for this example (the paper measured 9 response variables!)
<- read.csv("data/Spartina_alterniflora_traits.csv", stringsAsFactors = T)
valdez_2023 <-lm(below.biomass.g.meter.sq..m2..~Snail.Level + Nitrogen.level + Snail.Level:Nitrogen.level, valdez_2023) bgb_model
For shorthand, note that if we put main effect * main effect in a model, it automatically adds the interaction term. You can see the model summary is the same.
<-lm(below.biomass.g.meter.sq..m2..~Snail.Level * Nitrogen.level, valdez_2023)
bgb_model_shorthand summary(bgb_model)
Call:
lm(formula = below.biomass.g.meter.sq..m2.. ~ Snail.Level + Nitrogen.level +
Snail.Level:Nitrogen.level, data = valdez_2023)
Residuals:
Min 1Q Median 3Q Max
-103.003 -50.933 1.507 28.297 153.937
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value
(Intercept) 542.16 40.72 13.315
Snail.Levelremoval 33.58 57.58 0.583
Snail.Levelsnail addition -88.85 57.58 -1.543
Snail.Leveluncaged 45.09 57.58 0.783
Nitrogen.levelwithout -111.87 57.58 -1.943
Snail.Levelremoval:Nitrogen.levelwithout 43.39 81.44 0.533
Snail.Levelsnail addition:Nitrogen.levelwithout 60.18 81.44 0.739
Snail.Leveluncaged:Nitrogen.levelwithout NA NA NA
Pr(>|t|)
(Intercept) 2.44e-09 ***
Snail.Levelremoval 0.5691
Snail.Levelsnail addition 0.1451
Snail.Leveluncaged 0.4467
Nitrogen.levelwithout 0.0724 .
Snail.Levelremoval:Nitrogen.levelwithout 0.6025
Snail.Levelsnail addition:Nitrogen.levelwithout 0.4721
Snail.Leveluncaged:Nitrogen.levelwithout NA
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 70.53 on 14 degrees of freedom
Multiple R-squared: 0.498, Adjusted R-squared: 0.2829
F-statistic: 2.315 on 6 and 14 DF, p-value: 0.09183
summary(bgb_model_shorthand)
Call:
lm(formula = below.biomass.g.meter.sq..m2.. ~ Snail.Level * Nitrogen.level,
data = valdez_2023)
Residuals:
Min 1Q Median 3Q Max
-103.003 -50.933 1.507 28.297 153.937
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value
(Intercept) 542.16 40.72 13.315
Snail.Levelremoval 33.58 57.58 0.583
Snail.Levelsnail addition -88.85 57.58 -1.543
Snail.Leveluncaged 45.09 57.58 0.783
Nitrogen.levelwithout -111.87 57.58 -1.943
Snail.Levelremoval:Nitrogen.levelwithout 43.39 81.44 0.533
Snail.Levelsnail addition:Nitrogen.levelwithout 60.18 81.44 0.739
Snail.Leveluncaged:Nitrogen.levelwithout NA NA NA
Pr(>|t|)
(Intercept) 2.44e-09 ***
Snail.Levelremoval 0.5691
Snail.Levelsnail addition 0.1451
Snail.Leveluncaged 0.4467
Nitrogen.levelwithout 0.0724 .
Snail.Levelremoval:Nitrogen.levelwithout 0.6025
Snail.Levelsnail addition:Nitrogen.levelwithout 0.4721
Snail.Leveluncaged:Nitrogen.levelwithout NA
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 70.53 on 14 degrees of freedom
Multiple R-squared: 0.498, Adjusted R-squared: 0.2829
F-statistic: 2.315 on 6 and 14 DF, p-value: 0.09183
You may note a weird NA here (we’ll come back to it), but remember we should really check model assumptions before looking at output.
plot(bgb_model)
These look ok. There may be a slight increase in variance with fitted values, but we can work with this. Let’s build an ANOVA table.
Anova(bgb_model, type="III")
Error in Anova.III.lm(mod, error, singular.ok = singular.ok, ...): there are aliased coefficients in the model
But we got an error! What happened? Let’s look at the data
paged_table(valdez_2023)
A summary may help more. Note we can summarize across multiple factors.
library(Rmisc)
paged_table(summarySE(valdez_2023, measurevar = "below.biomass.g.meter.sq..m2..", groupvars = c("Snail.Level", "Nitrogen.level")))
Note the uncaged treatment only has without for the nitrogen impact. It was a control! While we often need these in experiments, they can create analysis problems. This is because we can’t consider how nutrient level depends on snail treatment for the control level! In other words, interactions can not be calculated for some levels.
This is also why we saw the NA and warnings in our model summary
summary(bgb_model)
Call:
lm(formula = below.biomass.g.meter.sq..m2.. ~ Snail.Level + Nitrogen.level +
Snail.Level:Nitrogen.level, data = valdez_2023)
Residuals:
Min 1Q Median 3Q Max
-103.003 -50.933 1.507 28.297 153.937
Coefficients: (1 not defined because of singularities)
Estimate Std. Error t value
(Intercept) 542.16 40.72 13.315
Snail.Levelremoval 33.58 57.58 0.583
Snail.Levelsnail addition -88.85 57.58 -1.543
Snail.Leveluncaged 45.09 57.58 0.783
Nitrogen.levelwithout -111.87 57.58 -1.943
Snail.Levelremoval:Nitrogen.levelwithout 43.39 81.44 0.533
Snail.Levelsnail addition:Nitrogen.levelwithout 60.18 81.44 0.739
Snail.Leveluncaged:Nitrogen.levelwithout NA NA NA
Pr(>|t|)
(Intercept) 2.44e-09 ***
Snail.Levelremoval 0.5691
Snail.Levelsnail addition 0.1451
Snail.Leveluncaged 0.4467
Nitrogen.levelwithout 0.0724 .
Snail.Levelremoval:Nitrogen.levelwithout 0.6025
Snail.Levelsnail addition:Nitrogen.levelwithout 0.4721
Snail.Leveluncaged:Nitrogen.levelwithout NA
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Residual standard error: 70.53 on 14 degrees of freedom
Multiple R-squared: 0.498, Adjusted R-squared: 0.2829
F-statistic: 2.315 on 6 and 14 DF, p-value: 0.09183
You could note we have the same issue for our initial bird analysis:
<- lm(Color_index ~ Feather_type * Bird, data=feather)
two_way_anova_example_int Anova(two_way_anova_example_int, type="III")
Error in Anova.lm(two_way_anova_example_int, type = "III"): residual df = 0
On a positive note, this means R will typically not consider interactions when it shouldn’t, but you need to know why in order to fix it.
Dealing with controls and missing interactions
To fix this (and deal with controls), we need to consider the data. Valdez et al. (2023) used t-tests (why?) to consider differences between cage and cage control plots (note %in% and the fact they did not focus on above-ground biomass (maybe because uncaged plots had little..). %in% allows you to subset data by matching items to list. Remember you can always get help on functions using something like (we need the quotations for operators!)
'%in%' ?
t.test(below.biomass.g.meter.sq..m2..~Snail.Level, valdez_2023[valdez_2023$Snail.Level %in% c("uncaged","control snails") & valdez_2023$Nitrogen.level == "without",])
Welch Two Sample t-test
data: below.biomass.g.meter.sq..m2.. by Snail.Level
t = -0.82961, df = 3.9666, p-value = 0.4538
alternative hypothesis: true difference in means between group control snails and group uncaged is not equal to 0
95 percent confidence interval:
-196.4785 106.3052
sample estimates:
mean in group control snails mean in group uncaged
430.2967 475.3833
If interactions are missing for other reasons (e.g., a set of units failed/died/data was lost), we can either ignore interactions or combine factor levels into a single new treatment variable and analyze using one-way ANOVAs.
Considering interactions
To consider interactions, we can remove the controls
<-lm(below.biomass.g.meter.sq..m2..~Snail.Level + Nitrogen.level + Snail.Level:Nitrogen.level, valdez_2023[valdez_2023$Snail.Level != "uncaged",]) bgb_model_cont_removed
We can consider the assumptions
plot(bgb_model_cont_removed)
and now note …
Anova(bgb_model_cont_removed, type="III")
Anova Table (Type III tests)
Response: below.biomass.g.meter.sq..m2..
Sum Sq Df F value Pr(>F)
(Intercept) 881823 1 176.4820 1.545e-08 ***
Snail.Level 24012 2 2.4028 0.13254
Nitrogen.level 18771 1 3.7567 0.07648 .
Snail.Level:Nitrogen.level 2893 2 0.2895 0.75371
Residuals 59960 12
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
…that the ANOVA table works.
Before we can interpret this, we need to return to our discussion about different sums of squares.
Back to type = III
Much of this follows several excellent posts (“ANOVA,” n.d.-; Scholer, n.d.) . A paper by Venables (n.d.) may also be helpful, as might .
We introduced the various types of sums of squares in the previous chapter, but they didn’t matter much there. However, here they do. This returns to the idea that we can partition the variance of a given point among various factors.In general, we can think about determining the impact by comparing sums of squares from models with and without a factor of interest (using incremental sums of squares). So, for a model
lm(response ~ A + B + A:B)
We could consider the impact of all factors and interactions by comparing:
SS(AB | A, B) = SS(A, B, A:B) – SS(A, B)
SS(A | B, AB) = SS(A, B, A:B) – SS(B, A:B)
SS(B | A, AB) = SS(A, B, A:B) – SS(A, A:B)
SS(A | B) = SS(A, B) – SS(B)
SS(B | A) = SS(A, B) – SS(A)
However, the variance we attribute to each factor can be calculated in multiple ways. First, we can consider type 1 sums of squares. Using this approach, the order factors are entered into the model matters. Following above,
SS(A) for factor A.
SS(B | A) for factor B.
SS(A:B | B, A) for interaction A:B
This approach leads to different main effects (overall, general impacts of a factor) for factors A and B depending on the order. Note
summary(aov(bgb_model_cont_removed))
Df Sum Sq Mean Sq F value Pr(>F)
Snail.Level 2 39024 19512 3.905 0.0494 *
Nitrogen.level 1 26919 26919 5.387 0.0387 *
Snail.Level:Nitrogen.level 2 2893 1447 0.290 0.7537
Residuals 12 59960 4997
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
is different from
summary(aov(lm(below.biomass.g.meter.sq..m2..~Nitrogen.level + Snail.Level + Snail.Level:Nitrogen.level, valdez_2023[valdez_2023$Snail.Level != "uncaged",])))
Df Sum Sq Mean Sq F value Pr(>F)
Nitrogen.level 1 26919 26919 5.387 0.0387 *
Snail.Level 2 39024 19512 3.905 0.0494 *
Nitrogen.level:Snail.Level 2 2893 1447 0.290 0.7537
Residuals 12 59960 4997
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
..Wait, it’s the same. That’s because differences only arise when the data are unbalanced! ANOVAs were designed for use in perfectly balanced datasets. We’ll come back to this.
Another way is to consider the impact of main effects, such as
SS(A | B) for factor A.
SS(B | A) for factor B.
This approach is known as type II. It does not consider interactions (or assumes they are equal to 0)
Anova(bgb_model_cont_removed, type="II")
Anova Table (Type II tests)
Response: below.biomass.g.meter.sq..m2..
Sum Sq Df F value Pr(>F)
Snail.Level 39024 2 3.9050 0.04941 *
Nitrogen.level 26919 1 5.3874 0.03869 *
Snail.Level:Nitrogen.level 2893 2 0.2895 0.75371
Residuals 59960 12
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
so it works ok here since the interaction is not significant, but it shouldn’t be used when interactions are significant/important. Compare it to
Anova(bgb_model_cont_removed, type="III")
Anova Table (Type III tests)
Response: below.biomass.g.meter.sq..m2..
Sum Sq Df F value Pr(>F)
(Intercept) 881823 1 176.4820 1.545e-08 ***
Snail.Level 24012 2 2.4028 0.13254
Nitrogen.level 18771 1 3.7567 0.07648 .
Snail.Level:Nitrogen.level 2893 2 0.2895 0.75371
Residuals 59960 12
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
to see the difference. Type III considers interactions.
To make things, more complicated, when looking at type III output, we need to consider how R computes main effects. The default contrast in R looks at the main effect for factor B using only the first level of factor A (specified by the treatment contrasts). If there is an interaction, this obviously doesn’t make sense - B may matter for some levels of A but not others. Given this, we can (and maybe should) instead look at the average impact of B. We can do this by updating how contrasts are calculated. We can specify overall changes to how R does this, or we can include in the model as shown here:
Anova(lm(below.biomass.g.meter.sq..m2..~Snail.Level + Nitrogen.level + Snail.Level:Nitrogen.level, valdez_2023[valdez_2023$Snail.Level != "uncaged",], contrasts = list(Snail.Level = "contr.sum", Nitrogen.level = "contr.sum")), type = "III")
Anova Table (Type III tests)
Response: below.biomass.g.meter.sq..m2..
Sum Sq Df F value Pr(>F)
(Intercept) 4235224 1 847.6086 1.68e-12 ***
Snail.Level 39024 2 3.9050 0.04941 *
Nitrogen.level 26919 1 5.3874 0.03869 *
Snail.Level:Nitrogen.level 2893 2 0.2895 0.75371
Residuals 59960 12
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
This time we get the same results for the main effects as we did for Type 1 and Type 2 approaches. To be clear, this is because we used the same type of contrasts and had a balanced design. Note this matches the above,
Anova(lm(below.biomass.g.meter.sq..m2..~Snail.Level + Nitrogen.level + Snail.Level:Nitrogen.level, valdez_2023[valdez_2023$Snail.Level != "uncaged",], contrasts = list(Snail.Level = "contr.sum", Nitrogen.level = "contr.sum")), type = "II")
Anova Table (Type II tests)
Response: below.biomass.g.meter.sq..m2..
Sum Sq Df F value Pr(>F)
Snail.Level 39024 2 3.9050 0.04941 *
Nitrogen.level 26919 1 5.3874 0.03869 *
Snail.Level:Nitrogen.level 2893 2 0.2895 0.75371
Residuals 59960 12
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
and this does not.
Anova(lm(below.biomass.g.meter.sq..m2..~Snail.Level + Nitrogen.level + Snail.Level:Nitrogen.level, valdez_2023[valdez_2023$Snail.Level != "uncaged",], contrasts = list(Snail.Level = "contr.treatment", Nitrogen.level = "contr.treatment")), type = "III")
Anova Table (Type III tests)
Response: below.biomass.g.meter.sq..m2..
Sum Sq Df F value Pr(>F)
(Intercept) 881823 1 176.4820 1.545e-08 ***
Snail.Level 24012 2 2.4028 0.13254
Nitrogen.level 18771 1 3.7567 0.07648 .
Snail.Level:Nitrogen.level 2893 2 0.2895 0.75371
Residuals 59960 12
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
In doing this, however, note our coefficients also change. Note
data.frame(coef(lm(below.biomass.g.meter.sq..m2..~Snail.Level + Nitrogen.level + Snail.Level:Nitrogen.level, valdez_2023[valdez_2023$Snail.Level != "uncaged",], contrasts = list(Snail.Level = "contr.sum", Nitrogen.level = "contr.sum"))))
coef.lm.below.biomass.g.meter.sq..m2.....Snail.Level...Nitrogen.level...
(Intercept) 485.067222
Snail.Level1 1.162778
Snail.Level2 56.436111
Nitrogen.level1 38.671667
Snail.Level1:Nitrogen.level1 17.261667
Snail.Level2:Nitrogen.level1 -4.431667
is different than
data.frame(coef(lm(below.biomass.g.meter.sq..m2..~Snail.Level + Nitrogen.level + Snail.Level:Nitrogen.level, valdez_2023[valdez_2023$Snail.Level != "uncaged",])))
coef.lm.below.biomass.g.meter.sq..m2.....Snail.Level...Nitrogen.level...
(Intercept) 542.16333
Snail.Levelremoval 33.58000
Snail.Levelsnail addition -88.85333
Nitrogen.levelwithout -111.86667
Snail.Levelremoval:Nitrogen.levelwithout 43.38667
Snail.Levelsnail addition:Nitrogen.levelwithout 60.18333
The contrasts sum approach sets the intercept as the overall average and then puts all other levels relative to that (meaning it includes all levels). Our default treatment approach lets the first factor level (typically alphabetical, here technically a grouped level (Snail Level (ambient): Nitrogen level (with) ) be an intercept for the linear model, and then considers the other factor levels as deviations from that.
Finally, note than unbalanced designs (when you have a different number of means for one grouping than others), the different approaches lead to different outcomes. We can consider this (and reinforce the above points) using a built-in dataset, ToothGrowth. If you run this code
?ToothGrowth
in R, you’ll see it notes the dataset contains “the length of odontoblasts (cells responsible for tooth growth) in 60 guinea pigs. Each animal received one of three dose levels of vitamin C (0.5, 1, and 2 mg/day) by one of two delivery methods, orange juice or ascorbic acid (a form of vitamin C and coded as VC).”
You can view the data using:
head(ToothGrowth)
len supp dose
1 4.2 VC 0.5
2 11.5 VC 0.5
3 7.3 VC 0.5
4 5.8 VC 0.5
5 6.4 VC 0.5
6 10.0 VC 0.5
and we can see we have a balanced design here:
xtabs(~supp+dose, ToothGrowth)
dose
supp 0.5 1 2
OJ 10 10 10
VC 10 10 10
Given that, our type 1, 2, 3 approaches are equivalent (we need to make does a factor for ANOVA purposes)
$dose <- factor(ToothGrowth$dose)
ToothGrowthsummary(aov(lm(len~supp*dose, ToothGrowth)))
Df Sum Sq Mean Sq F value Pr(>F)
supp 1 205.4 205.4 15.572 0.000231 ***
dose 2 2426.4 1213.2 92.000 < 2e-16 ***
supp:dose 2 108.3 54.2 4.107 0.021860 *
Residuals 54 712.1 13.2
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Anova(lm(len~supp*dose, ToothGrowth), type = "II")
Anova Table (Type II tests)
Response: len
Sum Sq Df F value Pr(>F)
supp 205.35 1 15.572 0.0002312 ***
dose 2426.43 2 92.000 < 2.2e-16 ***
supp:dose 108.32 2 4.107 0.0218603 *
Residuals 712.11 54
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Anova(lm(len~supp*dose, ToothGrowth, contrasts = list(supp = "contr.sum", dose = "contr.sum")), type = "III")
Anova Table (Type III tests)
Response: len
Sum Sq Df F value Pr(>F)
(Intercept) 21236.5 1 1610.393 < 2.2e-16 ***
supp 205.4 1 15.572 0.0002312 ***
dose 2426.4 2 92.000 < 2.2e-16 ***
supp:dose 108.3 2 4.107 0.0218603 *
Residuals 712.1 54
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
but not if we use a different contrast (treatment is default used here)
Anova(lm(len~supp*dose, ToothGrowth), type = "III")
Anova Table (Type III tests)
Response: len
Sum Sq Df F value Pr(>F)
(Intercept) 1750.33 1 132.730 3.603e-16 ***
supp 137.81 1 10.450 0.002092 **
dose 885.26 2 33.565 3.363e-10 ***
supp:dose 108.32 2 4.107 0.021860 *
Residuals 712.11 54
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Also, if we have an unbalanced design
library(purrr)
library(tidyr)
set.seed(19)
<- ToothGrowth %>% group_by(supp, dose) %>% nest() %>% ungroup() %>% mutate(n = c(10,8,7,5,10,3)) %>% mutate(samp = map2(data, n, sample_n)) %>% dplyr::select(-data) %>% unnest(samp)
ToothGrowth_unbalanced <- lm(len~supp*dose, ToothGrowth_unbalanced, contrasts = list(supp = "contr.sum", dose = "contr.sum")) ToothGrowth_unbalanced_anova
Then we get different answers from each approach:
summary(aov(ToothGrowth_unbalanced_anova))
Df Sum Sq Mean Sq F value Pr(>F)
supp 1 194.6 194.6 16.636 0.000231 ***
dose 2 1696.3 848.1 72.489 1.59e-13 ***
supp:dose 2 103.7 51.9 4.432 0.018818 *
Residuals 37 432.9 11.7
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Anova(ToothGrowth_unbalanced_anova, type= "II")
Anova Table (Type II tests)
Response: len
Sum Sq Df F value Pr(>F)
supp 153.04 1 13.0798 0.0008853 ***
dose 1696.26 2 72.4885 1.59e-13 ***
supp:dose 103.71 2 4.4318 0.0188183 *
Residuals 432.91 37
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Anova(ToothGrowth_unbalanced_anova, type= "III")
Anova Table (Type III tests)
Response: len
Sum Sq Df F value Pr(>F)
(Intercept) 12583.8 1 1075.5215 < 2.2e-16 ***
supp 79.2 1 6.7654 0.01328 *
dose 1250.9 2 53.4579 1.221e-11 ***
supp:dose 103.7 2 4.4318 0.01882 *
Residuals 432.9 37
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Now we see the different ways of calculating sums of squares leads to different outcomes when the sample sizes are unequal. In general, this supports the use of the type III approach (but see Langsrud (2003)) and for considering contrasts as needed.
Interpreting interactions
An important note is that all approaches above found the same values (p, F, etc.) for the highest order interaction. In general, interactions make interpreting main effects difficult. So, we have a few ways of handling them.
When not significant
If interactions are not significant, they can be handled in 2 ways.
We can leave the interaction in the model and interpret main effects immediately
- If doing this, we should make sure we set the contrasts to use an average approach (usually using the sum contrasts as shown above)
Anova(lm(below.biomass.g.meter.sq..m2..~Snail.Level + Nitrogen.level + Snail.Level:Nitrogen.level, valdez_2023[valdez_2023$Snail.Level != "uncaged",], contrasts = list(Snail.Level = "contr.sum", Nitrogen.level = "contr.sum")), type = "III")
Anova Table (Type III tests) Response: below.biomass.g.meter.sq..m2.. Sum Sq Df F value Pr(>F) (Intercept) 4235224 1 847.6086 1.68e-12 *** Snail.Level 39024 2 3.9050 0.04941 * Nitrogen.level 26919 1 5.3874 0.03869 * Snail.Level:Nitrogen.level 2893 2 0.2895 0.75371 Residuals 59960 12 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
We can remove the interaction from the model, re-run it, and interpret main effects of the factors.
<- update(bgb_model_cont_removed, .~.-Snail.Level:Nitrogen.level) bgb_model_cont_removed_int_removed plot(bgb_model_cont_removed_int_removed)
Anova(bgb_model_cont_removed_int_removed, type="III")
Anova Table (Type III tests) Response: below.biomass.g.meter.sq..m2.. Sum Sq Df F value Pr(>F) (Intercept) 1239848 1 276.1645 1.303e-10 *** Snail.Level 39024 2 4.3461 0.03402 * Nitrogen.level 26919 1 5.9959 0.02812 * Residuals 62853 14 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Which is the same as…
Anova(bgb_model_cont_removed_int_removed, type="II")
Anova Table (Type II tests)
Response: below.biomass.g.meter.sq..m2..
Sum Sq Df F value Pr(>F)
Snail.Level 39024 2 4.3461 0.03402 *
Nitrogen.level 26919 1 5.9959 0.02812 *
Residuals 62853 14
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
When we do this, the contrasts methods matter less,
<- update(bgb_model_cont_removed, .~.-Snail.Level:Nitrogen.level, contrasts = list(Snail.Level = "contr.sum", Nitrogen.level = "contr.sum"))
bgb_model_cont_removed_int_removed Anova(bgb_model_cont_removed_int_removed, type="III")
Anova Table (Type III tests)
Response: below.biomass.g.meter.sq..m2..
Sum Sq Df F value Pr(>F)
(Intercept) 4235224 1 943.3564 3.015e-14 ***
Snail.Level 39024 2 4.3461 0.03402 *
Nitrogen.level 26919 1 5.9959 0.02812 *
Residuals 62853 14
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
but you shouldn’t report main effects from the model with the interaction.
Compare to these to what we actually see (always a good idea!)
Plotting outcomes
Results from two-way ANOVAs are often plotted similarly to one-way ANOVAs, but with colors or other aesthetics representing the additional group.
<- summarySE(valdez_2023[valdez_2023$Snail.Level != "uncaged",], measurevar = "below.biomass.g.meter.sq..m2..", groupvars = c("Snail.Level", "Nitrogen.level")) sdm_summary
$Snail.Level <- relevel(sdm_summary$Snail.Level, "removal")
sdm_summary$Nitrogen.level <- revalue(sdm_summary$Nitrogen.level, c("Fertilized" = "Yes", "without"= "No"))
sdm_summaryggplot(sdm_summary, aes(x=Snail.Level,
y=below.biomass.g.meter.sq..m2..,
fill=Nitrogen.level)) +
geom_col(color="black", position=position_dodge()) +
geom_errorbar(aes(ymin=below.biomass.g.meter.sq..m2.., ymax=below.biomass.g.meter.sq..m2..+ci),
position = position_dodge()) +
labs(title="Grazing impacts depend on nitrogen levels",
x= "Grazing level",
y= expression(paste("Standing dry mass (" , g^{-1}, m^{-2}, ")")),
fill = "Fertilized?")
In general, an interaction is not apparent, fertilized plots tend to do better than unfertilized, and there might be less dry mass as snails increase.
Regardless, we can interpret main effects (though with possibly different outcomes). The benefit of approach 2 (removing the interaction) is we “increase” the degrees of freedom associated with the residuals, which ends up reducing the the MST. This is because in 2-way ANOVAs we allocate degrees of freedom to calculating main effects and interactions. This approach was likely used in the original manuscript.
Simply using the the provided output (approach 1) and not performing another series of tests, however, reduces the chances for a Type 1 error. We will return to this discussion when we get to model selection.
If we see significant effects of a factor that has more than 2 levels (like we do when using the approach that drops insignificant interactions), we can consider the general impact of grazing levels using post-hoc tests:
summary(glht(bgb_model_cont_removed_int_removed, linfct = mcp(Snail.Level = "Tukey")))
Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: Tukey Contrasts
Fit: lm(formula = below.biomass.g.meter.sq..m2.. ~ Snail.Level + Nitrogen.level,
data = valdez_2023[valdez_2023$Snail.Level != "uncaged",
], contrasts = list(Snail.Level = "contr.sum", Nitrogen.level = "contr.sum"))
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
removal - control snails == 0 55.27 38.68 1.429 0.3534
snail addition - control snails == 0 -58.76 38.68 -1.519 0.3124
snail addition - removal == 0 -114.03 38.68 -2.948 0.0272 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
Note the contrast approach has minimal impact here
summary(glht(lm(below.biomass.g.meter.sq..m2..~Snail.Level + Nitrogen.level
$Snail.Level != "uncaged",], contrasts = list(Snail.Level = "contr.sum", Nitrogen.level = "contr.sum")), linfct = mcp(Snail.Level = "Tukey"))) , valdez_2023[valdez_2023
Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: Tukey Contrasts
Fit: lm(formula = below.biomass.g.meter.sq..m2.. ~ Snail.Level + Nitrogen.level,
data = valdez_2023[valdez_2023$Snail.Level != "uncaged",
], contrasts = list(Snail.Level = "contr.sum", Nitrogen.level = "contr.sum"))
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
removal - control snails == 0 55.27 38.68 1.429 0.354
snail addition - control snails == 0 -58.76 38.68 -1.519 0.312
snail addition - removal == 0 -114.03 38.68 -2.948 0.027 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
When significant
If the interaction term is significant, it means the main effects can not be interpreted. This is because the impact of a given variable depends on the level of another. When this happens, the data is typically divided into subset and analyzed using one-way ANOVAs.
For example, when Valdez et al. (2023) analyzed standing dead mass, they found a significant interaction term:
<-lm(Standing.Dead..dry..m2.~Snail.Level * Nitrogen.level, valdez_2023[valdez_2023$Snail.Level != "uncaged",])
sdm_model plot(sdm_model)
Anova(sdm_model, type="III")
Anova Table (Type III tests)
Response: Standing.Dead..dry..m2.
Sum Sq Df F value Pr(>F)
(Intercept) 1050.19 1 328.463 4.392e-10 ***
Snail.Level 300.82 2 47.042 2.095e-06 ***
Nitrogen.level 348.39 1 108.963 2.248e-07 ***
Snail.Level:Nitrogen.level 200.90 2 31.417 1.700e-05 ***
Residuals 38.37 12
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Following this, you could investigate impacts in plots with nitrogen, where you find snail manipulation had a significant impact (and considered post-hoc which ones!)
<-lm(Standing.Dead..dry..m2.~Snail.Level, valdez_2023[valdez_2023$Snail.Level != "uncaged" & valdez_2023$Nitrogen.level == "Fertilized",])
sdm_model_fertilized plot(sdm_model_fertilized)
Anova(sdm_model_fertilized, type= "III")
Anova Table (Type III tests)
Response: Standing.Dead..dry..m2.
Sum Sq Df F value Pr(>F)
(Intercept) 1050.19 1 206.027 7.158e-06 ***
Snail.Level 300.82 2 29.507 0.000786 ***
Residuals 30.58 6
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
summary(glht(sdm_model_fertilized, linfct = mcp(Snail.Level= "Tukey")))
Simultaneous Tests for General Linear Hypotheses
Multiple Comparisons of Means: Tukey Contrasts
Fit: lm(formula = Standing.Dead..dry..m2. ~ Snail.Level, data = valdez_2023[valdez_2023$Snail.Level !=
"uncaged" & valdez_2023$Nitrogen.level == "Fertilized", ])
Linear Hypotheses:
Estimate Std. Error t value Pr(>|t|)
removal - control snails == 0 -14.150 1.843 -7.676 <0.001 ***
snail addition - control snails == 0 -7.567 1.843 -4.105 0.0150 *
snail addition - removal == 0 6.583 1.843 3.571 0.0272 *
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
(Adjusted p values reported -- single-step method)
and plots without added nutrients, where you find snail addition did not
<-lm(Standing.Dead..dry..m2.~Snail.Level, valdez_2023[valdez_2023$Snail.Level != "uncaged" & valdez_2023$Nitrogen.level == "without",])
sdm_model_not_fertilized plot(sdm_model_not_fertilized)
Anova(sdm_model_not_fertilized, type= "III")
Anova Table (Type III tests)
Response: Standing.Dead..dry..m2.
Sum Sq Df F value Pr(>F)
(Intercept) 36.123 1 27.8457 0.001871 **
Snail.Level 12.416 2 4.7856 0.057211 .
Residuals 7.783 6
---
Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
Other approaches for dealing with significant interactions include directly interpreting interaction terms (which we can do given our understanding of linear model coefficients), but this is rarely used. They are somewhat messy
coef(sdm_model)
(Intercept)
18.710000
Snail.Levelremoval
-14.150000
Snail.Levelsnail addition
-7.566667
Nitrogen.levelwithout
-15.240000
Snail.Levelremoval:Nitrogen.levelwithout
16.153333
Snail.Levelsnail addition:Nitrogen.levelwithout
10.356667
Another approach when interactions are significant is to compare all group means (somewhat like a Tukey design for combined treatment levels). The emmeans package offers this approach.
library(emmeans)
emmeans(sdm_model, pairwise ~ Snail.Level*Nitrogen.level)
$emmeans
Snail.Level Nitrogen.level emmean SE df lower.CL upper.CL
control snails Fertilized 18.71 1.03 12 16.46 20.96
removal Fertilized 4.56 1.03 12 2.31 6.81
snail addition Fertilized 11.14 1.03 12 8.89 13.39
control snails without 3.47 1.03 12 1.22 5.72
removal without 5.47 1.03 12 3.22 7.72
snail addition without 6.26 1.03 12 4.01 8.51
Confidence level used: 0.95
$contrasts
contrast estimate SE df t.ratio
control snails Fertilized - removal Fertilized 14.150 1.46 12 9.692
control snails Fertilized - snail addition Fertilized 7.567 1.46 12 5.183
control snails Fertilized - control snails without 15.240 1.46 12 10.439
control snails Fertilized - removal without 13.237 1.46 12 9.066
control snails Fertilized - snail addition without 12.450 1.46 12 8.528
removal Fertilized - snail addition Fertilized -6.583 1.46 12 -4.509
removal Fertilized - control snails without 1.090 1.46 12 0.747
removal Fertilized - removal without -0.913 1.46 12 -0.626
removal Fertilized - snail addition without -1.700 1.46 12 -1.164
snail addition Fertilized - control snails without 7.673 1.46 12 5.256
snail addition Fertilized - removal without 5.670 1.46 12 3.884
snail addition Fertilized - snail addition without 4.883 1.46 12 3.345
control snails without - removal without -2.003 1.46 12 -1.372
control snails without - snail addition without -2.790 1.46 12 -1.911
removal without - snail addition without -0.787 1.46 12 -0.539
p.value
<.0001
0.0024
<.0001
<.0001
<.0001
0.0072
0.9716
0.9868
0.8450
0.0021
0.0206
0.0512
0.7419
0.4406
0.9932
P value adjustment: tukey method for comparing a family of 6 estimates
This was likely the approach used in the Valdez et al. 2023 paper.
paged_table(data.frame(emmeans(sdm_model, pairwise ~ Snail.Level*Nitrogen.level)$contrasts))
paged_table(data.frame(emmeans(sdm_model, pairwise ~ Snail.Level)$contrasts))
NOTE: Results may be misleading due to involvement in interactions
paged_table(data.frame(emmeans(sdm_model, pairwise ~ Nitrogen.level)$contrasts))
NOTE: Results may be misleading due to involvement in interactions
Note the warning; if interactions are significant comparing main effects may be inappropriate (which is why other approaches include subsetting the data).
Other options
Bootstrapping and permutation tests options may also be used for two-way ANOVAs when assumptions are not met, though there is implementation is more complicated than single-sample designs due to the need to randomize/permute interaction impacts.
Plotting outcomes
Results from two-way ANOVAs are often plotted similarly to one-way ANOVAs, but with colors or other aesthetics representing the additional group.
<- summarySE(valdez_2023[valdez_2023$Snail.Level != "uncaged",], measurevar = "Standing.Dead..dry..m2.", groupvars = c("Snail.Level", "Nitrogen.level"))
sdm_summary sdm_summary
Snail.Level Nitrogen.level N Standing.Dead..dry..m2. sd se
1 control snails Fertilized 3 18.710000 3.6626220 2.1146158
2 control snails without 3 3.470000 1.0013491 0.5781292
3 removal Fertilized 3 4.560000 0.4250882 0.2454248
4 removal without 3 5.473333 1.4150029 0.8169523
5 snail addition Fertilized 3 11.143333 1.3025104 0.7520047
6 snail addition without 3 6.260000 0.9417006 0.5436911
ci
1 9.098457
2 2.487489
3 1.055978
4 3.515062
5 3.235615
6 2.339314
$Snail.Level <- relevel(sdm_summary$Snail.Level, "removal")
sdm_summary
$Nitrogen.level <- revalue(sdm_summary$Nitrogen.level, c("Fertilized" = "Yes",
sdm_summary"without"= "No"))
ggplot(sdm_summary, aes(x=Snail.Level,
y=Standing.Dead..dry..m2.,
fill=Nitrogen.level)) +
geom_col(color="black", position=position_dodge()) +
geom_errorbar(aes(ymin=Standing.Dead..dry..m2., ymax=Standing.Dead..dry..m2.+ci), position = position_dodge()) +
labs(title="Grazing impacts depend on nitrogen levels",
x= "Grazing level",
y= expression(paste("Standing dry mass (" , g^{-1}, m^{-2}, ")")),
fill = "Fertilized?")
Next steps
In the next chapters we will carry our linear model approach to consider the relationship between continuous outcomes and continuous predictor variables.