A Lazy Function

I have already written 2 posts about writing functions, and I will try to diversify my content. That said, I won’t refrain from sharing something that has been helpful to me. The function(s) I describe in this post is an artefact left over from before I started using R Markdown. It is a product of its time but may still be of use to people who haven’t switched to R Markdown yet. It is lazy (and quite imperfect) solution to a tedious task.

The Problem

At the time I wrote this function I was using R for my statistics and Libreoffice for writing. I would run a test in R and then write it up in Libreoffice. Each value that needed reporting had to be transferred from my R output to Libreoffice – and for each test there are a number of values that need reporting. Writing up these tests is pretty formulaic. There’s a set structure to the sentence, for example writing up a t-test with a significant result nearly always looks something like this:

An independent samples t-test revealed a significant difference in X between the Y sample, (M = [ ], SD = [ ]), and the Z sample, (M = [ ], SD = [ ]), t([df]) = [ ], p = [ ].

And the write up of a non-significant result looks something like this:

An independent samples t-test revealed no significant difference in X between the Y sample, (M = [ ], SD = [ ]), and the Z sample, (M = [ ], SD = [ ]), t([df]) = [ ], p = [ ].

Seven values (the square [ ] brackets) need to be reported for this single test. Whether you copy and paste or type each value, the reporting of such tests can be very tedious, and leave you prone to errors in reporting.

The Solution

In order to make reporting values easier (and more accurate) I wrote the t_paragraph() function (and the related t_paired_paragraph() function). This provided an output that I could copy and paste into a Word (Libreoffice) document. This function is part of the desnum package (McHugh, 2017).

The t_parapgraph() Function

The t_parapgraph() function runs a t-test and generates an output that can be copied and pasted into a word document. The code for the function is as follows:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41





t_paragraph <- function (x, y, measure){
 
 t <- t.test(x ~ y)
 
 labels <- levels(y)
 
 
 tsl <- as.vector(t$statistic)
 ts <- round(tsl, digits = 3)
 tpl <- as.vector(t$p.value)
 tp <- round(tpl, digits = 3)
 d_fl <- as.vector(t$parameter)
 d_f <- round(d_fl, digits = 2)
 ml <- as.vector(tapply(x, y, mean))
 m <- round(ml, digits = 2)
 sdl <- as.vector(tapply(x, y, sd))
 sd <- round(sdl, digits = 2)
 
 
 if (tp < 0.05) 
 print(paste0("An independent samples t-test revealed a significant difference in ", 
 measure, " between the ", labels[1], " sample, (M = ", 
 m[1], ", SD = ", sd[1], "), and the ", labels[2], 
 " sample, (M =", m[2], ", SD =", sd[2], "), t(", 
 d_f, ") = ", ts, ", p = ", tp, "."), quote = FALSE, 
 digits = 2)
 
 
 if (tp > 0.05) 
 print(paste0("An independent samples t-test revealed no difference in ", 
 measure, " between the ", labels[1], " sample, (M = ", 
 m[1], ", SD = ", sd[1], "), and the ", labels[2], 
 " sample, (M = ", m[2], ", SD =", sd[2], "), t(", 
 d_f, ") = ", ts, ", p = ", tp, "."), quote = FALSE, 
 digits = 2)
}

When using t_paragraph()x is your DV, y is your grouping variable while measure is a string value that the name of the dependent variable. To illustrate the function I’ll use the mtcars dataset.

Applications of the t_parapgraph() Function

The mtcars dataset is comes with R. For information on it simply type help(mtcars). The variables of interest here are am(transmission; 0 = automatic, 1 = manual), mpg (miles per gallon), qsec (1/4 mile time). The two questions I’m going to look at are:

  1. Is there a difference in miles per gallon depending on transmission?

  2. Is there a difference in 1/4 mile time depending on transmission?

Before running the test it is a good idea to look at the data. Because we’re going to look at differences between groups we want to run descriptives for each group separately. To do this I’m going to combine the the descriptives() function which I previously covered here (also part of the desnum package) and the tapply() function.

The tapply() function allows you to run a function on subsets of a dataset using a grouping variable (or index). The arguments are as follows tapply(vector, index, function)vector is the variable you want to pass through function; and index is the grouping variable. The examples below will make this clearer.

We want to run descriptives on mtcars$mpg and on mtcars$qsec and for each we want to group by transmission (mtcars$am). This can be done using tapply() and descriptives() together as follows:

1
tapply(mtcars$mpg, mtcars$am, descriptives)
1
2
3
4
5
6
7
## $`0`
## mean sd min max len
## 1 17.14737 3.833966 10.4 24.4 19
## 
## $`1`
## mean sd min max len
## 1 24.39231 6.166504 15 33.9 13

Recall that 0 = automatic, and 1 = manual. Replace mpg with qsec and run again:

1
tapply(mtcars$qsec, mtcars$am, descriptives)
1
2
3
4
5
6
7
## $`0`
## mean sd min max len
## 1 18.18316 1.751308 15.41 22.9 19
## 
## $`1`
## mean sd min max len
## 1 17.36 1.792359 14.5 19.9 13

Running t_paragraph()

Now that we know the values for automatic vs manual cars we can run our t-tests using t_paragraph(). Our first question:

Is there a difference in miles per gallon depeding on transmission?

1
t_paragraph(mtcars$mpg, mtcars$am, "miles per gallon")
1
## [1] An independent samples t-test revealed a significant difference in miles per gallon between the sample, (M = 17.15, SD = 3.83), and the sample, (M =24.39, SD =6.17), t(18.33) = -3.767, p = 0.001.

There is a difference, and the output above can be copied and pasted into a word document with minimal changes required.

Our second question was:

Is there a difference in 1/4 mile time depending on transmission?

1
t_paragraph(mtcars$qsec, mtcars$am, "quarter-mile time")
1
## [1] An independent samples t-test revealed no difference in quarter-mile time between the sample, (M = 18.18, SD = 1.75), and the sample, (M = 17.36, SD =1.79), t(25.53) = 1.288, p = 0.209.

This time there was no significant difference, and again the output can be copied and pasted into word with minimal changes.

Limitations

The function described was written a long time ago, and could be updated. However I no longer copy and paste into word (having switched to R markdown instead). The reporting of the p value is not always to APA standards. If p is < .001 this is what should be reported. The code for t_paragraph() could be updated to include the p_report function (described here) which would address this. Another limitation is that the formatting of the text isn’t perfect, the letters (N,M,SD,t,p) should all be italicised, but having to manually fix this formatting is still easier than manually transferring individual values.

Conclusion

Despite the limitations the functions t_paragraph() and t_paired_paragraph() have made my life easier. I still use them occasionally. I hope they can be of use to anyone who is using R but has not switched to R Markdown yet.

References

McHugh, C. (2017). Desnum: Creates some useful functions.