Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

I am wondering how to use apply/lapply/sapply, or for looping to do t.testing.

I have a set of 2 level grouping variables and a set of numeric variables within the same dataframe.

For example the dataframe would look something like this:

Var1 Var2 Group1 Group2 Na 6 … TRUE TRUE 1 9 … FALSE FALSE 3 5 … TRUE TRUE Na Na … FALSE FALSE 3 2 … TRUE TRUE 4 5 … FALSE FALSE Na 1 … FALSE TRUE 21 2 … TRUE FALSE 19 3 … FALSE FALSE 12 7 … TRUE TRUE

with 10 numeric variables and 10 grouping variables. The main problem is the NAs in the data, which I can't figure out how to ignore.

for (i in 1:10){
    for (j in 11:20){
        print(t.test(df[i],df[j])$p.value, na.rm = TRUE)

where the i's are my numeric variables and the j's are my grouping variables.

The idea is that I want to compare all numeric variables against all grouping variables but this code results in the error:

Error in if (stderr < 10 * .Machine$double.eps * max(abs(mx), abs(my)))
stop("data are essentially constant") : 
missing value where TRUE/FALSE needed
In addition: Warning messages:
1: In mean.default(y) : argument is not numeric or logical: returning NA
2: In var(y) : NAs introduced by coercion
3: In mean.default(y) : argument is not numeric or logical: returning NA
4: In var(y) : NAs introduced by coercion

I just need to ignore NAs basically. I am not familiar with lapply or apply but i know that within R they tend to be much simpler than running loops.

Please let me know if there is an easy solution for this.

Ultimately I don't want to do tons of t.tests one by one...

Thanks, for (i in 1:5) for (j in 6:10) print(t.test(mtcars[i], mtcars[j])$p.value) works so the issue must be your data, since you didnt post it, we can't be much more help – rawr Dec 1, 2016 at 22:01 If you have a grouping variable group, and want to test whether elements of a vector vec that correspond to one group are different in mean from those that correspond to another group the syntax is t.test(vec, by=group). If you write t.test(vec, group) it will compare the mean value of vec to the mean value of group – gfgm Dec 1, 2016 at 22:09 @GabrielFGeislerMesevage t.test doent have a by argument so that gets ignored identical(t.test(sleep$extra, by = sleep$group), t.test(sleep$extra)) is true – rawr Dec 1, 2016 at 22:44 You're right, what I wrote was incorrect. I won't delete it so your comment remains coherent. The correct syntax would be t.test(vec[group==0], vec[group==1]), or whatever the levels are called. But including the grouping variable as the y vector in the argument looks like what is generating the problem. – gfgm Dec 1, 2016 at 22:52 Hey all, thanks for the commend so far, I am still running into the problem with the NAs but I have edited my question to include an example dataset and a clearer question. – Ash Dec 2, 2016 at 17:53

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.