r - ggplot2 - One facceted plot does not show stat_compare_means Kruskal

Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
Hello when I try to plot this code :
ggplot(subset(tabcourt, !is.na(Score) & !is.na(`PSA level (ng/ml)`)))
+facet_wrap(.~Method,scales='free')
+aes(x =Score, y =`PSA level (ng/ml)`,color=Method)
+stat_compare_means(show.legend=FALSE,label.x.npc = 0.5,label.y.npc = 0.93,color="black",size=4)
+geom_boxplot()+theme_bw()
It doesn't show the Kruskal Wallis on the middle plot, I tried all I could but can't seem to have the solution, any ideas on how to fix this ?
Edit : when putting free_y instead of free it fixes the bug but the x axis is bad (1 to 30 for each)
here's the head and the str of the data :
                your bracketing is wierd, and aes usually gois inside a geom layer, not outside (I think). PLease include a sample of your data to make this reproducible
– morgan121
                Nov 7, 2019 at 23:25
I think your error could come either how you wrapped your data into ggplot or from your data it self.
I don't have a sample of your data, so I used the sample database Toothgrowth and your code for stat_compare_mean, I get the display you are looking for. 
Here is my code:
library(ggpubr)
data("ToothGrowth")
# Box plot faceted by "dose"
p <- ggboxplot(ToothGrowth, x = "supp", y = "len",
               color = "supp", palette = "jco",
               add = "jitter",
               facet.by = "dose", short.panel.labs = FALSE)
# Adding stat_compare_means
p + stat_compare_means(show.legend=FALSE, label.x.npc = 0.5, 
                       label.y.npc = 0.93, color = "black", size = 4) + theme_bw()
Here is the plot:
If you use this instead, you have a better plotting:
p + stat_compare_means() + theme_bw()
UPDATE: TRICK TO GET THE FINAL PLOT DISPLAYED
So, I tried to reproduce your data in order to reproduce the error of plotting you get and I succeed to plot the p values using a trick described in this post: R: ggplot2 - Kruskal-Wallis test per facet
Here is the code that I used to mimicks your data:
set.seed(1)
# defining the sample dataset AJCC
PSA_levels <- rnorm(100,mean = 2, sd = 2)
AJCC_data <- data.frame(cbind(PSA_levels))
x <- NULL
for(i in 1:100) {x <- c(x,sample(1:4,1))}
AJCC_data$score <- x
AJCC_data$Method <- 'AJCC'
# defining the sample dataset Gleason
PSA_levels <- rnorm(100,mean = 2.5, sd = 1)
Gleason_data <- data.frame(cbind(PSA_levels))
x <- NULL
for(i in 1:100) {x <- c(x,sample(5:10,1))}
Gleason_data$score <- x
Gleason_data$Method <- 'Gleason'
# defining the sample dataset TNM
PSA_levels <- rnorm(100,mean = 2.5, sd = 1)
TNM_data <- data.frame(cbind(PSA_levels))
x <- NULL
for(i in 1:100) {x <- c(x,sample(1:30,1))}
TNM_data$score <- x
TNM_data$Method <- 'TNM'
df <- rbind(AJCC_data, Gleason_data, TNM_data)
df$score <- as.factor(df$score)
Here is the output of  df that looks similar to your data tabcourt
> str(df)
'data.frame':   300 obs. of  3 variables:
 $ PSA_levels: num  0.747 2.367 0.329 5.191 2.659 ...
 $ score     : Factor w/ 30 levels "1","2","3","4",..: 2 1 2 2 2 3 1 2 3 3 ...
 $ Method    : chr  "AJCC" "AJCC" "AJCC" "AJCC" ...
Then, I tried to reproduce your boxplot faceted:
library(ggplot2)
library(ggpubr)
g <- ggplot(df, aes(x = score, y = PSA_levels, color = Method))
p <- g + facet_wrap(.~Method, scales = 'free_x')
p <- p + geom_boxplot()
p <- p + theme_bw()
When, I tried to add p values on the graph using the stat_compare_means function, I get same error of plotting as you. So, according to the post cited above, I used the package dplyr to generate the pvalue of the Kruskal Wallis test for each group. 
library(dplyr)
ptest <- df %>% group_by(Method) %>% summarize(p.value = kruskal.test(PSA_levels ~score)$p.value)
Here the output of ptest:
> ptest
# A tibble: 3 x 2
  Method  p.value
  <chr>     <dbl>
1 AJCC      0.575
2 Gleason   0.216
3 TNM       0.226
Now, I can add that the boxplot by doing:
p + geom_text(data = ptest, aes(x =  c(2,3,10), y = c(6,6,6), label = paste0("Kruskal-Wallis\n p=",round(p.value,3))))
And here, what you get:
So, I think it is because stat_compare_means did not understand which group to compare and how to represent all statistical comparisons on the graph. Doing the test out of the ggplot and then adding as a geom_text argument solve the situation. 
Hope it will works with your real data ! 
                Hello, I tried yours and it worked but without a free_x, because when i added scale="free_x" then your code didn't work, so i tried on mine and when i remove the free scale it works.... but it's such a bad plot then...   compare_means(!is.na(`mtDNA copy number`) ~ !is.na(Score), data = tabcourt, group.by = "Method") strangely gives me Error: Strings must match column names. Unknown columns: !is.na(Score), I verified, Score is there How can I give you my data ?
– MrIce
                Nov 8, 2019 at 7:06
                Sorry, I made a mistake the correct code is compare_means(!is.na("mtDNA copy number") ~ !is.na("Score"), data = tabcourt, group.by = "Method"). (I forget brackets on Score). Try that, it should work. By the way, there is no free-x on my code, where did you see that ?
– dc37
                Nov 8, 2019 at 7:12
                I tried it because there is one on mine, without that I have 1 to 30 for x on each axis,  https://i.ibb.co/frX7Tm8/Capture.png here's the head ! Same error tho : > compare_means(!is.na("mtDNA copy number") ~ !is.na("Score"), data = tabcourt, group.by = "Method") Error: Strings must match column names. Unknown columns: !is.na("Score") Call `rlang::last_error()` to see a backtrace
– MrIce
                Nov 8, 2019 at 7:20
                Sorry forgot str(tabcourt) here it is : https://i.ibb.co/Wp5fxrW/Capture2.png Thank you for your kind help !
– MrIce
                Nov 8, 2019 at 7:21
                Just a quick comment, in your code, you have aes(x =Score, y =PSA level(ng/ml), color = Method), is it not supposed to be mtDNA copy number ?
– dc37
                Nov 8, 2019 at 8:02
thank you for this workaround !!! It did work, but I had to add : +scale_x_discrete() otherwise I'd get Error: Discrete value supplied to continuous scale
Here's the code I used if this happens to others :
ptest = tabcourt %>% group_by(Method) %>%summarize(p.value=kruskal.test("mtDNA copy number"~Score)$p.value)
p2 = ggplotly(ggplot(subset(tabcourt, !is.na(Score) & !is.na("mtDNA copy number")),aes(x =Score, y ="mtDNA copy number",color=Method)) 
+ scale_x_discrete()
+ geom_text(data = ptest, aes(x =c(2,3,10), y= c(1.5,1.5,1.5), label = paste0("Kruskal-Wallis\n p=",round(p.value,3))))
+ facet_grid(.~Method,scales='free')
+ geom_boxplot()
+ theme_bw())
Weird tho that stat_compare_means have an hard time doing it's job !
                Great that it is working for you !  Apparently, stat_compare_means is more limited that compare_means and for example can take the output of compare_means as an argument. Maybe in a near future, new versions of this function will solve this issue.
– dc37
                Nov 9, 2019 at 13:53
        Thanks for contributing an answer to Stack Overflow!
Please be sure to answer the question. Provide details and share your research!
But avoid …
Asking for help, clarification, or responding to other answers.
Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.