Collectives™ on Stack Overflow
Find centralized, trusted content and collaborate around the technologies you use most.
Learn more about Collectives
Teams
Q&A for work
Connect and share knowledge within a single location that is structured and easy to search.
Learn more about Teams
Trying to understand the use of median_hilow in ggplot. I was hoping to find a way to plot upper and lower interquartile ranges. But I can't find the a full explanation for 'fun.data=median_hilow' anywhere. Even though I assume it is doing the correct thing. Is there any full documentation for this function to check how it is plotting IQRs?
library(ggplot2)
ggplot(DF,aes(x=cnd,y=bias,colour=cnd)) +
stat_summary(fun.data=median_hilow)
median_hilow
is just a wrapper around smedian_hilow
which comes from the Hmisc
package.
From the documentation of smean / smedian
group of functions from Hmisc
.
As per @BondedDust 's comment below you need to have the package Hmisc
installed previously.
(type ?smedian_hilow
and ?median_hilow
):
A number of statistical summary functions is provided for use with summary.formula and summarize (as well as tapply and by themselves). smean.cl.normal computes 3 summary variables: the sample mean and lower and upper Gaussian confidence limits based on the t-distribution. smean.sd computes the mean and standard deviation. smean.sdl computes the mean plus or minus a constant times the standard deviation. smean.cl.boot is a very fast implementation of the basic nonparametric bootstrap for obtaining confidence limits for the population mean without assuming normality. These functions all delete NAs automatically. smedian.hilow computes the sample median and a selected pair of outer quantiles having equal tail areas.
The smedian.hilow
calculates the median and lower and upper quartiles according to a confidence interval. As an example:
x <- rnorm(100)
> smedian.hilow(x, conf.int=.5) # 25th and 75th percentiles
Median Lower Upper
0.02036472 -0.76198947 0.71190404
And you can have a look at @BondedDust's answer on exactly how this should be implemented with the ggplot2
function.
–
–
–
–
–
If you want the IQR then you do not want median_hilow
, at least with its defaults, because it delivers the low value as the 2.5th percentile and the high value as the 97.5th percentile. (IQR would be 25th and 75th.)
> smedian.hilow(1:100)
Median Lower Upper
50.500 3.475 97.525
You can pass the conf.int
-parameter to the Hmisc::smedian.hilow
-function in this manner using conf.int of 0.5 which will give you the interquartile ranges because (as the Hmisc help page says): " smedian.hilow computes the sample median and a selected pair of outer quantiles having equal tail areas."
:
ggplot(DF,aes(x=cnd,y=bias,colour=cnd)) +
stat_summary(fun.data=median_hilow, conf.int=.5)
–
Think by doing this I can match the results proving its doing what we think:
library(plyr)
iqr <- function(x, ...) {
qs <- quantile(as.numeric(x), probs = c(0.25, 0.75), na.rm = TRUE)
names(qs) <- c("ymin","ymax")
ddply(DF, .(cnd), summarise, new = iqr(bias))
but this example highlights the impot the conf.int
argument is crucial
Thanks for contributing an answer to Stack Overflow!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.