Collectives™ on Stack Overflow

Find centralized, trusted content and collaborate around the technologies you use most.

Learn more about Collectives

Teams

Q&A for work

Connect and share knowledge within a single location that is structured and easy to search.

Learn more about Teams

Trying to understand the use of median_hilow in ggplot. I was hoping to find a way to plot upper and lower interquartile ranges. But I can't find the a full explanation for 'fun.data=median_hilow' anywhere. Even though I assume it is doing the correct thing. Is there any full documentation for this function to check how it is plotting IQRs?

library(ggplot2)
ggplot(DF,aes(x=cnd,y=bias,colour=cnd)) + 
  stat_summary(fun.data=median_hilow)

median_hilow is just a wrapper around smedian_hilow which comes from the Hmisc package.

From the documentation of smean / smedian group of functions from Hmisc.

As per @BondedDust 's comment below you need to have the package Hmisc installed previously.

(type ?smedian_hilow and ?median_hilow):

A number of statistical summary functions is provided for use with summary.formula and summarize (as well as tapply and by themselves). smean.cl.normal computes 3 summary variables: the sample mean and lower and upper Gaussian confidence limits based on the t-distribution. smean.sd computes the mean and standard deviation. smean.sdl computes the mean plus or minus a constant times the standard deviation. smean.cl.boot is a very fast implementation of the basic nonparametric bootstrap for obtaining confidence limits for the population mean without assuming normality. These functions all delete NAs automatically. smedian.hilow computes the sample median and a selected pair of outer quantiles having equal tail areas.

The smedian.hilow calculates the median and lower and upper quartiles according to a confidence interval. As an example:

x <- rnorm(100)
> smedian.hilow(x, conf.int=.5)  # 25th and 75th percentiles
     Median       Lower       Upper 
 0.02036472 -0.76198947  0.71190404 

And you can have a look at @BondedDust's answer on exactly how this should be implemented with the ggplot2 function.

Does this mean one would have needed to execute library(Hmisc) in order to have that function in the workspace? I load it in my .Rprofile but not everyone considers it as essential as do I. Actually it doesn't, since the first line of median_hilow does the loading for you but it does mean that Hmisc would need to be installed first. – IRTFM Feb 10, 2015 at 16:55 @BondedDust It is on the suggested imports in the ggplot2 vignette, which makes me believe that it would get automatically imported (or throw and error if it isn't installed) when the median.hilow function is being used. – LyzandeR Feb 10, 2015 at 17:05 @BondedDust Yes, it is automatically imported. No need to explicitly import it previously. ggplot(mtcars,aes(x=cyl,y=wt)) + stat_summary(fun.data=median_hilow) – LyzandeR Feb 10, 2015 at 17:09 As I said: it would be loaded by the function call. But it does need to be installed first, since it is not on lthe list of dependencies. See the package DESCRIPTION file. (After looked at the help page, one ca say it is not delivering what the questioner expected.) – IRTFM Feb 10, 2015 at 17:15 @BondedDust Sorry BondedDust I missed the second part of your comment. Yes as you said it needs to be installed beforehand. – LyzandeR Feb 10, 2015 at 17:20

If you want the IQR then you do not want median_hilow, at least with its defaults, because it delivers the low value as the 2.5th percentile and the high value as the 97.5th percentile. (IQR would be 25th and 75th.)

> smedian.hilow(1:100)
Median  Lower  Upper 
50.500  3.475 97.525 

You can pass the conf.int-parameter to the Hmisc::smedian.hilow-function in this manner using conf.int of 0.5 which will give you the interquartile ranges because (as the Hmisc help page says): " smedian.hilow computes the sample median and a selected pair of outer quantiles having equal tail areas." :

ggplot(DF,aes(x=cnd,y=bias,colour=cnd)) + 
   stat_summary(fun.data=median_hilow, conf.int=.5)
                when applying the ,conf.int=.5, I get a Warning: Ignoring unknown parameters: conf.int. How can this be overcome?
– user08041991
                Aug 31, 2017 at 12:51

Think by doing this I can match the results proving its doing what we think:

library(plyr)
 iqr <- function(x, ...) {
     qs <- quantile(as.numeric(x), probs = c(0.25, 0.75), na.rm = TRUE)
     names(qs) <- c("ymin","ymax")
 ddply(DF, .(cnd), summarise, new = iqr(bias))

but this example highlights the impot the conf.int argument is crucial

Thanks for contributing an answer to Stack Overflow!

  • Please be sure to answer the question. Provide details and share your research!

But avoid

  • Asking for help, clarification, or responding to other answers.
  • Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.