broom 0.7.0

相关文章推荐

温柔的煎鸡蛋 · 彭兰玉：语言世界里的行者-湖南大学新闻网· 1 年前 ·

天涯 · 苹果大模型最大动作:开源M芯专用ML框架,能 ...· 1 年前 ·

天涯 · mysql like子查询语句_mysql ...· 2 年前 ·

没读研的菠萝 · 扩大受众面 ...· 2 年前 ·

非常酷的莴苣 · Amazon Live· 2 年前 ·

We’re excited to announce the release of broom 0.7.0 on CRAN!

broom is a package for summarizing statistical model objects in tidy tibbles. While several compatibility updates have been released in recent months, this is the first major update to broom in almost two years. This update includes many new tidier methods, bug fixes, improvements to existing tidier methods and their documentation, and improvements to maintainability and internal consistency. The full list of changes is available in the package release notes .

This release was made possible in part by the RStudio internship program, which has allowed one of us ( Simon Couch ) to work on broom full-time for the last month.

You can install the most recent broom update with the following code:

install.packages("broom")

Then attach it for use with:

library(broom)

We’ll outline some of the more notable changes below!

For one, this release includes support for several new model objects—many of these additions came from first-time contributors to broom!


    anova

objects from the

car

package

pam

objects from the


    cluster

package

drm

objects from the

drc

package


    summary_emm

objects from the


    emmeans

package


    epi.2by2

objects from the


    epiR

package


    fixest

objects from the


    fixest

package


    regsubsets

objects from the


    leaps

package


    lm.beta

objects from the


    lm.beta

package

rma

objects from the


    metafor

package

mfx


    logitmfx


    negbinmfx


    poissonmfx


    probitmfx

, and


    betamfx

objects from the

mfx

package


    lmrob

and


    glmrob

objects from the


    robustbase

package


    sarlm

objects from the


    spatialreg

package


    speedglm

objects from the


    speedglm

package


    svyglm

objects from the


    survey

package

We have restored a simplified version of


     glance.aov()

This update also features many bug fixes improvements to existing tidiers. Some of the more notable ones:

Many improvements to the consistency of


    augment.*()

methods:

If you pass a dataset to


     augment()

via the


    data


    newdata

arguments, you are now guaranteed that the augmented dataset will have exactly the same number of rows as the original dataset. This differs from previous behavior primarily when there are missing values. Previously


     augment()

would drop rows containing

NA

. This should no longer be the case. As a result,


    augment.*()

methods no longer accept an


    na.action

argument.

In previous versions, several


    augment.*()

methods inherited the


     augment.lm()

method, but required additions to the


     augment.lm()

method itself. We have shifted away from this approach in favor of re-implementing many


    augment.*()

methods as standalone methods making use of internal helper functions. As a result,


     augment.lm()

and some related methods have deprecated (previously unused) arguments.

The


    .resid

column in the output of


    augment().*

methods is now consistently defined as


    y - y_hat


    augment()

tries to give an informative error when


   data

isn’t the original training data.

Several


    glance.*()

methods have been refactored in order to return a one-row tibble even when the model matrix is rank-deficient.


    glance()

methods now return a


   nobs

column, which contains the number of data points used to fit the model!

Various warnings resulting from changes to the tidyr API in v1.0.0 have been fixed.

Added options to provide additional columns in the outputs of


     glance.biglm()


     tidy.felm()


    tidy.lmsobj()


     tidy.lmodel2()


     tidy.polr()


     tidy.prcomp()


     tidy.zoo()


     tidy_optim()

This release also contains a number of breaking changes and deprecations meant to improve maintainability and internal consistency.

We have changed how we report degrees of freedom for

lm

objects. This is especially important for instructors in statistics courses. Previously the

df

column in


     glance.lm()

reported the rank of the design matrix. Now it reports degrees of freedom of the numerator for the overall F-statistic. This is equal to the rank of the model matrix minus one (unless you omit an intercept column), so the new

df

should be the old

df

minus one.

We are moving away from supporting


    summary.*()

objects. In particular, we have removed


    tidy.summary.lm()

as part of a major overhaul of internals. Instead of calling


     tidy()


    summary

-like objects, please call


     tidy()

directly on model objects moving forward.

We have removed all support for the


    quick

argument in


     tidy()

methods. This is to simplify internals and is for maintainability purposes. We anticipate this will not influence many users as few people seemed to use it. If this majorly cramps your style, let us know, as we are considering a new verb to return only model parameters. In the meantime,


     stats::coef()

together with


     tibble::enframe()

provides most of the functionality of


     tidy(..., quick = TRUE)

All


    conf.int

arguments now default to


    FALSE

, and all


    conf.level

arguments now default to


    0.95

. This should primarily affect


     tidy.survreg()

, which previously always returned confidence intervals, although there are some others.

Tidiers for


    emmeans

-objects use the arguments


    conf.int

and


    conf.level

instead of relying on the argument names native to the


    emmeans::summary()

-methods (i.e.,


    infer

and


    level

). Similarly,


    multcomp

-tidiers now include a call to


    summary()

as previous behavior was akin to setting the now removed argument


    quick = TRUE

. Both families of tidiers now use the


    adj.p.value

column name when appropriate. Finally,


    emmeans


    multcomp

-, and


    TukeyHSD

-tidiers now consistently use the column names


    contrast

and


    null.value

instead of


    comparison


    level1

and


    level2

, or

lhs

and

rhs

This release of broom also deprecates several helper functions as well as tidier methods for a number of non-model objects, each in favor of more principled approaches from other packages (outlined in the NEWS file). Notably, though, tidiers have been deprecated for data frames, rowwise data frames, vectors, and matrices. Further, we have moved forward with the planned transfer of tidiers for mixed models to broom.mixed .

Most all unit testing for the package is now supported by the modeltests package!

Also, we have revised several vignettes and moved them to the tidymodels website. For backward compatibility, the existing vignettes will now simply link to the revised versions.

Finally, the package’s website has moved from its previous tidyverse domain to broom.tidymodels.org .

Most notably, the broom dev team is changing the process to add new tidying methods to the package. Instead, we ask that issues/PRs requesting support for new model objects be directed to the model-owning package (i.e. the package that the model is exported from) rather than to broom. If the maintainers of those packages are unable or unwilling to provide tidying methods in the model-owning package, it might be possible to add the new tidier to broom. broom is near its limit of tidiers; adding more may make the package unsustainable.

For developers exporting tidying methods directly from model-owning packages, we are actively working to provide resources to both ease the process of writing new tidiers methods and reduce the dependency burden of taking on broom generics and helpers. As for the first point, we recently posted an article on the tidymodels website providing notes on best practices for writing tidiers. This article will be kept up to date as we develop new resources for easing the process of writing new tidier methods. As for the latter, the r-lib/generics package provides lightweight dependencies for the main broom generics. We hope to soon provide a coherent suite of helper functions for use in external broom methods.

We anticipate that the most active development on the broom package, looking forward, will center on improving augment() methods. We are also hoping to change our CRAN release cycle and to provide incremental updates every several months rather than major changes every couple years.

This release features work and input from over 140 contributors (over 50 of them for their first time) since the last major release. See the package release notes to see more specific notes on contributions. Thank you all for your thoughtful comments, patience, and hard work!

@abbylsmith , @acoppock , @ajb5d , @aloy , @AndrewKostandy , @angusmoore , @anniew , @aperaltasantos , @asbates , @asondhi , @asreece , @atyre2 , @bachmeil , @batpigandme , @bbolker , @benjbuch , @bfgray3 , @BibeFiu , @billdenney , @BrianOB , @briatte , @bruc , @brunaw , @brunolucian , @bschneidr , @carlislerainey , @CGMossa , @CharlesNaylor , @ChuliangXiao , @cimentadaj , @crsh , @cwang23 , @DavisVaughan , @dchiu911 , @ddsjoberg , @dgrtwo , @dmenne , @dylanjm , @ecohen13 , @economer , @EDiLD , @ekatko1 , @ellessenne , @ethchr , @florencevdubois , @GegznaV , @gershomtripp , @grantmcdermott , @gregmacfarlane , @hadley , @haozhu233 , @hasenbratan , @HenrikBengtsson , @hermandr , @hideaki , @hughjonesd , @iago-pssjd , @ifellows , @IndrajeetPatil , @Inferrator , @istvan60 , @jamesmartherus , @JanLauGe , @jasonyang5 , @jaspercooper , @jcfisher , @jennybc , @jessecambon , @jkylearmstrongibx , @jmuhlenkamp , @JulianMutz , @Jungpin , @jwilber , @jyuu , @karissawhiting , @karldw , @khailper , @krauskae , @kuriwaki , @kyusque , @KZARCA , @Laura-O , @ldlpdx , @ldmahoney , @lilymedina , @llendway , @lrose1 , @ltobalina , @LukasWallrich , @lukesonnet , @lwjohnst86 , @malcolmbarrett , @margarethannum , @mariusbarth , @MatthieuStigler , @mattle24 , @mattpollock , @mattwarkentin , @mine-cetinkaya-rundel , @mkirzon , @mlaviolet , @Move87 , @namarkus , @nlubock , @nmjakobsen , @ns-1m , @nt-williams , @oij11 , @petrhrobar , @PirateGrunt , @pjpaulpj , @pkq , @poppymiller , @QuLogic , @randomgambit , @riinuots , @RobertoMuriel , @Roisin-White , @romainfrancois , @rsbivand , @serina-robinson , @shabbybanks , @Silver-Fang , @Sim19 , @simonpcouch , @sjackson1236 , @softloud , @stefvanbuuren , @strengejacke , @sushmitavgopalan16 , @tcuongd , @thisisnic , @topepo , @tyluRp , @vincentarelbundock , @vjcitn , @vnijs , @weiyangtham , @william3031 , @x249wang , @xieguagua , @yrosseel , and @zoews

New Tidier Methods

Improvements and Bug Fixes for Existing Tidiers

Breaking Changes and Deprecations

Other Changes

Looking Forward

Contributors