Summary analysis for categorical variables

mcatstat() returns a new dataframe containing counts and percentages of a categorical analysis variable according to grouping and treatment variables passed in mentry()

Usage

mcatstat(
  datain = NULL,
  a_subset = NA_character_,
  denom_subset = NA_character_,
  uniqid = "USUBJID",
  dptvar = NULL,
  pctdisp = "TRT",
  miss_catyn = "N",
  miss_catlabel = "Missing",
  cum_ctyn = "N",
  total_catyn = "N",
  total_catlabel = "Total",
  dptvarn = 1,
  pctsyn = "Y",
  sigdec = 2,
  denomyn = "N",
  sparseyn = "N",
  sparsebyvalyn = "N",
  return_zero = "N"
)

Arguments

datain: Input data from mentry() output to get counts for each category
a_subset: Analysis Subset condition specific to categorical analysis.
denom_subset: Subset condition to be applied to data set for calculating denominator.
uniqid: Variable(s) to calculate unique counts of. eg. "USUBJID", "SITEID", "ALLCT"
dptvar: Categorical Analysis variable and ordering variable if exists, separated by /. eg: "SEX", "SEX/SEXN", "AEDECOD", "ISTPT/ISTPTN"
pctdisp: Method to calculate denominator (for %) by. Possible values: "TRT", "VAR", "COL", "SUBGRP", "CAT", "NONE", "NO", "DPTVAR", "BYVARxyN"
miss_catyn: To include empty/blank values as miss_catlabel in categories of dptvar variable or not. Values: "Y"/"N"
miss_catlabel: Label for missing values
cum_ctyn: To return cumulative frequency instead of individual frequencies for each category. Values: "Y"/"N"
total_catyn: To return a 'Total' row for the categories of dptvar variable or not. Possible values: "Y"/"N"
total_catlabel: Label for total category row. eg- "All"/"Total"
dptvarn: Number to assign as DPTVARN, useful for block sorting when multiple mcatstat() outputs are created to be combined.
pctsyn: Display Percentage Sign in table or not. Values: "Y"/"N"
sigdec: Number of decimal places for % displayed in output
denomyn: Display denominator in output or not. Values: "Y"/"N"
sparseyn: To sparse missing categories/treatments or not? "Y"/"N"
sparsebyvalyn: Sparse missing categories within by groups. "Y"/"N"
return_zero: Return rows with zero counts if analysis subset/ non-missing does not exist in data. "Y"/"N"

Value

a data.frame with counts and/or percentages, passed to passed to tbl_processor() or graph functions

Details

Object passed to datain is the return element from mentry()
a_subset condition is applied to data to be analysed, and not applied for getting denominator
denom_subset condition, if given, to apply to denominator data alone. Usually used with pctdisp = "SUBGRP" or "DPTVAR"
uniqid is the variable name to get unique counts of. If given as "ALLCT", it sums all observations for the given category. If "USUBJID" then it calculates the number of unique subjects per category.
cum_ctyn as "Y" to get output value as cumulative frequencies instead of individual frequencies. If "Y", total_catyn will be reset to "N"
pctdisp has possible values for method to get denominator to calculate percentage:
- NONE/NO: No percent calculation
- TRT: Treatment total counts acts as denominator
- VAR: Variable Total of all treatments/groups acts as denominator
- COL: Column wise denominator - percentage within each Treatment-Subgroup(s) combination
- CAT: Row-wise denominator - percentage within each Bygroup(s)-dptvar combination
- SUBGRP: Percentage within each Treatment-By group(s)-Subgroup(s) combination
- DPTVAR: Percentage within each Treatment-By group(s)-Subgroup(s)-dptvar combination.
- BYVARxyN: Percentage using Treatment-Bygroup combination as denominator. eg if BYVAR12N then uses TRT-BYVAR1-BYVAR2 combination, if BYVAR1N then only TRT-BYVAR1
- SGRPN: Percentage using Subgroup total as denominator

Examples

data("adsl")

df_mentry <-
  adsl |> mentry(
    subset = "EFFFL=='Y'",
    byvar = "AGEGR1",
    trtvar = "TRT01A",
    trtsort = "TRT01AN",
    subgrpvar = "SEX/SEXN",
    trttotalyn = "N",
    add_grpmiss = "N",
    sgtotalyn = "N",
    pop_fil = "Overall Population"
  )

df_mentry |>
  mcatstat(
    a_subset = "SUBGRPVAR1 == 'F'",
    uniqid = "USUBJID",
    dptvar = "RACE/RACEN",
    pctdisp = "TRT"
  )
#> mcatstat success
#> # A tibble: 18 × 16
#>    BYVAR1 TRTVAR    SUBGRPVAR1 DPTVAR DPTVAL CVALUE DENOMN  FREQ DPTVALN BYVAR1N
#>    <chr>  <ord>     <chr>      <chr>  <chr>  <chr>   <int> <int>   <dbl>   <dbl>
#>  1 65-80  Placebo   F          RACE   BLACK… 2 ( 2…     79     2       2       2
#>  2 65-80  Placebo   F          RACE   WHITE  18 (2…     79    18       1       2
#>  3 <65    Placebo   F          RACE   BLACK… 1 ( 1…     79     1       2       1
#>  4 <65    Placebo   F          RACE   WHITE  7 ( 8…     79     7       1       1
#>  5 >80    Placebo   F          RACE   BLACK… 2 ( 2…     79     2       2       3
#>  6 >80    Placebo   F          RACE   WHITE  16 (2…     79    16       1       3
#>  7 65-80  Xanomeli… F          RACE   BLACK… 2 ( 2…     81     2       2       2
#>  8 65-80  Xanomeli… F          RACE   WHITE  24 (2…     81    24       1       2
#>  9 <65    Xanomeli… F          RACE   BLACK… 2 ( 2…     81     2       2       1
#> 10 <65    Xanomeli… F          RACE   WHITE  2 ( 2…     81     2       1       1
#> 11 >80    Xanomeli… F          RACE   BLACK… 2 ( 2…     81     2       2       3
#> 12 >80    Xanomeli… F          RACE   WHITE  15 (1…     81    15       1       3
#> 13 65-80  Xanomeli… F          RACE   BLACK… 4 ( 5…     74     4       2       2
#> 14 65-80  Xanomeli… F          RACE   WHITE  22 (2…     74    22       1       2
#> 15 <65    Xanomeli… F          RACE   WHITE  4 ( 5…     74     4       1       1
#> 16 >80    Xanomeli… F          RACE   WHITE  5 ( 6…     74     5       1       3
#> 17 <65    Xanomeli… F          RACE   BLACK… 0          74     0       2       1
#> 18 >80    Xanomeli… F          RACE   BLACK… 0          74     0       2       3
#> # ℹ 6 more variables: SUBGRPVAR1N <dbl>, PCT <dbl>, CPCT <chr>, XVAR <chr>,
#> #   DPTVARN <dbl>, CN <chr>