Utility functions for creating new variables from logicals describing the levels
named logical "rules" defining the levels.
a logical indicating whether the resulting factored should be ordered
Ignored if .asFactor is FALSE.
one of "unique", "first", and "last".
If "unique", exactly one rule must be TRUE for each position.
If "first", the first TRUE rule defines the level.
If "last", the last TRUE rule defines the level.
one of "default", "always", and "never", indicating
whether debugging information should be printed. If "default", debugging
information is printed only when multiple rules give conflicting definitions
for some positions.
One of "given" (the default) or "alpha" or
a vector of integers the same length as the number of levels indicating the
order in which the levels should appear in the resulting factor.
Ignored if .asFactor is FALSE.
character vector of length 1 giving name of default level or
NULL for no default.
A logical indicating whether the returned value should be a factor.
Each logical "rule" corresponds to a level in the resulting variable.
If .default is defined, an implicit rule is added that is TRUE
whenever all other rules are FALSE.
When there are multiple TRUE rules for a slot, the first or last such is used
or an error is generated, depending on the value of method.
derivedVariable is designed to be used with transform() or
dplyr::mutate() to add new
variables to a data frame. derivedFactor() is the same but that the
default value for .asFactor is TRUE. See the examples.
Kf <- mutate(KidsFeet, biggerfoot2 = derivedFactor(
dom = biggerfoot == domhand,
nondom = biggerfoot != domhand)
)
tally( ~ biggerfoot + biggerfoot2, data = Kf)
#> biggerfoot2
#> biggerfoot dom nondom
#> L 2 20
#> R 11 6
tally( ~ biggerfoot + domhand, data = Kf)
#> domhand
#> biggerfoot L R
#> L 2 20
#> R 6 11
# Three equivalent ways to define a new variable
# Method 1: explicitly define all levels
modHELP <- mutate(HELPrct, drink_status = derivedFactor(
abstinent = i1 == 0,
moderate = (i1>0 & i1<=1 & i2<=3 & sex=='female') |
(i1>0 & i1<=2 & i2<=4 & sex=='male'),
highrisk = ((i1>1 | i2>3) & sex=='female') |
((i1>2 | i2>4) & sex=='male'),
.ordered = TRUE)
)
tally( ~ drink_status, data = modHELP)
#> drink_status
#> abstinent moderate highrisk
#> 68 28 357
# Method 2: Use .default for last level
modHELP <- mutate(HELPrct, drink_status = derivedFactor(
abstinent = i1 == 0,
moderate = (i1<=1 & i2<=3 & sex=='female') |
(i1<=2 & i2<=4 & sex=='male'),
.ordered = TRUE,
.method = "first",
.default = "highrisk")
)
tally( ~ drink_status, data = modHELP)
#> drink_status
#> abstinent moderate highrisk
#> 68 28 357
# Method 3: use TRUE to catch any fall through slots
modHELP <- mutate(HELPrct, drink_status = derivedFactor(
abstinent = i1 == 0,
moderate = (i1<=1 & i2<=3 & sex=='female') |
(i1<=2 & i2<=4 & sex=='male'),
highrisk=TRUE,
.ordered = TRUE,
.method = "first"
)
)
tally( ~ drink_status, data = modHELP)
#> drink_status
#> abstinent moderate highrisk
#> 68 28 357
is.factor(modHELP$drink_status)
#> [1] TRUE
modHELP <- mutate(HELPrct, drink_status = derivedVariable(
abstinent = i1 == 0,
moderate = (i1<=1 & i2<=3 & sex=='female') |
(i1<=2 & i2<=4 & sex=='male'),
highrisk=TRUE,
.ordered = TRUE,
.method = "first"
)
)
is.factor(modHELP$drink_status)
#> [1] FALSE