Figure out how many units to take from each stratum when some strata are deficient. The result should be used as an input to optimize_controls().

generate_qs(
  z,
  st,
  ratio,
  treated = 1,
  max_ratio = NULL,
  max_extra_s = 5,
  strata_dist = NULL
)

Arguments

z

a factor with the ith entry equal to the treatment of unit i.

st

a stratum vector with the ith entry equal to the stratum of unit i. This should have the same order of units and length as z.

ratio

a numeric or vector specifying the desired ratio of controls to `treated` in each stratum. If there is one control group and all treated units should be included, this can be a numeric. Otherwise, this should be a vector with one entry per treatment group, in the same order as the levels of z, including the treated level. If NULL, q_s should be specified.

treated

which treatment value should be considered the treated units. This must be one of the values of z.

max_ratio

a numeric or vector specifying the maximum ratio to allow in a stratum to achieve the overall ratio specified. If NULL, it is set by default to 1.1 times the desired ratio. To have no maximum ratio, set this to Inf.

max_extra_s

single numeric or named vector or matrix with values corresponding to the maximum desired number of extra controls to be chosen from each stratum to achieve the overall ratio specified. If this is a vector, the names should correspond to the stratum values from st. If there are more than two treatment levels, this should be a matrix with one row per treatment level, in the same order as the levels of z. The default is 5 for each stratum in each treatment group. To have no maximum, set this to Inf. If both max_ratio and max_s are specified, the maximum of the two will be used for each stratum.

strata_dist

matrix with both row and column names with names corresponding to the stratum values from st and entries corresponding to the distance associated with taking a control from the stratum associated with the row when the desired stratum is the one associated with the column. Lower distance values are more desirable replacements. Typically the diagonal should be 0, meaning there is no penalty for choosing a unit from the correct stratum.

Value

A named vector stating how many controls to take from each stratum.