This function test contrasts from a sccomp result.
sccomp_test(
.data,
contrasts = NULL,
percent_false_positive = 5,
test_composition_above_logit_fold_change = 0.1,
pass_fit = TRUE
)
A tibble. The result of sccomp_estimate.
A vector of character strings. For example if your formula is ~ 0 + treatment
and the factor treatment has values yes
and no
, your contrast could be "constrasts = c(treatmentyes - treatmentno)".
A real between 0 and 100 non included. This used to identify outliers with a specific false positive rate.
A positive integer. It is the effect threshold used for the hypothesis test. A value of 0.2 correspond to a change in cell proportion of 10% for a cell type with baseline proportion of 50%. That is, a cell type goes from 45% to 50%. When the baseline proportion is closer to 0 or 1 this effect thrshold has consistent value in the logit uncontrained scale.
A boolean. Whether to pass the Stan fit as attribute in the output. Because the Stan fit can be very large, setting this to FALSE can be used to lower the memory imprint to save the output.
A tibble (tbl
), with the following columns:
cell_group - The cell groups being tested.
parameter - The parameter being estimated from the design matrix described by the input formula_composition and formula_variability.
factor - The covariate factor in the formula, if applicable (e.g., not present for Intercept or contrasts).
c_lower - Lower (2.5%) quantile of the posterior distribution for a composition (c) parameter.
c_effect - Mean of the posterior distribution for a composition (c) parameter.
c_upper - Upper (97.5%) quantile of the posterior distribution for a composition (c) parameter.
c_pH0 - Probability of the c_effect being smaller or bigger than the test_composition_above_logit_fold_change
argument.
c_FDR - False discovery rate of the c_effect being smaller or bigger than the test_composition_above_logit_fold_change
argument. False discovery rate for Bayesian models is calculated differently from frequentists models, as detailed in Mangiola et al, PNAS 2023.
c_n_eff - Effective sample size, the number of independent draws in the sample. The higher, the better.
c_R_k_hat - R statistic, a measure of chain equilibrium, should be within 0.05 of 1.0.
v_lower - Lower (2.5%) quantile of the posterior distribution for a variability (v) parameter.
v_effect - Mean of the posterior distribution for a variability (v) parameter.
v_upper - Upper (97.5%) quantile of the posterior distribution for a variability (v) parameter.
v_pH0 - Probability of the v_effect being smaller or bigger than the test_composition_above_logit_fold_change
argument.
v_FDR - False discovery rate of the v_effect being smaller or bigger than the test_composition_above_logit_fold_change
argument. False discovery rate for Bayesian models is calculated differently from frequentists models, as detailed in Mangiola et al, PNAS 2023.
v_n_eff - Effective sample size for a variability (v) parameter.
v_R_k_hat - R statistic for a variability (v) parameter, a measure of chain equilibrium.
count_data - Nested input count data.
#'
message("Use the following example after having installed install.packages(\"cmdstanr\", repos = c(\"https://stan-dev.r-universe.dev/\", getOption(\"repos\")))")
#> Use the following example after having installed install.packages("cmdstanr", repos = c("https://stan-dev.r-universe.dev/", getOption("repos")))
# \donttest{
if (instantiate::stan_cmdstan_exists()) {
data("counts_obj")
estimates = sccomp_estimate(
counts_obj,
~ 0 + type, ~1, sample, cell_group, count,
cores = 1
) |>
sccomp_test("typecancer - typebenign")
}
# }