Given a vector of MiniMax statisic values under the null hypothesis, estimate the parameters of the Beta Distribution which best fits these values.

MiniMax_estBetaParams(
  MiniMaxNull_num,
  nPlatforms,
  orderStat = 2L,
  method = c("parametric", "MLE", "MoM")
)

Arguments

MiniMaxNull_num

A numeric vector of MiniMax statistics under the null

nPlatforms

An integer stating how many data platforms are in the original data.

orderStat

How many platforms should show a biological signal for a pathway / gene set to have multi-omic "enrichment"? Defaults to 2. See "Details" for more information.

method

Which estimation method will be used to find the parameters of the Beta Distribution? Options are "parametric" (no estimation from the data), "MLE" (Maximum Likelihood Estimates), or "MoM" (Method of Moments estimates). See "Details" for more information.

Value

A list of 3 components: "alpha" and "beta" hold the parameter estimates of the Beta Distribution, and "method" returns a character string denoting which estimation method was used.

Details

Concerning Parameter Estimation Methods: We currently support 3 options to estimate the parameters of the Beta Distribution. The "parametric" option does not use the data. Instead, it assumes that the MiniMax statistics will have a Beta \((k, n + 1 - k)\) distribution, where \(k\) is the value of orderStat and \(n\) has the value nPlatforms. See https://en.wikipedia.org/wiki/Order_statistic.

The next two estimation options make use of the MiniMaxNull_num vector, which should be calculated by finding the same significance levels of the statistical tests used on the real data (for each pathway and data platform), but by using a random permutation of the outcome of interest instead of the real values; more permutations are better. The "MLE" option uses the beta.mle function to find the Maximum Likelihood Estimates of \(\alpha\) and \(\beta\). The "MoM" option uses the closed-form Method of Moments estimators of \(\alpha\) and \(\beta\) as shown in https://en.wikipedia.org/wiki/Beta_distribution#Method_of_moments.

Concerning Appropriate Order Statistics: The MiniMax operation is equivalent to sorting the p-values and taking the second smallest. In our experience, setting this "order statistic" cutoff to 2 is appropriate for =< 5 data platforms. Biologically, this is equivalent to saying "if this pathway is dysregulated in at least two data types for this disease / condition, it is worthy of additional consideration". In situations where more than 5 data platforms are available for the disease of interest, we recommend increasing the orderStat value to 3.

Examples

 miniMax_num <- nullMiniMaxResults_df$MiniMax

 MiniMax_estBetaParams(miniMax_num, nPlatforms = 3L)
#> $alpha
#> [1] 2
#> 
#> $beta
#> [1] 2
#> 
#> $method
#> [1] "Parametric"
#> 
#> attr(,"class")
#> [1] "MiniMaxParams" "list"         
 MiniMax_estBetaParams(miniMax_num, nPlatforms = 3L, method = "MoM")
#> $alpha
#> [1] 1.908401
#> 
#> $beta
#> [1] 2.056421
#> 
#> $method
#> [1] "Method of Moments"
#> 
#> attr(,"class")
#> [1] "MiniMaxParams" "list"         
 MiniMax_estBetaParams(miniMax_num, nPlatforms = 3L, method = "MLE")
#> $alpha
#> [1] 1.787027
#> 
#> $beta
#> [1] 2.03405
#> 
#> $method
#> [1] "Maximum Likelihood"
#> 
#> attr(,"class")
#> [1] "MiniMaxParams" "list"