Fit mixed model to methylation values in one genomic region

lmmTest(
  betaOne_df,
  pheno_df,
  contPheno_char,
  covariates_char,
  modelType = c("randCoef", "simple"),
  genome = c("hg19", "hg38"),
  arrayType = c("450k", "EPIC"),
  manifest_gr = NULL,
  ignoreStrand = TRUE,
  outLogFile = NULL
)

Arguments

betaOne_df

matrix of beta values for one genomic region, with row names = CpG IDs and column names = sample IDs

pheno_df

a data frame with phenotype and covariates, with variable Sample indicating sample IDs.

contPheno_char

character string of the main effect (a continuous phenotype) to be tested for association with methylation values in the region

covariates_char

character vector for names of the covariate variables

modelType

type of mixed model: can be randCoef for random coefficient mixed model or simple for simple linear mixed model.

genome

Human genome of reference: hg19 or hg38

arrayType

Type of array: "450k" or "EPIC"

manifest_gr

A GRanges object with the genome manifest (as returned by ExperimentHub or by ImportSesameData). This function by default ignores this argument in favour of the genome and arrayType arguments.

ignoreStrand

Whether strand can be ignored, default is TRUE

outLogFile

Name of log file for messages of mixed model analysis

Value

A dataframe with one row for association result of one region and the following columns: Estimate, StdErr, and pvalue showing the association of methylation values in the genomic region tested with the continuous phenotype supplied in contPheno_char

Details

This function implements a mixed model to test association between methylation M values in a genomic region with a continuous phenotype. In our simulation studies, we found both models shown below are conservative, so p-values are estimated from normal distributions instead of Student's t distributions.

When modelType = "randCoef", the model is:

M ~ contPheno_char + covariates_char + (1|Sample) + (contPheno_char|CpG).

The last term specifies random intercept and slope for each CpG. When modelType = "simple", the model is

M ~ contPheno_char + covariates_char + (1|Sample).

Examples

  data(betasChr22_df)

  CpGsChr22_char <- c(
    "cg02953382", "cg12419862", "cg24565820", "cg04234412", "cg04824771",
    "cg09033563", "cg10150615", "cg18538332", "cg20007245", "cg23131131",
    "cg25703541"
  )

  coMethCpGs <- CoMethSingleRegion(CpGsChr22_char, betasChr22_df)
#> snapshotDate(): 2021-05-18
#> see ?sesameData and browseVignettes('sesameData') for documentation
#> snapshotDate(): 2021-05-18
#> see ?sesameData and browseVignettes('sesameData') for documentation
#> snapshotDate(): 2021-05-18
#> see ?sesameData and browseVignettes('sesameData') for documentation

  # test only the first co-methylated region
  coMethBeta_df <- betasChr22_df[coMethCpGs$CpGsSubregions[[1]], ]

  data(pheno_df)

  res <- lmmTest(
    betaOne_df = coMethBeta_df,
    pheno_df,
    contPheno_char = "stage",
    covariates_char = c("age.brain", "sex"),
    modelType = "randCoef",
    arrayType = "450k", 
    ignoreStrand = TRUE
  )
#> snapshotDate(): 2021-05-18
#> see ?sesameData and browseVignettes('sesameData') for documentation
#> Analyzing region chr22:24372913-24372926.