Wrapper function to find contiguous and comethyalted sub-regions within a pre-defined genomic region

CoMethSingleRegion(
  CpGs_char,
  dnam,
  betaToM = TRUE,
  rDropThresh_num = 0.4,
  method = c("pearson", "spearman"),
  minCpGs = 3,
  genome = c("hg19", "hg38"),
  arrayType = c("450k", "EPIC"),
  manifest_gr = NULL,
  returnAllCpGs = FALSE
)

Arguments

CpGs_char

vector of CpGs in the inputting pre-defined genomic region.

dnam

matrix (or data frame) of beta values, with row names = CpG ids, column names = sample ids. This should include the CpGs in CpGs_char, as well as additional CpGs.

betaToM

indicates if converting methylation beta values mvalues

rDropThresh_num

threshold for min correlation between a cpg with sum of the rest of the CpGs

method

method for computing correlation, can be "pearson" or "spearman"

minCpGs

minimum number of CpGs to be considered a "region". Only regions with more than minCpGs will be returned.

genome

Human genome of reference hg19 or hg38

arrayType

Type of array, can be "450k" or "EPIC"

manifest_gr

A GRanges object with the genome manifest (as returned by ExperimentHub or by ImportSesameData). This function by default ignores this argument in favour of the genome and arrayType arguments.

returnAllCpGs

When there is not a contiguous comethylated region in the inputing pre-defined region, returnAllCpGs = 1 indicates outputting all the CpGs in the input region, while returnAllCpGs = 0 indicates not returning any CpG.

Value

A list with two components:

  • Contiguous_Regions : a data frame with CpG (CpG ID), Chr (chromosome number), MAPINFO (genomic position), r_drop (correlation between the CpG with rest of the CpGs), keep (indicator for co-methylated CpG), keep_contiguous (index for contiguous comethylated subregion)

  • CpGs_subregions : lists of CpGs in each contiguous co-methylated subregion

Examples

   data(betasChr22_df)

   CpGsChr22_char <- c(
     "cg02953382", "cg12419862", "cg24565820", "cg04234412", "cg04824771",
     "cg09033563", "cg10150615", "cg18538332", "cg20007245", "cg23131131",
     "cg25703541"
   )
   CoMethSingleRegion(
     CpGs_char = CpGsChr22_char,
     dnam = betasChr22_df
   )
#> snapshotDate(): 2021-05-18
#> see ?sesameData and browseVignettes('sesameData') for documentation
#> snapshotDate(): 2021-05-18
#> see ?sesameData and browseVignettes('sesameData') for documentation
#> snapshotDate(): 2021-05-18
#> see ?sesameData and browseVignettes('sesameData') for documentation
#> $contiguousRegions
#>                     Region        CpG   Chr  MAPINFO     r_drop keep
#> 1  chr22:24372913-24373618 cg20007245 chr22 24372913 0.86131041    1
#> 2  chr22:24372913-24373618 cg04824771 chr22 24372921 0.97650367    1
#> 3  chr22:24372913-24373618 cg24565820 chr22 24372926 0.93149530    1
#> 4  chr22:24372913-24373618 cg10150615 chr22 24372951 0.02632093    0
#> 5  chr22:24372913-24373618 cg18538332 chr22 24372958 0.39083913    0
#> 6  chr22:24372913-24373618 cg23131131 chr22 24373011 0.28001379    0
#> 7  chr22:24372913-24373618 cg25703541 chr22 24373054 0.96330673    1
#> 8  chr22:24372913-24373618 cg02953382 chr22 24373134 0.78575626    1
#> 9  chr22:24372913-24373618 cg04234412 chr22 24373322 0.97587636    1
#> 10 chr22:24372913-24373618 cg12419862 chr22 24373484 0.96604153    1
#> 11 chr22:24372913-24373618 cg09033563 chr22 24373618 0.93697621    1
#>    keep_contiguous
#> 1                1
#> 2                1
#> 3                1
#> 4                0
#> 5                0
#> 6                0
#> 7                2
#> 8                2
#> 9                2
#> 10               2
#> 11               2
#> 
#> $CpGsSubregions
#> $CpGsSubregions$`chr22:24372913-24372926`
#> [1] "cg20007245" "cg04824771" "cg24565820"
#> 
#> $CpGsSubregions$`chr22:24373054-24373618`
#> [1] "cg25703541" "cg02953382" "cg04234412" "cg12419862" "cg09033563"
#> 
#> 

   data(betaMatrix_ex3)
   CpGsEx3_char <- c(
     "cg14221598", "cg02433884", "cg07372974", "cg13419809", "cg26856676",
     "cg25246745"
   )
   CoMethSingleRegion(
     CpGs_char = CpGsEx3_char,
     dnam = t(betaMatrix_ex3),
     returnAllCpGs = TRUE
   )
#> snapshotDate(): 2021-05-18
#> see ?sesameData and browseVignettes('sesameData') for documentation
#> snapshotDate(): 2021-05-18
#> see ?sesameData and browseVignettes('sesameData') for documentation
#> $contiguousRegions
#>                      Region        CpG   Chr   MAPINFO      r_drop keep
#> 1 chr10:102790849-102791028 cg14221598 chr10 102790849 -0.02101030    0
#> 2 chr10:102790849-102791028 cg02433884 chr10 102790876  0.01711019    0
#> 3 chr10:102790849-102791028 cg07372974 chr10 102790954  0.22974605    0
#> 4 chr10:102790849-102791028 cg13419809 chr10 102790978 -0.06854353    0
#> 5 chr10:102790849-102791028 cg26856676 chr10 102791011  0.27220476    0
#> 6 chr10:102790849-102791028 cg25246745 chr10 102791028 -0.02519665    0
#>   keep_contiguous
#> 1               0
#> 2               0
#> 3               0
#> 4               0
#> 5               0
#> 6               0
#> 
#> $CpGsSubregions
#> $CpGsSubregions$`chr10:102790849-102791028`
#> [1] "cg14221598" "cg02433884" "cg07372974" "cg13419809" "cg26856676"
#> [6] "cg25246745"
#> 
#>