Extract clusters of CpGs located closely in a genomic region.

CloseBySingleRegion(
  CpGs_char,
  genome = c("hg19", "hg38"),
  arrayType = c("450k", "EPIC"),
  manifest_gr = NULL,
  maxGap = 200,
  minCpGs = 3
)

Arguments

CpGs_char: a list of CpG IDs
genome: Human genome of reference hg19 or hg38
arrayType: Type of array, 450k or EPIC
manifest_gr: A GRanges object with the genome manifest (as returned by ExperimentHub or by ImportSesameData). This function by default ignores this argument in favour of the genome and arrayType arguments.
maxGap: an integer, genomic locations within maxGap from each other are placed into the same cluster
minCpGs: an integer, minimum number of CpGs for the resulting CpG cluster

Value

a list, each item in the list is a character vector of CpG IDs located closely (i.e. in the same cluster)

Details

Note that this function depends only on CpG locations, and not on any methylation data. The algorithm is based on the clusterMaker function in the bumphunter R package. Each cluster is essentially a group of CpG locations such that two consecutive locations in the cluster are separated by less than maxGap.

Examples


   CpGs_char <- c(
     "cg02505293", "cg03618257", "cg04421269", "cg17885402", "cg19890033",
     "cg20566587", "cg27505880"
   )

   cluster_ls <- CloseBySingleRegion(
     CpGs_char,
     genome = "hg19",
     arrayType = "450k",
     maxGap = 100,
     minCpGs = 3
   )
#> snapshotDate(): 2021-05-18
#> see ?sesameData and browseVignettes('sesameData') for documentation