Given a data frame with regions in the genome, add gene symbols, UCSC reference gene accession, UCSC reference gene group and relation to CpG island.

AnnotateResults(lmmRes_df, arrayType = c("450k", "EPIC"), nCores_int = 1L, ...)

Arguments

lmmRes_df

A data frame returned by lmmTestAllRegions. This data frame must contain the following columns:

  • chrom : the chromosome the region is on, e.g. ``chr22''

  • start : the region start point

  • end : the region end point

arrayType

Type of array: 450k or EPIC

nCores_int

Number of computing cores to be used when executing code in parallel. Defaults to 1 (serial computing).

...

Dots for additional arguments passed to the cluster constructor. See CreateParallelWorkers for more information.

Value

A data frame with

  • the location of the genomic region's chromosome (chrom), start (start), and end (end);

  • UCSC annotation information (UCSC_RefGene_Group, UCSC_RefGene_Accession, and UCSC_RefGene_Name); and

  • a list of all of the probes in that region (probes).

Details

The region types include "NSHORE", "NSHELF", "SSHORE", "SSHELF", "TSS1500", "TSS200", "UTR5", "EXON1", "GENEBODY", "UTR3", and "ISLAND".

Examples

   lmmResults_df <- data.frame(
     chrom = c("chr22", "chr22", "chr22", "chr22", "chr22"),
     start = c("39377790", "50987294", "19746156", "42470063", "43817258"),
     end   = c("39377930", "50987527", "19746368", "42470223", "43817384"),
     regionType = c("TSS1500", "EXON1", "ISLAND", "TSS200", "ISLAND"),
     stringsAsFactors = FALSE
   )

   AnnotateResults(
     lmmRes_df = lmmResults_df,
     arrayType = "450k"
   )
#> Setting options('download.file.method.GEOquery'='auto')
#> Setting options('GEOquery.inmemory.gpl'=FALSE)
#>   chrom    start      end regionType UCSC_RefGene_Group
#> 1 chr22 39377790 39377930    TSS1500            TSS1500
#> 2 chr22 50987294 50987527      EXON1            1stExon
#> 3 chr22 19746156 19746368     ISLAND              5'UTR
#> 4 chr22 42470063 42470223     TSS200             TSS200
#> 5 chr22 43817258 43817384     ISLAND              5'UTR
#>          UCSC_RefGene_Accession UCSC_RefGene_Name Relation_to_Island
#> 1                     NM_004900          APOBEC3B            OpenSea
#> 2                     NM_138433           KLHDC7B             Island
#> 3 NM_005992;NM_080646;NM_080647              TBX1             Island
#> 4                  NM_001002034           FAM109B             Island
#> 5                  NM_001044370            MPPED1             Island