Title: | Estimate and Account for Tumor Purity in Cancer Methylation Data Analysis |
---|---|
Description: | The proportion of cancer cells in solid tumor sample, known as the tumor purity, has adverse impact on a variety of data analyses if not properly accounted for. We develop 'InfiniumPurify', which is a comprehensive R package for estimating and accounting for tumor purity based on DNA methylation Infinium 450k array data. 'InfiniumPurify' provides functionalities for tumor purity estimation. In addition, it can perform differential methylation detection and tumor sample clustering with the consideration of tumor purities. |
Authors: | Yufang Qin |
Maintainer: | Yufang Qin <[email protected]> |
License: | GPL-2 |
Version: | 1.3.1 |
Built: | 2024-11-26 03:05:29 UTC |
Source: | https://github.com/cran/InfiniumPurify |
This data set lists abbreviations for all TCGA cancer types.
abbr
abbr
A dataframe containing names and abbreviations for all TCGA cancer types.
X. Zheng, N. Zhang, H.J. Wu and H. Wu, Estimating and accounting for tumor purity in the analysis of DNA methylation data from cancer studies. Genome biology, accepted.
An example data set for InfiniumClust and InfiniumPurify.
beta.emp
beta.emp
A dataframe containing methylaton beta values for 62 tumor and normal samples.
X. Zheng, N. Zhang, H.J. Wu and H. Wu, Estimating and accounting for tumor purity in the analysis of DNA methylation data from cancer studies. Genome biology, accepted.
Print tumor types and their abbreviations with known informative DMCs.
CancerTypeAbbr()
CancerTypeAbbr()
None.
Xiaoqi Zheng [email protected].
X. Zheng, N. Zhang, H.J. Wu and H. Wu, Estimating and accounting for tumor purity in the analysis of DNA methylation data from cancer studies. Genome biology, in revision.
data(abbr) CancerTypeAbbr()
data(abbr) CancerTypeAbbr()
Estimate the percentage of tumor cells in cancer samples which are mixtures of cancer and normal cells.added a sentence
getPurity(tumor.data,normal.data = NULL,tumor.type = NULL)
getPurity(tumor.data,normal.data = NULL,tumor.type = NULL)
tumor.data |
numeric vector/matrix of beta values for tumor samlpes. The names/rownames of tumor.data should be probe names of Infinium 450k array, and colnames should be names of tumor samples. |
normal.data |
numeric matrix of beta values for normal samlpes. The rownames of normal.data should be probe names of Infinium 450k array, and colnames should be names of normal samples. |
tumor.type |
cancer type (in abbreviation) of tumor and normal samlpes. Options are "LUAD", "BRCA" and so
on. See |
Arguments normal.data and tumor.type could be null. If either the number of tumor samples or number of normal smaples is less than 20, the tumor.type argument should be specified according to CancerTypeAbbr
. If the numbers of tumor and normal samples are both more than 20, tumor.type could be null. In such case, getPurity first identify 1000 iDMCs by Wilcox rank-sum test, then tumor purity for each sample is estimated as the density mode of adjusted methylation levels of iDMCs.
A vector of tumor purities for each tumor sample.
Xiaoqi Zheng [email protected].
N. Zhang, H.J. Wu, W. Zhang, J. Wang, H. Wu and X. Zheng (2015) Predicting tumor purity from methylation microarray data. Bioinformatics 31(21), 3401-3405.
X. Zheng, N. Zhang, H.J. Wu and H. Wu, Estimating and accounting for tumor purity in the analysis of DNA methylation data from cancer studies. Genome biology, accepted.
## load example data data(beta.emp) normal.data <- beta.emp[,1:21] tumor.data <- beta.emp[,22:61] ## call purity for single tumor sample purity <- getPurity(tumor.data = tumor.data[,1],normal.data = NULL,tumor.type= "LUAD") ## call purity for less than 20 tumor samples purity <- getPurity(tumor.data = tumor.data[,1:10],normal.data = NULL,tumor.type= "LUAD") ## call purity for more than 20 tumor samples with matched normal samples purity <- getPurity(tumor.data = tumor.data[,1:40],normal.data = normal.data)
## load example data data(beta.emp) normal.data <- beta.emp[,1:21] tumor.data <- beta.emp[,22:61] ## call purity for single tumor sample purity <- getPurity(tumor.data = tumor.data[,1],normal.data = NULL,tumor.type= "LUAD") ## call purity for less than 20 tumor samples purity <- getPurity(tumor.data = tumor.data[,1:10],normal.data = NULL,tumor.type= "LUAD") ## call purity for more than 20 tumor samples with matched normal samples purity <- getPurity(tumor.data = tumor.data[,1:40],normal.data = normal.data)
This data set lists pre-selected iDMCs for all TCGA cancer types.
iDMC
iDMC
A list containing informative Differential methylation CpG sites (iDMC) and their average methylation levels in tumor and normal samples.
X. Zheng, N. Zhang, H.J. Wu and H. Wu, Estimating and accounting for tumor purity in the analysis of DNA methylation data from cancer studies. Genome biology, accepted.
Clustering of tumor samples into subtypes accounting for tumor purity.
InfiniumClust(tumor.data, purity, K, maxiter = 100, tol = 0.001)
InfiniumClust(tumor.data, purity, K, maxiter = 100, tol = 0.001)
tumor.data |
numeric matrix of beta values for tumor samlpes. The rownames of tumor.data should be probe names of Infinium 450k array, and colnames should be names of tumor samples. |
purity |
purities for tumor samples. Could be estimated by |
K |
the number of clusters. |
maxiter |
the maximum number of iterations allowed. Default is 100. |
tol |
tolerance for convergence of EM iterations. Default is 0.001. |
An EM based statistical method for subtype classification based on DNA methylation data, while adjusting for tumor purity.
InfiniumClust returns a list consisting oflikelihood tol.ll
and membership matrix Z
.
tol.ll |
total log-likelihood of converged EM algorithm. |
Z |
the membership matrix, where row corresponds to tumor samples and column corresponds to K clusters. |
Xiaoqi Zheng [email protected] and Hao Wu [email protected]
W. Zhang, H. Feng, H. Wu and X. Zheng (2016). Tumor purity improves cancer subtype classification from DNA methylation data. Submitted.
## load example data data(beta.emp) normal.data <- beta.emp[,1:21] tumor.data <- beta.emp[,22:31] ## estimate tumor purity purity <- getPurity(tumor.data = tumor.data,tumor.type= "LUAD") ## cluster tumor samples accounting for tumor purity out <- InfiniumClust(tumor.data,purity,K=3, maxiter=5, tol=0.001)
## load example data data(beta.emp) normal.data <- beta.emp[,1:21] tumor.data <- beta.emp[,22:31] ## estimate tumor purity purity <- getPurity(tumor.data = tumor.data,tumor.type= "LUAD") ## cluster tumor samples accounting for tumor purity out <- InfiniumClust(tumor.data,purity,K=3, maxiter=5, tol=0.001)
Infer differentially methylated CpG sites with the consideration of tumor purities.
InfiniumDMC(tumor.data,normal.data,purity,threshold)
InfiniumDMC(tumor.data,normal.data,purity,threshold)
tumor.data |
numeric matrix of beta values for tumor samlpes. The rownames of tumor.data should be probe names of Infinium 450k array, and colnames should be names of tumor samples. |
normal.data |
numeric matrix of beta values for normal samlpes. The rownames of normal.data should be probe names of Infinium 450k array, and colnames should be names of normal samples. |
purity |
purities for tumor samples. Could be estimated by getPurity, or user specified purities from other tools. |
threshold |
probability threshold in control-free DM calling. Default is 0.1. |
If normal.data is provided, the function tests each CpG site for differential methylation between tumor and normal samples with the consideration of tumor purities by a generalized linear regression. If normal.data is not provided, the function computes posterior probability to rank CpG sites.
A data frame of statistics, p-values and q-values for all CpG sites.
Xiaoqi Zheng [email protected].
X. Zheng, N. Zhang, H.J. Wu and H. Wu, Estimating and accounting for tumor purity in the analysis of DNA methylation data from cancer studies. Genome biology, in revision.
dmpFinder
in the minfi package.
## load example data data(beta.emp) normal.data <- beta.emp[,1:21] tumor.data <- beta.emp[,22:61] ## estimate tumor purity purity <- getPurity(tumor.data = tumor.data,normal.data = normal.data) ## DM calling with normal controls DMC = InfiniumDMC(tumor.data = tumor.data,normal.data = normal.data,purity = purity) ## DM calling without normal control DMC_ctlFree = InfiniumDMC(tumor.data = tumor.data,purity = purity)
## load example data data(beta.emp) normal.data <- beta.emp[,1:21] tumor.data <- beta.emp[,22:61] ## estimate tumor purity purity <- getPurity(tumor.data = tumor.data,normal.data = normal.data) ## DM calling with normal controls DMC = InfiniumDMC(tumor.data = tumor.data,normal.data = normal.data,purity = purity) ## DM calling without normal control DMC_ctlFree = InfiniumDMC(tumor.data = tumor.data,purity = purity)
Deconvolute purified tumor methylomes accounting for tumor purity.
InfiniumPurify(tumor.data,normal.data,purity)
InfiniumPurify(tumor.data,normal.data,purity)
tumor.data |
numeric matrix of beta values for tumor samlpes. The rownames of tumor.data should be probe names of Infinium 450k array, and colnames should be names of tumor samples. |
normal.data |
numeric matrix of beta values for normal samlpes. The rownames of normal.data should be probe names of Infinium 450k array, and colnames should be names of normal samples. |
purity |
purities for tumor samples. Could be estimated by getPurity, or user specified purities from other tools. |
The function deconvolutes purified tumor methylomes by a linear regression model.
A matrix of purified beta values for all CpG sites (row) and tumor samples (column).
Xiaoqi Zheng [email protected].
X. Zheng, N. Zhang, H.J. Wu and H. Wu, Estimating and accounting for tumor purity in the analysis of DNA methylation data from cancer studies. Genome biology, accepted.
## load example data data(beta.emp) normal.data <- beta.emp[,1:21] tumor.data <- beta.emp[,22:61] ## estimate tumor purity purity <- getPurity(tumor.data = tumor.data,normal.data = NULL,tumor.type= "LUAD") ## correct tumor methylome by tumor purity tumor.purified = InfiniumPurify(tumor.data = tumor.data[1:100,], normal.data = normal.data[1:100,], purity = purity)
## load example data data(beta.emp) normal.data <- beta.emp[,1:21] tumor.data <- beta.emp[,22:61] ## estimate tumor purity purity <- getPurity(tumor.data = tumor.data,normal.data = NULL,tumor.type= "LUAD") ## correct tumor methylome by tumor purity tumor.purified = InfiniumPurify(tumor.data = tumor.data[1:100,], normal.data = normal.data[1:100,], purity = purity)