DoSurvive is a database and tool for survival analysis of data for 33 cancer types and a number of
covariates including the expression levels of proteins, mRNA, miRNA, lncRNA and methylation levels of CpG sites.
Additionally, DoSurvive provides "Explore Combined Effect" whereby the can
perform real-time survival analysis for a combination of genes, including mRNAs, miRNAs,and lncRNAs. DoSurvive aims to
offer biologists and cancer researchers a platform for quickly and easily exploring potential prognosis biomarkers in
cancer.
2. What data are in DoSurvive?
There are two kinds of data: clinical data, and quantitative features. These include expression
levels of CpG sites in the genome.
All the expression values for mRNA, miRNA and protein and methylation data for
33 cancer types in TCGA were downloaded from Broad GDAC Firehose. The
expression data for lncRNA were downloaded from TANRIC
database (Li J, et al.). The clinical data and the survival information, including overaal survival (OS), disease
specific survival (DSS), diseaase-free survival (DFI), and progression-free survival (PFS), were downloaded from
Pan-Cancer Atlas.
3. How to use DoSurvive?
There are two main entry points: browsing the database, and real-time survival analysis. A user who is interested in the survival analysis results for a specific single gene or a specific cancer type can click "Browse Database" to browse the precalculated database. A user who is interested in the combined effect of multiple genes can click "Inquiry into Combined Effects" to perform a Log-rank test, or run a Cox proportional hazards model or Accelerated failure-time (AFT) model.
4. What are the data preprocessing steps in DoSurvive?
Only genes/proteins with non-zero values in more than 75% of the samples were included in DoSurvive. Also, patients with incomplete clinical data were excluded from Cox regression and Accelerated failure-time model analyses.
5. Can I use the figures or analysis results from DoSurvive in my publications?
Yes. If you use the results tables and/or figure in your research, please cite DoSurvive.
6. Has the raw data been normalized in the survival analysis?
Yes. The raw expression levels and methylation levels were normalized by rank-based inverse normal transformation for the Cox regression and Accelerated failure-time model analyses.
7. What is "Two genes" analysis?
In "Two genes" analysis, the user can investigate the combined effect of two genes on prognosis using Log-rank tests, Cox regression and Accelerated failure-time models, both the normalized expression levels and relevant clinical features are used in building the model.
For the non-parametric method, Log-rank test patients
are grouped into two groups based on the medians of the two genes. All of the patients
are classified into four groups: high in both genes, high in the first one and low in
the second one, low in the first one and high in the second one, and low in both
genes. Based on these four groups, the user can divide all the patients into
two groups and perform a long-rank test for the difference between these two groups.
8. What is ">2 genes" analysis?
In “>2 genes” analysis, the user can investigate the combined effect of multiple genes on prognosis, using
Cox regression and failure-time model analyses involving both the normalized expression levels and relevant clinical features.
9. Can I download the tables from DoSurvive?
Yes. Both the precalculated survival analysis results and the raw data used for the statistical analysis
can be downloaded. The user can use the "Download" button above each table to download the precalculated results, or use
the download hyperlink below each survival plot to download the data used for survival analysis.
10. Which web browsers should I use with DoSurvive?
HTML5-compliant browsers such as Safari, Chrome and Firefox, are recommended.