A wrapper function reads RNA-seq related datasets from TCGA and GTEx.
initialize_RNAseq_data()
Its side effects is the global variable TCGA_GTEX_RNAseq_sampletype
, which
was merged from two internal data frames:
(1) .TCGA_GTEX_RNAseq
: the recomputed RNAseq data from both TCGA and GTEx
generated by .get_TCGA_GTEX_RNAseq()
, which imports the dataset
TcgaTargetGtex_RSEM_Hugo_norm_count
.
(2) .TCGA_GTEX_sampletype
annotates the feature for each sample from
TCGA and GTEx. The data frame imports the TcgaTargetGTEX_phenotype.txt
dataset and performed basic data cleaning steps including removal of
duplicates and NAs.
To reduce the data size, we only select the following four relevant columns
out of TcgaTargetGTEX_phenotype.txt
to construct .TCGA_GTEX_sampletype
.
sample.type
column that annotates malignant of normal tissues
primary.disease
column that annotates cancer types for each sample
primary.site
column that annotates the tissue types
study
column that annotates the cohort “TCGA” or “GTEx”
TCGA_GTEX_RNAseq_sampletype
was stored as TCGA_GTEX_RNAseq_sampletype.csv
in
~/Documents/EIF_output/ProcessedData
folder.
Other wrapper function for data initialization:
initialize_cnv_data()
,
initialize_data()
,
initialize_phosphoproteomics_data()
,
initialize_proteomics_data()
,
initialize_survival_data()
if (FALSE) {
initialize_RNAseq_data()
}