Biocomputing 2013 - Proceedings Of The Pacific Symposium.
- 1 online resource (471 pages)
Intro -- Modeling cell heterogeneity: from single-cell variations to mixed cells populations 445 -- Computational Challenges of Mass Phenotyping 454 -- The Future of Genome-Based Medicine 456 -- 0session-intro-cdr.pdf -- 1cheng -- 1. Introduction -- 2. Methods -- 2.1. Data sources and data processing -- 2.2. Pair-wise similarity scores -- 2.3. Method nomenclature -- 2.4. AUCs and p-values -- 2.5. Expression signal strength -- 3. Results -- 4. Discussion -- 5. Acknowledgments -- 2felciano -- 3phatak -- 4shi -- 5wang -- 0intro-epigenomics.pdf -- 1ahn -- 2luo -- 3sahu -- 1gabr -- 2gevaert -- 3kim -- 1. Introduction -- 2. Methods -- 2.1. Introduction of the Module Cover Problem -- 2.2. Integrated Module Cover -- 2.3. Two-Step Module Cover -- 3. Results -- 3.1. Analysis of Glioblastoma Multiforme Data from GMDI -- 3.1.1. Comparison of the Module Cover approaches. -- We applied the integrated greedy module cover algorithm with k = 300 and = 1, allowing 5 samples (3%) to be covered less than k times to exclude outliers. We discuss the more detailed parameter selection in online Appendix Section 2. In particular, we found that the number of non-trivial modules (i.e. ≥ 3 genes) starts to level with k = 300, prompting us to choose this parameter value for our main analysis. We obtained 249 modules that contained a total of 513 genes including 41 non-singleton. We also computed the entropy of association profiles for each module. Since entropy measures the uncertainty of data, a good quality module (with only a few strong associations) is expected to have low entropy while entropy increases as data is more uniformly distributed. Formally, for each module M, we partitioned the range from 0 to strength (M) into 10 bins of equal sizes and assigned loci according to their significance. In each bin, we computed the percentage of loci and defined the entr -- For an association to be specific in a given module, only a few regulatory associations should have highly significant p-values while the remaining loci are expected to have insignificant p-values. Thus, we defined the specificity of a module M as the area of a cumulative histogram of association significance values. Specifically, we partitioned the range from 0 to strength (M) into 10 bins of equal sizes and defined cj to be the cumulative percentage of j-th bin. Then the specificity is defined -- 3.1.2. Analysis of GBM data -- 3.1.3. Analysis of Ovarian Cancer Data -- 4. Discussion -- Uncovering modules that are associated with genomic alterations in a disease is a challenging task as well as an important step to understand complex diseases. To address this challenge we introduced a novel technique - module cover - that extends the concept of set cover to network modules. We provided a mathematical formalization of the problem and developed two heuristic solutions: the Integrated Module Cover approach, which greedily selects genes to cover disease cases while simultaneously d. In general, the module cover approach is especially helpful in analyzing and classifying heterogeneous disease cases by exploring the way different combinations of dys-regulated of modules relate to a particular disease subcategory. Indeed, our analysis indicated that the gene set selected by module cover approach may be used for classification. Equally important, the selected module covers may help to interpret classifications that were obtained with other methods. -- 5. Materials -- 5.1 Data Treatment for Glioblastoma Multiforme Data from GMDI -- Differentially Expressed Genes: Briefly, all samples were profiled using HG-U133 Plus 2.0 arrays that were normalized at the probe level with dChip (16, 19). Among probes representing each gene, we chose the probeset with the highest mean intensity in the tumor and control samples. We determined genes that are differentially expressed in each disease case compared to the non-tumor control cases with a Z-test. For a gene g and case c, we define cover(c, g) to be 1 if nominal p-value < -- 0.01 and 0 -- eQTL Profiles: To detect copy number alterations, samples were hybridized on the Genechip Human Mapping 100K arrays, and copy numbers were calculated using Affymetrix Copy Number Analysis Tool (CNAT 4). After probe-level normalization and summarization, calculated log2-tranformed ratios were used to estimate raw copy numbers. Using a Gaussian approach, raw SNP profiles were smoothed (> -- 500 kb window by default) and segmented with a Hidden Markov Model approach (20-22). We first performed local c -- 5.2 Data Treatment for Ovarian Cancer Data from TCGA -- 4pendergrass -- 5perez-rathke -- 0intro-pm-rev.pdf -- 1biswas -- 2crawford -- 3flores -- 4huang -- 5li. Alzheimer's disease (AD) is one of the leading causes of death for older people in US with rapidly increasing incidence. AD irreversibly and progressively damages the brain, but there are treatments in clinical trials to potentially slow the developme... -- 1. Introduction -- 2.1 Utilizing VARiant Informing MEDicine (VARIMED) -- 3 Result -- 5 Acknowledgments -- 6province -- 0intro-ppg.pdf -- 1bayzid -- 2degnan -- 3kopelman -- 4lin -- 5roch -- 1brown -- 2ding -- 3moore -- 4schrider -- 5singh -- 0intro-text.pdf -- 1bush -- 2holzinger -- 3hu -- 4kolchinsky -- 5seedorff -- 6verspoor -- 1modeling.pdf -- 3ccmp -- 4pm.