Department of Mathematics
 Search | Help | Login | pdf version | printable version

Math @ Duke



Publications [#225821] of John Harer

Papers Submitted

  1. J. Perea, A. Deckard, S. Haase and J. Harer, Sliding Windows and 1-Persistence Scoring; Discovering Periodicity in Gene Expression Time Series Data, BMC Bioinformatics (2014)
    (last updated on 2014/12/15)

    Motivation: Identifying periodically expressed genes across different processes such as the cell cy- cle, circadian rhythms, and metabolic cycles, is a central problem in computational biology. Biological time series data may contain (multiple) unknown sig- nal shapes, have imperfections such as noise, damp- ing, and trending, or have limited sampling density. While many methods exist for detecting periodicity, their design biases can limit their applicability in one or more of these situations. Methods: We present in this paper a novel method, SW1PerS, for quantifying periodicity in time se- ries data. The measurement is performed directly, without presupposing a particular shape or pattern, by evaluating the circularity of a high-dimensional representation of the signal. SW1PerS is compared to other algorithms using synthetic data and perfor- mance is quantified under varying noise levels, sam- pling densities, and signal shapes. Results on biolog- ical data are also analyzed and compared; this data includes different periodic processes from various or- ganisms: the cell and metabolic cycles in S. cere- visiae, and the circadian rhythms in M. musculus. ∗Department of Mathematics, Duke University, USA and Institute for Mathematics and its Applications, University of Minnesota, USA. †Program in Computational Biology and Bioinformatics, Duke University, USA. ‡Center for Systems Biology, Institute for Genome Sciences & Policy, Duke University, USA. §Departments of Mathematics, Computer Science and Elec- trical and Computer Engineering, Duke University, USA. Results: On the task of periodic/not-periodic clas- sification, using synthetic data, SW1PerS performs on par with successful methods in periodicity detec- tion. Moreover, it outperforms Lomb-Scargle and JTK CYCLE in the high-noise/low-sampling range. SW1PerS is shown to be the most shape-agnostic of the evaluated methods, and the only one to consis- tently classify damped signals as highly periodic. On biological data, and for several experiments, the lists of top 10% genes ranked with SW1PerS recover up to 67% of those generated with other popular algo- rithms. Moreover, lists of genes which are highly- ranked only by SW1PerS contain non-cosine patterns (e.g. ECM33, CDC9, SAM1,2 and MSH6 in the Yeast metabolic cycle data of Tu et al. (2005)) which are highly periodic. In the Yeast cell cycle data SW1PerS identifies genes not preferred by other algorithms, not previously reported in Orlando et al. (2008); Spell- man et al. (1998), but found in other experiments such as the universal growth rate response of Slavov and Botstein (2011). These genes are BOP3, CDC10, YIL108W, YER034W, MLP1, PAC2 and RTT101.
ph: 919.660.2800
fax: 919.660.2821

Mathematics Department
Duke University, Box 90320
Durham, NC 27708-0320