Math @ Duke

Publications [#341907] of Sayan Mukherjee
Papers Published
 Washburne, AD; Silverman, JD; Morton, JT; Becker, DJ; Crowley, D; Mukherjee, S; David, LA; Plowright, RK, Phylofactorization: a graph partitioning algorithm to identify phylogenetic scales of ecological data,
Ecological Monographs
(January, 2019) [doi]
(last updated on 2019/03/27)
Abstract: © 2019 by the Ecological Society of America The problem of pattern and scale is a central challenge in ecology. In community ecology, an important scale is that at which we aggregate species to define our units of study, such as aggregation of “nitrogen fixing trees” to understand patterns in carbon sequestration. With the emergence of massive community ecological data sets, there is a need to objectively identify the scales for aggregating species to capture welldefined patterns in community ecological data. The phylogeny is a scaffold for identifying scales of speciesaggregation associated with macroscopic patterns. Phylofactorization was developed to identify phylogenetic scales underlying patterns in relative abundance data, but many ecological data, such as presenceabsences and counts, are not relative abundances yet may still have phylogenetic scales capturing patterns of interest. Here, we broaden phylofactorization to a graphpartitioning algorithm identifying phylogenetic scales in community ecological data. As a graphpartitioning algorithm, phylofactorization connects many tools from data analysis to phylogenetically informed analyses of community ecological data. Twosample tests identify five phylogenetic factors of mammalian body mass which arose during the KPg extinction event, consistent with other analyses of mammalian body mass evolution. Projection of data onto coordinates connecting the phylogeny and graphpartitioning algorithm yield a phylogenetic principal components analysis which refines our understanding of the major sources of variation in the human gut microbiome. These same coordinates allow generalized additive modeling of microbes in Central Park soils, confirming that a large clade of Acidobacteria thrive in neutral soils. The graphpartitioning algorithm extends to generalized linear and additive modeling of exponential family random variables by phylogenetically constrained reducedrank regression or stepwise factor contrasts. All of these tools can be implemented with the R package phylofactor.


dept@math.duke.edu
ph: 919.660.2800
fax: 919.660.2821
 
Mathematics Department
Duke University, Box 90320
Durham, NC 277080320

