Department of Mathematics
 Search | Help | Login | pdf version | printable version

Math @ Duke



Publications of Sayan Mukherjee    :chronological  alphabetical  combined  bibtex listing:


  1. Mukherjee, SP; Sinha, BK; Chattopadhyay, AK, Statistical methods in social science research (October, 2018), pp. 1-152, ISBN 9789811321450 [doi]  [abs]

Papers Published

  1. Berchuck, S; Jammal, A; Mukherjee, S; Somers, T; Medeiros, FA, Impact of anxiety and depression on progression to glaucoma among glaucoma suspects., The British Journal of Ophthalmology, vol. 105 no. 9 (September, 2021), pp. 1244-1249 [doi]  [abs]
  2. Silverman, JD; Bloom, RJ; Jiang, S; Durand, HK; Dallow, E; Mukherjee, S; David, LA, Measuring and mitigating PCR bias in microbiota datasets., Plos Computational Biology, vol. 17 no. 7 (July, 2021), pp. e1009113 [doi]  [abs]
  3. Wang, B; Sudijono, T; Kirveslahti, H; Gao, T; Boyer, DM; Mukherjee, S; Crawford, L, A statistical pipeline for identifying physical features that differentiate classes of 3D shapes, The Annals of Applied Statistics, vol. 15 no. 2 (June, 2021), pp. 638-661 [doi]  [abs]
  4. Zhang, X; Bashizade, R; Wang, Y; Mukherjee, S; Lebeck, AR, Statistical robustness of Markov chain Monte Carlo accelerators, International Conference on Architectural Support for Programming Languages and Operating Systems Asplos (April, 2021), pp. 959-974, ISBN 9781450383172 [doi]  [abs]
  5. Johnston, RA; Vullioud, P; Thorley, J; Kirveslahti, H; Shen, L; Mukherjee, S; Karner, CM; Clutton-Brock, T; Tung, J, Morphological and genomic shifts in mole-rat 'queens' increase fecundity but reduce skeletal integrity., Elife, vol. 10 (April, 2021) [doi]  [abs]
  6. Bryan, J; Mandan, A; Kamat, G; Gottschalk, WK; Badea, A; Adams, KJ; Thompson, JW; Colton, CA; Mukherjee, S; Lutz, MW; Alzheimer's Disease Neuroimaging Initiative,, Likelihood ratio statistics for gene set enrichment in Alzheimer's disease pathways., Alzheimers Dement, vol. 17 no. 4 (April, 2021), pp. 561-573 [doi]  [abs]
  7. Li, W; Hannig, J; Mukherjee, S, Subspace clustering through sub-clusters, Journal of Machine Learning Research, vol. 22 (January, 2021)  [abs]
  8. Silverman, JD; Roche, K; Mukherjee, S; David, LA, Naught all zeros in sequence count data are the same., Computational and Structural Biotechnology Journal, vol. 18 (2020), pp. 2789-2798 [doi]  [abs]
  9. Berchuck, SI; Mukherjee, S; Medeiros, FA, Estimating Rates of Progression and Predicting Future Visual Fields in Glaucoma Using a Deep Variational Autoencoder., Scientific Reports, vol. 9 no. 1 (December, 2019), pp. 18113 [doi]  [abs]
  10. Cakir, M; Mukherjee, S; Wood, KC, Label propagation defines signaling networks associated with recurrently mutated cancer genes., Scientific Reports, vol. 9 no. 1 (June, 2019), pp. 9401 [doi]  [abs]
  11. Gao, T; Brodzki, J; Mukherjee, S, The Geometry of Synchronization Problems and Learning Group Actions, Discrete & Computational Geometry (January, 2019) [doi]  [abs]
  12. Washburne, AD; Silverman, JD; Morton, JT; Becker, DJ; Crowley, D; Mukherjee, S; David, LA; Plowright, RK, Phylofactorization: a graph partitioning algorithm to identify phylogenetic scales of ecological data, Ecological Monographs (January, 2019) [doi]  [abs]
  13. Crawford, L; Monod, A; Chen, AX; Mukherjee, S; Rabadán, R, Predicting Clinical Outcomes in Glioblastoma: An Application of Topological and Functional Data Analysis, Journal of the American Statistical Association (January, 2019) [doi]  [abs]
  14. Silverman, JD; Durand, HK; Bloom, RJ; Mukherjee, S; David, LA, Correction to: Dynamic linear models guide design and analysis of microbiota studies within artificial human guts., Microbiome, vol. 6 no. 1 (November, 2018), pp. 212 [doi]  [abs]
  15. Silverman, JD; Durand, HK; Bloom, RJ; Mukherjee, S; David, LA, Dynamic linear models guide design and analysis of microbiota studies within artificial human guts., Microbiome, vol. 6 no. 1 (November, 2018), pp. 202 [doi]  [abs]
  16. Barish, S; Nuss, S; Strunilin, I; Bao, S; Mukherjee, S; Jones, CD; Volkan, PC, Combinations of DIPs and Dprs control organization of olfactory receptor neuron terminals in Drosophila., Plos Genetics, vol. 14 no. 8 (August, 2018), pp. e1007560 [doi]  [abs]
  17. Tan, Z; Roche, K; Zhou, X; Mukherjee, S, Scalable algorithms for learning high-dimensional linear mixed models, 34th Conference on Uncertainty in Artificial Intelligence 2018, Uai 2018, vol. 1 (January, 2018), pp. 259-268, ISBN 9781510871601  [abs]
  18. Singleton, KR; Crawford, L; Tsui, E; Manchester, HE; Maertens, O; Liu, X; Liberti, MV; Magpusao, AN; Stein, EM; Tingley, JP; Frederick, DT; Boland, GM; Flaherty, KT; McCall, SJ; Krepler, C; Sproesser, K; Herlyn, M; Adams, DJ; Locasale, JW; Cichowski, K; Mukherjee, S; Wood, KC, Melanoma Therapeutic Strategies that Select against Resistance by Exploiting MYC-Driven Evolutionary Convergence., Cell Reports, vol. 21 no. 10 (December, 2017), pp. 2796-2812 [doi]  [abs]
  19. Darnell, G; Georgiev, S; Mukherjee, S; Engelhardt, BE, Adaptive randomized dimension reduction on massive data, Journal of machine learning research : JMLR, vol. 18 (November, 2017)  [abs]
  20. Gao, T; Yapuncich, GS; Daubechies, I; Mukherjee, S; Boyer, DM, Development and Assessment of Fully Automated and Globally Transitive Geometric Morphometric Methods, With Application to a Biological Comparative Dataset With High Interspecific Variation., The Anatomical Record : Advances in Integrative Anatomy and Evolutionary Biology (October, 2017) [doi]  [abs]
  21. Crawford, L; Wood, KC; Zhou, X; Mukherjee, S, Bayesian Approximate Kernel Regression With Variable Selection, Journal of the American Statistical Association (August, 2017), pp. 1-12 [doi]
  22. Bobrowski, O; Mukherjee, S; Taylor, JE, Topological consistency via kernel estimation, Bernoulli : official journal of the Bernoulli Society for Mathematical Statistics and Probability, vol. 23 no. 1 (February, 2017), pp. 288-328 [doi]
  23. Tan, Z; Mukherjee, S, Partitioned tensor factorizations for learning mixed membership models, 34th International Conference on Machine Learning, Icml 2017, vol. 7 (January, 2017), pp. 5156-5165, ISBN 9781510855144  [abs]
  24. Snyder-Mackler, N; Majoros, WH; Yuan, ML; Shaver, AO; Gordon, JB; Kopp, GH; Schlebusch, SA; Wall, JD; Alberts, SC; Mukherjee, S; Zhou, X; Tung, J, Efficient Genome-Wide Sequencing and Low-Coverage Pedigree Analysis from Noninvasively Collected Samples., Genetics, vol. 203 no. 2 (June, 2016), pp. 699-714 [doi]  [abs]
  25. Zhao, S; Gao, C; Mukherjee, S; Engelhardt, BE, Bayesian group factor analysis with structured sparsity, Journal of machine learning research : JMLR, vol. 17 (April, 2016), pp. 1-47  [abs]
  26. Galinsky, KJ; Bhatia, G; Loh, P-R; Georgiev, S; Mukherjee, S; Patterson, NJ; Price, AL, Fast Principal-Component Analysis Reveals Convergent Evolution of ADH1B in Europe and East Asia., The American Journal of Human Genetics, vol. 98 no. 3 (March, 2016), pp. 456-472 [doi]  [abs]
  27. Munch, E; Turner, K; Bendich, P; Mukherjee, S; Mattingly, J; Harer, J, Probabilistic Fréchet means for time varying persistence diagrams, Electronic Journal of Statistics, vol. 9 no. 1 (January, 2015), pp. 1173-1204 [repository], [doi]  [abs]
  28. Raskutti, G; Mukherjee, S, The information geometry of mirror descent, Lecture notes in computer science, vol. 9389 (January, 2015), pp. 359-368, ISBN 9783319250397 [doi]  [abs]
  29. Stewart, L; MacLean, EL; Ivy, D; Woods, V; Cohen, E; Rodriguez, K; McIntyre, M; Mukherjee, S; Call, J; Kaminski, J; Miklósi, Á; Wrangham, RW; Hare, B, Citizen Science as a New Tool in Dog Cognition Research., PloS one, vol. 10 no. 9 (January, 2015), pp. e0135176 [doi]  [abs]
  30. Turner, K; Mukherjee, S; Boyer, DM, Persistent homology transform for modeling shapes and surfaces, Information and Inference, vol. 3 no. 4 (January, 2014), pp. 310-344 [doi]  [abs]
  31. Bonnefoi H, Potti A, Delorenzi M, Mauriac L, Campone M, Tubiana-Hulin M, Petit T, Rouanet P, Jassem J, Blot E, Becette V, Farmer P, André S, Acharya CR, Mukherjee S, Cameron D, Bergh J, Nevins JR, Iggo RD., Validation of gene signatures that predict the response of breast cancer to neoadjuvant chemotherapy: a substudy of the EORTC 10994/BIG 00-01 clinical trial., Lancet Oncology, vol. 8 no. 12 (December, 2007), pp. 1071-1078 [entrez]
  32. F. Liang, S. Mukherjee, M. West, Understanding the use of unlabelled data in predictive modelling, Statistical Science, vol. 22 no. 2 (Fall, 2007), pp. 189-205
  33. Jen-Tsan Chi1, Edwin H. Rodriguez, Zhen Wang, Dimitry S. A. Nuyten, Sayan Mukherjee, Matt van de Rijn, Marc J. van de Vijver, Trevor Hastie, Patrick O. Brown, Gene Expression Programs of Human Smooth Muscle Cells: Tissue-Specific Differentiation and Prognostic Significance in Breast Cancers, PLoS Genet, vol. 3 no. 9 (September, 2007), pp. 1770-1784 [pdf]
  34. Natesh Pillai, Qiang Wu, Feng Liang, Sayan Mukherjee, Robert L. Wolpert, Characterizing the function space for Bayesian kernel models, Journal of Machine Learning Research, vol. 8 (August, 2007), pp. 1769--1797 [html]  [abs]
  35. Liang Goh, Susan K. Murphy, Sayan Muhkerjee, and Terrence S. Furey, Genomic sweeping for hypermethylated genes, Bioinformatics, vol. 23 no. 3 (February, 2007), pp. 281-288 [btl620v1]
  36. Zhong Wang, Huntington F. Willard, Sayan Mukherjee, Terrence S. Furey, Evidence of Influence of Genomic DNA Sequence on Human X Chromosome Inactivation, Public Library of Science Computational Biology, vol. 2 no. 9 (Winter, 2006), pp. 979-988 [available here]
  37. S. Mukherjee and Q. Wu, Estimation of Gradients and Coordinate Covariation in Classification, Journal of Machine Learning Research, vol. 7 (November, 2006), pp. 2481--2514 [html]
  38. S. Mukherjee, DX. Zhou, Learning Coordinate Covariances via Gradients, Journal of Machine Learning Research, vol. 7 (March, 2006), pp. 519-549 [html]
  39. Elena Edelman, Alessandro Porrello, Justin Guinney, BalaBalakumaran, Andrea Bild, Phillip G. Febbo, and Sayan Mukherjee, Analysis of Sample Set Enrichment Scores: assaying the enrichment of sets of genes for individual samples in genome-wide expression profiles, Bioinformatics, vol. 22 no. 14 (2006), pp. e101-e116 [e108]
  40. Daniela Tropea, Gabriel Kreiman, Alvin Lyckman, Sayan Mukherjee, Hongbo Yu, Sam Horng and Mriganka Sur, Gene expression changes and molecular pathways mediating activity-dependent plasticity in visual cortex, Nature Neuroscience, vol. 9 (2006), pp. 660-668 [html]
  41. A. Potti, S. Mukherjee, R. Petersen, HK. Dressman, A. Bild, J. Koontz, R. Kratzke, MA. Watson, M. Kelley, A Genomic Strategy to Refine Prognosis in Early Stage Non-Small Cell Lung Carcinoma, New England Journal of Medicine, vol. 355 no. 6 (2006), pp. 570-580 [pdf]
  42. A. Subramanian, P. Tamayo, VK. Mootha, S. Mukherjee, BL. Ebert, MA. Gillette, A. Paulovich, SL. Pomeroy, TR. Golub, ES. Lander, JP. Mesirov, Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles, PNAS, vol. 102 no. 43 (October, 2005), pp. 15278-9 [15545]
  43. A. Rakhlin, D. Panchenko, S. Mukherjee, Risk Bounds for Mixture Density Estimation, ESAIM: Probability and Statistics, vol. 9 (June, 2005), pp. 220-229
  44. Sweet-Cordero, A., Mukherjee, S., You, H., Subramnian, S., Ladd, C., Roix, J., Mesirov, J.P., Golub, T.R., Jacks, T, An oncogenic KRAS2 expression signature identified by cross-species gene-expression analysis, Nature Genetics, vol. 37 no. 1 (January, 2005), pp. 48-55 [html]
  45. Wolf, L; Shashua, A; Mukherjee, S, Gene selection via a spectral approach, Ieee Computer Society Conference on Computer Vision and Pattern Recognition Workshops, vol. 2005-September (January, 2005), ISBN 0769526608 [doi]  [abs]
  46. A. Rakhlin, S. Mukherjee, T. Poggio, Stability Results In Learning Theory, Analysis and Applications, vol. 3 no. 4 (2005), pp. 397–417 [html]
  47. P. Golland, F. Liang, S. Mukherjee, D. Panchenko, Permutation Tests for Classification, in Proceedings of Computational Learning Theory 2005, edited by P. Auer and R. Meir (2005), pp. 501-515, Springer-Verlag
  48. S. Mukherjee, P. Niyogi, T. Poggio, R. Rifkin, Statistical Learning: Stability is Sufficient for Generalization and Necessary and Sufficient for Consistency of Empirical Risk Minimization, Advances in Computational Mathematics, vol. 25 no. 1-3 (2005), pp. 161 - 193 [contribution.asp]
  49. R. Berger, PG. Febbo, PK. Majumder, JJ. Zhao, S. Mukherjee, T Campbell, WR. Sellers, TM. Roberts, M. Loda, TR. Golub, WC. Hahan, Androgen-Induced Differentiation and Tumorigenicity of Human Prostate Epithelial Cells, Cancer Research, vol. 64 (December, 2004), pp. 8867-8875 [8867]
  50. T. Poggio, R. Rifkin, S. Mukherjee, P. Niyogi, Learning Theory: general conditions for predictivity, Nature, vol. 428 (March, 2004), pp. 419-422 [html]
  51. R. Rifkin, S. Mukherjee, P. Tamayo, S. Ramaswamy, CH. Yeang, M. Reich, T. Poggio, ES. Lander, TR. Golub, JP. Mesirov, An Analytical Method for Multi-Class Cancer Classification, SIAM Reviews, vol. 45 no. 4 (Winter, 2003), pp. 706-723 [html]
  52. S. Mukherjee, P. Tamayo , S. Rogers, R. Rifkin, A. Engle, C. Campbell, TR. Golub, JP. Mesirov, Estimating Dataset Size Requirements for Classifying DNA Microarray Data, Journal of Computational Biology, vol. 10 no. 2 (April, 2003), pp. 119-142 [10.1089%2F106652703321825928]
  53. LD. Miller, PM. Long,L. Wong, S. Mukherjee, LM. McShane, ET. Liu, Optimal gene expression analysis by microarrays, Cancer Cell, vol. 2 (November, 2002), pp. 353-361 [abstract]
  54. S. Pomeroy, P. Tamayo, M. Gaasenbeek, L. Sturla, M. Angelo, M. McLaughlin, J. Kim, L. Goumnerova, P. Black, C. Lau, J. Allen, D. Zigzag, J. Olson, T. Curran, C. Wetmore, J. Biegel, T. Poggio, S. Mukherjee, R. Rifkin, A. Califano, G. Stolovitzky, D. Louis, Prediction of central nervous system embryonal tumour outcome based of gene expression, Nature, vol. 415 no. 24 (January, 2002), pp. 436-442 [html]
  55. Mukherjee, N; Mukherjee, S, Predicting signal peptides with support vector machines, Lecture notes in computer science, vol. 2388 (January, 2002), pp. 1-7, ISBN 354044016X [doi]  [abs]
  56. S. Ramaswamy, P. Tamayo, R. Rifkin, S. Mukherjee, CH Yeang, M. Angelo, C. Ladd, M. Reich, E. Latulippe, JP. Mesirov, T. Poggio, W. Gerald, M. Loda,, ES. Lander, TR. Golub, Multiclass cancer diagnosis using tumor gene expression signatures, PNAS, vol. 98 no. 26 (December, 2001), pp. 15149-15154 [15149]
  57. O. Chapelle, V. Vapnik, O. Bousquet, S. Mukherjee, Choosing Multiple Parameters for Support Vector Machines, Machine Learning, vol. 46 no. 1-3 (March, 2001), pp. 131-159 [contribution.asp]
  58. Peshkin, L; Mukherjee, S, Bounds on sample size for policy evaluation in Markov environments, Lecture notes in computer science, vol. 2111 (January, 2001), pp. 616-629, ISSN 0302-9743  [abs]
  59. J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, V. Vapnik, Feature Selection for SVMs, in Proceedings of Advances in Neural Information Processing Systems, vol. 14 (2001), pp. 668-674
  60. CH Yeang, S. Ramaswamy, P. Tamayo, S. Mukherjee, R. Rifkin, M. Angelo, M. Reich, E. Lander, J. Mesirov, T. Golub, Molecular classification of multiple tumor types, Bioinformatics, vol. 1 no. 1 (2001), pp. 1-7 [S316]
  61. Pontil, M; Mukherjee, S; Girosi, F, On the noise model of support vector machines regression, in Proceedings of Algorithmic Learning Theory 11th Conference, Lecture notes in computer science, vol. 1968 (2000), pp. 316-324, Springer, Berlin, ISBN 9783540412373  [abs]
  62. V. Vapnik, S. Mukherjee, Support vector method for multivariate density estimation, in Proceedings of Advances in Neural Information Processing Systems, edited by S. A. Solla, T. K. Leen, and K.R. Muller, vol. 12 (2000), pp. 659--665

Papers Accepted

  1. E. Edelman, J. Guinney, J-T. Chi, P.G. Febbo, and S. Mukherjee, Modeling Cancer Progression via Pathway Dependencies, Public Library of Science Computational Biology (2007) [html]

Papers Submitted

  1. Q. Wu, S. Mukherjee, F. Liang, Regularized sliced inverse regression for kernel models., Biometrika (2007) [html]
  2. F. Liang, K. Mao, M. Liao, S. Mukherjee and M. West, Non-parametric Bayesian kernel models, Biometrika (2007) [html]
  3. J. Guinney, Q. Wu, and S. Mukherjee, Estimating variable structure and dependence in Multi-task learning via gradients, Journal of Machine Learning Research (2007) [html]
  4. Q. Wu, J. Guinney, M. Maggioni, and S. Mukherjee, Learning gradients: predictive models that infer geometry and dependence, Journal of Machine Learning Research (2007) [html]
  5. S. Mukherjee, Q. Wu, D-X. Zhou, Learning Gradients and Feature Selection on Manifolds, Annals of Statistics (2007) [html]


  1. Huang, B; Jarrett, NWD; Babu, S; Mukherjee, S; Yang, J, Cümülön: MatrixBased data analytics in the cloud with spot instances, in Proceedings of the VLDB Endowment, vol. 9 (January, 2016), pp. 156-167  [abs]
ph: 919.660.2800
fax: 919.660.2821

Mathematics Department
Duke University, Box 90320
Durham, NC 27708-0320