Publications of Sayan Mukherjee


  1. Mukherjee, SP; Sinha, BK; Chattopadhyay, AK, Statistical methods in social science research (October, 2018), pp. 1-152, ISBN 9789811321450

Papers Published

  1. Berchuck, SI; Mukherjee, S; Medeiros, FA, Estimating Rates of Progression and Predicting Future Visual Fields in Glaucoma Using a Deep Variational Autoencoder., Scientific Reports, vol. 9 no. 1 (December, 2019), pp. 18113
  2. Cakir, M; Mukherjee, S; Wood, KC, Label propagation defines signaling networks associated with recurrently mutated cancer genes., Scientific Reports, vol. 9 no. 1 (June, 2019), pp. 9401
  3. Gao, T; Brodzki, J; Mukherjee, S, The Geometry of Synchronization Problems and Learning Group Actions, Discrete & Computational Geometry (January, 2019)
  4. Washburne, AD; Silverman, JD; Morton, JT; Becker, DJ; Crowley, D; Mukherjee, S; David, LA; Plowright, RK, Phylofactorization: a graph partitioning algorithm to identify phylogenetic scales of ecological data, Ecological Monographs (January, 2019)
  5. Crawford, L; Monod, A; Chen, AX; Mukherjee, S; Rabadán, R, Predicting Clinical Outcomes in Glioblastoma: An Application of Topological and Functional Data Analysis, Journal of the American Statistical Association (January, 2019)
  6. Silverman, JD; Durand, HK; Bloom, RJ; Mukherjee, S; David, LA, Correction to: Dynamic linear models guide design and analysis of microbiota studies within artificial human guts., Microbiome, vol. 6 no. 1 (November, 2018), pp. 212
  7. Silverman, JD; Durand, HK; Bloom, RJ; Mukherjee, S; David, LA, Dynamic linear models guide design and analysis of microbiota studies within artificial human guts., Microbiome, vol. 6 no. 1 (November, 2018), pp. 202
  8. Barish, S; Nuss, S; Strunilin, I; Bao, S; Mukherjee, S; Jones, CD; Volkan, PC, Combinations of DIPs and Dprs control organization of olfactory receptor neuron terminals in Drosophila., Plos Genetics, vol. 14 no. 8 (August, 2018), pp. e1007560
  9. Tan, Z; Roche, K; Zhou, X; Mukherjee, S, Scalable algorithms for learning high-dimensional linear mixed models, 34th Conference on Uncertainty in Artificial Intelligence 2018, Uai 2018, vol. 1 (January, 2018), pp. 259-268, ISBN 9781510871601
  10. Singleton, KR; Crawford, L; Tsui, E; Manchester, HE; Maertens, O; Liu, X; Liberti, MV; Magpusao, AN; Stein, EM; Tingley, JP; Frederick, DT; Boland, GM; Flaherty, KT; McCall, SJ; Krepler, C; Sproesser, K; Herlyn, M; Adams, DJ; Locasale, JW; Cichowski, K; Mukherjee, S; Wood, KC, Melanoma Therapeutic Strategies that Select against Resistance by Exploiting MYC-Driven Evolutionary Convergence., Cell Reports, vol. 21 no. 10 (December, 2017), pp. 2796-2812
  11. Darnell, G; Georgiev, S; Mukherjee, S; Engelhardt, BE, Adaptive randomized dimension reduction on massive data, Journal of machine learning research : JMLR, vol. 18 (November, 2017)
  12. Gao, T; Yapuncich, GS; Daubechies, I; Mukherjee, S; Boyer, DM, Development and Assessment of Fully Automated and Globally Transitive Geometric Morphometric Methods, With Application to a Biological Comparative Dataset With High Interspecific Variation., The Anatomical Record : Advances in Integrative Anatomy and Evolutionary Biology (October, 2017)
  13. Crawford, L; Wood, KC; Zhou, X; Mukherjee, S, Bayesian Approximate Kernel Regression With Variable Selection, Journal of the American Statistical Association (August, 2017), pp. 1-12
  14. Bobrowski, O; Mukherjee, S; Taylor, JE, Topological consistency via kernel estimation, Bernoulli : official journal of the Bernoulli Society for Mathematical Statistics and Probability, vol. 23 no. 1 (February, 2017), pp. 288-328
  15. Tan, Z; Mukherjee, S, Partitioned tensor factorizations for learning mixed membership models, 34th International Conference on Machine Learning, Icml 2017, vol. 7 (January, 2017), pp. 5156-5165, ISBN 9781510855144
  16. Snyder-Mackler, N; Majoros, WH; Yuan, ML; Shaver, AO; Gordon, JB; Kopp, GH; Schlebusch, SA; Wall, JD; Alberts, SC; Mukherjee, S; Zhou, X; Tung, J, Efficient Genome-Wide Sequencing and Low-Coverage Pedigree Analysis from Noninvasively Collected Samples., Genetics, vol. 203 no. 2 (June, 2016), pp. 699-714
  17. Zhao, S; Gao, C; Mukherjee, S; Engelhardt, BE, Bayesian group factor analysis with structured sparsity, Journal of machine learning research : JMLR, vol. 17 (April, 2016), pp. 1-47
  18. Galinsky, KJ; Bhatia, G; Loh, P-R; Georgiev, S; Mukherjee, S; Patterson, NJ; Price, AL, Fast Principal-Component Analysis Reveals Convergent Evolution of ADH1B in Europe and East Asia., The American Journal of Human Genetics, vol. 98 no. 3 (March, 2016), pp. 456-472
  19. Munch, E; Turner, K; Bendich, P; Mukherjee, S; Mattingly, J; Harer, J, Probabilistic Fréchet means for time varying persistence diagrams, Electronic Journal of Statistics, vol. 9 no. 1 (January, 2015), pp. 1173-1204
  20. Raskutti, G; Mukherjee, S, The information geometry of mirror descent, Lecture notes in computer science, vol. 9389 (January, 2015), pp. 359-368, ISBN 9783319250397
  21. Stewart, L; MacLean, EL; Ivy, D; Woods, V; Cohen, E; Rodriguez, K; McIntyre, M; Mukherjee, S; Call, J; Kaminski, J; Miklósi, Á; Wrangham, RW; Hare, B, Citizen Science as a New Tool in Dog Cognition Research., PloS one, vol. 10 no. 9 (January, 2015), pp. e0135176
  22. Turner, K; Mukherjee, S; Boyer, DM, Persistent homology transform for modeling shapes and surfaces, Information and Inference, vol. 3 no. 4 (January, 2014), pp. 310-344
  23. Bonnefoi H, Potti A, Delorenzi M, Mauriac L, Campone M, Tubiana-Hulin M, Petit T, Rouanet P, Jassem J, Blot E, Becette V, Farmer P, André S, Acharya CR, Mukherjee S, Cameron D, Bergh J, Nevins JR, Iggo RD., Validation of gene signatures that predict the response of breast cancer to neoadjuvant chemotherapy: a substudy of the EORTC 10994/BIG 00-01 clinical trial., Lancet Oncology, vol. 8 no. 12 (December, 2007), pp. 1071-1078
  24. F. Liang, S. Mukherjee, M. West, Understanding the use of unlabelled data in predictive modelling, Statistical Science, vol. 22 no. 2 (Fall, 2007), pp. 189-205
  25. Jen-Tsan Chi1, Edwin H. Rodriguez, Zhen Wang, Dimitry S. A. Nuyten, Sayan Mukherjee, Matt van de Rijn, Marc J. van de Vijver, Trevor Hastie, Patrick O. Brown, Gene Expression Programs of Human Smooth Muscle Cells: Tissue-Specific Differentiation and Prognostic Significance in Breast Cancers, PLoS Genet, vol. 3 no. 9 (September, 2007), pp. 1770-1784
  26. Natesh Pillai, Qiang Wu, Feng Liang, Sayan Mukherjee, Robert L. Wolpert, Characterizing the function space for Bayesian kernel models, Journal of Machine Learning Research, vol. 8 (August, 2007), pp. 1769--1797
  27. Liang Goh, Susan K. Murphy, Sayan Muhkerjee, and Terrence S. Furey, Genomic sweeping for hypermethylated genes, Bioinformatics, vol. 23 no. 3 (February, 2007), pp. 281-288
  28. Zhong Wang, Huntington F. Willard, Sayan Mukherjee, Terrence S. Furey, Evidence of Influence of Genomic DNA Sequence on Human X Chromosome Inactivation, Public Library of Science Computational Biology, vol. 2 no. 9 (Winter, 2006), pp. 979-988
  29. S. Mukherjee and Q. Wu, Estimation of Gradients and Coordinate Covariation in Classification, Journal of Machine Learning Research, vol. 7 (November, 2006), pp. 2481--2514
  30. S. Mukherjee, DX. Zhou, Learning Coordinate Covariances via Gradients, Journal of Machine Learning Research, vol. 7 (March, 2006), pp. 519-549
  31. Elena Edelman, Alessandro Porrello, Justin Guinney, BalaBalakumaran, Andrea Bild, Phillip G. Febbo, and Sayan Mukherjee, Analysis of Sample Set Enrichment Scores: assaying the enrichment of sets of genes for individual samples in genome-wide expression profiles, Bioinformatics, vol. 22 no. 14 (2006), pp. e101-e116
  32. Daniela Tropea, Gabriel Kreiman, Alvin Lyckman, Sayan Mukherjee, Hongbo Yu, Sam Horng and Mriganka Sur, Gene expression changes and molecular pathways mediating activity-dependent plasticity in visual cortex, Nature Neuroscience, vol. 9 (2006), pp. 660-668
  33. A. Potti, S. Mukherjee, R. Petersen, HK. Dressman, A. Bild, J. Koontz, R. Kratzke, MA. Watson, M. Kelley, A Genomic Strategy to Refine Prognosis in Early Stage Non-Small Cell Lung Carcinoma, New England Journal of Medicine, vol. 355 no. 6 (2006), pp. 570-580
  34. A. Subramanian, P. Tamayo, VK. Mootha, S. Mukherjee, BL. Ebert, MA. Gillette, A. Paulovich, SL. Pomeroy, TR. Golub, ES. Lander, JP. Mesirov, Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles, PNAS, vol. 102 no. 43 (October, 2005), pp. 15278-9
  35. A. Rakhlin, D. Panchenko, S. Mukherjee, Risk Bounds for Mixture Density Estimation, ESAIM: Probability and Statistics, vol. 9 (June, 2005), pp. 220-229
  36. Sweet-Cordero, A., Mukherjee, S., You, H., Subramnian, S., Ladd, C., Roix, J., Mesirov, J.P., Golub, T.R., Jacks, T, An oncogenic KRAS2 expression signature identified by cross-species gene-expression analysis, Nature Genetics, vol. 37 no. 1 (January, 2005), pp. 48-55
  37. A. Rakhlin, S. Mukherjee, T. Poggio, Stability Results In Learning Theory, Analysis and Applications, vol. 3 no. 4 (2005), pp. 397–417
  38. P. Golland, F. Liang, S. Mukherjee, D. Panchenko, Permutation Tests for Classification, in Proceedings of Computational Learning Theory 2005, edited by P. Auer and R. Meir (2005), pp. 501-515, Springer-Verlag
  39. S. Mukherjee, P. Niyogi, T. Poggio, R. Rifkin, Statistical Learning: Stability is Sufficient for Generalization and Necessary and Sufficient for Consistency of Empirical Risk Minimization, Advances in Computational Mathematics, vol. 25 no. 1-3 (2005), pp. 161 - 193
  40. R. Berger, PG. Febbo, PK. Majumder, JJ. Zhao, S. Mukherjee, T Campbell, WR. Sellers, TM. Roberts, M. Loda, TR. Golub, WC. Hahan, Androgen-Induced Differentiation and Tumorigenicity of Human Prostate Epithelial Cells, Cancer Research, vol. 64 (December, 2004), pp. 8867-8875
  41. T. Poggio, R. Rifkin, S. Mukherjee, P. Niyogi, Learning Theory: general conditions for predictivity, Nature, vol. 428 (March, 2004), pp. 419-422
  42. R. Rifkin, S. Mukherjee, P. Tamayo, S. Ramaswamy, CH. Yeang, M. Reich, T. Poggio, ES. Lander, TR. Golub, JP. Mesirov, An Analytical Method for Multi-Class Cancer Classification, SIAM Reviews, vol. 45 no. 4 (Winter, 2003), pp. 706-723
  43. S. Mukherjee, P. Tamayo , S. Rogers, R. Rifkin, A. Engle, C. Campbell, TR. Golub, JP. Mesirov, Estimating Dataset Size Requirements for Classifying DNA Microarray Data, Journal of Computational Biology, vol. 10 no. 2 (April, 2003), pp. 119-142
  44. LD. Miller, PM. Long,L. Wong, S. Mukherjee, LM. McShane, ET. Liu, Optimal gene expression analysis by microarrays, Cancer Cell, vol. 2 (November, 2002), pp. 353-361
  45. S. Pomeroy, P. Tamayo, M. Gaasenbeek, L. Sturla, M. Angelo, M. McLaughlin, J. Kim, L. Goumnerova, P. Black, C. Lau, J. Allen, D. Zigzag, J. Olson, T. Curran, C. Wetmore, J. Biegel, T. Poggio, S. Mukherjee, R. Rifkin, A. Califano, G. Stolovitzky, D. Louis, Prediction of central nervous system embryonal tumour outcome based of gene expression, Nature, vol. 415 no. 24 (January, 2002), pp. 436-442
  46. Mukherjee, N; Mukherjee, S, Predicting signal peptides with support vector machines, Lecture notes in computer science, vol. 2388 (January, 2002), pp. 1-7, ISBN 354044016X
  47. S. Ramaswamy, P. Tamayo, R. Rifkin, S. Mukherjee, CH Yeang, M. Angelo, C. Ladd, M. Reich, E. Latulippe, JP. Mesirov, T. Poggio, W. Gerald, M. Loda,, ES. Lander, TR. Golub, Multiclass cancer diagnosis using tumor gene expression signatures, PNAS, vol. 98 no. 26 (December, 2001), pp. 15149-15154
  48. O. Chapelle, V. Vapnik, O. Bousquet, S. Mukherjee, Choosing Multiple Parameters for Support Vector Machines, Machine Learning, vol. 46 no. 1-3 (March, 2001), pp. 131-159
  49. Peshkin, L; Mukherjee, S, Bounds on sample size for policy evaluation in Markov environments, Lecture notes in computer science, vol. 2111 (January, 2001), pp. 616-629, ISSN 0302-9743
  50. J. Weston, S. Mukherjee, O. Chapelle, M. Pontil, T. Poggio, V. Vapnik, Feature Selection for SVMs, in Proceedings of Advances in Neural Information Processing Systems, vol. 14 (2001), pp. 668-674
  51. CH Yeang, S. Ramaswamy, P. Tamayo, S. Mukherjee, R. Rifkin, M. Angelo, M. Reich, E. Lander, J. Mesirov, T. Golub, Molecular classification of multiple tumor types, Bioinformatics, vol. 1 no. 1 (2001), pp. 1-7
  52. Pontil, M; Mukherjee, S; Girosi, F, On the noise model of support vector machines regression, in Proceedings of Algorithmic Learning Theory 11th Conference, Lecture notes in computer science, vol. 1968 (2000), pp. 316-324, Springer, Berlin, ISBN 9783540412373
  53. V. Vapnik, S. Mukherjee, Support vector method for multivariate density estimation, in Proceedings of Advances in Neural Information Processing Systems, edited by S. A. Solla, T. K. Leen, and K.R. Muller, vol. 12 (2000), pp. 659--665

Papers Accepted

  1. E. Edelman, J. Guinney, J-T. Chi, P.G. Febbo, and S. Mukherjee, Modeling Cancer Progression via Pathway Dependencies, Public Library of Science Computational Biology (2007)

Papers Submitted

  1. Q. Wu, S. Mukherjee, F. Liang, Regularized sliced inverse regression for kernel models., Biometrika (2007)
  2. F. Liang, K. Mao, M. Liao, S. Mukherjee and M. West, Non-parametric Bayesian kernel models, Biometrika (2007)
  3. J. Guinney, Q. Wu, and S. Mukherjee, Estimating variable structure and dependence in Multi-task learning via gradients, Journal of Machine Learning Research (2007)
  4. Q. Wu, J. Guinney, M. Maggioni, and S. Mukherjee, Learning gradients: predictive models that infer geometry and dependence, Journal of Machine Learning Research (2007)
  5. S. Mukherjee, Q. Wu, D-X. Zhou, Learning Gradients and Feature Selection on Manifolds, Annals of Statistics (2007)


  1. Huang, B; Jarrett, NWD; Babu, S; Mukherjee, S; Yang, J, Cümülön: MatrixBased data analytics in the cloud with spot instances, in Proceedings of the VLDB Endowment, vol. 9 (January, 2016), pp. 156-167