Guido Montúfar's Homepage

Guido Montúfar

UCLA Mathematics and Statistics & Data Science
Los Angeles, CA 90095, USA

Phone:	+1 (310) 206-2671
Fax:	+1 (310) 206-6673
Office:	7620E Math Sciences
Email:	montufar at math.ucla.edu

Short CV

Professor, Departments of Mathematics and of Statistics & Data Science, UCLA (since 2024)

Associate Professor, Departments of Mathematics and Statistics & Data Science, UCLA (2022-2024)

Research Group Leader, Mathematical Machine Learning Group, Max Planck Institute for Mathematics in the Sciences (since 2018)

Assistant Professor, Departments of Mathematics and Statistics, UCLA (2017-2022)

Postdoc, Max Planck Institute for Mathematics in the Sciences, Information Theory of Cognitive Systems Group (2013-2017)

Research Associate, Department of Mathematics, Pennsylvania State University (2012-2013)

Dr. rer. nat. in Mathematics, MPI MIS/Leipzig University (2012)

Diplom Physiker, TU Berlin (2009)

Diplom Mathematiker, TU Berlin (2007)

Research Interests

Mathematical Machine Learning

Deep Learning Theory

Optimization and Implicit Biases

Verification and Reliable AI

Algebraic and Geometric Methods in AI

Activities

Math Machine Learning Seminar MPI MIS + UCLA

SECAI School of Embedded Composite Artificial Intelligence

Conference on Mathematics of Machine Learning 2025

MFO Workshop Modern and Emerging Phenomena in Machine Learning 2026

ICML Workshop CBT 2026

IPAM Workshop Algebraic Geometry: A Window to Machine Learning 2027

Grants and awards

RTG: Statistics and Data Theory - Engaging the Future of Data Science

DARPA AIQ: Constraints for Provable Extrapolation in AI

NSF Collaborative Research: AIMing: Synergistic Advancement of AI and Mathematics

ERC Starting Grant: Deep Learning Theory 2018-2023

DFG SPP 2298 Theoretical Foundations of Deep Learning: Combinatorial and Implicit Approaches to Deep Learning

NSF CAREER: Neural Networks in the Practical Regime

2022 Sloan Research Fellowship

NSF Collaborative Research: RI: Medium: MoDL: Occam's Razor in Deep and Physical Learning

Publications

2026

TriSearch: Learning to Optimize Triangulations via Bistellar Flips.
Yiran Wang, Guido Montufar. Preprint [arXiv:2605.30220].

Stress-Testing Neural Network Verifiers with Provably Robust Instances.
David Troxell, Yulia Alexandr, Sofia Hunt, Stephanie Lei, Guido Montufar. Preprint [arXiv:2605.17153]. Workshop version at ICML 2026 Workshop Combining Theory and Benchmarks. Repo [GitHub].

Implicit Bias of Mirror Flow in Homogeneous Neural Networks: Sparse and Dense Feature Learning.
Tom Jacobs, Guido Montufar. Preprint [arXiv:2605.19458].

The Symmetries of Three-Layer ReLU Networks.
Johanna Marie Gegenfurtner, Moritz Grillo, Guido Montufar. Preprint [arXiv:2605.18319].

Most ReLU Networks Admit Identifiable Parameters.
Moritz Grillo, Guido Montufar. Preprint [arXiv:2605.03601].

The Value Function Semi-Algebraic Set in Partially Observable Markov Decision Processes.
Ryan Anderson, Guido Montufar. In Forty-third International Conference on Machine Learning (ICML 2026). Preprint [arXiv:2606.03048]. Repo [GitHub].

Differentiable Optimization Layers for Guaranteed Fairness in Deep Learning.
David Troxell, Noah Roemer, Guido Montufar. In Forty-third International Conference on Machine Learning (ICML 2026). Preprint [arXiv:2605.17118]. Repo [GitHub].

Harmful Overfitting in Sobolev Spaces.
Kedar Karhadkar, Alexander Sietsema, Deanna Needell, Guido Montufar. In Forty-third International Conference on Machine Learning (ICML 2026). Preprint [arXiv:2602.00825].

Algebraic Invariants of Lightning Self-Attention.
Yulia Alexandr, Hao Duan, Guido Montufar. Preprint [arXiv:2604.15632]. Repo [GitHub].

Algebraic Robustness Verification of Neural Networks.
Yulia Alexandr, Hao Duan, Guido Montufar. Preprint [arXiv:2602.06105]. Repo [GitHub].

Gradient Descent with Large Step Sizes: Chaos and Fractal Convergence Region.
Shuang Liang, Guido Montufar. In International Conference on Learning Representations (ICLR 2026). Preprint [arXiv:2509.25351]. Repo [GitHub].

2025

Supermodular Rank: Set Function Decomposition and Optimization.
Rishi Sonthalia, Anna Seigal, Guido Montufar. SIAM Journal on Mathematics of Data Science 7(4), 2025. Preprint [arXiv:2305.14632]. Repo [GitHub].

Hadamard ranks of algebraic varieties.
Dario Antolini, Guido Montufar, Alessandro Oneto. Preprint [arXiv:2510.05231]. Repo [zenodo].

Low Rank Gradients and Where to Find Them.
Rishi Sonthalia, Michael Murray, Guido Montufar. Neural Information Processing Systems (NeurIPS 2025). Workshop version presented at ICML 2025 Workshop HiLD. Preprint [arXiv:2510.01303]. Repo [GitHub].

Zero-Shot Context Generalization in Reinforcement Learning from Few Training Contexts.
James Chapman, Kedar Karhadkar, Guido Montufar. Neural Information Processing Systems (NeurIPS 2025). Preprint [arXiv:2507.07348]. Repo [GitHub].

Enumeration of max-pooling responses with generalized permutohedra.
Laura Escobar, Patricio Gallardo, Javier Gonzalez-Anaya, Jose L Gonzalez, Guido Montufar, Alejandro H Morales. Annals of Combinatorics, 2025. Preprint [arXiv:2209.16978]. Repo [GitHub].

Constraining the outputs of ReLU neural networks.
Yulia Alexandr, Guido Montufar. Preprint [arXiv:2508.03867].

Understanding learning invariance in deep linear networks.
Hao Duan, Guido Montufar. Preprint [arXiv:2506.13714].

Fisher-Rao Gradient Flows of Linear Programs and State-Action Natural Policy Gradients.
Johannes Müller, Semih Çaycı, Guido Montufar. SIAM Journal on Optimization 35(2), 2025. Preprint [arXiv:2403.19448]. Repo [GitHub].

On the local complexity of linear regions in deep ReLU networks.
Niket Patel, Guido Montufar. International Conference on Machine Learning (ICML 2025). Preprint [arXiv:2412.18283].

Implicit Bias of Mirror Flow for Shallow Neural Networks in Univariate Regression.
Shuang Liang, Guido Montufar. International Conference on Learning Representations (ICLR 2025). Preprint [arXiv:2410.03988]. Repo [GitHub].

Demystifying Topological Message-Passing with Relational Structures: A Case Study on Oversquashing in Simplicial Message-Passing.
Diaaeldin Taha, James Chapman, Marzieh Eidi, Karel Devriendt, Guido Montufar. International Conference on Learning Representations (ICLR 2025). Preprint [arXiv:2506.06582]. Repo [GitHub].

2024

Mildly Overparameterized ReLU Networks Have a Favorable Loss Landscape.
Kedar Karhadkar, Michael Murray, Hanna Tseran, Guido Montufar. Transactions on Machine Learning Research, 2024. Preprint [arXiv:2305.19510]. Repo [GitHub].

Function space and critical points of linear convolutional networks.
Kathlen Kohn, Guido Montufar, Vahid Shahverdi, Matthew Trager. SIAM Journal on Applied Algebra and Geometry 8(2), 2024. Preprint [arXiv:2304.05752].

Bounds for the smallest eigenvalue of the NTK for arbitrary spherical data of arbitrary dimension.
Kedar Karhadkar, Michael Murray, Guido Montufar. Neural Information Processing Systems (NeurIPS 2024). Preprint [arXiv:2405:14630].

Benign overfitting in leaky ReLU networks with moderate input dimension.
Kedar Karhadkar, Erin George, Michael Murray, Guido Montufar, Deanna Needell. Neural Information Processing Systems (NeurIPS 2024). Preprint [arXiv:2403.06903].

The Real Tropical Geometry of Neural Networks.
Marie-Charlotte Brandenburg, Georg Loho, Guido Montufar. Transactions of Machine Learning Research, 2024. Preprint [arXiv:2403.11871].

Pull-back Geometry of Persistent Homology Encodings.
Shuang Liang, Renata Turkes, Jiayi Li, Nina Otter, Guido Montufar. Transactions on Machine Learning Research, 2024. Preprint [arXiv:2310.07073]. Repo [GitHub]. Video [Video].

2023

Continuity and additivity properties of information decompositions.
Johannes Rauh, Pradeep Kr. Banerjee, Ekehard Olbrich, Guido Montufar, Jürgen Jost. International Journal of Approximate Reasoning vol 161, 2023. Workshop version in 12 Workshop on Uncertainty Processing (WUPES 2022). Preprint [arXiv:2204.10982].

Algebraic Optimization of Sequential Decision Problems.
Mareike Dressler, Marina Garrote-Lopez, Guido Montufar, Johannes Müller, Kemal Rose. Journal of Symbolic Computation vol 121, 2023. Preprint [arXiv:2211.09439]. Repo [GitHub].

Geometry and convergence of natural policy gradients.
Johannes Müller, Guido Montufar. Information Geometry, 2023. Preprint [arXiv:2211.02105]. Repo [GitHub].

Implicit bias of gradient descent for mean squared error regression with two-layer wide neural networks.
Hui Jin and Guido Montufar. Journal of Machine Learning Research JMLR 24(137):1-97, 2023. Repo [GitHub]. Preprint [arXiv:2006.07356].

Critical points and convergence analysis of generative deep linear networks trained with Bures-Wasserstein loss.
Pierre Brechet, Katerina Papagiannouli, Jing An, Guido Montufar. In International Conference on Machine Learning (ICML 2023). Preprint [arXiv:2303.03027].

Expected Gradients of Maxout Networks and Consequences to Parameter Initialization.
Hanna Tseran, Guido Montufar. In International Conference on Machine Learning (ICML 2023). Preprint [arXiv:2301.06956]. Repo [GitHub].

Characterizing the Spectrum of the NTK via a Power Series Expansion.
Michael Murray, Hui Jin, Benjamin Bowman, Guido Montufar. In International Conference on Learning Representations (ICLR 2023). Repo [GitHub]. Virtual poster [ICLR]. Preprint [arXiv:2211.07844].

FoSR: First-order spectral rewiring for addressing oversquashing in GNNs.
Kedar Karhadkar, Pradeep Kr Banerjee, Guido Montufar. In International Conference on Learning Representations (ICLR 2023). Repo [GitHub]. Virtual poster [ICLR]. Preprint [arXiv:2210.11790].

2022

Stochastic Feedforward Neural Networks: Universal Approximation.
T. Merkh and G. Montufar. In Mathematical Aspects of Deep Learning, Cambridge University Press, 2022. Preprint [arXiv:1910.09763], [RG].

Sharp bounds for the number of regions of maxout networks and vertices of Minkowski sums.
Guido Montufar, Yue Ren, and Leon Zhang. SIAM Journal on Applied Algebra and Geometry 6(4), 2022. Preprint [arXiv:2104.08135].

Spectral Bias Outside the Training Set for Deep Networks in the Kernel Regime.
Benjamin Bowman and Guido Montufar. In Advances in Neural Information Processing Systems (NeurIPS 2022). Virtual poster [SlidesLive]. Preprint [arXiv:2206.02927].

On the effectiveness of persistent homology.
Renata Turkeš, Guido Montufar, and Nina Otter. In Advances in Neural Information Processing Systems (NeurIPS 2022). Repo [GitHub]. Virtual poster [SlidesLive]. Preprint [arXiv:2206.10551].

Oversquashing in GNNs through the lens of information contraction and graph expansion.
Pradeep Kr. Banerjee, Kedar Karhadkar, YuGuang Wang, Uri Alon, and Guido Montufar. In 58th Annual Allerton Conference on Communication, Control, and Computing (Allerton 2022). Preprint [arXiv:2208.03471].

Cell graph neural networks enable the precise prediction of patient survival in gastric cancer.
Yanan Wang, YuGuang Wang, Changyuan Hu, Ming Li, Yanan Fan, Nina Otter, Ikuan Sam, Hongquan Gou, Yiquun Hu, Terry Kwon, John Zalcberg, Alex Boussioutas, Roger Dali, Guido Montufar, Pietro Lio, Dakang Xu, Geoffrey I Webb, and Jiangning Song. npj Precision Oncology 6, Article number: 45 (2022).

Geometry of Linear Convolutional Networks.
Kathlen Kohn, Thomas Merkh, Guido Montufar, Matthew Trager. SIAM Journal on Applied Algebra and Geometry 6(3), 2022. Preprint [arXiv:2108.01538].

Solving infinite-horizon POMDPs with memoryless stochastic policies in state-action space.
Johannes Müller and Guido Montufar. In Reinforcement Learning and Decision Making (RLDM 2022). Repo [GitHub]. Preprint [arXiv:2205.14098].

Continuity and additivity properties of information decompositions.
Johannes Rauh, Pradeep Kr. Banerjee, Ekehard Olbrich, Guido Montufar, Juergen Jost. In 12 Workshop on Uncertainty Processing (WUPES 2022). Preprint [arXiv:2204.10982].

Implicit Bias of MSE Gradient Optimization in Underparameterized Neural Networks.
Benjamin Bowman and Guido Montufar. In The Tenth International Conference on Learning Representations (ICLR 2022). Virtual poster [ICLR]. Preprint [arXiv:2201.04738].

Learning curves for Gaussian process regression with power-law priors and targets.
Hui Jin, Pradeep Kr. Banerjee, and Guido Montufar. In The Tenth International Conference on Learning Representations (ICLR 2022). Preprint [arXiv:2110.12231]. Slides [GitHub].
Power-law asymptotics of the generalization error for GP regression under power-law priors and targets.
Workshop version presented at Workshop on Bayesian Deep Learning NeurIPS 2021.

The Geometry of Memoryless Stochastic Policy Optimization in Infinite-Horizon POMDPs.
Johannes Müller and Guido Montufar. In The Tenth International Conference on Learning Representations (ICLR 2022). Virtual poster [ICLR]. Repo [GitHub]. Preprint [arXiv:2110.07409].

2021

Weisfeiler and Lehman go cellular: CW networks.
Christian Bodnar, Fabrizio Frasca, Nina Otter, Yu Guang Wang, Pietro Lio, Guido Montufar, and Michael Bronstein. Advances in Neural Information Processing Systems 35 (NeurIPS 2021). Repo [GitHub]. Virtual poster [SlidesLive]. Preprint [arXiv:2106.12575].

On the expected complexity of maxout networks.
Hanna Tseran and Guido Montufar. Advances in Neural Information Processing Systems 35 (NeurIPS 2021). Repo [GitHub]. Virtual poster [SlidesLive]. Preprint [arXiv:2107.00379].

A top-down approach to attain decentralized multi-agents.
Alex Tong Lin, Guido Montufar, and Stanley Osher. Handbook of Reinforcement Learning and Control, pp 419-431, Springer, 2021.

Distributed learning via filtered hyperinterpolation on manifolds.
Guido Montufar and Yu Guang Wang. Foundations of computational mathematics 22, 1219-1271, 2021. Preprint [arXiv:2007.09392].

How framelets enhance graph neural networks.
Xuebin Zheng, Bingxin Zhou, Junbin Gao, Yu Guang Wang, Pietro Lio, Ming Li, and Guido Montufar. In Proceedings of the 38th International Conference on Machine Learning (ICML 2021), PMLR 139:12761-12771, 2021. Repo [GitHub]. Preprint [arXiv:2102.06986].

Weisfeiler and Lehman go topological: message passing simplicial networks.
Christian Bodnar, Fabrizio Frasca, Yu Guang Wang, Nina Otter, Guido Montufar, Pietro Lio, and Michael Bronstein. In Proceedings of the 38th International Conference on Machine Learning (ICML 2021), PMLR 139:1026-1037, 2021. Virtual poster [ICML]. Preprint [arXiv:2103.03212].
Workshop version presented at Workshop on Geometrical and Topological Representation Learning ICLR, 2021. Virtual poster [SlidesLive].

Tight bounds on the smallest eigenvalue of the neural tangent kernel for deep ReLU networks.
Quynh Nguyen, Marco Mondelli, and Guido Montufar. In Proceedings of the 38th International Conference on Machine Learning (ICML 2021), PMLR 139:8119-8129, 2021. Preprint [arXiv:2012.11654].

Information complexity and generalization bounds.
Pradeep Kumar Banerjee and Guido Montufar. In IEEE international symposium on information theory (ISIT 2021). Preprint [arXiv:2105.01747].

Wasserstein proximal of GANs.
Alex Tong Lin, Wuchen Li, Stanley Osher, and Guido Montufar. In Proceedings of the 5th International Conference Geometric Science of Information (GSI 2021), LNCS, vol 12829, pp 524-533, Springer, 2021. Preprint [arXiv:2102.06862], [CAM report 18-53], [RG]. Poster [pdf].

Decentralized multi-agents by imitations of a centralized controller.
Alex Tong Lin, Mark J. Debord, Katia Estabridis, Gary Hewer, Guido Montufar, and Stanley Osher. In 2nd Annual Conference on Mathematical and Scientific Machine Learning (MSML 2021). Preprint [arXiv:1902.02311].

Wasserstein distance to independence models.
Tuerkue Ozluem Celik, Asgar Jamneshan, Guido Montufar, Bernd Sturmfels, and Lorenzo Venturello. Journal of Symbolic Computation, 104:855-873, 2021. Preprint [arXiv:2003.06725].

PAC-Bayes and information complexity.
Pradeep Kr. Banerjee and Guido Montufar. Presented at Workshop on neural compression: from information theory to applications ICLR, 2021. Poster [GitHub].

2020

Optimization theory for ReLU neural networks trained with normalization layers.
Yonatan Dukler, Quanquan Gu, and Guido Montufar. In Proceedings of the 37th International Conference on Machine Learning (ICML 2020), PMLR 119:2751-2760, 2020. Virtual poster [ICML]. Preprint [arXiv:2006.06878].

Haar graph pooling.
Yu Guang Wang, Ming Li, Zheng Ma, Guido Montufar, Xiaosheng Zhuang, Yanan Fan. In Proceedings of the 37th International Conference on Machine Learning (ICML 2020), PMLR 119:9952-9962, 2020. Repo [GitHub]. Virtual poster [ICML]. Preprint [arXiv:1909.11580].

The variational deficiency bottleneck.
Pradeep Kumar Banerjee and Guido Montufar. In Proceedings of the international joint conference on neural networks (IJCNN 2020). Preprint [arXiv:1810.11677].

Kernelized Wasserstein Natural Gradient.
Michael Arbel, Arthur Gretton, Wuchen Li, Guido Montufar. In International Conference on Learning Representations (ICLR 2020). Repo [GitHub]. Preprint [arXiv:1910.09652].

Factorized Mutual Information Maximization.
Thomas Merkh and Guido Montufar. Kybernetika 56(5):948-978, 2020. Preprint [arxiv:1906.05460], [RG].

Ricci curvature for parametric statistics via optimal transport.
W. Li and G. Montufar. Information Geometry 3(1):89-117, 2020. Preprint [arXiv 1807.07095].

Can neural networks learn persistent homology features?.
Guido Montufar, Nina Otter, and Yu Guang Wang. Presented at Workshop on topological data analysis and beyond NeurIPS, 2020. Preprint [arXiv:2011.14688].

2019

Optimal Transport to a Variety.
T. O. Celik, A. Jamneshan, G. Montufar, B. Sturmfels, L. Venturello. In International Conference on Mathematical Aspects of Computer and Information Sciences (MACIS 2019), LNCS, vol 11989, pp 364-381, Springer, 2020. Preprint [arxiv:1909.11716], [MPI MIS 7/2021].

Affine Natural Proximal Learning.
Wuchen Li, Alex Lin and Guido Montufar. In Proceedings of the 4th International Conference Geometric Science of Information (GSI 2019), LNCS, vol 11712, pp 705-714, Springer, 2019. Preprint [MPI MIS 6/2021], [RG].

Wasserstein of Wasserstein loss for learning generative models.
Y. Dukler, W. Li, A. Lin, and G. Montufar. In Proceedings of the 36th International Conference on Machine Learning (ICML 2019), PMLR 97:1716-1725, 2019. [BibTex]. Repo [GitHub]. Preprint [MPI MIS 13/2019].

Wasserstein Diffusion Tikhonov Regularization.
Alex Tong Lin, Yonatan Dukler, Wuchen Li, and Guido Montufar. Presented at Optimal Transport and Machine Learning Workshop NeurIPS, 2019. Preprint [arxiv:1909.06860], [RG].

A continuity result for optimal memoryless planning in POMDPs.
J. Rauh, N. Ay, G. Montufar. Presented at The 4th Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM 2019). [pdf]. Preprint [MPI MIS 5/2021], [RG].

Task-Agnostic Constraining in Average Reward POMDPs.
G. Montufar, J. Rauh, N. Ay. Presented at Task-Agnostic Reinforcement Learning Workshop ICLR, 2019. [pdf]. Preprint [MPI MIS 9/2021], [RG].

How Well Do WGANs Estimate the Wasserstein Metric?.
Anton Mallasto, Guido Montufar, Augusto Gerolin. Preprint [arXiv:1910.03875].

2018

Natural Gradient via Optimal Transport.
W. Li and G. Montufar. Information Geometry 1, issue 2, pp 181-214, 2018. Preprint [arXiv:1803.07033].

Restricted Boltzmann Machines: Introduction and Review.
G. Montufar. Information geometry and its applications (IGAIA IV), pp 75-115, Springer, 2018. Preprint [arXiv:1806.07066].

Computing the Unique Information.
P. K. Banerjee, J. Rauh, and G. Montufar. IEEE International Symposium on Information Theory (ISIT), pages 141-145, 2018. Repo [GitHub]. Preprint [arXiv:1709.07487].

Mixtures and Products in two Graphical Models.
A. Seigal and G. Montufar. Journal of Algebraic Statistics vol 9 no 1, 2018. Preprint [arXiv:1709.05276].

Uncertainty and Stochasticity of Optimal Policies.
G. Montufar, J. Rauh, N. Ay. In Proceedings of the 11th Workshop on Uncertainty Processing (WUPES 2018). Preprint [MPI MIS 8/2021], [RG].

2017

Geometry of Policy Improvement.
G. Montufar and J. Rauh. In Geometric Science of Information (GSI 2017), LNCS, vol 10589, pp 282-290, Springer, 2017. [BibTex]. Preprint [arXiv:1704.01785].

Morphological Computation: The good, the bad, and the ugly.
K. Ghazi-Zahedi, R. Deimel, G. Montufar, V. Wall, and O. Brock. In IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2017). [BibTex]. Preprint [RG].

Dimension of Marginals of Kronecker Product Models.
G. Montufar and J. Morton. SIAM Journal on Applied Algebra and Geometry 1(1), 2017. [BibTex]. Supplement [JacobianKronecker.m]. Preprint [arXiv:1511.03570], [MPI MIS 75/2015].

Hierarchical Models as Marginals of Hierarchical Models.
G. Montufar and J. Rauh. International Journal of Approximate Reasoning (IJAR) 88:531-546, 2017. [BibTeX]. Workshop version in Proceedings of the 10th Workshop on Uncertainty Processing (WUPES 2015). Supplement [starcover.m]. Preprint [arXiv:1508.03606], [MPI MIS 27/2016].

Stochasticity of optimal policies for POMDPs.
G. Montufar, K. Ghazi-Zahedi, N. Ay. Presented at Reinforcement Learning and Decision Making (RLDM 2017).

Notes on the number of linear regions of deep neural networks.
G. Montufar. Presented at Mathematics of Deep Learning, Sampling Theory and Applications (SampTA 2017). Preprint [RG].

2016

Mode Poset Probability Polytopes.
G. Montufar and J. Rauh. Journal of Algebraic Statistics 7(1):1-13, 2016. [BibTeX]. Workshop version in Proceedings of the 10th Workshop on Uncertainty Processing (WUPES 2015). Preprint [arXiv:1503.00572], [MPI MIS 22/2015].

Evaluating Morphological Computation in Muscle and DC-motor Driven Models of Hopping Movements.
K. Ghazi-Zahedi, D. Haeufle, G. Montufar, S. Schmitt, and N. Ay. Frontiers in Robotics and AI 3(42):frobt.2016.00042, 2016. [BibTeX]. Preprint [arXiv:1512.00250].

Information Theoretically Aided Reinforcement Learning for Embodied Agents.
G. Montufar, K. Ghazi-Zahedi, and N. Ay. Preprint [arXiv:1605.09735], [RG].

Geometry and Determinism of Optimal Stationary Control in Partially Observable Markov Decision Processes.
G. Montufar, K. Ghazi-Zahedi, and N. Ay. Preprint [arXiv:1503.07206], [MPI MIS 22/2016].

2015

A Theory of Cheap Control in Embodied Systems.
G. Montufar, K. Zahedi, and N. Ay. PLoS Comput Biol 11(9):e1004427, 2015. [BibTeX]. Preprint [arXiv:1407.6836], [MPI MIS 70/2014].

Geometry and Expressive Power of Conditional Restricted Boltzmann Machines.
G. Montufar, N. Ay, and K. Zahedi. Journal of Machine Learning Research JMLR 16(Dec):2405-2436, 2015. [BibTeX]. Preprint [arXiv:1402.3346], [MPI MIS 16/2014].

Discrete Restricted Boltzmann Machines.
G. Montufar and J. Morton. Journal of Machine Learning Research JMLR 16(Apr):653-672, 2015. [BibTeX]. Conference version in International Conference on Learning Representations (ICLR 2013). Preprint [arXiv:1301.3529], [MPI MIS 106/2014].

When Does a Mixture of Products Contain a Product of Mixtures?
G. Montufar and J. Morton. SIAM Journal on Discrete Mathematics (SIDMA) 29(1):321-347, 2015. [BibTeX]. Preprint [arXiv:1206.0387], [MPI MIS 98/2014].

Deep Narrow Boltzmann Machines are Universal Approximators.
G. Montufar. In Third International Conference on Learning Representations (ICLR 2015). [BibTeX]. Preprint [arXiv:1411.3784], [MPI MIS 113/2014].

A Comparison of Neural Network Architectures.
G. Montufar. Presented at Deep Learning Workshop ICML, 2015. [pdf], [pdf].

Universal Approximation of Markov Kernels by Shallow Stochastic Feedforward Networks.
G. Montufar. Preprint [arXiv:1503.07211], [MPI MIS 23/2015].

2014

On the Number of Linear Regions of Deep Neural Networks.
G. Montufar, R. Pascanu, K. Cho, and Y. Bengio. Neural Information Processing Systems 27 (NIPS 2014). [BibTeX]. Preprint [MPI MIS 73/2014], [arXiv 1402.1869]

On the Number of Response Regions of Deep Feedforward Networks with Piecewise Linear Activations.
R. Pascanu, G. Montufar, and Y. Bengio. In Second International Conference on Learning Representations (ICLR 2014). [BibTeX]. Preprint [MPI MIS 72/2014], [arXiv 1312.6098]

On the Fisher Information Metric of Conditional Probability Polytopes.
G. Montufar, J. Rauh, and N. Ay. Entropy 16(6):3207-3233, 2014. [BibTeX]. Preprint [MPI MIS 87/2014], [arXiv 1404.0198]

Scaling of Model Approximation Errors and Expected Entropy Distances.
G. Montufar and J. Rauh. Kybernetika 50(2):234-245, 2014. [BibTeX]. Workshop version WUPES 2012, pp. 137-148. Preprint [arXiv 1207.3399]

Universal Approximation Depth and Errors of Narrow Belief Networks with Discrete Units.
G. Montufar. Neural Computation 26(7):1386-1407, 2014. [BibTeX]. Preprint [MPI MIS 74/2014], [arXiv 1303.7461]

Sequential Recurrence-Based Multidimensional Universal Source Coding of Lempel-Ziv Type.
T. Krueger, G. Montufar, R. Seiler, and R. Siegmund-Schultze. Preprint [MPI MIS 86/2014], [arXiv 1408.4433].

2013

Universally Typical Sets for Ergodic Sources of Multidimensional Data.
T. Krüger, G. Montufar, R. Seiler, and R. Siegmund-Schultze. Kybernetika 49(6):868-882, 2013. [BibTeX]. Preprint [MPI MIS 20/2011], [arXiv 1105.0393]

Mixture Decompositions of Exponential Families Using a Decomposition of their Sample Spaces.
G. Montufar. Kybernetika 49(1):23-39, 2013. [BibTeX]. Preprint [MPI MIS 39/2010], [arXiv 1008.0204]

Maximal Information Divergence from Statistical Models defined by Neural Networks.
G. Montufar, J. Rauh, and N. Ay. In Geometric Science of Information (GSI), LNCS, vol 8085, pp 759-766, Springer, 2013. [BibTeX]. Preprint [MPI MIS 31/2013], [arXiv 1303.0268]

Selection Criteria for Neuromanifolds of Stochastic Dynamics.
N. Ay, G. Montufar, J. Rauh. In Advances in Cognitive Neurodynamics (III), pp 147-154, 2013. [BibTeX]. Preprint [MPI MIS 15/2011]

2012

Kernels and Submodels of Deep Belief Networks.
G. Montufar and J. Morton. Deep Learning and Unsupervised Feature Learning Workshop NIPS, 2012 . Preprint [arXiv 1211.0932]

2011

Expressive Power and Approximation Errors of Restricted Boltzmann Machines.
G. Montufar, J. Rauh, and N. Ay. Neural Information Processing Systems 24 (NIPS 2011). [BibTeX]. Preprint [MPI MIS 27/2011], [arXiv 1406.3140]

Refinements of Universal Approximation Results for Restricted Boltzmann Machines and Deep Belief Networks.
G. Montufar and N. Ay. Neural Computation 23(5):1306-1319, 2011. [BibTeX]. Preprint [MPI MIS 23/2010], [arXiv 1005.1593]

Mixture Models and Representational Power of RBMs, DBNs and DBMs.
G. Montufar. Deep Learning and Unsupervised Feature Learning Workshop NIPS, 2010. [pdf], [pdf]

Theses

On the Expressive Power of Discrete Mixture Models, Restricted Boltzmann Machines, and Deep Belief Networks—A Unified Mathematical Treatment.
PhD Thesis, Leipzig University, October 2012. Supervisor: N. Ay. [pdf] (14.4 MB, 155 pages, 30 figures)

Theory of Transport and Photon-Statistics in a Biased Nanostructure.
German Diplom in Physics, Institute for Theoretical Physics, TU-Berlin, December 2008. Supervisor: A. Knorr and T. Brandes.

Q-Sanov Theorem for d ≥ 2.
German Diplom in Mathematics, Institute for Mathematics, TU-Berlin, August 2007. Supervisor: R. Seiler and J.-D. Deuschel.

Math Machine Learning Group

Postdocs

Moritz Grillo, Postdoc at MPI MiS

Yulia Alexandr, Hedrick Assistant Adjunct Professor, UCLA

PhD students

Cash Bowman, Data Theory Major at UCLA

David Troxell, PhD candidate, UCLA Statistics & Data Science

Yiran Wang, PhD candidate, UCLA Electrical & Computer Engineering

Khang Nguyen, PhD candidate, UCLA Mathematics

Shuang Liang, PhD candidate, UCLA Statistics & Data Science

Ryan Anderson, PhD candidate, UCLA Statistics & Data Science

Hao Duan, PhD candidate, UCLA Statistics & Data Science

Past Postdocs

Angelica Torres, Postdoc, MPI MIS, next position: Assistant Professor at Universidad Tecnica Federico Santa Maria, Chile

Rishi Sonthalia, 2021-24 Hedrick Assistant Adjunct Professor, UCLA, co-mentored with A. Bertozzi and J. Foster, next position: Assistant Professor Mathematics Department Boston College

Michael Murray, 2021-24 Hedrick Assistant Adjunct Professor, UCLA, co-mentored with D. Needell, next position: Lecturer University of Bath

Marie Brandenburg, 2023-24 Postdoc MPI MIS, next position: Postdoc at KTH

Pradeep Kr. Banerjee, 2019-23 Postdoc MPI MIS, next position: Postdoc at TUUH

Katerina Papagiannouli, 2021-22 Postdoc at MPI MIS, next position: Postdoc at Learning and Inference Group MPI MIS

Jing An, 2021-22 Postdoc at MPI MIS co-mentored with F. Otto, next position: Phillip Griffiths Assistant Research Professor at Duke

Yu Guang Wang, 2020-21 Postdoc at MPI MIS, next position: Associate Professor at Shanghai Jiao Tong University

Nina Otter, 2018-21 CAM Adjunct Assistant Professor at UCLA co-mentored with M. Porter, next position: Lecturer (Assistant Professor) of Data Science at Queen Mary University of London

Quynh Nguyen, 2020-21 Postdoc at MPI MIS

Past PhD students

Kedar Karhadkar, PhD 2025 at UCLA Math, Thesis: Optimization and Generalization of Neural Networks in Moderate-Dimensional Regimes, next position: G-Research

James Chapman, PhD 2025 at UCLA Math, co-advised with A. Bertozzi, Thesis: Inductive Biases in Multi-Stage Machine Learning Problems and Applications, next position: Google

Jiayi Li, PhD 2025 at UCLA Statistics & Data Science, Thesis: Structure, Symmetry, and Singularity in Learning Systems, next position: Postdoc at MPI CBG

Pierre Bréchet, Dr.rer.nat. Informatik 2025 at MPI MIS / Leipzig University, Thesis: On the Optimization Properties of Optimal-Transport Based Generative Adversarial Linear Networks, next position: Learning and Inference Group MPI MIS

Johannes Müller, Dr.rer.nat. Mathematics 2024 at IMPRS MPI MIS Leipzig University, Thesis: Geometry of Optimization in Markov Decision Processes and Neural Network Based PDE Solvers, co-advised with Nihat Ay, next position: Postdoc at RWTH Aachen

Hanna Tseran, Dr.rer.nat. Informatik 2023 at IMPRS MPI MIS Leipzig University, Thesis: Expected Complexity and Gradients of Deep Maxout Networks and Implications to Parameter Initialization, next position: Project Researcher at University of Tokyo

Benjamin Bowman, PhD 2023 at UCLA Math, Thesis: On the Spectral Bias of Neural Networks in the Neural Tangent Kernel Regime, next position: Applied Scientist at AWS

Hui Jin, PhD 2022 at UCLA Math, Thesis: Generalization of Wide Neural Networks from the Perspective of Linearization and Kernel Learning, next position: Research Engineer at Huawei Co., Ltd.

Yonatan Dukler, PhD 2021 at UCLA Math, Thesis: The geometry and manipulation of natural data for optimizing neural networks / A theory for undercompressive shocks in tears of wine, co-advised with A. Bertozzi, next position: Applied Scientist at AWS

Visitors

Tom Jacobs (CISPA), Winter 2026 visitor at MPI

Renata Turkes (Antwerp), Winter 2026 visitor at MPI

Manjot Singh (LMU), Fall 2025 visitor at UCLA

Carson Newman, Summer 2025 Competitive Edge Fellow at UCLA

Erik Bolager (TUM), Spring 2025 visitor at UCLA

Felix Kornad (Bonn), Feb 2024 visitor at MPI

Dora Klein (Bonn), Summer 2023 Research Intern at MPI

Renata Turkeš (Antwerp), Sep 2021-May 2022 Fulbright visitor at UCLA, co-mentored with Nina Otter

Friedrich Wicke (Berlin / Harvard), May-Sep 2022 Research Intern at MPI MIS

Teaching

2027

Stats 231B - Methods of Machine Learning, Spring 2027

2026

Math 277 - Mathematical Foundations of Machine Learning and Artificial Intelligence, Fall 2026

Stats 100A - Introduction to Probability, Fall 2026

Math 170E - Introduction to Probability and Statistics, Spring 2026

2025

Math 285J - Applied Mathematics Seminar on Verification of AI Systems, Fall 2025

Stats 200A - Applied Probability, Fall 2025

Math 156 - Introduction to Machine Learning, Spring 2025

Stats 100A - Introduction to Probability, Spring 2025

2024

Stats 200A - Applied Probability, Fall 2024

Math 273A - Optimization, Fall 2024

Stats 290 - Current Literature in Statistics, Spring 2024

2023

Stats 200A - Applied Probability, Fall 2023

Math 273A - Optimization, Fall 2023

Stats 231B - Methods of Machine Learning, Spring 2023

2022

Stats 100A - Introduction to Probability, Fall 2022

Math 273A - Optimization, UCLA, Fall 2022

Math 164 - Optimization, UCLA, Spring 2022

Stats 290 - Current Literature in Statistics, UCLA, Spring 2022

2021

Math 273A - Optimization, UCLA, Fall 2021

Stats 100A - Introduction to Probability, UCLA, Fall 2021

Stats 231C - Theories of Machine Learning, UCLA, Spring 2021

2020

Stats 200A - Applied Probability, UCLA, Fall 2020

Math 273A - Optimization and Calculus of Variations, UCLA, Fall 2020

Stats 231C - Theories of Machine Learning, UCLA, Spring 2020

2019

Math 285J - Applied Mathematics Seminar - Deep Learning Topics, UCLA, Fall 2019

Stats 200A - Applied Probability, UCLA, Fall 2019

Stats 231C - Theories of Machine Learning, UCLA, Spring 2019

IMPRS Ringvorlesung - short course, Topics from Deep Learning, MPI MIS, Winter 2019

2018

Math 273 - Optimization, Calculus of Variations, and Control Theory, UCLA, Fall 2018

Math 164 - Optimization, UCLA, Fall 2018.

Stat 270 - Mathematical Machine Learning, UCLA, Spring 2018

Math 285J - Applied Mathematics Seminar - Deep Learning Topics, UCLA, Winter 2018

2017

Introduction to the Theory of Neural Networks, MS/PhD Lecture, Leipzig University and MPI MIS, Summer Term 2016

Geometric Aspects of Graphical Models and Neural Networks, with N. Ay, [Abstract], MS/PhD Lecture, Leipzig University and MPI MIS, Winter Term 2014/2015

Talks

2026

Wallenberg Advanced Scientific Forum, Ranas Castle, Sep 2026

Mathematical Foundations of Machine Learning (MFML), Vason, Monte Bondone, May 2026

2025

MPI MiS Workshop on Geometry, Topology, and Machine Learning, MPI MiS, Nov 2025

Munich AI Lecture, LMU, Jul 2025

AMS Special Session Geometry and Machine Learning, JMM, Jan 2025

AMS Special Session Applications of Algebraic Geometry, JMM, Jan 2025

AMS Special Session Geometric and Combinatorial Methods in Deep Learning Theory, JMM, Jan 2025

AMS Special Session Algebraic Statistics in Our Changing World, JMM, Jan 2025

2024

Math of Machine Learning, Incontro INdAM, Cortona, September 2024.

Mathematical foundations of AI, ScaDS.AI Summer School, Leipzig, June 2024.

Math and Stats Departments, University of Wisconsin Madison, April 2024.

Deep Learning: Theory, Applications, and Implications (DL 2024), RIKEN AIP, Tokyo, March 2024.

Kernel Methods in Data Analysis and Computational Science, 94th GAMM, Magdeburg, March 2024.

SPP 2298 Theoretical Foundations of Deep Learning, 94th GAMM, Magdeburg, March 2024.

Applied CATS Seminar, KTH, Stockholm, March 2024.

Winter Conference in Statistics with the theme Mathematical Foundations for AI, Data Science, and Ski, Department of Mathematics and Mathematical Statistics, Umea University, March 2024.

Minisymposium Mathematics of Deep Learning at Symposium on Sparsity and Singular Structures, RWTH Aachen, February 2024.

Advanced Studies Institute in Data Science, Machine Learning and Artificial Intelligence, Urgench State University (Uzbekistan), January 2024.

2023

Bayesian Statistics and Statistical Learning, IMSI Chicago, December 2023.

Workshop on Geometry and Machine Learning, MPI Leipzig, November 2023.

Machine Learning Seminar, Caltech, October 2023.

Mathematics of Data Science Section at DMV Meeting, Ilmenau, September 2023.

Conference on Applied Algebra, Osnabrueck, September 2023.

Machine Learning Seminar, Vanderbilt, August 2023. (online)

Mathematics of Geometric Deep Learning, Minisymposium at the 10th International Congress on Industrial and Applied Mathematics, Tokyo, August 2023.

Minisymposium on Data Geometry and Optimization, SampTA, Yale University, July 2023. (online)

Minisymposium Parameterizations and Nonconvex Optimization Landscapes, SIAM Conference on Optimization (OP23), Seattle, May 2023.

Computations and Data in Algebraic Statistics, Impromptu Session, BIRS CMO, Oaxaca, May 2023.

Joint TILOS and OPTML++ Seminar, MIT, April 2023. (online)

Applied Mathematics and Computation Seminar, Department of Mathematics and Statistics, University of Massachusetts Amherst, March 2023.

Online Machine Learning Seminar, School of Mathematical Sciences, University of Nottingham, February 2023. (online)

Caltech CMX Seminar, Caltech, January 2023.

2022

1W-MINDS Seminar, December 2022.

Learning, Information, Optimization, Networks, and Statistics (LIONS) seminar, Arizona State University, November 2022. (online)

Invited talk at Information Geometry for Data Science, Hamburg University of Technology, September 2022. (online)

Invited talk at Algebraic Statistics Session at COMPSTAT 2022, Bologna, Italy, August 2022.

Oxford Data Science Seminar, Mathematical Institute, University of Oxford, May 2022. (online)

Keynote at Algebraic Statistics, University of Hawai'i at Manoa, Honolulu, HI, May 2022.

Invited talk at Statistics Colloquium, Statistics Department, University of Chicago, April 2022. (online)

Invited talk at Special session on Latinxs in Combinatorics at the Joint Mathematics Meetings (JMM), hosted by the American Mathematical Society, April 2022.

Invited talk at mini-symposium Geometric Methods for Understanding and Applying Machine Learning, SIAM Conference on Imaging Science (IS22), March 2022. (online)

2021

Wilhelm Killing Colloquium, Mathematisches Institut, Universitaet Muenster, Muenster, Germany, December 2021. (online)

Mathematics of Information Processing Seminar, RWTH Aachen University, Aachen, Germany, November 2021. (online)

Mathematics in Imaging, Data and Optimization, Department of Mathematics, Rensselaer Polytechnic Institute, Troy, New York, November 2021. (online)

Institute of Natural Sciences, Shanghai Jiao Tong University, Shanghai, China, October 2021. (online)

Kickoff Workshop Numerical and Probabilistic Nonlinear Algebra, MPI MIS, September 2021.

Mathematics of Deep Learning, Isaac Newton Institute, Cambridge, August 2021. (online)

Mathematical Foundation and Applications of Deep Learning Workshop, Purdue, August 2021. (online)

Numerical Algebra and Optimization seminar, MPI MIS, August 2021.

Mini symposium Algebraic Geometry of Data, SIAM Conference on Applied Algebraic Geometry (AG21), August 2021. (online)

Workshop - Sayan Mukherjee, MPI MIS, July 2021.

Statistics Department Seminar, Department of Statistics, UCLA, June 2021. (online)

Discrete Mathematics/Geometry Seminar, TU Berlin, May 2021. (online)

Mathematical Data Science Seminar, Department of Mathematics, Purdue University, March 2021. (online)

DeepMind/ELLIS CSML Seminar Series, Centre for Computational Statistics and Machine Learning, University College London (UCL), January 2021. (online)

Invited talk at the TRIPODS Winter School and Workshop on the Foundations of Graph and Deep Learning, Mathematical Institute for Data Science (MINDS) at Johns Hopkins University, Baltimore MD, January 2021. (online)

Biostatistics Winter Webinar Series, Department of Biostatistics, UCLA, January 2021. (online)

Applied and Computational Mathematics Seminar, UC Irvine, January 2021. (online)

2020

Mathematics of Data and Decision in Davis (MADDD), UC Davis, December 2020. (online)

Keynote at Workshop Deep Learning through Information Geometry at NeurIPS, December 2020. (online)

Invited talk at GAMM Workshop Computational and Mathematical Methods in Data Science, (Gesellschaft fuer Angewandte Mathematik und Mechanik e.V.), Max Planck Institute MIS, September 2020. (online)

Plenary talk at the Workshop Optimal Transport, Topological Data Analysis and Applications to Shape and Machine Learning, Mathematical Biosciences Institute, Ohio State University, USA, July 2020. (online)

Keynote at Algebraic Statistics in Hawaii, USA, June 2020. (online)

Keynote at Differential Geometry and Machine Learning at CVPR 2020, June 2020.

Deep Learning Seminar, University of Vienna, Austria, February 2020.

Mathematics Seminar, KAUST, Thuwal, Saudi Arabia, January 2020.

Mathematisches Kolloquium, Bielefeld University, Germany, January 2020.

2019

Wasserstein Regularization for Generative and Discriminative Learning, UseDat Conf, Infospace, Moscow, September 2019.

Invited talk Factorized mutual information maximization at Prague Stochastics, Institute of Information Theory and Automation, Academy of Sciences of the Czech Republic, Prague, August 2019.

Wasserstein Information Geometry for Learning from Data, Optimal Transport for Nonlinear Problems, ICIAM, Valencia, Spain, July 2019.

Markov Kernels with Deep Graphical Models, Latent Graphical Models, SIAM AG, Bern, July 2019.

Keynote Computing the Unique Information at 1st Workshop on Semantic Information, CVPR 2019, Long Beach, June 2019.

Tutorial Wasserstein Information Geometry for Learning from Data at Geometry and Learning from Data in 3D and Beyond, IPAM, LA, March 2019. [talk at GLTUT IPAM].

2018

RBM Intro and Review at Boltzmann Machines, AIM, October 2018.

Plenary talk Representation, Approximation, Optimization advances for Restricted Botlzmann Machines at 7th International conference on computational harmonic analysis, Vanderbilt University, May 2018.

SIAM Annual Meeting 2018: Statistics Mini-Simposium, Portland, Oregon, July 2018.

Tutorial at Transilvanian Machine Learning Summer School, Cluj-Napoca, Romania, 16-22 July 2018.

Mixtures and products in two graphical models, USC Probability and Statistics Seminar, USC, April 2018.

A theory of cheap control in embodied systems, Random Structures in Neuroscience and Biology, Herrsching, Germany, March 2018.

Mode poset probability polytopes, Combinatorics Seminar (Igor Pak), UCLA, February 2018.

Uncertainty and Stochasticity in POMDPs, Machine Learning Seminar (Wilfrid Gangbo), UCLA, February 2018.

2017

Mixtures and products in two graphical models, Level Set Collective (Stanley Osher), IPAM, November 2017.

Graphical models with hidden variables, Systematic Approaches to Deep Learning Methods for Audio, Vienna, Austria, September 2017.

On the Fisher metric of conditional probability polytopes, Geometry of Information for Neural Networks, Machine Learning, Artificial Intelligence, Topological and Geometrical Structures of Information, CIRM Marseille, France, August 2017.

Notes on the number of linear regions of deep neural networks, Mathematics of Deep Learning, Special Session at International Conference on Sampling Theory, Tallin, Estonia, July 2017.

Neural networks for cheap control of embodied behavior, Peking University (Jinchao Xu), Beijing, China, July 2017.

Selected Topics in Deep Learning, Short Course, Beijing Institute for Scientific Computing, Beijing, China, July 2017.

Learning with neural networks, Tutorial, Training Networks, Summer School, Signal Processing with Adaptive Sparse Structured Representations, Lisbon, Portugal, June 2017.

2016

Dimension of Marginals of Kronecker Product Models, Seminar on Non-Linear Algebra, TU-Berlin, Germany, November 2016.

Geometric and Combinatorial Perspectives on Deep Neural Networks, Theory of Deep Learning Workshop, ICML 2016, New York, USA, June 2016.

Plenary talk Geometry of Boltzmann Machines, [Slides], [Abstract], IGAIA IV, Liblice, Czech Republic, June 2016.

Geometric Approaches to the Design of Embodied Learning Systems, Special Symposium on Intelligent Systems, MPI for Intelligent Systems, Tuebingen, Germany, March 2016.

Artificial Intelligence Overview, LikBez Seminar, MPI MIS, January 2016.

2015

Poster: A comparison of neural network architectures, Deep Learning Workshop, ICML 2015.

Poster: Mode Poset Probability Polytopes, [pdf], Algebraic Statistics 2015, Department of Mathematics University of Genoa, Italy, June 8-11, 2015.

Poster: Deep Narrow Boltzmann Machines are Universal Approximators, [pdf], ICLR 2015.

A Theory of Cheap Control in Embodied Systems, Montreal Institute for Learning Algorithms (MILA), University of Montreal, Canada, December 2015.

Dimension of restricted Boltzmann machines, Department of Mathematics & Statistics, York University, Toronto, Canada, December 2015.

Sequential Recurrence-Based Multidimensional Universal Source Coding, Dynamical Systems Seminar, MPI MIS, November 2015.

Cheap Control of Embodied Systems, Aalto Science Institute, Espoo, Finland, November 2015.

Mode Poset Probability Polytopes, WUPES'15, Moninec, Czech Republic, September 18, 2015.

Hierarchical models as marginals of hierarchical models, WUPES'15, Moninec, Czech Republic, September 17, 2015.

Confining bipartite graphical models by simple classes of inequalities, Special Topics Session Algebraic and Geometric Approaches to Graphical Models, 60th World Statistics Congress - ISI 2015, Rio de Janeiro, Brazil, July 31, 2015.

2014

Poster: On the Number of Linear Regions of Deep Neural Networks, [pdf], NIPS 2014.

Poster: A Framework for Cheap Universal Approximation in Embodied Systems, Autonomous Learning: 3. Symposium DFG Priority Programme 1527, Berlin, September 8-9, 2014.

Poster: Geometry of hidden-visible products of statistical models, [pdf], Algebraic Statistics at IIT, Chicago, IL, 2014.

On the Number of Linear Regions of Deep Neural Networks, Montreal Institute for Learning Algorithms (MILA), Université de Montréal, Montreal, Canada, December 15, 2014.

Information Divergence from Statistical Models Defined by Neural Networks, Workshop: Information Geometry for Machine Learning, RIKEN BSI, Japan, December 2014.

Geometry of Deep Neural Networks and Cheap Design for Autonomous Learning, Google DeepMind, London, UK, October 2014.

Geometry of Hidden-Visible Products of Statistical Models, Joint Workshop on Limit Theorems and Algebraic Statistics, UTIA, Prague, August 25-29, 2014.

2013

How size and architecture determine the learning capacity of neural networks, SFI Seminar, Santa Fe, NM, USA, October 23, 2013.

Maximal Information Divergence from Statistical Models defined by Neural Networks, GSI 2013, Mines ParisTech, Paris, France, August 29, 2013.

Naive Bayes models, Seminario de Postgrado en Ingenieria de Sistemas, Universidad del Valle, Santiago de Cali, Colombia, May 30, 2013.

Discrete Restricted Boltzmann Machines, ICLR2013, Scottsdale, AZ, USA, May 2, 2013.

2012

Poster: When Does a Mixture of Products Contain a Product of Mixtures, [Abstract], NIPS 2012 - Deep Learning and Unsupervised Feature Learning Workshop.

Poster: Kernels and Submodels of Deep Belief Networks, [Abstract], NIPS 2012 - Deep Learning and Unsupervised Feature Learning Workshop.

When Does a Mixture of Products Contain a Product of Mixtures?, Tensor network states and algebraic geometry, ISI Foundation, Torino, Italy, November 06-08, 2012.

Universally typical sets for ergodic sources of multidimensional data, Seminar on probability and its applications (Manfred Denker), Penn State, PA, USA, October 05, 2012.

On the Expressive Power of Discrete Mixture Models, Restricted Boltzmann Machines, and Deep Belief Networks—A Unified Mathematical Treatment, PhD thesis defense, Leipzig University, October 17, 2012.

Scaling of model approximation errors and expected entropy distances, Stochastic Modelling and Computational Statistics Seminar (Murali Haran), Penn State, PA, USA, October 11, 2012.

Scaling of Model Approximation Errors and Expected Entropy Distances, WUPES'12, Mariánské Lázně, Czech Republic, September 13, 2012.

Multivalued Restricted Boltzmann Machines, [Abstract], MPI MIS, Leipzig, Germany, September 19, 2012.

Simplex packings of marginal polytopes and mixtures of exponential families, SIAM Conference on Discrete Mathematics (DM 2012), Dalhousie University, Halifax, Nova Scotia, Canada, June 18-21, 2012.

On Secants of Exponential Families, Algebraic Statistics in the Alleghenies, Penn State, PA, USA, June 08-15, 2012.

Approximation Errors of Deep Belief Networks, Applied Algebraic Statistics Seminar, Penn State, PA, USA, February 08, 2012.

2011

Submodels of Deep Belief Networks, [Abstract], Berkeley Algebraic Statistics Seminar, UC Berkeley, CA, USA, December 07, 2011.

Geometry and Approximation Errors of Restricted Boltzmann Machines, The 5th Statistical Machine Learning Seminar, Institute of Statistical Mathematics, Tachikawa, Tokyo, Japan, September 02, 2011.

Geometry of Restricted Boltzmann Machines Towards Geometry of Deep Belief Networks, RIKEN Workshop on Information Geometry, RIKEN BSI, Japan, August 31, 2011.

Selection Criteria for Neuromanifolds of Stochastic Dynamics, The 3rd International Conference on Cognitive Neurodynamics, Niseko Village, Hokkaido, Japan, June 12, 2011.

On Exponential Families and the Expressive Power of Related Generative Models, [Abstract], Laboratoire d'Informatique des Systèmes Adaptatifs (LISA), Université de Montréal, Canada, March 14, 2011.

Mixtures from Exponential Families, Neuronale Netze und Kognitive Systeme Seminar, MPI MIS, Leipzig, Germany, March 02, 2011.

Universal approximation results for Restricted Boltzmann Machines and Deep Belief Networks, Neuronale Netze und Kognitive Systeme Seminar, MPI MIS, Leipzig, Germany, February 16, 2011.

Necessary conditions for RBM universal approximators, Meeting of the Department of Decision-Making Theory - Institute of Information Theory and Automation UTIA, Marianska, Czech Republic, January 18, 2011.

2010

Poster: Mixture Models and Representational Power of RBMs, DBNs and DBMs, NIPS 2010 - Deep Learning and Unsupervised Feature Learning Workshop, Whistler, Canada.

Poster: Faces of the probability simplex contained in the closure of an exponential family and minimal mixture representations, Information Geometry and its Applications III, Leipzig, Germany, 2010.

Information Geometry of Mean-Field Methods, Fall School on Statistical Mechanics and 5th annual PhD Student Conference in Probability, MPI MIS, Leipzig, Germany, September 07-12, 2009.

Quantum-Sanov-Theorem for correlated States in multidimensional Grids, Dies Mathematicus, TU-Berlin, Germany, February 2008.

Quanten-Sanov-Theorem im mehrdimensionallen Fall, Workshop on Complexity and Information Theory, MPI MIS, Leipzig, Germany, October 2007.

Activities

2027

IPAM Workshop: Algebraic Geometry: A Window to Machine Learning, organized with Y Alexandr, M Murray, R Sonthalia, Institute for Pure & Applied Mathematics, Feb 2027.

2026

ICML Workshop: Combining Theory and Benchmarks: Towards A Virtuous Cycle to Understand and Guarantee Foundation Model Performance, organized with B Hu, N Bottman, Y Yang, Y Yan, J Lee, ICML, Seoul, South Korea, Jul 2026.

MFO Workshop: Modern and Emerging Phenomena in Machine Learning, organized with M Murray, D Needell, R Sonthalia, Mathematisches Forschungsinstitut Oberwolfach, Mar 2026.

2025

Conference on Mathematics of Machine Learning 2025, organized with Nihat Ay, Benjamin Gess, Martin Burger, at TUHH, Sep 22-25, 2025.

Mini Symposium on New Frontiers of Geometry and Combinatorics in Machine Learning, organized with P Gallardo, L Escobar, J Gonzalez, A Morales, SIAM AG, University of Wisconsin-Madison, Jul 2025.

Special Scientific Session in Machine Learning at LatMath 2025, UCLA, Mar 6-8, 2025.

AMS Special Session on Algebraic Methods in Machine Learning and Optimization, organized with J Li, Y Alexandr, J Lindberg, JMM, Jan 10, 2025.

2024

Mini Symposium Algebraic Geometry and Machine Learning, organized with Y Cooper, T Tang, J Rodriguez, at SIAM MDS, Oct 25, 2024.

SQuaRE Neural Network Polytopes, organized with L Escobar, P Gallardo, J Gonzalez-Anaya, J Gonzalez, A Morales, at AIM, August 2024.

2023

SQuaRE Neural Network Polytopes at AIM, August 2023.

GNN Mini Meeting at MPI MIS, June 2023.

2022

Workshop on Combinatorics, Algebraic Geometry, and Machine Learning at MPI MIS, August 2022.

Neural Networks and Polytopes LMRC mini-workshop at UCLA, May 2022.

We had the Reunion Conference of the IPAM Program Geometry and Learning from Data in 3D and Beyond at Lake Arrowhead, December 2021.

We are excited to be part of the Priority Programme Theoretical Foundations of Deep Learning (SPP 2298) with a project on Combinatorial and implicit approaches to deep learning.

Together with Pablo Suarez Serrato, Minh Ha Quang, Rongjie Lai we are organizing the BIRS-CMO Workshop Geometry and Learning from Data, online, October 2021.

Together with Benjamin Gess and Nihat Ay we are organizing the ZiF Conference on Mathematics of Machine Learning, Bielefeld, August 2021.

I am participating at the Mathematics of Deep Learning Program, Isaac Newton Institute for Mathematical Sciences, Cambridge, UK, Jul-Dec 2021.

Starting in June 2021 I will be serving as a research mentor at the year-long Latinx Mathematicians Research Community, AIM, 2021.

Optimal transport in the natural sciences, Mathematisches Forschungsinstitut Oberwolfach (MFO), February 2021.

Together with Wuchen Li we are organizing the Wasserstein Information Geometry special session at GSI 2019, Tolouse, France, August 2019.

I am participating at the National Workshop on Data Science Education, UC Berkeley, CA, USA, June 2019.

IST Workshop on Deep Learning Theory, IST, Vienna, Austria, September 2019

Together with Joan Bruna, Yu Guang Wang, Nina Otter, and Zheng Ma, we had the ICERM Collaborate Group Geometry of Data and Networks, Institute for Computational and Experimental Research in Mathematics, Providence, RI, USA, June 2019.

I am a co-organizer of the Geometry of Data and Learning in 3D and Beyond, IPAM Long Program, Institute for Pure and Applied Mathematics, Los Angeles, CA, USA, March - June 2019.

We had a fantastic Deep Learning Theory Kickoff Meeting at the Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany, March 2019.

DALI, January 2019.

Asja Fischer, Jason Morton, and I are organizing the AIM Workshop Boltzmann Machines, American Institute of Mathematics, San Jose, CA, USA, September 2018.

With Asja Fischer I am organizing the Theory of Deep Learning Workshop at DALI 2018, Lanzarote, Spain, April 2018.

Latinx in the Mathematical Sciences Conference, IPAM, March 2018.

Together with Christiane Goergen, Nihat Ay, and Andre Uschmajew, I am a co-initiator of the Math of Data Initiative, Max Planck Institute for Mathematics in the Sciences, Leipzig, Germany.

NIPS 2017, Long Beach, CA, December 2017.

Geometric Science of Information, Paris, November 2017.

ICML 2017, Principled Approaches to Deep Learning, Program Committee, Sydney, Australia, August 2017.

Oberwolfach Workshop Algebraic Statistics, Mathematisches Forschungsinstitut Oberwolfach, Germany, April 2017.

Santa Fe Institute, Visit for Research Collaboration (Nihat Ay), Santa Fe, NM, USA, October 2016.

NIPS 2015, Montréal, Canada.

Santa Fe Institute, Visit for Research Collaboration (Nihat Ay), Santa Fe, NM, USA, October 15-November 15, 2014.

Information Geometry in Learning and Optimization, University of Copenhagen, September 22-26, 2014.

Autonomous Learning: 3. Symposium DFG Priority Programme 1527, Magnus Haus Berlin, Germany, September 08-09, 2014.

Autonomous Learning: Summer School, MPI MIS, September 01-04, 2014.

Santa Fe Institute, Visit for Research Collaboration (Nihat Ay), October 1-27, 2013.

SFI Working Group ``Information Theory of Sensorimotor Loops'', Santa Fe Institute, Santa Fe, NM, USA, October 8-11, 2013.

Pennsylvania State University, Visit for Research Collaboration (Jason Morton), PA, USA, September 2013.

Algebraic Statistics in Europe, IST Austria, September 28-30, 2012.

Graduate Summer School: Deep Learning, Feature Learning, IPAM - UCLA, Los Angeles, CA, USA, July 9-27, 2012.

Singular Learning Theory, AIM Workshop, American Institute of Mathematics, Palo Alto, CA, USA, December 12-16, 2011.

RIKEN-BSI, Laboratory for Mathematical Neuroscience (Prof. S. Amari), Internship, Hirosawa, Wako, Saitama, Japan, August-October 2011.

SFI Complex Systems Summer School (CSSS11), Saint John's College, Santa Fe, NM, USA, June 8-July 1, 2011.

Information Geometry and its Applications (IGAIA III), Leipzig University, Germany, August 2010.

Last updated: 03/2022