About

The "AI Tea Talk Singapore" series offers a fully open platform designed to welcome experts from various sub-fields of artificial intelligence. It aims to facilitate the sharing of cutting-edge research with anyone interested in AI, both within Singapore and internationally.

The "AI Tea Talk Singapore" series is a community based platform led by a group of junior AI researchers in Singapore and supported by senior scientists in the field

Focus Topics

  • Wide range of AI fields including method development and applications

Upcoming Talks

Previous Talks

Thursday 31st Oct 10 am Singapore/Beijing time , Wed 30th 10pm New York Time

Harshay Shah

Harshay Shah

MIT

Decomposing and Editing Predictions by Modeling Model Computation

TBD

How does the internal computation of a machine learning model transform inputs into predictions? In this paper, we introduce a task called component modeling that aims to address this question. The goal of component modeling is to decompose an ML model's prediction in terms of its components—simple functions (e.g., convolution filters, attention heads) that are the "building blocks" of model computation. We focus on a special case of this task, component attribution, where the goal is to estimate the counterfactual impact of individual components on a given prediction. We then present COAR, a scalable algorithm for estimating component attributions; we demonstrate its effectiveness across models, datasets, and modalities. Finally, we show that component attributions estimated with COAR directly enable model editing across five tasks, namely: fixing model errors, "forgetting" specific classes, boosting subpopulation robustness, localizing backdoor attacks, and improving robustness to typographic attacks. We provide code for COAR at https://github.com/MadryLab/modelcomponents.

Sep 24th 10:30 am SGT, Sep 23rd 10:30pm New York time

Lichao Sun

Lichao Sun

Lehigh University and the Mayo Clinic

BiomedGPT: A generalist vision–language foundation model for diverse biomedical tasks

Traditional biomedical artificial intelligence (AI) models, designed for specific tasks or modalities, often exhibit limited flexibility in real-world deployment and struggle to utilize holistic information. Generalist AI holds the potential to address these limitations due to its versatility in interpreting different data types and generating tailored outputs for diverse needs. However, existing biomedical generalist AI solutions are typically heavyweight and closed source to researchers, practitioners and patients. In this talk, We will discuss the development and performance of BiomedGPT, a novel open-source, lightweight vision-language foundation model designed as a generalist AI for biomedical applications. Unlike previous solutions, BiomedGPT is both computationally efficient and accessible, achieving state-of-the-art results in 16 out of 25 benchmarks across a variety of tasks. We will present human evaluation results that underscore its effectiveness in radiology visual question answering, report generation, and summarization, with performance metrics nearing human expert levels. This talk will explore how BiomedGPT exemplifies the potential of a multi-modal, generalist approach to revolutionize medical diagnostics and improve workflow efficiency./p>

August 22th 2024, 8PM SGT, 8AM New York Time

Ru-Yuan Zhang

Ru-Yuan Zhang

Associate Professor @ Shanghai Jiao Tong University

A Neural Network Approach for Human Visual Learning

Dr. Zhang is currently leading the Cognitive Computational Neuroscience and Brain Imaging Group at the School of Psychology and Shanghai Mental Health Center at Shanghai Jiao Tong. Dr. Zhang has long been working at the intersection of brain science and brain-like intelligence. His research primarily focuses on the neural computational mechanisms of the human brain and artificial intelligence by combining psychophysics, Bayesian probabilistic modeling, deep learning modeling, neuromodulation, and functional magnetic resonance imaging. He has published several cognitive neuroscience papers in PNAS, eLife, J Neurosci, Neuroimage, PLoS Comput Biol, etc. Dr. Zhang's research on brain-like computation has also been published in the world's top machine learning conferences (ICML and IJCAI). He is also a reviewer for several brain science journals such as eLife, Cerebral Cortex, and machine learning conferences such as ICML, NeurIPS, IJCAI, ICLR, CVPR, etc. He is also the Area Chair of NeurIPS 2024.

The past decade has seen a surge in the use of sophisticated AI models to reverse-engineer the human mind and behavior. This NeuroAI approach has dramatically promoted interdisciplinary research between neuroscience and AI. This talk focuses on using the neuroAI approach to elucidate human learning mechanisms. The talk will consist of two parts. First, I will present our work on the relationships between the primate visual system and artificial visual systems (i.e., deep neural networks) during the learning of simple visual discrimination tasks. Our deep learning models of biological visual learning successfully reproduce a wide range of neural phenomena observed in the primate visual system during perceptual learning. The novel predictions generated by our models are further validated against multivariate neuroimaging data in humans and multi-electrode recording data in macaques. In the second part, I will discuss our recent work on neural and computational mechanisms of how the human brain mitigates catastrophic forgetting during continual multitask learning. Leveraging neural network modeling on human learning behavior, we show that the human brain directly distills learned knowledge via elastic weight consolidation rather than other methods such as memory replay. These studies have profound implications for interdisciplinary research at the intersection of neuroscience and artificial intelligence.

August 14th 2024, 10AM SGT/August 8th, 10PM New York Time

August 22th 2024, 8PM SGT, 8AM New York Time

Alex Lamb

Alex Lamb

Discovering Agent-Centric Latent States in Theory and in Practice

Alex Lamb is a senior researcher in the AI Frontiers group at Microsoft. He completed his PhD under Yoshua Bengio and has worked on deep learning, generative models, reinforcement learning, and sequence models. He also worked on deep learning for classical Japanese document recognition as well as demand forecasting systems at Amazon.

Generative AI has led to stunning successes in recent years but is fundamentally limited by the amount of data available. This is especially limiting in the embodied setting – where an agent must solve new tasks in new environments. In this talk, I'll introduce the idea of compositional generative modeling, which enables generalization beyond the training data by building complex generative models from smaller constituents. I'll first introduce the idea of energy-based models and illustrate how they enable compositional generative modeling. I'll then illustrate how such compositional models enable us to synthesize complex plans for unseen tasks at inference time. Finally, I'll show how such compositionality can be applied to multiple foundation models trained on various forms of Internet data, enabling us to construct decision-making systems that can hierarchically zero-shot manner.

July 4th 2024: 10AM SGT

Yilun Du

Yilun Du

PhD in EECS @ MIT

Incoming assistant professor at Harvard University

Generalizing Outside the Training Distribution through Compositional Generation

Yilun Du is an incoming Assistant Professor at Harvard, starting in Fall 2025 at the Kempner Institute and Computer Science department. He is a final year PhD student in EECS at MIT, advised by Prof. Leslie Kaelbling, Prof. Tomas Lozano-Perez, and Prof. Joshua B. Tenenbaum. Yilun's research focuses on generative models, decision making, robot learning, and embodied agents. His work addresses the challenges of limited decision-making data and generalization to unseen situations using energy landscapes for composable generative models. Yilun aims to develop a decentralized generative architecture for decision-making and enhance models with reinforcement learning, with applications in fields like computational biology.

Generative AI has led to stunning successes in recent years but is fundamentally limited by the amount of data available. This is especially limiting in the embodied setting – where an agent must solve new tasks in new environments. In this talk, I'll introduce the idea of compositional generative modeling, which enables generalization beyond the training data by building complex generative models from smaller constituents. I'll first introduce the idea of energy-based models and illustrate how they enable compositional generative modeling. I'll then illustrate how such compositional models enable us to synthesize complex plans for unseen tasks at inference time. Finally, I'll show how such compositionality can be applied to multiple foundation models trained on various forms of Internet data, enabling us to construct decision-making systems that can hierarchically zero-shot manner.

Youtube: https://youtu.be/qJyy21-LPQY

June 26th 2024: 10AM SGT

Baifeng Shi

Baifeng Shi

University of California, Berkeley

Scaling Up Visual Pre-Training: What's Next?

Baifeng Shi is a Ph.D. student advised by Prof. Trevor Darrell at UC Berkeley. He previously graduated from Peking University with a B.S. degree in computer science. Baifeng's research focuses on building generalist vision and robotic models.

Larger models, more data, and longer training are the three-pronged approaches to scaling up visual pre-training. In this talk, I will first share our recent work that challenges the necessity of larger models. We find that pre-trained and frozen smaller models run on larger image scales (e.g., 224->448->672) are generally better than larger models (e.g., Base->Large->Giant). This trend holds across a variety of vision tasks—including image classification, semantic segmentation, and depth estimation—as well as Multimodal LLM benchmarks and robotic tasks. We demonstrate that smaller models, when pre-trained on multiple image scales, have similar model capacities as larger models and can perform on par or even better. Next, I will share some thoughts on the future of scaling visual pre-training, specifically, whether we should shift our focus from larger models to larger images, and how to utilize bottom-up and top-down attention to scale to extremely large images without hitting the computational constraints.

April 17th 2024: 12PM SGT (9PM PT)

Prof. Natasha Jaques

Prof. Natasha Jaques

University of Washington and Google DeepMind

Reinforcement Learning with Human Feedback

Natasha Jaques is an Assistant Professor of Computer Science and Engineering at the University of Washington, and a Senior Research Scientist at Google DeepMind. Her research focuses on Social Reinforcement Learning in multi-agent and human-AI interactions. During her PhD at MIT, she developed techniques for learning from human feedback signals to train language models which were later built on by OpenAI’s series of work on Reinforcement Learning from Human Feedback (RLHF). In the multi-agent space, she has developed techniques for improving coordination through the optimization of social influence, and adversarial environment generation for improving the robustness of RL agents. Her work has received various awards, including Best Demo at NeurIPS, an honourable mention for Best Paper at ICML, and the Outstanding PhD Dissertation Award from the Association for the Advancement of Affective Computing. Her work has been featured in Science Magazine, MIT Technology Review, Quartz, IEEE Spectrum, Boston Magazine, and on CBC radio. Natasha earned her Masters degree from the University of British Columbia, and undergraduate degrees in Computer Science and Psychology from the University of Regina.

Fine-tuning language models with reinforcement learning from human feedback (RLHF) has become the dominant paradigm for aligning large language models to human values. This talk will give a tutorial on RLHF, diving into the details of how to actually perform RL-finetuning of language models. I will cover the history of innovations leading to the form of RLHF used in ChatGPT, including my own work on KL-regularized RL fine-tuning of language models and human-centric dialog training, as well as OpenAI’s early work on learning from human preferences with deep RL. Putting it all together, we will see how what has become known as RLHF integrates these techniques. We will then briefly cover recent developments and directions for future work.

Invited Spearkers

    Coming Soon

Organizers

  • Kai Wang

    Ph.D. Student

    Data Science/Computing @ NUS

  • Trang Nguyen

    Research Assistant - Medicine @ NUS

    PhD Student - CS @ Stanford

  • Srinivas Anumasa

    Research fellow

    NUS

  • Lalithkumar Seenivasan

    PhD/Research fellow

    NUS/Johns Hopkins

  • Xuming Ran

    Research Assistant/PhD Student

    Medicine @ NUS

  • Jayneel Vora

    PhD Student

    CS @ UC Davis

Advisory Committee

  • Dianbo Liu

    Advisor

    Medicine/Engineering @ NUS

  • Yueming Jin

    Advisor

    Engineering @ NUS

  • Yang You

    Advisor

    Computing @ NUS

  • Vincent Y. F. Tan

    Advisor

    Math/Engineering @ NUS

  • Jonathan Scarlett

    Advisor

    CS/Data Science/Math @ NUS

  • Yong Wang

    Advisor

    CCDS @ NTU