Selected Publications

Highlights from recent work on active alignment, multi-agent systems, and uncertainty quantification. The full catalogue appears below.

Active Reward Modeling: Adaptive Preference Labeling for Large Language Model Alignment [ICML 25] Yunyi Shen*, Hao Sun*, Jean-François Ton
Understanding Chain-of-Thought in LLMs through Information Theory [ICLR 25] Jean-François Ton*, Muhammad Faaiz Taufiq*, Yang Liu
ACC-Debate: An Actor-Critic Approach to Multi-Agent Debate [ICLR 25] Jean-François Ton*, Andrew Estornell*, Yuanshun Yao, Yang Liu
Mitigating Reward Overoptimization via Lightweight Uncertainty [NeurIPS 24] Jean-François Ton*, Xiaoying Zhang*, Wei Shen, Hongning Wang, Yang Liu
Conformal Off-Policy Prediction in Contextual Bandits [NeurIPS 22] Jean-François Ton*, Muhammad Faaiz Taufiq*, Rob Cornish, Yee Whye Teh, Arnaud Doucet

All Publications