Keynote – ‘Thinking’ in Large Language Models

 

Professor Yulan He
Professor Yulan He, KCL

Abstract: It is widely acknowledged that Large Language Models (LLMs), primarily trained for next-token prediction, do not possess capabilities of conscious thoughts or true understanding. However, they can be made to appear “thoughtful” by eliciting intermediate reasoning steps, reflecting on prior outputs, and refining responses – a process often called test-time scaling. In this talk, I will present our latest work on enhancing “thinking” in LLMs, where intermediate thoughts can be viewed as latent reasoning tokens and Question-Answering (QA) can be framed as navigating an embedding space guided by these tokens. We explore sampling diverse reasoning paths through multi-perspective solution seeking and perturbation of the first answer token in the embedding space. We show that beyond explicit Chain-of-Thought (CoT) in discrete token space, implicit CoT reasoning can operate within continuous space.  While existing LLM incentive training often relies on external verifiers to assess outputs, we show that incentive training can be applied across diverse text-to-text tasks without such verifiers. Moreover, we introduce approaches to improve the faithfulness of LLM rationales using dual-model verbal reflection and dual-reward probabilistic inference. I will conclude my talk with our recently proposed Bayesian meta-reasoning framework, designed to enable more robust and generalisable reasoning in LLMs.