When a teacher provides examples for a student to study, these examples must
be informative, enabling a student to progress from their current state toward
a target concept or skill. Good teachers must therefore simultaneously infer
what students already know and adapt their teaching to students' changing state
of knowledge. There is increasing interest in using computational models,
particularly large language models, as pedagogical tools. As students, language
models in particular have shown a remarkable ability to adapt to new tasks
given small numbers of examples. But how effectively can these models adapt as
teachers to students of different types? To study this question, we introduce a
suite of models and evaluation methods we call AdapT. AdapT has two components:
(1) a collection of simulated Bayesian student models that can be used for
evaluation of automated teaching methods; (2) a platform for evaluation with
human students, to characterize the real-world effectiveness of these methods.
We additionally introduce (3) AToM, a new probabilistic model for adaptive
teaching that jointly infers students' past beliefs and optimizes for the
correctness of future beliefs. In evaluations of simulated students across
three learning domains (fraction arithmetic, English morphology, function
learning), AToM systematically outperforms LLM-based and standard Bayesian
teaching models. In human experiments, both AToM and LLMs outperform
non-adaptive random example selection. Our results highlight both the
difficulty of the adaptive teaching task and the potential of learned adaptive
models for solving it.