Tom Schaul - Publications by type

Theses

TU Munich[26]	Studies in Continuous Black-box Optimization. Tom Schaul. Ph.D. Thesis at Technische Universität München. 2011. [Link] [Pdf] [BibTeX] [Paperback at lulu.com] Summa cum laude
EPFL[1]	Evolving a Compact Concept-based Sokoban Solver. Tom Schaul. Masters thesis at the École Polytechnique Fédérale à Lausanne. 2005. [Pdf] [BibTeX]

Journals

Nature Comm.[60]	AI for Social Good: Unlocking the Opportunity for Positive Impact. Nenad Tomašev, Julien Cornebise, Frank Hutter, Shakir Mohamed, Angela Picciariello, Bec Connelly, Danielle Belgrave, Daphne Ezer, Fanny Cachat van der Haert, Frank Mugisha, Gerald Abila, Hiromi Arai, Hisham Almiraat, Julia Proskurnia, Kyle Snyder, Mihoko Otake-Matsuura, Mustafa Othman, Tobias Glasmachers, Wilfried de Wever, Yee Whye Teh, Mohammad Emtiyaz Khan, Ruben De Winne, Tom Schaul* and Claudia Clopath. Nature Communications 11 (2468)*, 2020. [Link]
Nature[58]	Grandmaster level in StarCraft II using multi-agent reinforcement learning. Oriol Vinyals, Igor Babuschkin, Wojciech Czarnecki, Michael Mathieu, Andrew Dudzik, Junyoung Chung, David Choi, Richard Powell, Timo Ewalds, Petko Georgiev, Junhyuk Oh, Dan Horgan, Manuel Kroiss, Ivo Danihelka, Aja Huang, Laurent Sifre, Trevor Cai, John Agapiou, Max Jaderberg, Alexander S. Vezhnevets, Remi Leblond, Tobias Pohlen, Valentin Dalibard, David Budden, Yury Sulsky, James Molloy, Tom L. Paine, Caglar Gulcerhe, Ziyu Wang, Tobias Pfaff, Yuhuai Wu, Roman Ring, Dani Yogatama, Dario Wunsch, Katrina McKinney, Oliver Smith, Tom Schaul, Timothy Lillicrap, Koray Kavukcuoglu, Demis Hassabis, Chris Apps* and David Silver. Nature 575 (7782), 350-354*, 2019. [Link] [Preprint] [Blog] [Video]
BBS[52]	Building machines that learn and think for themselves. Matthew Botvinick, David Barrett, Peter Battaglia, Nando de Freitas, Darshan Kumaran, Joel Leibo, Timothy Lillicrap, Joseph Modayil, Shakir Mohamed, Neil Rabinowitz, Danilo Rezende, Adam Santoro, Tom Schaul, Christopher Summerfield, Greg Wayne, Theophane Weber, Daan Wierstra, Shane Legg and Demis Hassabis. Behavioral and Brain Sciences. 2017. [Link]
T-CIAIG[40]	The 2014 General Video Game Playing Competition. Diego Perez, Spyridion Samothrakis, Julian Togelius, Tom Schaul, Simon Lucas, Adrien Couetoux, Jerry Lee, Chong-U Lim and Tommy Thompson. IEEE Transactions on Computational Intelligence and AI in Games. 2015. [Pdf] [BibTeX]
T-CIAIG[39]	An Extensible Video Game Description Language. Tom Schaul. IEEE Transactions on Computational Intelligence and AI in Games. 2014. [Pdf] [BibTeX] [Code]
JMLR[37]	Natural Evolution Strategies. Daan Wierstra, Tom Schaul, Tobias Glasmachers, Yi Sun, Jan Peters and Jürgen Schmidhuber. Journal of Machine Learning Research. 2014. [Pdf] [BibTeX] [arXiv]
Dagstuhl[36]	General Video Game Playing. John Levine, Clare Bates Congdon, Michal Bida, Marc Ebner, Graham Kendall, Simon Lucas, Risto Miikkulainen, Tom Schaul and Tommy Thompson. Dagstuhl Follow-up, volume 6. 2013. [BibTeX] [Preprint]
Dagstuhl[35]	Towards a Video Game Description Language. Marc Ebner, John Levine, Simon Lucas, Tom Schaul, Tommy Thompson and Julian Togelius. Dagstuhl Follow-up, volume 6. 2013. [BibTeX] [Preprint]
Acta Futura[23]	Artificial Curiosity for Autonomous Space Exploration. Vincent Graziano, Tobias Glasmachers, Tom Schaul, Leo Pape, Giuseppe Cuccu, Jürgen Leitner and Jürgen Schmidhuber. Acta Futura. 2011. [Pdf] [BibTeX] [Link]
Scholarpedia[18]	Metalearning. Tom Schaul and Jürgen Schmidhuber. Scholarpedia. 5(6):4650. 2010. [Link] [BibTeX]
JMLR[16]	PyBrain. Tom Schaul, Justin Bayer, Daan Wierstra, Yi Sun, Martin Felder, Frank Sehnke, Thomas Rückstieß and Jürgen Schmidhuber. Journal of Machine Learning Research. 2010. [Pdf] [BibTeX]
JBR[15]	Exploring Parameter Space in Reinforcement Learning. Thomas Rückstieß, Frank Sehnke, Tom Schaul, Daan Wierstra, Yi Sun and Jürgen Schmidhuber. Paladyn Journal of Behavioral Robotics. 2010. [Pdf] [BibTeX]
NIM-A[11]	Assessment of Neural Networks Training Strategies for Histomorphometric Analysis of Synchrotron Radiation Medical Images. Anderson Alvarenga de Moura Meneses, Christiano Pinheiro, Paola Rancoita, Tom Schaul, Luca Gambardella, Roberto Schirru, Regina Barroso and Luís de Oliveira. Nuclear Instruments and Methods in Physics Research, Section A. 2010. [Link] [BibTeX]
KI[7]	Ontogenetic and Phylogenetic Reinforcement Learning. Julian Togelius, Tom Schaul, Daan Wierstra, Christian Igel, Faustino Gomez and Jürgen Schmidhuber. Zeitschrift Künstliche Intelligenz - Special Issue on Reinforcement Learning. 2009. [Pdf] [BibTeX]

Peer-reviewed Conferences

NeurIPS[70]	Plasticity as the Mirror of Empowerment. David Abel, Michael Bowling, André Barreto, Will Dabney, Shi Dong, Steven Hansen, Anna Harutyunyan, Khimya Khetarpal, Clare Lyle, Razvan Pascanu, Georgios Piliouras, Doina Precup, Jonathan Richens, Mark Rowland, Tom Schaul and Satinder Singh. Proceedings of the Neural Information Processing Systems (NeurIPS-2025, San Diego, spotlight). [arXiv] [openreview]
NeurIPS[69]	DataRater: Meta-Learned Dataset Curation. Dan A. Calian, Gregory Farquhar, Iurii Kemaev, Luisa Zintgraf, Matteo Hessel, Jeremy Shar, Junhyuk Oh, András György, Tom Schaul, Jeff Dean, Hado van Hasselt and David Silver. Proceedings of the Neural Information Processing Systems (NeurIPS-2025, San Diego). [arXiv] [openreview]
ICML[68]	AuPair: Golden Example Pairs for Code Repair. Aditi Mavalankar, Hassan Mansoor, Zita Marinho, Mariia Samsikova and Tom Schaul. Proceedings of the International Conference on Machine Learning (ICML-2025) [arXiv] [openreview]
ICML[67]	Open-Endedness is Essential for Artificial Superhuman Intelligence. Edward Hughes, Michael Dennis, Jack Parker-Holder, Feryal Behbahani, Aditi Mavalankar, Yuge Shi, Tom Schaul and Tim Rocktäschel. Proceedings of the International Conference on Machine Learning (ICML-2024, oral presentation). [arXiv] [openreview]
IJCAI[66]	Scaling Goal-based Exploration via Pruning Proto-goals. Akhil Bagaria and Tom Schaul. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI-2033, Macao, China). [arXiv]
GECCO[65]	Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization. Robert Tjarko Lange, Tom Schaul, Yutian Chen, Chris Lu, Tom Zahavy, Valentin Dallibard and Sebastian Flennerhag. Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO-2023, Lisbon). [arXiv] Nominated for Best Paper Award
ICLR[64]	Discovering Evolution Strategies via Meta-Black-Box Optimization. Robert Tjarko Lange, Tom Schaul, Yutian Chen, Tom Zahavy, Valentin Dallibard, Chris Lu, Satinder Singh and Sebastian Flennerhag. Proceedings of the International Conference on Learning Representations (ICLR 2023, Kigali) [arXiv] [openreview]
NeurIPS[63]	The Phenomenon of Policy Churn. Tom Schaul, André Barreto, John Quan and Georg Ostrovski. Proceedings of the Neural Information Processing Systems (NeurIPS-2022, New Orleans) [arXiv]
ICML[62]	Model-Value Inconsistency as a Signal for Epistemic Uncertainty. Angelos Filos, Eszter Vértes, Zita Marinho, Gregory Farquhar, Diana Borsa, Abram Friesen, Feryal Behbahani, Tom Schaul, André Barreto and Simon Osindero. Proceedings of the International Conference on Machine Learning* (ICML-2022, Baltimore). [arXiv]
ICLR[61]	When should agents explore? Miruna Pîslar, David Szepesvari, Georg Ostrovski, Diana Borsa and Tom Schaul. Proceedings of the International Conference on Learning Representations (ICLR 2022, spotlight). [arXiv]
AISTATS[59]	Conditional Importance Sampling for Off-Policy Learning. Mark Rowland, Anna Harutyunyan, Hado van Hasselt, Diana Borsa, Tom Schaul, Rémi Munos and Will Dabney. Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS-2020, Palermo). [arXiv]
ICLR[57]	Universal Successor Features Approximators. Diana Borsa, Andre Barreto, John Quan, Daniel Mankowitz, Hado van Hasselt, Remi Munos, David Silver and Tom Schaul. Proceedings of the International Conference on Learning Representations (ICLR 2019, New Orleans). [openreview] [arXiv]
ICML[56]	Transfer in Deep Reinforcement Learning Using Successor Features and Generalised Policy Improvement. Andre Barreto, Diana Borsa, John Quan, Tom Schaul, David Silver, Matteo Hessel, Daniel Mankowitz, Augustin Zidek, Remi Munos. Proceedings of the International Conference on Machine Learning (ICML-2018, Stockholm). [pdf] [link]
GECCO[55]	Meta Learning by the Baldwin Effect. Chrisantha Fernando, Jakub Sygnowski, Simon Osindero, Jane Wang, Tom Schaul, Denis Teplyashin, Pablo Sprechmann, Alexander Pritzel, Andrei Rusu. Proceedings of the Genetic and Evolutionary Computation Conference Companion (GECCO-2018, Kyoto). [arXiv]
AAAI[54]	Rainbow: Combining Improvements in Deep Reinforcement Learning. Matteo Hessel, Joseph Modayil, Hado van Hasselt, Tom Schaul, Georg Ostrovski, Will Dabney, Dan Horgan, Bilal Piot, Mohammad Azar, David Silver. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI-2018, New Orleans). [arXiv]
AAAI[53]	Learning from Demonstrations for Real World Reinforcement Learning. Todd Hester, Matej Vecerik, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Andrew Sendonaris, Gabriel Dulac-Arnold, Ian Osband, John Agapiou, Joel Z Leibo, Audrunas Gruslys. Proceedings of the AAAI Conference on Artificial Intelligence (AAAI-2018, New Orleans). [arXiv]
NIPS[51]	Natural value approximators: learning when to trust past estimates. Zhongwen Xu, Joseph Modayil, Hado van Hasselt, Andre Barreto, David Silver and Tom Schaul. Proceedings of the Neural Information Processing Systems (NIPS-2017, Long Beach) [pdf]
NIPS[50]	Successor Features for Transfer in Reinforcement Learning. André Barreto, Will Dabney, Rémi Munos, Jonathan Hunt, Tom Schaul, David Silver and Hado van Hasselt. Proceedings of the Neural Information Processing Systems (NIPS-2017, Long Beach) [arXiv] [pdf]
ICML[49]	The Predictron: End-To-End Learning and Planning. David Silver, Hado van Hasselt, Matteo Hessel, Tom Schaul, Arthur Guez, Tim Harley, Gabriel Dulac-Arnold, David Reichert, Neil Rabinowitz, Andre Barreto, Thomas Degris. Proceedings of the International Conference on Machine Learning* (ICML-2017, Syndney). [arXiv] [openreview]
ICML[48]	FeUdal Networks for Hierarchical Reinforcement Learning. Alexander Sasha Vezhnevets, Simon Osindero, Tom Schaul, Nicolas Heess, Max Jaderberg, David Silver, Koray Kavukcuoglu. Proceedings of the International Conference on Machine Learning (ICML-2017, Sydney). [arXiv]
ICLR [47]	Reinforcement Learning with Unsupervised Auxiliary Tasks. Max Jaderberg, Volodymyr Mnih, Wojciech Czarnecki, Tom Schaul, Joel Leibo, David Silver, Koray Kavukcuoglu. Proceedings of the International Conference on Learning Representations* (ICLR-2017, Toulon). [arXiv] [openreview]
NIPS[46]	Unifying Count-Based Exploration and Intrinsic Motivation. Marc G. Bellemare, Sriram Srinivasan, Georg Ostrovski, Tom Schaul, David Saxton and Rémi Munos. Proceedings of the Neural Information Processing Systems (NIPS-2016) [arXiv] [Video (100k+ views)]
NIPS [45]	Learning to learn by gradient descent by gradient descent. Marcin Andrychowicz, Misha Denil, Sergio Gomez, Matthew W. Hoffman, David Pfau, Tom Schaul and Nando de Freitas. Proceedings of the Neural Information Processing Systems (NIPS-2016) [arXiv]
IEEE-CIG[44]	Analyzing the Robustness of General Video Game Playing Agents. Diego Perez-Liebana, Spyridon Samothrakis, Julian Togelius, Tom Schaul and Simon Lucas. Proceedings of the IEEE Conference on Computational Intelligence in Games (CIG-2016, Greece). [Pdf] [BibTeX]
ICML[43]	Dueling Network Architectures for Deep Reinforcement Learning. Ziyu Wang, Tom Schaul, Matteo Hessel, Hado van Hasselt, Marc Lanctot, Nando de Freitas. Proceedings of the International Conference on Machine Learning (ICML-2016, New York). [arXiv] [BibTeX] Best Paper Award
ICLR[42]	Prioritized Experience Replay. Tom Schaul, John Quan, Ioannis Antonoglou and David Silver. Proceedings of the International Conference on Learning Representations (ICLR-2016, Puerto Rico). [arXiv] [BibTeX]
ICML[41]	Universal Value Function Approximators. Tom Schaul, Daniel Horgan, Karol Gregor and David Silver. Proceedings of the International Conference on Machine Learning (ICML-2015, Lille). [Pdf] [BibTeX]
ICLR[38]	Unit Tests for Stochastic Optimization. Tom Schaul, Ioannis Antonoglou and David Silver. Proceedings of the International Conference on Learning Representations (ICLR-2014, Banff, Canada). [BibTeX] [arXiv] [Code] [Public reviews]
IEEE-CIG[34]	A Video Game Description Language for Model-based or Interactive Learning. Tom Schaul. Proceedings of the IEEE Conference on Computational Intelligence in Games (CIG-2013, Niagara Falls, Canada). [Pdf] [Code] [BibTeX] Runner-up to Best Paper Award
ICML[33]	No more Pesky Learning Rates. Tom Schaul, Sixin Zhang and Yann LeCun. Proceedings of the International Conference on Machine Learning (ICML-2013, Atlanta GA). [arXiv] [Pdf] [Supplementary material] [BibTeX]
IJCAI[32]	Better Generalization with Forecasts. Tom Schaul and Mark Ring. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI-2013, Beijing, China). [Pdf] [BibTeX] Basis for patent [P1].
ICLR[31]	Adaptive Learning Rates and Parallelization for Stochastic, Sparse, Non-smooth Gradients. Tom Schaul and Yann LeCun. Proceedings of the International Conference on Learning Representations (ICLR-2013, Scottsdale AZ). [Pdf] [BibTeX] [Public reviews]
GECCO[30]	A Linear Time Natural Evolution Strategy for Non-Separable Functions. Yi Sun, Faustino Gomez, Tom Schaul and Jürgen Schmidhuber. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2013, Amsterdam). [arXiv] [BibTeX]
ICDL[28]	The Organization of Behavior into Temporal and Spatial Neighborhoods. Mark Ring and Tom Schaul. Proceedings of the International Conference on Developmental Learning (ICDL-2012, San Diego). [Pdf] [BibTeX]
GECCO[27]	Natural Evolution Strategies Converge on Sphere Functions. Tom Schaul. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2012, Philadelphia). [Pdf] [BibTeX]
ICDL[25]	The Two-Dimensional Organization of Behavior. Mark Ring, Tom Schaul and Jürgen Schmidhuber. Proceedings of the International Conference on Developmental Learning (ICDL-2011, Frankfurt). [Pdf] [BibTeX]
AGI[24]	Coherence Progress: A Measure of Interestingness Based on Fixed Compressors Tom Schaul, Leo Pape, Tobias Glasmachers, Vincent Graziano and Jürgen Schmidhuber. Proceedings of the Fourth Conference on Artificial General Intelligence (AGI-2011, Mountain View). [Pdf] [BibTeX] [Video] Winner of PhD Challenge
IJCAI[22]	Q-error as a Selection Mechanism in Modular Reinforcement-Learning Systems. Mark Ring and Tom Schaul. Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI-2011, Barcelona). [Pdf] [BibTeX] [Video]
GECCO[21]	High Dimensions and Heavy Tails for Natural Evolution Strategies. Tom Schaul, Tobias Glasmachers and Jürgen Schmidhuber. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2011, Dublin). [Pdf] [BibTeX]
CEC[20]	Curiosity-driven Optimization. Tom Schaul, Yi Sun, Daan Wierstra, Faustino Gomez and Jürgen Schmidhuber. Proceedings of IEEE Congress on Evolutionary Computation, (CEC-2011, New Orleans). [Pdf] [BibTeX]
PPSN[19]	A Natural Evolution Strategy for Multi-Objective Optimization. Tobias Glasmachers, Tom Schaul, and Jürgen Schmidhuber. Proceedings of Parallel Problem Solving from Nature (PPSN-2010, Krakow). [Pdf] [BibTeX]
GECCO[17]	Exponential Natural Evolution Strategies. Tobias Glasmachers, Tom Schaul, Yi Sun, Daan Wierstra and Jürgen Schmidhuber. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2010, Portland). [Pdf] [BibTeX] Nominated for Best Paper Award
AGI[14]	Frontier Search. Yi Sun, Tobias Glasmachers, Tom Schaul and Jürgen Schmidhuber. Proceedings of the Third Conference on Artificial General Intelligence (AGI-2010, Lugano). [Pdf] [BibTeX] Kurzweil Best Paper Prize
AGI[13]	Towards Practical Universal Search. Tom Schaul and Jürgen Schmidhuber. Proceedings of the Third Conference on Artificial General Intelligence (AGI-2010, Lugano). [Pdf] [BibTeX] [Presentation Video]
ICANN[12]	Multi-Dimensional Deep Memory Go-Player for Parameter Exploring Policy Gradients. Mandy Grüttner, Frank Sehnke, Tom Schaul and Jürgen Schmidhuber. Proceedings of the International Conference on Artificial Neural Networks (ICANN-2010, Greece).[Pdf] [BibTeX]
GECCO[10]	Efficient Natural Evolution Strategies. Yi Sun, Daan Wierstra, Tom Schaul and Jürgen Schmidhuber. Proceedings of the Genetic and Evolutionary Computation Conference (GECCO-2009, Montreal). [Pdf] [BibTeX] Best Paper Award
ICML[9]	Stochastic Search using the Natural Gradient. Yi Sun, Daan Wierstra, Tom Schaul and Jürgen Schmidhuber. Proceedings of the International Conference on Machine Learning (ICML-2009, Montreal). [Pdf] [BibTeX]
ICANN[8]	Scalable Neural Networks for Board Games. Tom Schaul and Jürgen Schmidhuber. Proceedings of the International Conference on Artificial Neural Networks (ICANN-2009, Cyprus). [Pdf] [BibTeX]
IEEE-CIG[6]	A Scalable Neural Network Architecture for Board Games. Tom Schaul and Jürgen Schmidhuber. Proceedings of the IEEE Symposium on Computational Intelligence in Games (CIG-2008, Perth). [Pdf] [BibTeX]
PPSN[5]	Fitness Expectation Maximization. Daan Wierstra, Tom Schaul, Jan Peters and Jürgen Schmidhuber. Proceedings of Parallel Problem Solving from Nature (PPSN-2008, Dortmund). [Pdf] [BibTeX]
PPSN[4]	Countering Poisonous Inputs with Memetic Neuroevolution. Julian Togelius, Tom Schaul, Jürgen Schmidhuber and Faustino Gomez. Proceedings of Parallel Problem Solving from Nature (PPSN-2008, Dortmund). [Pdf] [BibTeX]
CEC[3]	Natural Evolution Strategies. Daan Wierstra, Tom Schaul, Jan Peters and Jürgen Schmidhuber. Proceedings of IEEE Congress on Evolutionary Computation (CEC-2008, Hongkong). [Pdf] [BibTeX]
ICANN[2]	Episodic Reinforcement Learning by Logistic Reward-Weighted Regression. Daan Wierstra, Tom Schaul, Jan Peters and Jürgen Schmidhuber. Proceedings of the International Conference on Artificial Neural Networks (ICANN-2008, Prague). [Pdf] [BibTeX]

Patents

US Patent[P13]	Reinforcement learning to explore environments. Luisa Zintgraf, Zita Marinho, Iurii Kemaev, Louis Kirsch, Junhyuk Oh and Tom Schaul. United States Patent Application 18/846,412. 2025. Based on [W20].
US Patent[P12]	Controlling agents by switching between control policies during task episodes. Tom Schaul and Miruna Pîslar. United States Patent Application 18/294,784. 2024. Based on [61].
US Patent[P11]	Temporal difference scaling when controlling agents using reinforcement learning. Tom Schaul. United States Patent Application 18/275,145. 2024. Based on [T6].
US Patent[P10]	Meta-learned evolutionary strategies optimizer. Robert Tjarko Lange, Tom Schaul, Yutian Chen, Tom Zahavy, Valentin Dallibard, Chris Lu, Satinder Singh and Sebastian Flennerhag. United States Patent Application 18/475,859. 2024. Based on [64].
US Patent[P9]	Modulating agent behavior to optimize learning progress. Tom Schaul, Diana Borsa, David Ding, David Szepesvari, Georg Ostrovski, Will Dabney and Simon Osindero. United States Patent Application 17/032,562. 2021. Based on [W19].
US Patent[P8]	Learning non-differentiable weights of neural networks using evolutionary strategies. Karel Lenc, Karen Simonyan, Tom Schaul and Erich Elsen. United States Patent Application 16/751,169. 2020. Based on [T3].
US Patent[P7]	Continual reinforcement learning with a multi-task agent. Tom Schaul, Matteo Hessel, Hado van Hasselt and Dan Mankowitz. United States Patent Application 16/268,414. 2019. Based on [W17].
US Patent[P6]	Environment prediction using reinforcement learning. David Silver, Tom Schaul, Matteo Hessel and Hado van Hasselt. United States Patent Application 16/403,314. 2019. Based on [49].
US Patent[P5]	Reinforcement learning with auxiliary tasks. Volodymyr Mnih, Wojciech Czarnecki, Maxwell Jaderberg, Tom Schaul, David Silver, Koray Kavukcuoglu. United States Patent Application 16/403,385. 2019. Based on [47].
US Patent[P4]	Training machine learning models. Misha Denil, Tom Schaul, Marcin Andrychowicz, Nando de Freitas, Sergio Gomez, Matthew Hoffman and David Pfau. United States Patent Application 16/302,592. 2019. Based on [45].
US Patent[P3]	Training neural networks using a prioritized experience memory. Tom Schaul, John Quan and David Silver. United States Patent Application 20170140269. 2017. Based on [42].
US Patent[P2]	Selecting reinforcement learning actions using goals and observations. Tom Schaul, Dan Horgan, Karol Gregor and David Silver. United States Patent Application 20160292568. 2016. Based on [41].
US Patent[P1]	Method for Creating Predictive Knowledge Structures from Experience in an Artificial Agent. Mark Ring and Tom Schaul. United States Patent Application 20160012338. 2016. [Link] [BibTeX]

Book chapters

Springer[29]

Optimization with Surrogate Models. Tom Schaul.
Chapter 3 in Numerical Methods for Metamaterial Design, edited by Kenneth Diest. 2013. [Link] [BibTeX]

Workshops and Abstracts

RLDM[W24]	Agency Is Frame-Dependent. David Abel, André Barreto, Michael Bowling, Will Dabney, Shi Dong, Steven Hansen, Anna Harutyunyan, Khimya Khetarpal, Clare Lyle, Razvan Pascanu, Georgios Piliouras, Doina Precup, Jonathan Richens, Mark Rowland, Tom Schaul and Satinder Singh. Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM-2025, oral presentation). [arXiv]
NeurIPS-LG[W23]	Boundless Socratic Learning with Language Games. Tom Schaul. NeurIPS Workshop on Language Gamification (NeurIPS-LG-2024, invited talk). [arXiv] [openreview]
NeurIPS-ALOE[W22]	Vision-Language Models as a Source of Rewards. Harris Chan, Volodymyr Mnih, Feryal Behbahani, Michael Laskin, Luyu Wang, Fabio Pardo, Maxime Gazeau, Himanshu Sahni, Dan Horgan, Kate Baumli, Yannick Schroecker, Stephen Spencer, Richie Steigerwald, John Quan, Gheorghe Comanici, Sebastian Flennerhag, Alexander Neitz, Lei Zhang, Tom Schaul, Satinder Singh, Clare Lyle, Tim Rocktäschel, Jack Parker-Holder and Kristian Holsheimer. Second Agent Learning in Open-Endedness Workshop, NeurIPS (ALOE-2033, New Orleans). [openreview]
Barbados[W21]	Scaling Goal-based Exploration via Pruning Proto-goals. Akhil Bagaria, Ray Jiang, Ramana Kumar and Tom Schaul. Barbados Reinforcement Learning Workshop on Lifelong Reinforcement Learning (2023, Holetown, Barbados). [arXiv] [slides]
RLDM[W20]	RL2X: Reinforcement Learning to Explore. Luisa Zintgraf, Zita Marinho, Iurii Kemaev, Louis Kirsch, Junhyuk Oh, and Tom Schaul. Multi-disciplinary Conference on Reinforcement Learning and Decision Making (RLDM-2022).
BeTR-RL[W19]	Adapting Behaviour for Learning Progress. Tom Schaul, Diana Borsa, David Ding, David Szepesvari, Georg Ostrovski, Will Dabney and Simon Osindero. ICLR workshop on: Beyond 'Tabula Rasa' in Reinforcement Learning (BeTR-RL-2020) [arXiv]
RLDM[W18]	Ray Interference: a Source of Plateaus in Deep Reinforcement Learning. Tom Schaul, Diana Borsa, Joseph Modayil and Razvan Pascanu. Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM-2019) [arXiv]
RLDM[W17]	Unicorn: Continual Learning with a Universal, Off-policy Agent. Daniel Mankowitz, Augustin Zidek, Andre Barreto, Dan Horgan, Matteo Hessel, John Quan, Junhyuk Oh, Hado van Hasselt, David Silver and Tom Schaul. Multidisciplinary Conference on Reinforcement Learning and Decision Making (RLDM-2019) [arXiv] [openreview]
NIPS-CL[W16]	The Barbados 2018 List of Open Issues in Continual Learning. Tom Schaul, Hado van Hasselt, Joseph Modayil, Martha White, Adam White, Pierre-Luc Bacon, Jean Harb, Shibl Mourad, Marc Bellemare and Doina Precup. NIPS Workshop on Continual Learning (NIPS-CL-2018, Montreal). [arXiv]
Barbados[W15]	Universal Successor Features Approximators. Diana Borsa, Andre Barreto, John Quan, Daniel Mankowitz, Hado van Hasselt, Remi Munos, David Silver and Tom Schaul. 11th Barbados Workshop on Reinforcement Learning (2018, Holetown, Barbados). [openreview]
AAAI-WH[W14]	General Video Game AI: Competition, Challenges and Opportunities. Diego Perez-Liebana, Spyridon Samothrakis, Julian Togelius, Tom Schaul and Simon Lucas. AAAI What's Hot Track (2016, Phoenix). [Pdf]
Barbados[W13]	Universal Value Function Approximators. Tom Schaul, Daniel Horgan, Karol Gregor and David Silver. 9th Barbados Workshop on Reinforcement Learning (2015, Holetown, Barbados). [Slides]
Barbados[W12]	Better Generalization with Forecasts. Tom Schaul and Mark Ring. 8th Barbados Workshop on Reinforcement Learning (2013, Holetown, Barbados). [Slides]
AAAI-LML[W11]	Organizing Behavior into Temporal and Spatial Neighborhoods. Mark Ring and Tom Schaul. AAAI Spring Symposium on Lifelong Machine Learning (AAAI-LML-2013, Stanford University, California). [Pdf]
NIPS-OPT[W10]	No More Pesky Learning Rates. Tom Schaul, Sixin Zhang and Yann LeCun. NIPS Workshop on Optimization for Machine Learning (2012).
NYC-ML[W9]	Adaptive Learning Rates for Stochastic Gradients. Tom Schaul, Sixin Zhang and Yann LeCun. New York City Machine Learning Symposium (2012).
BBOB[W8]	Comparing Natural Evolution Strategies to BIPOP-CMA-ES on Noiseless and Noisy Black-box Optimization Testbeds. Tom Schaul. Black-box Optimization Benchmarking: GECCO Workshop for Real-Parameter Optimization (2012). [Pdf] [BibTeX]
BBOB[W7]	Investigating the Impact of Adaptation Sampling in Natural Evolution Strategies on Black-box Optimization Testbeds. Tom Schaul. Black-box Optimization Benchmarking: GECCO Workshop for Real-Parameter Optimization (2012). [Pdf] [BibTeX]
BBOB[W6]	Benchmarking Separable Natural Evolution Strategies on the Noiseless and Noisy Black-box Optimization Testbeds. Tom Schaul. Black-box Optimization Benchmarking: GECCO Workshop for Real-Parameter Optimization (2012). [Pdf] [BibTeX]
BBOB[W5]	Benchmarking Exponential Natural Evolution Strategies on the Noiseless and Noisy Black-box Optimization Testbeds. Tom Schaul. Black-box Optimization Benchmarking: GECCO Workshop for Real-Parameter Optimization (2012). [Pdf] [BibTeX]
BBOB[W4]	Benchmarking Natural Evolution Strategies with Adaptation Sampling on the Noiseless and Noisy Black-box Optimization Testbeds. Tom Schaul. Black-box Optimization Benchmarking: GECCO Workshop for Real-Parameter Optimization (2012). [Pdf] [BibTeX]
Snowbird[W3]	Decoupling the Data Geometry from the Parameter Geometry for Stochastic Gradients. Tom Schaul, Sixin Zhang and Yann LeCun. Snowbird Learning Workshop (2012). [Pdf] [BibTeX]
AAAI[W2]	The Two-Dimensional Organization of Behavior. Mark Ring, Tom Schaul and Jürgen Schmidhuber. AAAI Workshop on Lifelong Learning from Sensorimotor Experience (AAAI-2011, San Francisco). [Pdf]
ICML-MLOS[W1]	PyBrain. Tom Schaul, Justin Bayer, Daan Wierstra, Yi Sun, Martin Felder, Frank Sehnke, Thomas Rückstieß and Jürgen Schmidhuber. ICML Workshop on Machine Learning Open Source (ICML-2010, Haifa). [Pdf] [Video]

Technical reports

(not otherwise published)

Tech report[T8]	Gemini 2.5: Pushing the Frontier with Advanced Reasoning, Multimodality, Long Context, and Next Generation Agentic Capabilities. Gemini Team, Google (3000+ contributors). 2025. [Pdf] [Blog] [arXiv]
Dagstuhl[T7]	AI for the Social Good. Claudia Clopath, Ruben De Winne and Tom Schaul. 2022. [workshop] [report]
arXiv[T6]	Return-based Scaling: Yet Another Normalisation Trick for Deep RL . Tom Schaul, Georg Ostrovski, Iurii Kemaev, and Diana Borsa. 2021. [arXiv]
arXiv[T5]	Policy Evaluation Networks . Jean Harb, Tom Schaul, Doina Precup and Pierre-Luc Bacon. 2020. [arXiv]
Dagstuhl[T4]	AI for the Social Good. Claudia Clopath, Ruben De Winne, Mohammad Emtiyaz Khan and Tom Schaul. 2019. [workshop] [report]
arXiv[T3]	Non-differentiable Supervised Learning with Evolution Strategies and Hybrid Methods. Karel Lenc, Erich Elsen, Tom Schaul and Karen Simonyan. 2019. [arXiv]
arXiv[T2]	StarCraft II: A New Challenge for Reinforcement Learning. Oriol Vinyals, Timo Ewalds, Sergey Bartunov, Petko Georgiev, Alexander Sasha Vezhnevets, Michelle Yeo, Alireza Makhzani, Heinrich Küttler, John Agapiou, Julian Schrittwieser, Stephen Gaffney, Stig Petersen, Karen Simonyan, Tom Schaul, Hado van Hasselt, David Silver, Timothy Lillicrap, Kevin Calderone, Paul Keet, Anthony Brunasso, David Lawrence, Anders Ekermo, Jacob Repp and Rodney Tsing. 2017. [arXiv]
arXiv[T1]	Measuring Intelligence through Games. Tom Schaul, Julian Togelius and Jürgen Schmidhuber. 2011. [arXiv]