My research spans moral psychology, cognitive science, and the nature and implications of artificial intelligence. In one strand of my research, I draw on empirical and computational research to elucidate the nature of motivation in agents, and the implications for questions about rationality and flourishing. In a related strand, I explore the conditions of sentience in animals and artificial agents. Additionally, I work on AI safety and designing benchmarks to evaluate the behavior of frontier models.
Publications
“Do Expected Utility Maximizers Have Commitment Issues?”, Philosophy and Phenomenological Research, Forthcoming.
“Motivation, Pleasure, and Valence” (Paul de Font-Reaulx & Chandra Sripada), Symposium in Philosophia, Forthcoming.
“What makes discrimination wrong?”, Journal of Practical Ethics, Vol. 5(2):105-113, 2017.
Non-Archival Papers
“MoReBench: Evaluating Procedural and Pluralistic Moral Reasoning in Language Models, More than Outcomes” (Yu Ying Chiu, Michael S. Lee, Rachel Calcott, Brandon Handoko, Paul de Font-Reaulx, Paula Rodriguez, Chen Bo Calvin Zhang, Ziwen Han, Udari Madhushani Sehwag, Yash Maurya, Christina Q Knight, Harry R. Lloyd, Florence Bacus, Mantas Mazeika, Bing Liu, Yejin Choi, Mitchell L Gordon, & Sydney Levine) (Under review).
“Machine Theory of Mind and the Structure of Human Values”, NeurIPS 2023 MP2 Workshop.
“Generative Theory of Mind and the Value Misgeneralization Problem”, 2023 AI Alignment Awards Final Prize winner.
“Alignment as a Dynamic Process”, NeurIPS 2022 AI Safety Workshop, AI Risk Analysis award winner.
Works in progress
“What Stands to Desire as Perception Stands to Belief?” (Draft available upon request).
“A Conflation of Valences” (Draft available upon request).
“Epistemic Virtues for Large Language Model Evaluations” (Draft available upon request).
“Self-Reflective Artificial Superintelligence” (Draft in progress)
“DeliberationBench: A Normative Benchmark for the Influence of Large Language Models on Users’ Views” (Luke Hewitt, Max Kroner Dale, & Paul de Font-Reaulx) (Under review).
Reports
“AI: The Consequences for Human Rights: Initial findings on the expected future use of AI in authoritarian regimes”, Governance of AI Project Report submitted as expert testimony to the House of Representatives Commission on Human Rights, 2017.
Writing for a popular audience
“Why gender studies are generally misunderstood”, Expressen, 2019 (in Swedish).
“Where is the meaning of art?”, Medium, 2018.