STAI 2023
The Safe and Trustworthy AI Workshop (ICLP 2023)
The 2023 workshop on Safe and Trustworthy AI (STAI 23) was held on 9 July at the International Conference on Logic Programming (ICLP) in London. Accepted papers were presented either in talk (numbering 13) or as posters (numbering 4). Three winners were nominated. There were three invited speakers divided between one talk (on the adversarial susceptibility of large neural networks) and one fireside chat (on the relationships between the AI ethics community and the catastrophic AI risk community). We made some financial support available to support those who would otherwise be prevented from attending through lack of funding.
Speakers
Best Paper Awards
The best papers were determined by letting participants vote.
Second best talk. Richard Willis and Michael Luck. Resolving social dilemmas through reward transfer commitments
Best poster. Aidan Kierans, Hananel Hazan and Shiri Dori-Hacohen. Quantifying Misalignment Between Agents
Accepted Papers
The accepted papers are listed below. They are ordered alphabetically according to their title. Only some of the papers appear in full on this website (i.e. only some are linked). In total, 23 submission were accepted as either a talk or a poster presentation.
Harvey Mannering. Analysing Gender Bias in Text-to-Image Models using Object Detection
Alexander W. Goodall and Francesco Belardinelli. Approximate Model-Based Shielding for Safe Reinforcement Learning
Harry Coppock. Audio-based AI classifiers show no evidence of improved COVID-19 screening over simple symptoms checkers
Paolo Bova, Alessandro Di Stefano and The Anh Han. Both eyes open: Vigilant Incentives help Auditors improve AI Safety
Matt MacDermott, Tom Everitt and Francesco Belardinelli. Decision Theory Using Mechanised Causal Graphs
Adam Kaufman. Grokking Grokking: Investigating Model Performance on Modular Arithmetic Tasks
Francis Rhys Ward. Honesty Is the Best Policy: Defining and Mitigating AI Deception
Anthony DiGiovanni and Jesse Clifton. Improved coordination with fail-safes and belief-conditioned programs
Dylan Cope. Learning to Plan with Tree Search via Deep RL
Usman Anwar, Chris Lu, David Krueger and Jakob Foerster. Noisy ZSC: Breaking The Common Knowledge Assumption In Zero-Shot Coordination Games
Chad DeChant. On the risks and benefits of episodic memory in AI agents
Aidan Kierans, Hananel Hazan and Shiri Dori-Hacohen. Quantifying Misalignment Between Agents
Elfia Bezou-Vrakatseli, Benedikt Brückner and Luke Thorburn. Reasons Why Influence May Be Unethical
Richard Willis and Michael Luck. Resolving social dilemmas through reward transfer commitments
Luke Bailey, Gustaf Ahdritz and Anat Kleiman. Soft Prompts Are Unlike Token Embeddings
Avinash Kori. Unsupervised Conditional Slot Attention for Object Centric Learning
Mattia Villani. Unwrapping all ReLU Networks
There were three types of submissions: (1) Regular original papers (8 pages) present more mature work that includes some (perhaps preliminary) results, that have not been previously published nor accepted for publication, nor are currently under review by another conference or journal. (2) Short original papers (4 pages) were intended for less well-developed work, where results may still be forthcoming, that has not been previously published nor accepted for publication, nor is currently under review by another conference or journal. (3) Published papers or papers under review (15 pages) reporting on interesting and relevant work that has been published (or accepted for publication) in the last 18 months or is currently under review at another venue.
Programme Committee
The program chairs were C Henrik Åslund, Francesco Belardinalli, Elizabeth Black, and Francis Rhys Ward, see Organisation. The people below performed the indispensible work on reviewing the submissions:
Adedjouma Morayo, Adin Safokem, Aidan Kierans, Alastair Donaldson, Alessandro Abate, Alex Goodall, Alex Jackson, Alex Spies, Almuthanna Alageel, Amani Abou Rida, Anastasios Lepipas, Andrea Omicini, Andrea Orlandini, Aniello Murano, Areej Alzaidi, Ashwathy T Revi, Caitlin Bentley, Caspar Oesterheld, Catalin Dima, Charlie Rogers-Smith, Chengsong Tan, Dekai Zhang, Dimitrios Letsios, Dipesh Singla, Eleanor Watson, Elena Botoeva, Elfia Bezou Vrakatseli, Elizabeth Black, Emiliano Lorini, Emilio Serrano, Florence Eghwrudje, Francesco Chiariello, Ganesh Pai, Hana Kopecka, Hariram Veeramani, Ibrahim Habli, Ishmeet Kaur, Jack McKinlay, Javier Carnerero-Cano, John Favaro, Juliette Mattioli, Kenneth Co, Kevin Wei, Krystal Maughan, Laurent Perrussel, Leon Lang, Leon van der Torre, Luca Viganò, Luis Croquevielle, Lun Ai, Malinda Vania, Mandar Pitale, Maria Stoica, Marta Bienkiewicz, Marten Kaas, Mary Paterson, Matt MacDermott, Matteo Magnini, Mehrdad Saadatmand, Munyque Mittelmann, Nathan Gavenski, Nicky Pochinkov, Philipp Rader, Philippos Papaphilippou, Pierre Parrend, Rachel Horne, Richard Willis, Rob Alexander, Sandareka Wickramanayake, Sanjay Modgil, Sebastian Benthall, Shanza Ali Zafar, Shikha Bordia, Shubhi Asthana, Sian Carey, Simos Gerasimou, Surabhi Sinha, Temitope Ayano, Teun van der Weij, Tilman Räuker, Xiaotong Ji, Xinyi Ye, Yawen Duan, Yi Chang
The workshop aims to give early career researchers (ECRs) working in relevant fields the opportunity to gain experience of participating in a PC, and training and support was provided for this. Both experienced reviewers and ECRs joined our PC (all papers received at least one review from an experienced PC member).