Dear all,
Next week, we have the pleasure of having Prof. Scott Aaronson give a talk in the colloquium.
The seminar will be held on Monday, June 17th at 14:00. Location: C221.
The title, abstract and bio appear below.
Looking forward to seeing you, Sagie and Liat
*Title:* AI Safety and Theoretical Computer Science
*Abstract:*
Progress on AI safety and alignment, like the current AI revolution more generally, has been almost entirely empirical. In this talk, however, I'll survey a few areas where I think theoretical computer science can contribute to AI safety, including:
- How can we robustly watermark the outputs of Large Language Models and other generative AI systems, to help identify academic cheating, deepfakes, and AI-enabled fraud? I'll explain my proposal and its basic mathematical properties, as well as what remains to be done.
- Can one insert undetectable cryptographic backdoors into neural nets, for good or ill? In what senses can those backdoors also be unremovable? How robust are they against fine-tuning?
- Should we expect neural nets to be "generically" interpretable? I'll discuss a beautiful formalization of that question due to Paul Christiano, along with some initial progress on it, and an unexpected connection to quantum computing.
*Bio*:
Scott Aaronson is Schlumberger Chair of Computer Science at the University of Texas at Austin, and founding director of its Quantum Information Center, currently on leave at OpenAI to work on theoretical foundations of AI safety. He received his bachelor's from Cornell University and his PhD from UC Berkeley. Before coming to UT Austin, he spent nine years as a professor in Electrical Engineering and Computer Science at MIT. Aaronson's research in theoretical computer science has focused mainly on the capabilities and limits of quantum computers. His first book, Quantum Computing Since Democritus, was published in 2013 by Cambridge University Press. He received the National Science Foundation’s Alan T. Waterman Award, the United States PECASE Award, the Tomassoni-Chisesi Prize in Physics, and the ACM Prize in Computing, and is a Fellow of the ACM and the AAAS.
Reminder, this is happening today.
On Mon, 10 Jun 2024, 15:25 Sagie Benaim, sagie.benaim@mail.huji.ac.il wrote:
Dear all,
Next week, we have the pleasure of having Prof. Scott Aaronson give a talk in the colloquium.
The seminar will be held on Monday, June 17th at 14:00. Location: C221.
The title, abstract and bio appear below.
Looking forward to seeing you, Sagie and Liat
*Title:* AI Safety and Theoretical Computer Science
*Abstract:*
Progress on AI safety and alignment, like the current AI revolution more generally, has been almost entirely empirical. In this talk, however, I'll survey a few areas where I think theoretical computer science can contribute to AI safety, including:
- How can we robustly watermark the outputs of Large Language Models and
other generative AI systems, to help identify academic cheating, deepfakes, and AI-enabled fraud? I'll explain my proposal and its basic mathematical properties, as well as what remains to be done.
- Can one insert undetectable cryptographic backdoors into neural nets,
for good or ill? In what senses can those backdoors also be unremovable? How robust are they against fine-tuning?
- Should we expect neural nets to be "generically" interpretable? I'll
discuss a beautiful formalization of that question due to Paul Christiano, along with some initial progress on it, and an unexpected connection to quantum computing.
*Bio*:
Scott Aaronson is Schlumberger Chair of Computer Science at the University of Texas at Austin, and founding director of its Quantum Information Center, currently on leave at OpenAI to work on theoretical foundations of AI safety. He received his bachelor's from Cornell University and his PhD from UC Berkeley. Before coming to UT Austin, he spent nine years as a professor in Electrical Engineering and Computer Science at MIT. Aaronson's research in theoretical computer science has focused mainly on the capabilities and limits of quantum computers. His first book, Quantum Computing Since Democritus, was published in 2013 by Cambridge University Press. He received the National Science Foundation’s Alan T. Waterman Award, the United States PECASE Award, the Tomassoni-Chisesi Prize in Physics, and the ACM Prize in Computing, and is a Fellow of the ACM and the AAAS.