aisst.ai

Summary: Harvard researchers are dedicated to mitigating the risks associated with artificial intelligence development. Our group provides a semester-long introductory course to equip future technical and policy scientists. We begin by analyzing neural network interpretability to understand how models make decisions. Subsequently, we explore learning from human feedback to refine agent behaviors and avoid unintended consequences. Our focus also includes understanding goal misgeneralization in reinforcement learning agents, where models may learn dangerous capabilities rather than beneficial objectives. Additionally, the program evaluates dangerous capabilities to steer the trajectory of AI development toward a more beneficial future. By combining rigorous academic study with practical application, this seminar ensures a robust understanding of safety challenges facing AI systems. Ultimately, our mission is to reduce these risks and guide future research toward a trustworthy and ethically aligned path.
Title: AISST
Description: AISST
Keywords: technical, policy, fellowship, resources, papers, research, safety, menu, home, team, mission, workshops, intro, folder, back, risks, open
NS Lookup: A 198.185.159.145, A 198.49.23.145, A 198.185.159.144, A 198.49.23.144
Dates: Created 2026-04-13

Updated 2026-04-14

Summarized 2026-04-16

Query time: 891 ms

Highspots