AIシステムに対する既知の攻撃と影響 Known Attacks and Their Impacts on AI Systems

AIシステムに対する既知の攻撃と影響
Known Attacks and Their Impacts on AI Systems

AIセーフティ・インスティテュートでは、AIシステムに対する特有のセキュリティ攻撃を俯瞰すべく、学術論文等で発表されたAIやAIシステムに対する攻撃とその影響を「AIシステムに対する既知の攻撃と影響」（本資料）としてまとめました。
本資料は、既に公開した「AIセーフティに関するレッドチーミング手法ガイド」に記載のAIシステムへの代表的な攻撃手法を補完し、本資料で示した攻撃とAIシステムへの影響の関係がレッドチーミングのリスクシナリオや攻撃シナリオの作成に活用できることに加えて、AIセキュリティに関する調査検討や研究開発にも参照頂けます。

直近の国際会議での論文の内容を踏まえ、第２版として資料を更新しました。(2026/4/24)

At the Japan AI Safety Institute, we have compiled a document titled “Known Attacks and Their Impacts on AI Systems,” which summarizes attacks on AI and AI systems—along with their consequences—published in academic papers and other sources, to provide an overview of security threats unique to AI systems.
This document supplements the representative attack methods described in our previously published “Guide to Red Teaming Methodology on AI Safety”. It clarifies the relationship between attacks and their impacts on AI systems, helping develop risk and attack scenarios. This document also serves as a reference for further investigation, evaluation, and research and development in AI security.

Based on the content of the paper presented at the recent international conference, we have updated the document for the second edition. (April 24, 2026)