共同テスト演習の評価レポートが公表されました。
Evaluation reports for the 2nd and 3rd joint testing exercise has been published.

国際的なネットワークから10か国が参加し共同で実施した共同テスト演習について、以下の通りレポートが公表されました。
Participants from across the international network including 10 countries, have conducted joint testing exercise. This post announces the publication of evaluation reports.

エージェント評価手法の進展

第3回共同テスト演習の評価レポート"エージェント評価手法の進展"が公表されました。

[pdf]
https://sgaisi.sg/wp-api/wp-content/uploads/2025/07/International-Joint-Testing-Exercise_3JT-Blogpost.pdf

[website]
https://www.aisi.gov.uk/work/international-joint-testing-exercise-agentic-testing

世界の言語におけるLLM評価手法の改善

第2回共同テスト演習の評価レポート"世界の言語におけるLLM評価手法の改善"が公表されました。

[pdf]
https://sgaisi.sg/wp-api/wp-content/uploads/2025/06/Improving-Methodologies-for-LLM-Evaluations-Across-Global-Languages-Evaluation-Report-1.pdf

Advancing Methodologies for Agentic Evaluations Across Domains

The evaluation report of the 3rd joint testing exercise titled "Advancing Methodologies for Agentic Evaluations Across Domains" has been published.

[pdf]
https://sgaisi.sg/wp-api/wp-content/uploads/2025/07/International-Joint-Testing-Exercise_3JT-Blogpost.pdf

[website]
https://www.aisi.gov.uk/work/international-joint-testing-exercise-agentic-testing

Improving Methodologies for LLM Evaluations Across Global Languages

The evaluation report of the 2nd joint testing exercise titled "Improving Methodologies for LLM Evaluations Across Global Languages" has been published.

[pdf]
https://sgaisi.sg/wp-api/wp-content/uploads/2025/06/Improving-Methodologies-for-LLM-Evaluations-Across-Global-Languages-Evaluation-Report-1.pdf