Multimodal AI Risk Benchmark Dataset

Dec 30, 2024 ยท 1 min read


Collaborated with Seoul City University, and industry partners to build a large-scale benchmark dataset for AI safety evaluation.
This dataset covers 35 categories of AI-related risks with 11,480 multimodal instances (text, image, video, audio).

Our lab contributed to the development of risk data for image and video modalities, ensuring high-quality, ethically curated datasets for generative AI safety evaluation.

๐Ÿท๏ธ Contribution

  • Image Data: 860 instances
  • Video Data: 310 instances

Prompt Types included:

  • Multiple-Choice
  • Q Only
  • Multi-Session
  • Role-Playing
  • Chain-of-Thought
  • Expert Prompting

๐Ÿ“Š Dataset Overview (Full)

  • Total Data: 11,480 instances

    • Text: 9,560
    • Image: 1,160
    • Video: 430
    • Audio: 330
  • Risk Categories: 35

  • Prompt Types: Multiple-Choice, Q Only, Multi-Session, Role-Playing, Chain-of-Thought, Expert Prompting, Rail, Reflection

Dongkun Lee
Authors
Ph.D. AI Researcher | XR Simulation | Explainable AI | Anomaly Detection
I am an AI researcher with a Ph.D. in Computer Science at KAIST, specializing in Generative AI for XR simulations and anomaly detection in safety-critical systems.
My work focuses on Explainable AI (XAI) to enhance transparency and reliability across smart infrastructure, security, and education.
By building multimodal learning approaches and advanced simulation environments, I aim to improve operational safety, immersive training, and scalable content creation.