Causal-Entity Reflected Egocentric Traffic Accident Video Synthesis

1 Xi’an Jiaotong University
2 National University of Singapore
3 Nanyang Technological University
4 Cleveland State University
ICCV 2025
Causal-VidSyn training schema

The training schema of Causal-VidSyn.

Abstract

Egocentricly comprehending the causes and effects of car accidents is crucial for the safety of self-driving cars, and synthesizing causal-entity reflected accident videos can facilitate the capability test to respond to unaffordable accidents in reality. However, incorporating causal relations as seen in real-world videos into synthetic videos remains challenging. This work argues that precisely identifying the accident participants and capturing their related behaviors are of critical importance. In this regard, we propose a novel diffusion model Causal-VidSyn for synthesizing egocentric traffic accident videos. To enable causal entity grounding in video diffusion, Causal-VidSyn leverages the cause descriptions and driver fixations to identify the accident participants and behaviors, facilitated by accident reason answering and gaze-conditioned selection modules. To support Causal- VidSyn, we further construct Drive-Gaze, the largest driver gaze dataset (with 1.54M frames of fixations) in driving accident scenarios. Extensive experiments show that Causal- VidSyn surpasses state-of-the-art video diffusion models in terms of frame quality and causal sensitivity in various tasks, including accident video editing, normal-to-accident video diffusion, and text-to-video generation.

Image 2

Sample visualizations of N2A task by Latte* , Latte-T, CogV-X*, CogV-X-T, MotionClone, A-OAVD, LAMP, and our Causal-VidSyn (Best viewed in zoom mode).

Image 3

Performance on N2A and T2V tasks (bold font: the best).

Image 5

We visualize AEdit results of one crossing situation by LAMP, A-OAVD, and our Causal-VidSyn.

Image 6

Afd is the ratio of IOU(, ) > 0 of all checks. IOU: the intersection over union of two bounding boxes.

Some text-to-video samples

AEdit and N2A samples are large. We will share them via external links on the GitHub page.

GIF 1

go-car drives too fast and the braking distance is short, resulting in that the ego-car hitting a crossing car.

GIF 2

The vehicle drives too fast and the braking distance is short, resulting in that the car hitting a crossing pedestrian

GIF 3

Ego-car drives too fast and the braking distance is short, resulting in that the ego-car hitting a car.

GIF 4

Pedestrian does not notice the coming vehicles when crossing the street, resulting in that the ego-car hitting a crossing pedestrian

GIF 5

Cyclist does not notice the coming vehicles when crossing the road, resulting in that the ego-car hitting a crossing cyclist

GIF 6

Motorcycle driver is inattentive,resulting in that the ego-car hitting a motorbike

BibTeX

BibTex Code Here