Journal Search Engine
The Journal of The Korea Institute of Intelligent Transport Systems Vol.24 No.6 pp.79-93
자전거도로 영상 데이터 합성을 위한 스테이블 디퓨전 모델 미세 조정 기법
Fine-tuning of Stable Diffusion Models for Image Data Synthesis on Bicycle Roads
Abstract
The growing use of bicycles has heightened the importance of video-based monitoring for preventing accidents and detecting hazardous situations on bicycle roads. However, collecting video data that adequately reflects variations in season, illumination, and weather requires substantial time, and the high cost of data labeling further limits the development of effective object-detection models. To address these challenges, this study proposes an image synthesis method that simultaneously incorporates structural constraints and style adaptation. The proposed approach integrates Stable Diffusion with ControlNet and Low-Rank Adaptation (LoRA), enabling unified control of scene structure through mask images and fine-grained style adjustment. A dataset was constructed using real CCTV footage, and three Stable Diffusion–based backbone models were evaluated for their synthesis performance. Fréchet Inception Distance and CLIP-score were used for quantitative assessment, demonstrating that the proposed method achieves superior realism and semantic alignment between images and text. Furthermore, the model successfully generated images reflecting seasonal and weather variations solely through prompt manipulation. This research provides an efficient solution for generating diverse environmental conditions that are difficult to capture in practice, thereby alleviating data scarcity in bicycle-road monitoring and supporting the advancement of nextgeneration object-detection and safety-management technologies.
