SDMC-Seg: A single-step latent diffusion framework for multi-class stroke lesion segmentation in brain CT
Abstract
Accurate segmentation of stroke subtypes in non-contrast CT is essential for timely diagnosis and treatment planning. However, the inherently low tissue contrast in ischemic regions and the heterogeneous appearance of lesions across subtypes present significant challenges. This study introduces SDMC-Seg, a latent diffusion-based framework designed for multi-class stroke segmentation in brain CT images. Unlike previous latent diffusion methods limited to binary tasks, SDMC-Seg addresses multi-class segmentation by encoding each stroke subtype into a distinct semantic channel, allowing the model to disentangle lesion-specific features in latent space for improved classification and delineation. The architecture integrates a fine-tuned mask autoencoder and a trainable CT image encoder, projecting both modalities into a shared, semantically meaningful latent space. Segmentation masks are reconstructed via a conditional denoising network operating on noisy latent representations conditioned on the CT image. In contrast to traditional diffusion methods requiring multiple iterative steps, SDMC-Seg performs single-step latent inference, enabling faster and more stable predictions while preserving structural detail. The model is evaluated on a curated, anonymized subset of a publicly available stroke CT dataset comprising 2,223 annotated slices across ischemic, hemorrhagic, and normal classes. It achieves an average Dice coefficient of 0.79 and IoU of 0.67, outperforming U-Net, ResUNet, and Swin-UNETR baselines—particularly achieving a 9 % Dice improvement in ischemic stroke segmentation. Ablation studies demonstrate that adapting both the image and mask encoders to the target domain significantly enhances performance. These results suggest that SDMC-Seg provides an efficient and scalable solution for stroke lesion delineation in CT, with strong potential for integration into real-time clinical workflows. Remaining limitations in detecting early ischemic changes highlight the need for future extensions incorporating multimodal imaging or temporal progression modeling.