Leveraging multimodal learning to address oral health inequities: A public health policy perspective
Abstract
Oral health disparities persist as a critical public health challenge, particularly within marginalized communities where barriers to care and preventive interventions are prevalent. Conventional methods in public health policy have exhibited limitations in effectively addressing these disparities, largely due to fragmented data systems and insufficient integration of multifaceted determinants of oral health. This study introduces an innovative multimodal learning framework designed to enhance policy-making and oral health outcomes by unifying diverse data sources—including clinical, socioeconomic, behavioral, and environmental factors—into a comprehensive analytical model. The framework incorporates the Cross-Modal Coherence Encoder (CMCE), leveraging structure-preserving attention mechanisms to align and integrate heterogeneous data modalities, thereby capturing intricate intra- and inter-modal relationships. Additionally, the Semantic Anchor Matching (SAM) mechanism is employed to refine the learning process by introducing latent semantic anchors, ensuring robust and semantically consistent representations even in the presence of incomplete or noisy data. Experimental evaluations indicate that the proposed framework achieves substantial improvements over traditional unimodal approaches in predicting oral health outcomes and identifying vulnerable populations. By revealing complex interdependencies among diverse determinants, this integrative methodology provides actionable insights for formulating targeted and evidence-based public health policies. The results highlight the transformative potential of advanced machine learning techniques to advance oral health equity and reduce systemic disparities in underserved populations.