Abstract: Human speech perception naturally integrates vi sual and auditory cues, with lip movements providing critical disambiguation in noisy environments where audio signals are degraded (SNR ≤-5 ...