-
One-class Learning Towards Synthetic Voice Spoofing Detection
Y. Zhang, F. Jiang, Z. Duan
IEEE Signal Processing Letters, vol. 28, pp. 937–941, 2021.
-
Speech Driven Talking Face Generation from a Single Image and an Emotion Condition
S.E. Eskimez, Y. Zhang, Z. Duan
IEEE Transactions on Multimedia, vol. 24, pp. 3480–3490, 2021.
-
UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021
X. Chen, Y. Zhang*, G. Zhu*, Z. Duan
ASVspoof 2021 Workshop, 2021.
-
SingFake: Singing Voice Deepfake Detection
Y. Zang, Y. Zhang*, M. Heydari, Z. Duan
IEEE ICASSP, 2024.
-
An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems
Y. Zhang, G. Zhu, F. Jiang, Z. Duan
Interspeech, pp. 4309–4313, 2021.
-
SAMO: Speaker Attractor Multi-Center One-Class Learning for Voice Anti-Spoofing
S. Ding, Y. Zhang, Z. Duan
IEEE ICASSP, 2023.
-
A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
Y. Zhang, G. Zhu, Z. Duan
Odyssey: The Speaker and Language Recognition Workshop, pp. 77–84, 2022.
-
Global HRTF Personalization Using Anthropometric Measures
Y. Wang, Y. Zhang, Z. Duan, M. Bocko
Audio Engineering Society (AES) 150th Convention, 2021.
-
Rethinking Audio-Visual Synchronization for Active Speaker Detection
A. Wuerkaixi, Y. Zhang, Z. Duan, C. Zhang
IEEE MLSP, 2022.
-
CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection
Y. Zang, J. Shi, Y. Zhang, et al.
Interspeech, pp. 4783–4787, 2024.
-
VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music
J. Shi, H. Shim, J. Tian, Y. Zhang, et al.
NAACL (Demo Track), 2025.
-
HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields
Y. Zhang, Y. Wang, Z. Duan
IEEE ICASSP, 2023.
-
SVDD 2024: The Inaugural Singing Voice Deepfake Detection Challenge
Y. Zhang, Y. Zang, J. Shi, R. Yamamoto, T. Toda, Z. Duan
IEEE SLT, pp. 782–787, 2024.
-
DyViSE: Dynamic Vision-Guided Speaker Embedding for Audio-Visual Speaker Diarization
A. Wuerkaixi, K. Yan, Y. Zhang, Z. Duan, C. Zhang
IEEE MMSP, 2022.
-
Predicting Global Head-Related Transfer Functions from Scanned Head Geometry Using Deep Learning and Compact Representations
Y. Wang, Y. Zhang, Z. Duan, M. Bocko
arXiv preprint, arXiv:2207.14352, 2022.
-
Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech
E. Zhou, Y. Zhang, Z. Duan
IEEE ICASSP, 2024.
-
SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan
Y. Zhang, Y. Zang, J. Shi, et al.
arXiv preprint, arXiv:2405.05244, 2024.
-
ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
X. Wang, H. Delgado, Y. Zhang, et al.
Computer Speech & Language, 2025.
-
Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions
K. Zhou, Y. Zhang, S. Zhao, et al.
arXiv preprint, arXiv:2409.16681, 2024.
-
Mitigating Cross-Database Differences for Learning Unified HRTF Representation
Y. Wen, Y. Zhang, Z. Duan
IEEE WASPAA, 2023.