Mr. You Zhang | Engineering | Best Researcher Award

University of Rochester, United States

Dr. You (Neil) Zhang is a Ph.D. candidate in Electrical and Computer Engineering at the University of Rochester, specializing in machine learning for speech, acoustics, and audio signal processing. His research focuses on spatial audio (HRTF personalization), speech anti-spoofing, singing voice deepfake detection, and audio-visual learning. He has held research roles at Dolby, Meta, Microsoft, Tencent, and Bytedance, contributing significantly to areas like perceptual HRTF modeling and audio-visual deepfake detection.

Profile:

Google Scholar

🎓 Education:

Ph.D. in Electrical & Computer Engineering (Expected 2025)
University of Rochester
M.S., University of Rochester
B.Eng., University of Electronic Science & Technology of China
Exchange Program, UC Berkeley

🧠 Research Interests:

Spatial Audio & HRTF Personalization 🎧
Speech Deepfake Detection & Audio Security 🔐
Multimodal Learning: Audio-Visual & Emotional Speech Synthesis 🎥🗣️

🏆 Honors & Fellowships:

IEEE SPS Scholarship (2024)
NIJ Graduate Research Fellowship (2023)
ICASSP Rising Star in Signal Processing (2023)
Open Scholarship Award @ UR (2025)

🧪 Research & Industry Experience:

Dolby Labs 🎶 – Sr. Researcher, Multimodal Spatial Audio
Meta Reality Labs 🧠 – HRTF Perceptual Learning
Microsoft, Tencent, ByteDance, IngenID 💼 – AI R&D Internships
Audio Information Research Lab, UR 🎙️ – Deepfake Detection, AV Speech, HRTF Neural Fields

📚 Selected Publications:

IEEE T-MM, SPL, ICASSP, Interspeech, NAACL
Co-organizer of SVDD Challenge at SLT 2024 & MIREX 2024
Contributor to Handbook of Biometric Anti-spoofing (Springer)

🎤 Talks & Tutorials:

Invited speaker at CMU, NII Japan, ISCA SPSC
Tutorials @ ASA, ICME, AES (Topics: HRTF, Deepfakes, ML for Acoustics)

🎓 Teaching & Mentorship:

TA for Machine Learning, Audio Signal Processing, Random Processes
Mentored 10+ undergrad and graduate students in UR, Tsinghua, UESTC

💼 Professional Service:

Reviewer for IEEE TASLP, TPAMI, ICASSP, Interspeech, CVPR Workshops
Member: IEEE, ASA, ACM, AES
DEI Committee @ UR ECE | Organizer of AR/VR Events

💻 Skills:

Programming: Python, MATLAB, C, Java
Tools: Git, Linux, PyTorch, Slurm
Languages: English 🇺🇸, Mandarin 🇨🇳

🏃 Hobbies & More:

Half-Marathon Finisher 🏅
Loves stand-up paddleboarding, travel, badminton 🌊✈️🏸

Google Scholar Citation Metrics:

Citations: 751 (All time) | 751 (Since 2020)
h-index: 12 (All time) | 12 (Since 2020)
i10-index: 14 (All time) | 14 (Since 2020)

Publication Top Notes:

One-class Learning Towards Synthetic Voice Spoofing Detection
Y. Zhang, F. Jiang, Z. Duan
IEEE Signal Processing Letters, vol. 28, pp. 937–941, 2021.
Speech Driven Talking Face Generation from a Single Image and an Emotion Condition
S.E. Eskimez, Y. Zhang, Z. Duan
IEEE Transactions on Multimedia, vol. 24, pp. 3480–3490, 2021.
UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021
X. Chen, Y. Zhang*, G. Zhu*, Z. Duan
ASVspoof 2021 Workshop, 2021.
SingFake: Singing Voice Deepfake Detection
Y. Zang, Y. Zhang*, M. Heydari, Z. Duan
IEEE ICASSP, 2024.
An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems
Y. Zhang, G. Zhu, F. Jiang, Z. Duan
Interspeech, pp. 4309–4313, 2021.
SAMO: Speaker Attractor Multi-Center One-Class Learning for Voice Anti-Spoofing
S. Ding, Y. Zhang, Z. Duan
IEEE ICASSP, 2023.
A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
Y. Zhang, G. Zhu, Z. Duan
Odyssey: The Speaker and Language Recognition Workshop, pp. 77–84, 2022.
Global HRTF Personalization Using Anthropometric Measures
Y. Wang, Y. Zhang, Z. Duan, M. Bocko
Audio Engineering Society (AES) 150th Convention, 2021.
Rethinking Audio-Visual Synchronization for Active Speaker Detection
A. Wuerkaixi, Y. Zhang, Z. Duan, C. Zhang
IEEE MLSP, 2022.
CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection
Y. Zang, J. Shi, Y. Zhang, et al.
Interspeech, pp. 4783–4787, 2024.
VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music
J. Shi, H. Shim, J. Tian, Y. Zhang, et al.
NAACL (Demo Track), 2025.
HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields
Y. Zhang, Y. Wang, Z. Duan
IEEE ICASSP, 2023.
SVDD 2024: The Inaugural Singing Voice Deepfake Detection Challenge
Y. Zhang, Y. Zang, J. Shi, R. Yamamoto, T. Toda, Z. Duan
IEEE SLT, pp. 782–787, 2024.
DyViSE: Dynamic Vision-Guided Speaker Embedding for Audio-Visual Speaker Diarization
A. Wuerkaixi, K. Yan, Y. Zhang, Z. Duan, C. Zhang
IEEE MMSP, 2022.
Predicting Global Head-Related Transfer Functions from Scanned Head Geometry Using Deep Learning and Compact Representations
Y. Wang, Y. Zhang, Z. Duan, M. Bocko
arXiv preprint, arXiv:2207.14352, 2022.
Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech
E. Zhou, Y. Zhang, Z. Duan
IEEE ICASSP, 2024.
SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan
Y. Zhang, Y. Zang, J. Shi, et al.
arXiv preprint, arXiv:2405.05244, 2024.
ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
X. Wang, H. Delgado, Y. Zhang, et al.
Computer Speech & Language, 2025.
Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions
K. Zhou, Y. Zhang, S. Zhao, et al.
arXiv preprint, arXiv:2409.16681, 2024.
Mitigating Cross-Database Differences for Learning Unified HRTF Representation
Y. Wen, Y. Zhang, Z. Duan
IEEE WASPAA, 2023.

You Zhang | Engineering | Best Researcher Award