You Zhang | Engineering | Best Researcher Award

Mr. You Zhang | Engineering | Best Researcher Award

University of Rochester, United States

Dr. You (Neil) Zhang is a Ph.D. candidate in Electrical and Computer Engineering at the University of Rochester, specializing in machine learning for speech, acoustics, and audio signal processing. His research focuses on spatial audio (HRTF personalization), speech anti-spoofing, singing voice deepfake detection, and audio-visual learning. He has held research roles at Dolby, Meta, Microsoft, Tencent, and Bytedance, contributing significantly to areas like perceptual HRTF modeling and audio-visual deepfake detection.

Profile:

🎓 Education:

  • Ph.D. in Electrical & Computer Engineering (Expected 2025)
    University of Rochester

  • M.S., University of Rochester

  • B.Eng., University of Electronic Science & Technology of China

  • Exchange Program, UC Berkeley

🧠 Research Interests:

  • Spatial Audio & HRTF Personalization 🎧

  • Speech Deepfake Detection & Audio Security 🔐

  • Multimodal Learning: Audio-Visual & Emotional Speech Synthesis 🎥🗣️

🏆 Honors & Fellowships:

  • IEEE SPS Scholarship (2024)

  • NIJ Graduate Research Fellowship (2023)

  • ICASSP Rising Star in Signal Processing (2023)

  • Open Scholarship Award @ UR (2025)

🧪 Research & Industry Experience:

  • Dolby Labs 🎶 – Sr. Researcher, Multimodal Spatial Audio

  • Meta Reality Labs 🧠 – HRTF Perceptual Learning

  • Microsoft, Tencent, ByteDance, IngenID 💼 – AI R&D Internships

  • Audio Information Research Lab, UR 🎙️ – Deepfake Detection, AV Speech, HRTF Neural Fields

📚 Selected Publications:

  • IEEE T-MM, SPL, ICASSP, Interspeech, NAACL

  • Co-organizer of SVDD Challenge at SLT 2024 & MIREX 2024

  • Contributor to Handbook of Biometric Anti-spoofing (Springer)

🎤 Talks & Tutorials:

  • Invited speaker at CMU, NII Japan, ISCA SPSC

  • Tutorials @ ASA, ICME, AES (Topics: HRTF, Deepfakes, ML for Acoustics)

🎓 Teaching & Mentorship:

  • TA for Machine Learning, Audio Signal Processing, Random Processes

  • Mentored 10+ undergrad and graduate students in UR, Tsinghua, UESTC

💼 Professional Service:

  • Reviewer for IEEE TASLP, TPAMI, ICASSP, Interspeech, CVPR Workshops

  • Member: IEEE, ASA, ACM, AES

  • DEI Committee @ UR ECE | Organizer of AR/VR Events

💻 Skills:

  • Programming: Python, MATLAB, C, Java

  • Tools: Git, Linux, PyTorch, Slurm

  • Languages: English 🇺🇸, Mandarin 🇨🇳

🏃 Hobbies & More:

  • Half-Marathon Finisher 🏅

  • Loves stand-up paddleboarding, travel, badminton 🌊✈️🏸

Google Scholar Citation Metrics:

  • Citations: 751 (All time) | 751 (Since 2020)

  • h-index: 12 (All time) | 12 (Since 2020)

  • i10-index: 14 (All time) | 14 (Since 2020)

Publication Top Notes:

  1. One-class Learning Towards Synthetic Voice Spoofing Detection
    Y. Zhang, F. Jiang, Z. Duan
    IEEE Signal Processing Letters, vol. 28, pp. 937–941, 2021.

  2. Speech Driven Talking Face Generation from a Single Image and an Emotion Condition
    S.E. Eskimez, Y. Zhang, Z. Duan
    IEEE Transactions on Multimedia, vol. 24, pp. 3480–3490, 2021.

  3. UR Channel-Robust Synthetic Speech Detection System for ASVspoof 2021
    X. Chen, Y. Zhang*, G. Zhu*, Z. Duan
    ASVspoof 2021 Workshop, 2021.

  4. SingFake: Singing Voice Deepfake Detection
    Y. Zang, Y. Zhang*, M. Heydari, Z. Duan
    IEEE ICASSP, 2024.

  5. An Empirical Study on Channel Effects for Synthetic Voice Spoofing Countermeasure Systems
    Y. Zhang, G. Zhu, F. Jiang, Z. Duan
    Interspeech, pp. 4309–4313, 2021.

  6. SAMO: Speaker Attractor Multi-Center One-Class Learning for Voice Anti-Spoofing
    S. Ding, Y. Zhang, Z. Duan
    IEEE ICASSP, 2023.

  7. A Probabilistic Fusion Framework for Spoofing Aware Speaker Verification
    Y. Zhang, G. Zhu, Z. Duan
    Odyssey: The Speaker and Language Recognition Workshop, pp. 77–84, 2022.

  8. Global HRTF Personalization Using Anthropometric Measures
    Y. Wang, Y. Zhang, Z. Duan, M. Bocko
    Audio Engineering Society (AES) 150th Convention, 2021.

  9. Rethinking Audio-Visual Synchronization for Active Speaker Detection
    A. Wuerkaixi, Y. Zhang, Z. Duan, C. Zhang
    IEEE MLSP, 2022.

  10. CtrSVDD: A Benchmark Dataset and Baseline Analysis for Controlled Singing Voice Deepfake Detection
    Y. Zang, J. Shi, Y. Zhang, et al.
    Interspeech, pp. 4783–4787, 2024.

  11. VERSA: A Versatile Evaluation Toolkit for Speech, Audio, and Music
    J. Shi, H. Shim, J. Tian, Y. Zhang, et al.
    NAACL (Demo Track), 2025.

  12. HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields
    Y. Zhang, Y. Wang, Z. Duan
    IEEE ICASSP, 2023.

  13. SVDD 2024: The Inaugural Singing Voice Deepfake Detection Challenge
    Y. Zhang, Y. Zang, J. Shi, R. Yamamoto, T. Toda, Z. Duan
    IEEE SLT, pp. 782–787, 2024.

  14. DyViSE: Dynamic Vision-Guided Speaker Embedding for Audio-Visual Speaker Diarization
    A. Wuerkaixi, K. Yan, Y. Zhang, Z. Duan, C. Zhang
    IEEE MMSP, 2022.

  15. Predicting Global Head-Related Transfer Functions from Scanned Head Geometry Using Deep Learning and Compact Representations
    Y. Wang, Y. Zhang, Z. Duan, M. Bocko
    arXiv preprint, arXiv:2207.14352, 2022.

  16. Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech
    E. Zhou, Y. Zhang, Z. Duan
    IEEE ICASSP, 2024.

  17. SVDD Challenge 2024: A Singing Voice Deepfake Detection Challenge Evaluation Plan
    Y. Zhang, Y. Zang, J. Shi, et al.
    arXiv preprint, arXiv:2405.05244, 2024.

  18. ASVspoof 5: Design, Collection and Validation of Resources for Spoofing, Deepfake, and Adversarial Attack Detection Using Crowdsourced Speech
    X. Wang, H. Delgado, Y. Zhang, et al.
    Computer Speech & Language, 2025.

  19. Emotional Dimension Control in Language Model-Based Text-to-Speech: Spanning a Broad Spectrum of Human Emotions
    K. Zhou, Y. Zhang, S. Zhao, et al.
    arXiv preprint, arXiv:2409.16681, 2024.

  20. Mitigating Cross-Database Differences for Learning Unified HRTF Representation
    Y. Wen, Y. Zhang, Z. Duan
    IEEE WASPAA, 2023.