- 👋 Hi, I’m Ye Zhen, a PhD student at HKUST.
- 👀 I’m interested in Multimodal generation and speech synthesis.
- if you have any questions, please feel free to contact me with zhenye312@gmail.com
🍉
Speech synthesis, Audio generation, Speech LLM
-
Hong Kong University of Science and Technology
- Hong Kong
- @zhenye234
- https://huggingface.co/ZhenYe234
- in/zhen-ye-25734a358
Pinned Loading
-
Talker-T2AV
Talker-T2AV PublicTalker-T2AV Joint Talking Audio-Video Generation with Autoregressive Diffusion Modeling
Python 30
-
LLaSA_training
LLaSA_training PublicLLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
-
X-Codec-2.0
X-Codec-2.0 PublicCodec for paper: LLaSA: Scaling Train-time and Inference-time Compute for LLaMA-based Speech Synthesis
-
FlashSpeech
FlashSpeech PublicACM MM 2024 FlashSpeech: Efficient Zero-Shot Speech Synthesis
-
CoMoSpeech
CoMoSpeech PublicACM MM 2023 CoMoSpeech: One-Step Speech and Singing Voice Synthesis via Consistency Model
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.
