实时语音交互大模型

关注指标

VITA(开源)

Pasted image 20241010205723.png|500

MoshiChat(开源)

代码: https://github.com/kyutai-labs/moshi
模型: https://huggingface.co/collections/kyutai/moshi-v01-release-66eaeaf3302bef6bd9ad7acd
报告: https://kyutai.org/Moshi.pdf
发布时间:2024.07.04

LLaMA-Omni(开源)

Pasted image 20241010204745.png|500

GLM-4-Voice(开源)

Pasted image 20241229155856.png|500

Freeze-omni (开源)

Pasted image 20241229160508.png|500

Mini-Omni (开源)

Pasted image 20241229161733.png|500

Qwen 2-Audio (开源)(非实时对话)

Pasted image 20241229162717.png|500

——浙ICP备2023052563号——