openbmb

minicpm-o2.6

A GPT-4o Level MLLM for Vision, Speech and Multimodal Live Streaming on Your Phone

27.3K Pulls 13 Tags Updated 1 year ago

A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

21.8K Pulls 11 Tags Updated 2 months ago

A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone

12.1K Pulls 13 Tags Updated 2 months ago

A Gemini 2.5 Flash Level MLLM for Vision, Speech, and Full-Duplex Mulitmodal Live Streaming on Your Phone

8,119 Pulls 12 Tags Updated 5 months ago

highly efficient large language models (LLMs) designed explicitly for end-side devices

6,720 Pulls 4 Tags Updated 2 months ago

A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

2,903 Pulls 12 Tags Updated 1 year ago

A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

2,235 Pulls 12 Tags Updated 11 months ago

highly efficient large language models (LLMs) designed explicitly for end-side devices

1,446 Pulls 1 Tag Updated 10 months ago

A GPT-4V Level Multimodal LLM on Your Phone

487 Pulls 13 Tags Updated 1 year ago