CloudPreview

Run larger models, faster using Ollama's cloud
$20/mo
UpgradeOllama's cloud lets you
-
Speed up model inference
Run models using datacenter-grade hardware, returning responses much faster.
-
Run larger models
Upgrade to the newest hardware, making it possible to run larger models.
-
Privacy first
Ollama does not retain your data to ensure privacy and security.
-
Save battery life
Take the load of running models off your Mac, Windows or Linux computer, giving you performance back for your other apps.
Frequently asked questions
-
What is Ollama's cloud?
Ollama's cloud is a new way to run open models using datacenter-grade hardware. Many new models are too large to fit on widely available GPUs, or run very slowly. Ollama's cloud provides a way to run these models fast while using Ollama's App, CLI, and API.
-
Does Ollama's cloud work with Ollama's CLI?
Yes! See the docs for more information.
-
Does Ollama's cloud work with Ollama's API and JavaScript/Python libraries?
Yes! See the docs for more information.
-
What data do you retain in Ollama's cloud?
Ollama does not log or retain any queries.
-
Where is the hardware that power Ollama's cloud located?
All hardware is located in the United States.
-
What are the usage limits for Ollama's cloud?
Ollama's cloud includes hourly and daily limits to avoid capacity issues. Usage-based pricing will soon be available to consume models in a metered fashion.