233 Downloads Updated 3 months ago
This repository hosts a Persian (Farsi) sentence embedding model, converted from the original heydariAI/persian-embeddings Hugging Face model into GGUF format and ported to Ollama for easy local use.
With this model, you can generate high-quality vector embeddings for Persian text — useful for semantic search, clustering, classification, recommendation systems, and more.
ollama pull aligh4699/heydariAI-persian-embeddings
Use the embeddings for:
import ollama
from numpy import dot
import numpy as np
from numpy.linalg import norm
def calcualte_cosine_sim(emb_a, emb_b):
return dot(emb_a, emb_b) / (norm(emb_a) * norm(emb_b))
if __name__ == "__main__":
# Get embeddings
emb1 = np.array(ollama.embed(model="aligh4699/heydariAI-persian-embeddings", input="سلام دنیا. صبح بسیار زیبایی است.")["embeddings"][0])
emb2 = np.array(ollama.embed(model="aligh4699/heydariAI-persian-embeddings", input="درود جهان. صبحتان پرطراوت باشد.")["embeddings"][0])
emb3 = np.array(ollama.embed(model="aligh4699/heydariAI-persian-embeddings", input="خداحاظ تا ابد. من از امروز متنفرم.")["embeddings"][0])
# Cosine similarity
similarity1 = calcualte_cosine_sim(emb1, emb2)
similarity2 = calcualte_cosine_sim(emb1, emb3)
similarity3 = calcualte_cosine_sim(emb2, emb3)
print("Cosine similarity (1,2): ", similarity1)
print("Cosine similarity (1,3): ", similarity2)
print("Cosine similarity (2,3): ", similarity3)
Cosine similarity (1,2): 0.8702924086504231
Cosine similarity (1,3): 0.38726373879395914
Cosine similarity (2,3): 0.46212684228451056
Thanks to: