Leveraging LLMs in your Obsidian Notes

September 21, 2023

Today I saw a post on Hacker News about another plugin for Obsidian that integrates with ChatGPT. There are a bunch of these tools out there, and I love seeing the different ways to use them with the Obsidian. Making connections, letting you go further with your notes. Some commenters suggested it’s doing the work you need to do yourself, but I think it empowers you in new and incredible ways.

Talk to your notes

The first and perhaps most obvious thing you probably want to do is be able to converse with your notes. Ask it questions to gain further insights. It would be convenient if you could just point the model at your notes and be done with it. But most models can’t accept all that content all at once.

When you ask a question, not all of your notes are relevant. So you need to find the parts that are relevant and hand that to the model. Obsidian has a Search function, but it’s just searching for exact words and phrases, and we need to search for concepts. That’s where embeddings come in. We have to create an index. It turns out to be pretty easy to do.

Let’s build the indexer

When you create an Obsidian plugin, you can have it do something when the plugin loads up, and then other things when you trigger a command or open a note, or other activities in Obsidian. So we want something to understand your notes at the start of the plugin, and it should save its progress, so it doesn’t have to regenerate the index again. Let’s look at a code sample to index one of our notes. I am going to use Llama Index in this one, but LangChain is another great option.

import { VectorStoreIndex, serviceContextFromDefaults, storageContextFromDefaults, MarkdownReader } from "llamaindex";

const service_context = serviceContextFromDefaults({ chunkSize: 256 })
const storage_context = await storageContextFromDefaults({ persistDir: "./storage" });

const mdpath = process.argv[2];
const mdreader = new MarkdownReader();
const thedoc = await mdreader.loadData(mdpath)

First, we need to initialize an in-memory data store. This is the in-memory store that comes with Llama Index, but Chroma DB is another popular option. And this second line is saying we’ll persist everything we index. Next, I get the path for the file and initialize a reader. Then I read the file. Llama Index knows Markdown, so it reads it in appropriately and indexes it. It also knows about PDFs, and text files, and notion docs, and more. It’s not just storing words, but also understanding the meanings of the words and how they relate to the other words in this text.

await VectorStoreIndex.fromDocuments(thedoc, { storageContext: storage_context, serviceContext: service_context });

Now, this part is using a service from OpenAI, but it’s separate from ChatGPT, different model, different product, and there are alternatives in Langchain that will do this locally, but a bit slower. Ollama also has an embed function. You could also use those services on a superfast self-hosted instance in the cloud and then close that down when indexing is done.

And now let’s search our notes

Now we have an index for this file. Obsidian can give us a list of all the files, so we can run that over and over again. And we are persisting, so this is a one-time action. Now, how can we ask a question? We want some code that will find the relevant bits in our notes, hand it off to the model, and use that info to come up with an answer.

const storage_context = await storageContextFromDefaults({ persistDir: "./storage" });
const index = await VectorStoreIndex.init({ storageContext: storage_context });
const ret = index.asRetriever();
ret.similarityTopK = 5
const prompt = process.argv[2];
const response = await ret.retrieve(prompt);
const systemPrompt = `Use the following text to help come up with an answer to the prompt: ${response.map(r => r.node.toJSON().text).join(" - ")} `

So in this code sample, we are initializing the index using the content we already processed. The Retriever.retrieve line is going to take the prompt and find all the chunks of notes that are relevant and return the text to us. We said to use the top 5 matches here. So I will get 5 chunks of text from our notes. With that raw information, we can generate a system prompt to help our model know what to do when we ask a question.

const ollama = new Ollama();
ollama.setModel("llama2");
ollama.setSystemPrompt(systemPrompt);
const genout = await ollama.generate(prompt);

And so now we get to use the model. I am using a library I created a few days ago that is on npm. I can set the model to use llama2, which is already downloaded to my machine using the command ollama pull llama2. You can try different models to find the one that works best for you.

To get answers back quickly, you will want to stick to a small model. But you also want a model that will have an input context size large enough to accept all of our text chunks. I have up to 5 chunks that are 256 tokens each. I set the model to use that System Prompt that includes our text chunks. Just ask the question, and it will give you the answer within a few seconds.

Awesome. Now, our obsidian plugin would display that answer appropriately.

What else can we do?

You can also think about summarizing the text or finding the best keywords that match your text and adding them to the front matter, so you can make better connections between notes. I’ve played around with making 10 good questions and answers to send to Anki. You will want to experiment with different models and prompts to accomplish these different things. It is very easy to change the prompts and even the model weights to something more appropriate to the task.

I hope this post has given you some ideas for how to build the next great plugin for Obsidian or any other note-taking tool. Using the latest local AI tools, like those you can find at ollama.com, this kind of power is a breeze, and I hope you’ll show me what you’ve got going on.