4 Downloads Updated 2 days ago
ollama run brinzaengineeringai/microlens-v2
Updated 2 days ago
2 days ago
4bad8923451b · 4.4GB ·
Research model · Apache 2.0 · Not a medical device · Not a certified instrument · Use at your own risk. Outputs are statistical pattern matches against training data, not analytical measurements. See full disclaimer below.
A small vision-language model for microscopy. Gemma 4 E2B fine-tuned on 122,399 image-question-answer pairs covering 145+ taxonomic genera across diatoms, freshwater and marine zooplankton, fungal spores, and fish larvae. Q4_K_M GGUF, 3.4 GB. Runs on a phone.
ollama run brinzaengineeringai/microlens-v2
Give it a microscopy image. Get back one line of structured taxonomic text. Same image, same answer, every time.
This is a diatom of the genus Navicula, specifically Navicula gregaria.
That format is on purpose. v2 is built for pipelines that need to ingest thousands of images and feed the result into a database. No prose, no chain-of-thought, no markdown surprises. If you want a longer scientific description (morphology, habitat, identification cues), use microlens-v3 instead.
Stratified evaluation on 220 held-out validation images.
| Category | Category match | Genus match | Notes |
|---|---|---|---|
| Diatoms | 100% | ~50% | Largest class in training (8k+ samples) |
| Freshwater zooplankton | 97% | ~45% | Rotifers, copepods, ciliates |
| Marine zooplankton | 100% | ~45% | Copepods, ostracods, krill larvae |
| Fungal spores | 100% | ~50% | Plant-pathogenic conidia |
| Fish larvae | 100% | n/a | Pseudo-genus, see Limitations |
For reference, random guess across the 145+ genera is around 0.7%.
Measured on actual hardware:
The Android client uses llama.cpp + mtmd via JNI. Desktop runs llama-server and reads SSE.
MicroLens v2 is a research and educational artefact published under Apache 2.0. It is a fine-tuned neural network, not a regulated instrument.
Designed for:
This model is NOT, and must not be treated as:
The model’s output is a probabilistic pattern match against the training data distribution, not a physical or analytical measurement. The model can be confidently wrong, particularly on:
No warranty. This software is provided “AS IS”, without warranty of any kind, express or implied, including but not limited to the warranties of merchantability, fitness for a particular purpose and non-infringement. In no event shall the author or contributors be liable for any claim, damages or other liability, whether in an action of contract, tort, or otherwise, arising from, out of, or in connection with the model or the use or other dealings in the model.
You assume all risk when downloading, deploying, modifying, or using this model on your own hardware. Always have qualified personnel verify any result that informs a regulatory, environmental, clinical, or health-related decision.
A few things to know:
For fish larvae, the underlying dataset has no species-level annotation. The model returns the category name as the “genus” for this class. Don’t import that into a taxonomic database.
The output format is rigid. That’s a feature for parsers and a limitation for humans. Use v3 if you want flowing text.
Long-tail genera — the roughly 100 with fewer than 100 training samples each — score noticeably lower than the 30 most-common ones. Per-genus precision and recall live in the GitHub model card.
There is no uncertainty score in the standard output. If you need confidence values, pull logprobs from llama-server.
Three pieces did the heavy lifting:
FastVisionModel with 4-bit QLoRA on a single RTX 3090 Ti. Two-times speedup and half the VRAM compared to vanilla Transformers, which is what made this trainable on consumer hardware in the first place.Apache 2.0. Built for the Kaggle Gemma 4 Good Hackathon 2026, Health & Sciences track.
Serghei Brinza · Vienna, Austria · 2026