15 Downloads Updated 2 weeks ago
ollama run mladen-gertner/slavkofusion-v1
Extract → Normalise → Unify all modalities into a single JSON object.
Multimodal AI must be deterministic and reproducible. Fusion normalises images, PDFs, UI mock-ups, and code snippets into a canonical feature set that can be fed to any downstream evaluator.
| Feature | Description |
|---|---|
| Automatic modality detection | Detects text, image, pdf, ui, code automatically |
| Feature extraction | Objects, layout, OCR, syntax tree extraction |
| Deterministic output | Always the same JSON shape for same input |
| Audit checkpoint #2 | Adds fusion to the audit chain |
| Plug-in extractor framework | Add custom parsers without touching core code |
git clone https://github.com/FormatDisc/slavko-fusion
cd slavko-fusion
pip install -e .
python>=3.11
pillow>=10.0.0
pytesseract>=0.3.10
pdfplumber>=0.10.0
opencv-python>=4.8.0
transformers>=4.35.0
torch>=2.0.0
from slavko_fusion import Fusion
import json
fusion = Fusion()
payload = {
"image_base64": "<BASE64-PNG-IMAGE-DATA>",
"text": "Review this dashboard"
}
features = fusion.extract(payload)
print(json.dumps(features, indent=2))
fusion = Fusion()
text_payload = {
"text": "This is a sample text document for analysis."
}
features = fusion.extract(text_payload)
print(features)
fusion = Fusion()
image_payload = {
"image_base64": "<BASE64-IMAGE>",
"text": "Analyze this UI screenshot"
}
features = fusion.extract(image_payload)
for obj in features["features"]["objects"]:
print(f"Found {obj['label']} at {obj['bbox']}")
print(f"Aspect ratio: {features['features']['layout']['aspectRatio']}")
python fusion = Fusion()
pdf_payload = { “pdf_base64”: “”, “text”: “Extract content from this PDF” }
features = fusion.extract(pdf_payload)
print(f”Text: {features[‘features’][‘text’]}“) “`
”`python fusion = Fusion()
code_payload = { “code”: “”” def calculate_risk(data): if data[‘risk_factor’] > 0.8: return ‘HIGH’ return ‘LOW’ “”“, “language”: “python”, “text”: “Analyze this code” }
features = fusion.extract(code_payload)
print(f”Functions: {features[‘features’][‘functions’]}“) print(f”Complexity: {features[‘features’][‘complexity’]}“)
| Modality | Avg. Latency | Memory | Model |
|---|---|---|---|
| Text | 5–10 ms | < 100 MB | N/A |
| Image | 300–700 ms | 4–5 GB | phi3-vision |
| 500–1000 ms | 2–3 GB | pdfplumber + OCR | |
| UI | 400–800 ms | 4–5 GB | phi3-vision |
| Code | 10–20 ms | < 200 MB | AST parser |
BSD-3-Clause – see LICENSE for details.
Company: Formatdisc – Computer Programming & Advanced Software Systems
Founder & System Architect: Mladen Gertner
Website: https://formatdisc.hr
Email: mladen@formatdisc.hr
Phone: +385 91 542 1014
Location: Zagreb, Croatia
OIB: 18915075854
GitHub: https://github.com/FormatDisc
LinkedIn: https://linkedin.com/company/formatdisc
Built with S.L.A.V.K.O.™ – Unified. Deterministic. Extensible.