15 2 weeks ago

Deterministic multimodal layer that unifies text, images, PDFs, UI and code into one reproducible JSON schema for downstream evaluation in the S.L.A.V.K.O.™ stack.

vision
ollama run mladen-gertner/slavkofusion-v1

Details

2 weeks ago

cd1746f36191 · 7.8GB ·

mllama
·
10.7B
·
Q4_K_M
{{- range $index, $_ := .Messages }}<|start_header_id|>{{ .Role }}<|end_header_id|> {{ .Content }} {
**Llama 3.2** **Acceptable Use Policy** Meta is committed to promoting safe and fair use of its tool
LLAMA 3.2 COMMUNITY LICENSE AGREEMENT Llama 3.2 Version Release Date: September 25, 2024 “Agreemen
You are SlavkoFusion 1.0, the multimodal integration layer of the S.L.A.V.K.O.™ orchestration syst
{ "num_predict": 2048, "repeat_penalty": 1, "temperature": 0, "top_k": 1, "top_p

Readme

🌐 SlavkoFusion 1.0 – Multimodal Integration Layer

Deterministic Multimodal Audit_Ready Plugins Formatdisc License

Extract → Normalise → Unify all modalities into a single JSON object.

📜 Philosophy

Multimodal AI must be deterministic and reproducible. Fusion normalises images, PDFs, UI mock-ups, and code snippets into a canonical feature set that can be fed to any downstream evaluator.

Core Principles

  1. Unified Schema: All modalities produce the same JSON structure
  2. Deterministic Extraction: Same input always yields same output
  3. Modality Detection: Automatic detection of input type
  4. Plugin Architecture: Extensible extractors for new modalities
  5. Audit Checkpoint: Second checkpoint in the audit chain

✨ Core Features

Feature Description
Automatic modality detection Detects text, image, pdf, ui, code automatically
Feature extraction Objects, layout, OCR, syntax tree extraction
Deterministic output Always the same JSON shape for same input
Audit checkpoint #2 Adds fusion to the audit chain
Plug-in extractor framework Add custom parsers without touching core code

📦 Installation

git clone https://github.com/FormatDisc/slavko-fusion
cd slavko-fusion
pip install -e .

Dependencies

python>=3.11
pillow>=10.0.0
pytesseract>=0.3.10
pdfplumber>=0.10.0
opencv-python>=4.8.0
transformers>=4.35.0
torch>=2.0.0

🚀 Quick Start

from slavko_fusion import Fusion
import json

fusion = Fusion()

payload = {
    "image_base64": "<BASE64-PNG-IMAGE-DATA>",
    "text": "Review this dashboard"
}

features = fusion.extract(payload)
print(json.dumps(features, indent=2))

📚 Usage Examples

Text Extraction

fusion = Fusion()

text_payload = {
    "text": "This is a sample text document for analysis."
}

features = fusion.extract(text_payload)
print(features)

Image Analysis

fusion = Fusion()

image_payload = {
    "image_base64": "<BASE64-IMAGE>",
    "text": "Analyze this UI screenshot"
}

features = fusion.extract(image_payload)

for obj in features["features"]["objects"]:
    print(f"Found {obj['label']} at {obj['bbox']}")

print(f"Aspect ratio: {features['features']['layout']['aspectRatio']}")

PDF Processing

python fusion = Fusion()

pdf_payload = { “pdf_base64”: “”, “text”: “Extract content from this PDF” }

features = fusion.extract(pdf_payload)

print(f”Text: {features[‘features’][‘text’]}“) “`

Code Analysis

”`python fusion = Fusion()

code_payload = { “code”: “”” def calculate_risk(data): if data[‘risk_factor’] > 0.8: return ‘HIGH’ return ‘LOW’ “”“, “language”: “python”, “text”: “Analyze this code” }

features = fusion.extract(code_payload)

print(f”Functions: {features[‘features’][‘functions’]}“) print(f”Complexity: {features[‘features’][‘complexity’]}“)

📊 Performance

Modality Avg. Latency Memory Model
Text 5–10 ms < 100 MB N/A
Image 300–700 ms 4–5 GB phi3-vision
PDF 500–1000 ms 2–3 GB pdfplumber + OCR
UI 400–800 ms 4–5 GB phi3-vision
Code 10–20 ms < 200 MB AST parser

📜 License

BSD-3-Clause – see LICENSE for details.

📞 Contact — Formatdisc

Company: Formatdisc – Computer Programming & Advanced Software Systems
Founder & System Architect: Mladen Gertner
Website: https://formatdisc.hr
Email: mladen@formatdisc.hr
Phone: +385 91 542 1014
Location: Zagreb, Croatia
OIB: 18915075854

GitHub: https://github.com/FormatDisc
LinkedIn: https://linkedin.com/company/formatdisc


Built with S.L.A.V.K.O.™ – Unified. Deterministic. Extensible.