9 hours ago

A finance + compliance copilot for cloud teams: paste a requirement (or control ID) and get a concrete mapping to AWS services/config patterns, a checklist of evidence artifacts for audits, automation hooks for continuous evidence collection, and gaps/ass

tools

9 hours ago

27f827fe8cd5 · 4.7GB ·

qwen2
·
7.62B
·
Q4_K_M
{{- if .Messages }} {{- if or .System .Tools }}<|im_start|>system {{- if .System }} {{ .System }} {{
Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR US
You are FinComp Control Mapper: a finance + compliance assistant focused on mapping requirements (e.
{ "num_ctx": 8192, "temperature": 0.3, "top_p": 0.9 }

Readme

FinComp Control Mapper

Finance + Compliance + Cloud Control Mapping Copilot (Local → Ollama Publish)

Turn a requirement (e.g., “NIST 800-53 style control”) into a concrete, audit-ready plan:

  • Intent (plain English)
  • AWS services + configuration patterns
  • Evidence artifacts (what to show auditors)
  • Automation hooks (continuous evidence collection)
  • Gaps & assumptions

This repo is a demo accelerator: a tight persona model in Ollama + a minimal RAG wrapper so the model stays grounded in your curated corpus.


What’s in this repo

  • Modelfile
    Creates an Ollama model (fincomp-control-mapper) that outputs strict JSON.

  • mappings/seed_controls.json
    A starter mapping dataset (25 controls) you can expand.

  • Minimal RAG API (Node.js + SQLite)

    • scripts/ingest.js chunks docs in ./knowledge/, embeds with Ollama embeddings, stores vectors in SQLite
    • server.js exposes POST /map (retrieve top‑K context → ask the model → return JSON)

Architecture (high level)

User Request (control_id + requirement text + workload)
            |
            v
   [Retriever] embed(query) -> topK chunks from SQLite
            |
            v
  Prompt: requirement + workload + CONTEXT(topK chunks)
            |
            v
   Ollama Chat Model (JSON-only schema)
            |
            v
   Response: AWS design + evidence + automation + gaps

Why this works: RAG keeps answers grounded in your sources (policies, AWS references, control catalog excerpts you’re allowed to store).


Prerequisites

  • Ollama installed and running (ollama serve on most systems)
  • Node.js 18+
  • Optional: jq for pretty JSON in terminal

Quickstart (local)

1) Pull models

ollama pull qwen2.5:7b-instruct
ollama pull nomic-embed-text:latest

2) Create the control mapper model

From this repo root:

ollama create fincomp-control-mapper -f ./Modelfile

3) Install + configure

npm install
cp .env.example .env

4) Add trusted knowledge

Drop .md or .txt files into:

./knowledge/

Recommended to start: - Your internal standards (logging, access reviews, change management, incident response runbooks) - AWS implementation notes you authored - Short control text excerpts / summaries you are allowed to store

Tip: keep the first corpus small (10–30 pages total) so retrieval stays high-signal.

5) Ingest (build vector index)

npm run ingest

6) Start the API

npm start

7) Try a mapping request

curl -s http://localhost:7070/map \
  -H "Content-Type: application/json" \
  -d '{
    "control_id": "AU-2",
    "requirement_text": "Identify and log auditable events and retain them for investigations and reporting.",
    "workload": {
      "cloud": "aws",
      "account_model": "multi-account",
      "data_sensitivity": "financial data",
      "regions": ["us-east-1"]
    }
  }' | jq .

API

GET /health

Returns basic config:

{ "ok": true, "ollama": "...", "chat_model": "...", "embed_model": "..." }

POST /map

Body

{
  "control_id": "AU-2",
  "requirement_text": "…",
  "workload": {
    "cloud": "aws",
    "account_model": "single|multi-account",
    "data_sensitivity": "low|medium|high|financial|pii",
    "regions": ["us-east-1"]
  }
}

Response - result: JSON mapping (model output) - retrieval: which chunks were used (score + preview)


Output schema (guaranteed)

The model is instructed to output ONLY valid JSON in this shape:

{
  "control_id": "AU-2",
  "requirement_summary": "…",
  "intent_plain_english": "…",
  "aws_control_design": {
    "services": ["CloudTrail", "CloudWatch Logs", "S3"],
    "patterns": ["…"]
  },
  "evidence_artifacts": ["…"],
  "automation_hooks": ["…"],
  "gaps_assumptions": ["…"],
  "confidence": "high"
}

Demo prompt pack (20 prompts)

Use these to showcase “finance + compliance + cloud” value.

Control mapping (core)

  1. Map AU-2 logging to AWS and list evidence artifacts and continuous evidence automation.
  2. Map AC-2 account lifecycle (joiner/mover/leaver) to AWS, including what evidence lives outside AWS.
  3. Map CM-2 baseline configuration to IaC + AWS Config drift detection; include evidence and exceptions.
  4. Requirement: “All privileged actions must be logged and reviewed weekly.” Produce AWS design + evidence.
  5. Requirement: “No production access without approvals and time-bound elevation.” Provide AWS design + evidence.

Finance flavored scenarios

  1. Workload: checkout/payment API on AWS. Provide a minimum audit evidence architecture (logs, retention, integrity).
  2. Requirement: “Retention: 7 years for financial audit logs.” Provide AWS patterns (archive, immutability) + evidence.
  3. Requirement: “Segregation of duties between deployers and approvers.” Provide AWS + process controls + evidence.
  4. Requirement: “Change management: trace every prod change to an approved ticket.” Provide CI/CD evidence artifacts.
  5. Requirement: “Detect anomalous access to financial data.” Provide monitoring/detection + evidence.

Cloud security & reliability tie‑ins

  1. Boundary protection: map to VPC segmentation + WAF; provide evidence and review cadence.
  2. Encryption at rest: data stores + KMS policies + access logs; provide evidence list.
  3. Secrets management: rotation + access control + CI scanning evidence.
  4. Incident handling: show how Security Hub/GuardDuty alerts flow into IR tickets; evidence and runbooks.
  5. Vulnerability scanning and remediation SLA: evidence, exceptions, and automation.

“Hard mode” for credibility

  1. Provide a mapping with assumptions because the requirement is ambiguous (show confidence=medium/low).
  2. Show how to collect evidence across multi-account AWS Organizations.
  3. Produce an “Evidence Index” (table of artifacts + where to find them).
  4. Produce a “Control-to-Implementation” matrix for 5 controls (AC/AU/CM/IR/SC).
  5. Ask it to flag gaps when the provided CONTEXT is insufficient.

Add citations (recommended)

Right now, the API returns the retrieved chunks, but the model output does not embed citations.

Next upgrade: add citations: [{source, chunk_id, quote}] to the schema and instruct the model to reference retrieval IDs (e.g., [#1]).


Publishing to ollama.com

After validating locally:

ollama signin
ollama cp fincomp-control-mapper bharathreddyjanumpally/fincomp-control-mapper
ollama push bharathreddyjanumpally/fincomp-control-mapper

Roadmap ideas (optional)

  • Add citations[] field
  • Add rule-based validators (reject if evidence_artifacts missing)
  • Add JSON schema validation server-side
  • Add “framework presets” (SOC2-like, NIST-like, PCI-like) as prompt profiles
  • Add a POST /matrix endpoint (multi-control mapping)

Disclaimer

This project provides general technical guidance and audit-prep structure. It is not legal advice.


New features

citations[] field

The model now returns:

"citations": [{ "ref": "C1", "source": "./knowledge/...", "quote": "..." }]
  • ref must be a retrieved chunk id (C1, C2, …)
  • source must match the chunk’s source_path
  • quote must be a short excerpt (<= 200 chars) copied from that chunk

Server-side JSON Schema validation + rule-based validators

  • Responses are validated with Ajv against src/schema.js
  • Requests are rejected (HTTP 422) if required fields are missing (e.g., evidence_artifacts empty) or citations are invalid

Framework presets (prompt profiles)

Pass framework in requests: - nist (default control-language emphasis) - soc2 (monitoring cadence, policies/procedures/evidence) - pci (segmentation, encryption, logging, vuln mgmt, scope boundaries) - generic

Example:

curl -s http://localhost:7070/map \
  -H "Content-Type: application/json" \
  -d '{
    "framework": "pci",
    "control_id": "SC-7",
    "requirement_text": "Protect system boundaries and control communications.",
    "workload": {"cloud":"aws","data_sensitivity":"financial"}
  }' | jq .

POST /matrix (multi-control mapping)

Generate mappings for multiple items in one request:

curl -s http://localhost:7070/matrix \
  -H "Content-Type: application/json" \
  -d '{
    "framework":"nist",
    "workload":{"cloud":"aws","account_model":"multi-account"},
    "items":[
      {"control_id":"AU-2","requirement_text":"Identify and log auditable events."},
      {"control_id":"AC-2","requirement_text":"Manage accounts through lifecycle processes."}
    ]
  }' | jq .

Response includes: - mappings: list of full mapping JSON objects - matrix: compact view (control_id → services → top evidence)