AI use cases & workflows

Building a persistent AI workflow

Started with AI-assisted coding in Cursor a year ago. Since then it became a workflow I use every day. 173 named sessions, 18 integrated services, a 150-file curated memory — a persistent setup where context carries across sessions instead of being rebuilt each time.

173
Named sessions
18
Integrations
150
Memory files
13
Agentic workflows
Field Report
Building a Persistent AI Workflow — What 170+ Sessions with Claude Code Looked Like
command-center-case-study.md
findings

Insights

Practical findings from building and running the system. Last updated May 5, 2026 — session 174.

how it works

Workflow

Every session starts with context loading, continues with real work, and ends with automated knowledge capture. Two real examples:

session lifecycle
# 1. Context loads automatically
claude > Reading memory/BRAIN.md
intent patterns, integrations, rules
claude > Reading memory/LINEAGE.md
last handoff: previous session
claude > Reading memory/MEMORY.md
lookup manual, memory files indexed
claude > tail memory/activity.log
# 2. Real work happens
human > build landing page, futuristic design
claude > Reading identity, projects, CV...
claude > ✓ Created, committed, deployed to Vercel
# 3. Auto-capture on session end
hook > activity.log ← [2026-04-17] [project] — title
hook > transcript → .staging/ for review
hook > git push → command-center backup
connected services

Tools & Integrations

12 services connected through one setup. Click any card to see how it's used.

system design

Architecture

Four layers keep context persistent across sessions.

Lineage
Every session has handoff notes. Each session inherits from the last.
Memory
Files organized by type: identity, rules, projects, knowledge, feedback. Organized, not dumped.
Hooks
Auto-capture on session end. Activity log, transcript excerpts, context reinject on compact.
Calibration
Feedback rules from real work. Structured disagreement, scope pushback, no recitation.
Session Start Load Brain Load Lineage Real Work Auto-Capture Next Session
principles

Security

Six principles that run through every public-facing or remote component. Measured, enforced, audited across 130+ sessions. Security posture here comes from what actually gets enforced — not from what looks thorough.

Deny by Default

Remote entrypoints reach 1 integration out of 9. Everything else stays blocked until a human is at the keyboard. Whitelist, not blocklist — blocklists expand slower than the attack surface.

session 118-57% polling

Defense in Depth

Three tiers of different kinds, not one long chain: integrity (HMAC signing + replay nonce), runtime isolation (filesystem + network sandbox), and data access (row-level security at the DB plus model-layer allowedTools). Compromising one tier doesn’t give you the others.

3 tiersintegrity · isolation · access

Measure Before Removing

Legacy guardrail tracked over 40 sessions: 0 real threats caught, 23 legitimate actions blocked per week. Removed only after data proved it was creating more attack surface than it covered.

session 089data-driven

Secrets in One Place

A single encrypted env file. Never read via cat, never dropped into shell history, never copied into model context. Rotation uses interactive input — exposed tokens trigger a 2-phase replace.

getpassno-history

Command Integrity

Remote commands are HMAC-signed with a replay nonce. The nonce insert sits in the execute path, not the claim path — so replayed commands fail closed after verification. Prod-verified on a real attempted replay.

session 121HMAC + nonce

Least-Privilege Identity

Every automation has its own role with the smallest permission set it needs. The bot can’t write where a human would; row-level security is enforced at the database, not only in application code.

RLSper-identity role
Every rule on this page has measurement or a prod incident behind it. If a security control can’t be audited or hasn’t been tested under real conditions, it doesn’t stay in the system — that’s how Security Through Subtraction above ended up in the insights list.
research

Academic Work

Controlled experiment with real data.

The Effect of System Prompts on LLM Output Quality in Coaching Contexts

MSc. Thesis — NEWTON University, 2026 · Supervised by PhDr. Mgr. et Mgr. Barbora Pánková, Ph.D., MBA
  • Controlled experiment: 200 outputs from 4 LLMs (ChatGPT 5.2, Gemini 3 Flash, Claude Sonnet 4.5, DeepSeek V3)
  • 5 coaching personas × 2 conditions (prompted vs. unprompted), evaluated with a 5-criterion rubric
  • Result: system prompts increase quality by 21% (t(99)=8.65, p<0.001, Cohen's d=0.87)
  • Discovered "over-prompting" — explicit personas can degrade naturally strong models

Hypothesis

Does defining a coaching persona via system prompt measurably improve LLM output quality compared to the same model without any role instruction? And does the effect depend on the type of persona, the model, and the task?

Method

200 outputs were generated by 4 commercially available LLMs across 5 coaching personas under 2 conditions—with and without a system prompt. Each output was scored on a 5-criterion rubric (role consistency, structure, specificity, handling of uncertainty, usability) on a 1–3 scale. Maximum score: 15 per output. Identical user inputs in both conditions ensure observed differences are attributable to the system prompt, not input variability.

200 outputs 4 models 5 personas 5 rubric criteria 100 matched pairs

Results

Overall Effect of System Prompt (max 15 pts)
10.55
Without prompt
+3.13 (+21%)
p<0.001
13.68
With prompt
With system prompt (13.68) Without (10.55)
Effect by Persona
Life Coach
+1.35
Product Owner
+2.40
Demo Guru
+3.45
Pragmatic Coach
+3.95
Philosophical Advisor
+4.50
With prompt Without Difference shown right
Effect by Model
ChatGPT 5.2
+3.48
Gemini 3 Flash
+2.64
Claude Sonnet 4.5
+1.88
DeepSeek V3
+4.52
With prompt Without

The Over-Prompting Discovery

In 12 out of 100 matched pairs (12%), the system prompt decreased output quality. This "over-prompting" effect concentrated in two patterns:

By model: Gemini 3 Flash (6 cases) and Claude Sonnet 4.5 (5 cases)—models with naturally strong coaching behavior that gets disrupted by explicit instructions.

By persona: Life Coach (6 of 12 cases)—a persona defined by implicit style (tone, reflective questions) rather than explicit output structure. When the model already "knows" how to coach, adding a persona prompt can constrain rather than improve.

Over-Prompting Cases (score with prompt < score without)
ChatGPT
Gemini
Claude
Life Coach
−6, −6, −2
−8, −4, −4
Product Owner
−4, −4
Pragmatic Coach
−5
−2
Phil. Advisor
−6

ChatGPT 5.2 showed no over-prompting cases in this sample. DeepSeek V3: 1 case excluded (malformed output prevented consistent scoring).

Key Insight

The practical rule: the less "natural" the desired behavior is for a model, the more explicit the persona must be. For roles already well-represented in training data (empathetic coaching), a light-touch prompt suffices. For roles with unusual output structure (analytical decomposition, philosophical distinction), explicit section-by-section prescriptions are necessary—and produce the largest quality gains (+4.50 for Philosophical Advisor, +3.95 for Pragmatic Coach).

Implications

This matters for anyone deploying LLMs in coaching, mentoring, or educational contexts: prompting is not a binary (on/off) but a calibration problem. More structure helps—until it doesn't. The evidence suggests that optimal persona design requires testing both conditions before deployment.

Full thesis (62 pages, dataset of 200 outputs, evaluation rubric, and analysis script) available upon request.

Teaching — "AI in Business"

NEWTON University, Prague · Nov 2025 – present
  • Teaching 2 groups of 30 students how to use AI as a professional partner
  • Curriculum: prompt engineering, AI-assisted decisions, ethical deployment, hands-on tool use
  • Pilot AI subject at university — building the course from scratch
  • University runs AI courses for 240 students across 8 groups
what i build

Capabilities

Things built through the setup described above.

Rapid Prototyping

Brief to deployed demo in 2–3 days. 40 shipped across industries.

40 shippedNext.jsVercel

Daily AI Brief

Four sources → deduped, ranked summary delivered daily at 9:30. Runs forever, zero touch.

4 sourceslaunchddedup

Sales & Outbound

Tender scraper with daily cron + outbound pipeline with AI web research.

lead genoutboundSMTP

Meeting Validation

SDR app: client brief → AI web research → profile → email to AE.

GeminiTurso42 briefs

Team Bot

Claude-powered Mattermost bot. NL commands, 6-layer security, queue.

Bot APIalways-onaudited

Persistent Memory

Organized by type, session lineage, auto-capture via shell hooks.

170+ sessionsfeedback rulesauto-capture

Invoice Pipeline

ClickUp task → Fakturoid invoice → Gmail send. No manual typing.

ClickUpFakturoidGmail

Live Documentation

Svelte MCP pulls current framework docs straight into context. No guessing API shapes from training memory.

MCPSvelte 5always-fresh

HR Autonomous Loop

Job board API → AI evaluation → ClickUp tasks, calendar invites, candidate emails. Six stages, no manual touch.

StartupJobsGemini6-stage
connect

Get in Touch

Martin Andrt

I work on AI in two modes: building internal tools at a digital agency, and teaching students at NEWTON University how to use AI for actual work. Based in Prague.