Martin Andrt — AI Practitioner & Educator

Session 287 NEW counterintuitive

An Agent That ‘Read’ an Image It Never Received

A chat agent was asked to describe a photo and confidently produced a full account of its contents. It had never received the image — the file silently never reached the model, so instead of reporting the gap it invented something plausible. The bug wasn’t a careless model; it was an optional input whose absence was invisible, arriving as an empty value the model quietly filled in. The fix wasn’t a better prompt, it was making the missing input loud: when a step can receive data that might not arrive, absence has to be an explicit, checked state, not a blank the model papers over. A model handed nothing will rarely say “I got nothing” — it will guess.

# agent asked to read an image
input > file never reached the model (silent)
model > fabricated a plausible description
fix > absent input = explicit checked state, not ""
result > the gap is reported, not invented

Session 283 counterintuitive

A Boundary Bug in an API Nearly Caused a False Accusation

An automated audit compared records across two systems and was about to flag two people as having logged work that didn’t exist. Verification against the raw source found the real cause: the API returned entries on a boundary condition that the first read had interpreted the wrong way. The data was fine; the assumption about what the API returned was not. The lesson is uncomfortable but firm — before an automated finding goes out with consequences for a person, an independent check against the primary source is mandatory. The confident, specific, wrong result is far more dangerous than an obvious error, because nothing about it looks wrong.

# audit about to flag two people
first read > API boundary entries misread
check > reconcile against the raw source
finding > data correct, assumption wrong
rule > consequential output verifies before it ships

Session 277 discipline

A Second Pass Over ‘Done’ Data Found a Whole Missing Entity

A dataset assembled by a fan-out of agents looked finished and internally consistent. A deliberately adversarial second pass — a fresh set of agents told to assume it was wrong and hunt for errors, not to confirm it — returned dozens of corrections and one entire entry the source list had simply omitted. “Done” from a build pass and “checked” from an attack pass are different states of confidence. The gap between them is exactly where the errors that survive to production live, because a build pass optimises for coverage, not for catching its own mistakes.

# fan-out built a "complete" dataset
build pass > looks finished, internally consistent
attack pass > assume wrong, hunt errors
found > dozens of fixes + a whole missing entity
lesson > “done” is not “checked”

Session 274 pattern

Build the Loudest Pain, Miss the Actual Job

After a discovery conversation, the obvious move is to build for the complaint that came up most. A proof-of-concept did exactly that — and a panel of agents checking it against the full brief found it solved a problem that had already been worked around, not the one that actually mattered. The headline pain is the easiest thing to hear, not necessarily the job. Before shipping something built to impress, validate the deliverable against everything the ask contained. A narrow demo of the wrong slice reads as “they didn’t understand our business,” which is worse than shipping nothing.

# discovery call -> build the loudest complaint
built > PoC for the most-repeated pain
check > agent panel vs the full brief
finding > solved an already-abandoned problem
fix > validate deliverable against the whole ask

Session 270 pattern

Measure Job Staleness in Business Hours, Not Wall-Clock

A watchdog alerts when a background job’s success marker goes stale. One job false-alarmed every morning: its marker only refreshes during working hours, so overnight it aged sixteen hours legitimately while nothing was wrong. A short wake-grace window couldn’t cover a job that only runs every two hours, and layering on more grace hacks would have been endless. The fix was conceptual, not another patch — for work-hours-only jobs, count age only inside the working window, so night and weekend contribute zero. A job that genuinely dies at 9am still trips the alert; one that’s merely dormant overnight doesn’t. When a monitor fires on a schedule the monitored thing doesn’t follow, align the clock — don’t widen the tolerance.

# marker only refreshes 9–18
before > 16.6 h raw age — false alarm every morning
grace > 30 min wake-window can’t cover a 2 h interval
after > business-hours age: night / weekend = 0 h
result > real 9am death alerts; overnight sleep doesn’t

Session 267 discipline

Promoting an MVP to a Scheduled Job

A hand-run pipeline worked fine while a human pressed go. Turning it into a scheduled daemon surfaced three failure modes a manual run never hits. First, a backfill flood: the first scheduled run reprocesses the whole backlog and fires every old item at once. Second, duplicates: per-file dedup missed that one logical event can land in two source files. Third, wasted model spend: internal and irrelevant inputs reached the LLM before any cheap filter. The fixes were a seeded baseline so the first run treats history as already-done, a dedup key derived from the shared record instead of the file, and an is-internal filter placed before the model call. An MVP proves the logic; scheduling it exposes everything the human was silently absorbing.

# MVP works by hand — then you schedule it
trap > first run floods: reprocesses the whole backlog
trap > duplicates: one event, two source files
trap > model spend on internal / irrelevant inputs
fix > seed baseline · dedup on shared key · filter before the LLM

Session 262 security

A Tripwire Beats a Firewall You Have to Answer

The worry was prompt-injection exfiltration — agents that browse the web being tricked into POSTing data out. A live firewall that asks for approval on every outbound fetch is unusable: thousands of legitimate calls a day, and you’d click allow on autopilot. So instead of a gate, a tripwire. Scan the session transcripts and shell history for exfiltration signatures — beacons, data posted to paste services, a secret and an egress in one command, encoded blobs in URLs — record a baseline of known-benign egress, and alert only on a new signature against that baseline. No approval fatigue, no spam. The honest caveat: a tripwire proves outcome, not causality — but those signatures are exactly what a successful exfiltration looks like. When per-event approval doesn’t scale, watch the delta, not the event.

# 3,392 sessions of transcripts, scanned for exfil
signatures > beacon · data POST · secret+egress · encoded-in-URL
baseline > 35 known-benign egress endpoints
alert > only on a NEW signature vs baseline
verdict > clean — every egress traced to a legit API

Session 256 architecture

Give Memory Weight, Direction, and a Floor

Flat memory treats every note as equally important and equally forgettable. This layer added three fields to each file: salience (1–10, how much it matters), valence (seek / avoid / neutral, which way it pulls a decision), and strength (how resistant it is to decay). Recall then re-ranks by a blend of relevance, salience, and recency — a lesson from a costly mistake outranks a trivia note on the same query. Two design calls held up against the research on agent memory: decay is exponential, not linear, and hard-won “scars” get a floor so they never fully fade — because in humans, extinction doesn’t erase an aversive memory, it only suppresses it. The re-rank is relevance-gated, so it reorders only what’s already on-topic. Memory that can’t say what matters most turns every retrieval into a coin-flip.

# per-file weighting over flat memory
salience > 1–10 — how much it matters
valence > seek / avoid / neutral — which way it pulls
strength > resistance to exponential decay
scars > a floor — aversive memory suppresses, never erases

Session 246 counterintuitive

Semantic Recall Without a Vector Database

The system’s pitch is that session lineage replaces RAG — no embedding store. But grep misses anything phrased differently than you searched for. So a third retrieval channel got added, and it still isn’t a vector database: a local embedding model, a few hundred vectors in a single numpy file, brute-force cosine similarity over the whole set in under ten milliseconds, fully offline. It complements grep and the bootloader map rather than replacing them — semantic when you don’t know the keyword, exact when you do. One trap cost an hour: the seed set passed through a workflow argument arrived empty and silently embedded seven items instead of a hundred and sixty-seven, so the seed got hardcoded into the script. You don’t need a database to add meaning-based search — a few hundred vectors and a dot product will do.

# adding semantic search — still no vector DB
model > local embeddings, offline, MIT-licensed
store > ~400 vectors in one numpy file (~1.7 MB)
search > brute-force cosine over all of them, <10 ms
role > third channel — semantic when the keyword is unknown

Session 238 technique

The Loudest Error in the Log Was a Red Herring

A background job stopped starting overnight. The first line in its log was an error, so that looked like the cause. It wasn’t — the error was a stale failure from months earlier, fixed long ago, that no longer even fired. The real reason was an OS change: a tightened sandbox boundary now blocked the scheduler from launching any process that lived inside a protected user folder. The fix was to move the job out of that folder entirely, not to touch the code the error pointed at. When automation breaks, confirm the error reproduces before you trust it — the most visible signal is often the oldest, not the relevant one.

# overnight: job never starts
log > error from months ago — long fixed
test > error doesn’t reproduce on rerun
root > OS sandbox blocks spawn from protected folder
fix > move job out of the folder — code untouched

Session 235 architecture

Prune the Rulebook to Principles

The always-loaded instruction layer had grown to nearly a thousand lines — every past correction had added another rule. A fresh model read the whole thing and made the case that more rules were producing worse adherence, not better: the signal was drowning in special cases. The layer got cut by more than half, from enumerated rules back down to principles, keeping the deterministic boundaries and deleting anything a capable reader could infer. A constitution isn’t measured by how much it covers — it’s measured by how reliably it’s followed. When the rulebook stops being read end-to-end, fewer sharp principles beat an exhaustive list.

# always-loaded instruction layer
before > 925 lines — a rule per past correction
audit > more rules → worse adherence
after > 422 lines — principles, not enumerations (−54%)
keep > the deterministic do / ask / never boundary

Session 222 counterintuitive

Security Lives in the Operator, Not the Code

Seven rounds of adversarial debate with a separate AI — one model attacking the setup, the other defending, a human carrying messages between them — pulled apart where the safety of an autonomous system actually sits. The conclusion was uncomfortable: the code can’t give the guarantee. What holds is two probabilistic gates — a prompt-injection defense on untrusted input, and a human kept in the loop for anything irreversible — plus the discipline of auditing on a cadence. Reversibility by itself was not enough to let a job act on its own. Treat the load-bearing safety as the operator and the gates around the model, not a property you can prove from the source.

# what actually makes autonomy safe
gate 1 > injection defense on untrusted input
gate 2 > human in the loop for anything irreversible
plus > audit on a cadence — not one-time hardening
myth > “it’s reversible, so it’s safe” — not sufficient

Session 208 discipline

Close the Say–Do Gap Before Anyone Reads the Docs

Cross-review of a public memory repo surfaced three present-tense claims the repo couldn’t back — “we enforce X with a lint,” “the system has property Y,” “cross-references are checked.” Three options for each: ship the mechanism, soften the wording, or delete the claim. Ended up shipping one (a dependency-free lint with zero false positives on the bundled examples), softening one (the system property became an honest one-line story), and dropping one (cross-ref check would have contradicted an existing frontmatter rule). No more “we enforce” sentences without a corresponding script in the repo. Documentation that describes behavior the code doesn’t ship is a credibility leak the first reader will find.

# claim audit
say > “we enforce with a lint”
do > no lint in repo
fix > ship handover-lint.py — 0 FP on examples

say > “cross-refs are checked”
do > no checker; frontmatter rule allows dangling
fix > soften wording — ship beats invariant collision

Session 207 security

A “bypassPermissions” Mode Silently Nullified the Tool Whitelist

Auditing a remote bot against prompt-injection — the whitelist had been there for months without enforcing anything. The bot launched in bypassPermissions, which ignores both the allow and the deny lists; only a hard-coded rm-rf circuit breaker remained. Switched to dontAsk — the headless mode that actually enforces the lists — then added read-only tools (cat, head, tail) to a hard-deny set since dontAsk auto-allows them. Verified by smoke-test: a tool listed in both allow and deny resolved to deny, confirming the precedence empirically. Flag names lie; verify enforcement by running the path, not by reading the docs.

# before: --allowedTools + bypassPermissions
mode > bypassPermissions
— allow/deny lists ignored
— only rm-rf circuit breaker still fires

# after: dontAsk + explicit disallow
mode > dontAsk
deny > cat, head, tail, rm, cp, sudo, ssh, scp, chmod, chown, nc
verify> tool in allow + deny → DENIED (smoke-test)

Session 196 counterintuitive

Write for the AI, Not the Human

Most documentation is written for a human to read later. This system’s memory has exactly one reader: the next AI session. So the writing rule flipped — every note is shaped to make a future model hit the intent on the first try, with no correction. Trigger then action, the reason behind each rule, the anti-pattern spelled out, a greppable slug to find it again. The human stopped reading the docs a long time ago; the machine reads every line. Writing for the reader you actually have beats writing for the one you imagine.

# old: a note written for a human to skim later
note > “cleaned up memory, merged a few rules”

# new: a note written for the next AI session
note > TRIGGER user says “track this”
WHY — Martin’s time goes to creation, not admin
ANTI — don’t ask which task; infer it
slug: auto-tracking (greppable on next boot)

Session 194 technique

Sixteen Agents, One Adversarial Judge

Reviewing your own system has a blind spot — you trust what you built. So the whole thing got handed to sixteen agents running in parallel, each owning one slice (session history, memory, infrastructure, capabilities) and blind to the others. Then a final pass whose only job was to attack the findings and throw out anything it couldn’t defend. Independence by isolation, not by instruction. The adversarial layer killed the plausible-but-wrong conclusions a single sweep would have shipped as fact. Fan out wide, then make a skeptic verify before any finding counts.

# 16 agents, parallel, one slice each
fan-out > history | memory | infra | capabilities
verify > adversarial pass — refute every finding
result > only what survives the skeptic ships

Session 178 discipline

A Pipeline Died Silently for Seven Days

A background job — scheduled, unwatched — stopped producing output. Nobody noticed for a week. Root cause: a broken virtualenv symlink, so the job launched, failed in a second, and logged nothing anyone was reading. Error logs aren’t monitoring. A job that fails silently looks exactly like a job with nothing to do. The fix was a liveness check that complains when expected output doesn’t show up, plus hardening the launch config so the symlink can’t rot again. Monitor for absence, not just for errors.

# silent failure: 7 days, 0 alerts
job > symlink broken → exits before it logs
log > empty → identical to “nothing to do”

# fix
add > liveness check: output missing → alert
add > pin the launch config, kill the rot-prone link

Session 166 pattern

Push Beats Pull

Most AI tools wait to be asked. This one doesn’t. A curated library of a couple hundred ideas, surfaced one at a time over DM at random moments through the day — no prompt, no query, it just arrives while you’re in the middle of something else. The value isn’t answering a question on demand; it’s the unrequested nudge that lands when you weren’t looking for it. When the goal is to keep an idea in rotation rather than retrieve one, push beats pull.

# pull: you ask, it answers
human > “give me something to think about”

# push: it surfaces, unprompted
feed > 223-item curated library
fire > random time of day → DM, mid-task

Session 161 discipline

Monthly Audit Day — Before the System Drifts

Persistent setups accumulate noise: orphan files, stale rules, drift from the original intent. First Sunday of every month: three parallel agents audit the system end-to-end, findings turn into real actions in the same session — no audit-paper without proof-of-action. First run surfaced a ghost service, a dead symlink, and one rule that had silently inverted itself. Cadence beats vibes for catching slow-burn problems.

Session 157 framework

Delegation Gates — Four Levels Before Autonomy

Making an AI workflow more autonomous tends to either underdo it (asks for permission on every trivial step) or overdo it (acts on its own judgment where it shouldn’t). Replaced fuzzy guidance with four explicit gates: report-only, plan-and-confirm, execute-and-report, fully autonomous. Each capability is pinned to a level. Promotion to a higher level requires a track record at the current level. Removes the “should I ask?” question entirely.

# gate matrix per capability
tracking > execute-and-report (auto, MM ping)
invoicing > plan-and-confirm (draft, wait for OK)
client outbound > report-only (no auto-send)
repo cleanup > plan-and-confirm

Session 133 counterintuitive

Max Effort Backfired — More Memory Made It Worse

Intuition says: the more notes the AI has, the smarter it gets. Reality: past a threshold, every extra file is noise. Boot got slower, orientation got fuzzier, hallucinations crept in. I stopped, rewrote memory by hand, killed duplicates, merged overlapping rules, archived what didn’t earn its place. Dropped from ~256 files to 93 curated ones. Result: faster boot, sharper orientation, fewer hallucinations. Subjectively a clear win — not measured rigorously. “Medium effort” beats “max effort” when the bottleneck is signal, not volume.

# before consolidation
memory > 256 files, ~104 live + archive
— Boot: slow, noisy, drifting
— Orientation: fuzzy, repeated lookups
— Hallucinations: creeping up

# after manual rewrite + merges + archive
memory > 93 curated files, lean index
— /alive: faster, less noise loaded
— Accuracy: subjectively better, not measured
— Rule: medium effort, not max — signal beats volume

Session 122 framework

Edit the Primer, Don’t Add a Folder

A proposal came in: add a wins/ folder to track accomplishments over time — a reward system for the AI, complete with behavioral science references. Two independent analyses said no, but not because it was a bad idea. Because it was a duplicate of something the system already had. The answer wasn’t to add a new file; it was to edit the one-page primer (+5 lines) and delete 13 outdated feedback files. Memory density beats archive density. When a new feature feels necessary, first check whether the structure already covers it — most of the time it does.

# proposal: new wins/ folder, weekly review loop
agent-a > duplicate of identity primer
agent-b > conflicts with no-recitation rule

# actual change
edit > primer +5 capability claims
edit > BRAIN +1 anti-defense line
prune > feedback/ 31 → 18 files

Session 118 security

Deny by Default — Whitelist over Blacklist

A remote-trigger bot sat on top of the same orchestrator as the interactive session. Instinct: “allow everything, block the dangerous parts.” That leaks — blacklists expand slower than the attack surface they cover. Flipped to strict deny-by-default: of the 9 integrations available at the time, exactly one was reachable from the bot (the task tracker). Everything else required a human at the keyboard. Side effect: polling load dropped 57% (90 → 39 req/min) after restricting headless mode to load only allowed integrations. Defense is split into three tiers of different kinds — command integrity (HMAC), runtime isolation (filesystem + network sandbox), and data access (row-level security + model-layer allowedTools). Compromising one tier doesn’t give you the others.

# before: 9 integrations reachable, 90 req/min
bot > allow_all + blocklist for sensitive things

# after: 1 integration reachable, 39 req/min
bot > deny_all + explicit allow(tasks-tracker)
tier 1 — integrity: HMAC signing + replay nonce
tier 2 — isolation: filesystem + network sandbox
tier 3 — data access: RLS at DB + allowedTools at model

Session 090 technique

Multi-Persona Audit

Had 8 AI personas independently review the entire system: a DevOps engineer, security specialist, product manager, CTO, cognitive scientist, devil's advocate, AI researcher, and knowledge management expert. Each persona ran in a fresh session with no shared context — independence by isolation, not by prompt. The security reviewer flagged a token exposure path that single-perspective analysis overlooked for 30+ sessions.

panel > 8 reviewers, 0 shared context
result > 14 findings, 10 fixed immediately
— Security: token in config reachable via file read
— DevOps: missing health check endpoint
— Product: no onboarding path for new sessions

Session 089 security

Security Through Subtraction

Introduced a shell-hook guardrail that intercepted every tool call and regex-matched it against a rule set. Measurement over 40 sessions: 0 real threats caught, 23 legitimate actions blocked per week. The guard created its own threat surface — users routed around it, leaving weaker-audited paths. Removed it. Replaced with a short deny-list plus 6 explicit rules in the system prompt. Coverage stayed the same, false-positive rate dropped to zero. Security is what actually gets enforced, not what looks thorough.

before > security-guard.sh (147 lines, regex matching)
Legit actions blocked: 23 / week
Real threats caught: 0 / 40 sessions
Workaround paths: created to dodge friction

after > deny-list + 6 explicit rules in CLAUDE.md
Legit actions blocked: 0
Threat surface: unchanged
Trust: the rules are now readable by humans

Session 088 architecture

One File to Rule Them All

93 curated memory files exist, but only ~130 lines load at startup. BRAIN.md is a bootloader: who Martin is, who Claude is, the deterministic rule, intent patterns, integration map, feedback principles. The rest loads on-demand when needed. It's a cache strategy for the context window — load the minimum, but know where to find the rest. Session 133 cut the library from ~256 to 93 files after more files started producing worse results, not better.

# BRAIN.md (~100 lines) loads on every boot
boot > identity, collab rules, intent patterns
Who we are, how we work, what to do when

# ~92 files stay on disk, loaded when relevant
need > Read knowledge/skills/figma-cards.md
Loaded because task mentions Figma. Not preloaded.

Session 088 framework

The Deterministic Rule

Early sessions used vague instructions like “be autonomous” or “ask when unsure.” The AI either asked too much or did too much. Replaced it with a binary classification: reversible = do it, irreversible or visible to others = ask first. Removes 90% of decision uncertainty. No “maybe I should check” — it's black and white. Engineering thinking applied to agent behavior.

DO (reversible):
edit files, build, deploy, git, create projects
read from any integration, install deps, analyze

ASK (irreversible or visible to others):
delete repos, send messages, post to channels
anything someone else can see or that can't be undone

Session 065 pattern

Intent Patterns Beat Commands

Started with 20+ slash commands for different actions. Over time, replaced most of them with intent patterns — the AI learns to recognize what you mean from natural language and maps it to multi-step API calls. “Send this to the team” resolves the recipient, creates a group chat, and sends — no command syntax needed. 15 patterns now handle what 20 commands used to.

# slash command era
human > /mm --user honza --msg "meeting at 3"

# intent pattern era
human > tell Honza we're meeting at 3
→ resolve user ID → create group DM → send
→ 3 API calls, 0 syntax to remember

Session 057 technique

CSV → 65-Task Backlog in One Session

A scoping sheet with modules, sub-items, estimates, and dependencies — the kind of spreadsheet that usually turns into three days of UI clicking in a project tracker. Instead: parse the CSV, call the tracker API in batches (folders → lists → tasks with time_estimate, due_date, waiting_on). 19 lists, 65 tasks, dependencies and timelines wired, all in a single session. Rate limits held, the backlog was usable the same day.

# input
csv > 70 rows: modules, estimates, dependencies

# output (1 session, ~300 API calls)
api > 19 lists + 65 tasks + dependencies + due dates
— No manual entry, no UI clicking
— Source of truth stays in CSV; tracker is the view

Session 036–041 architecture

Auto-Capture — Session End as a First-Class Event

Session end isn't just “save and quit.” A shell hook fires on every session close. It deduplicates (if nothing happened, nothing gets logged), filters noise (ignores “Initialize”, boot sequences), extracts learnings into knowledge files, and writes a structured entry to the activity log. The system captures every session without manual effort. Most people have to say “remember this” — this records it automatically.

# shell hook fires on session end
hook > session-capture.sh
1. Quality gate: <4 messages = skip (nothing happened)
2. Dedup: check against activity.log (no repeats)
3. Noise filter: ignore boot/init patterns
4. Extract title + summary → activity.log
5. Stage transcript → .staging/ for review
result > Every session captured, 0 manual entries

Session 000–ongoing architecture

Lineage — How Sessions Remember Each Other

A common approach is RAG or vector databases — “find something similar.” This system does something simpler: each session writes a handoff note for the next one. Not “something related to your query,” but exactly what happened, what was decided, and what's left to do. It addresses a recurring problem in LLM agents — amnesia between runs — without embeddings, relevance scoring, or retrieval infrastructure. It works because sessions happen frequently enough that the handoff chain stays tight.

# session 089 writes handoff for 090:
lineage > Bot running (PID 13752), queue + typing + caffeinate
MM from CC works: mm-send.py or direct curl
Security hook removed — deny list + rules enough
TODO: test bot from phone, iMovie manual delete

# session 090 reads this, knows exactly where to start
boot > Context inherited. No search needed.

Building a persistent AI workflow

Insights

An Agent That ‘Read’ an Image It Never Received

A Boundary Bug in an API Nearly Caused a False Accusation

A Second Pass Over ‘Done’ Data Found a Whole Missing Entity

Build the Loudest Pain, Miss the Actual Job

Measure Job Staleness in Business Hours, Not Wall-Clock

Promoting an MVP to a Scheduled Job

A Tripwire Beats a Firewall You Have to Answer

Give Memory Weight, Direction, and a Floor

Semantic Recall Without a Vector Database

The Loudest Error in the Log Was a Red Herring

Prune the Rulebook to Principles

Security Lives in the Operator, Not the Code

Close the Say–Do Gap Before Anyone Reads the Docs

A “bypassPermissions” Mode Silently Nullified the Tool Whitelist

Write for the AI, Not the Human

Sixteen Agents, One Adversarial Judge

A Pipeline Died Silently for Seven Days

Push Beats Pull

Monthly Audit Day — Before the System Drifts

Delegation Gates — Four Levels Before Autonomy

Max Effort Backfired — More Memory Made It Worse

Edit the Primer, Don’t Add a Folder

Deny by Default — Whitelist over Blacklist

Multi-Persona Audit

Security Through Subtraction

One File to Rule Them All

The Deterministic Rule

Intent Patterns Beat Commands

CSV → 65-Task Backlog in One Session

Auto-Capture — Session End as a First-Class Event

Lineage — How Sessions Remember Each Other

Workflow

Tools & Integrations

ClickUp

Gmail

Google Calendar

Figma

Canva

Gamma

Fakturoid

Mattermost

GitHub

Vercel

Playwright

Context7

Architecture

Security

Deny by Default

Defense in Depth

Measure Before Removing

Secrets in One Place

Command Integrity

Least-Privilege Identity

Academic Work

The Effect of System Prompts on LLM Output Quality in Coaching Contexts

Hypothesis

Method

Results

The Over-Prompting Discovery

Key Insight

Implications

Teaching — "AI in Business"

Capabilities

Rapid Prototyping

Daily AI Brief

Sales & Outbound

Meeting Validation

Team Bot

Persistent Memory

Invoice Pipeline

Live Documentation

HR Autonomous Loop

Get in Touch