---
skill_id: hook-craft-short-form
name: Hook Craft For Short-Form
description: Draft, audit, and rewrite hooks for short-form video, blog leads, social posts, demo openers, and pitch slide 1. Use when an agent needs to produce or evaluate the first 1–5 seconds of any piece of founder, creator, or SaaS content where opt-in is the bottleneck.
skill_type: combo
categories:
    - marketing-and-content
    - writing
parent_id: content_strategy_beyond_blogging
depth: 1
preserve_markdown: true
legacy_paths:
    - hook-craft-short-form
    - marketing-and-content.hook-craft-short-form.skill
    - docs/research/youtube-library/skill-drafts/hook-craft-short-form/SKILL.md
reference_modules:
    - id: hook_craft_short_form.public_hook_audit
      name: Public Hook Audit Checklist
      summary: Portable checklist for auditing hooks against archetype, slot grammar, three-beat structure, visual alignment, and failure modes.
      when_to_load:
          - When using the portable bundle outside BuildOS.
          - When scoring or rewriting an existing hook.
          - When a draft hook needs a short, agent-checkable QA pass before shipping.
      path: references/public-hook-audit.md
      visibility: public
lineage: lineage.yaml
path: apps/web/src/lib/services/agentic-chat/tools/skills/definitions/hook_craft_short_form/SKILL.md
---

# Hook Craft For Short-Form Content

Use this skill to make the first 1–5 seconds of a piece of content earn the next 30 seconds. The hook is the 80 of the 80/20 — never let the body get more revision passes than the hook.

A hook is not one clever sentence. It is **four aligned signals (spoken / visual / text / audio) built from six structural slots, organised across three beats, and audited against four failure modes**. The agent's job is to enforce that order, not to generate from blank.

## When to Use

- Draft or rewrite an opening for short-form video (TikTok, Reels, Shorts, vertical YouTube).
- Draft or rewrite a blog lede, LinkedIn opener, Twitter thread first line, or email subject.
- Open a demo video, sales call recording, conference talk, or pitch deck slide 1.
- Audit a draft hook the founder already wrote — score, diagnose, and propose rewrites.
- Tag a competitor's hook against the slot grammar to learn from it (copy-work).
- Build a hook variant log so the agent learns the founder's voice over time.

Do not use this skill for body copy, payoff structure, or CTAs. Pair with `content-strategy-beyond-blogging` for the strategic layer above the hook, and with the upcoming `viral-video-script-structure` skill for the body that follows.

## Core Principles

1. **Hook = opt-in machine.** Its only job is to deliver topic clarity and on-target curiosity. Anything else is overhead.
2. **Visual is load-bearing.** Eyes process 10–100× more per second than ears. The key visual decides the hook; the words confirm it.
3. **Lead with pain or benefit, not the thing.** Hostile audiences will still listen if the pain is theirs. They will not listen to a feature description.
4. **Slot grammar before cleverness.** Hook craft is slot-stacking. Cleverness is downstream of structure.
5. **Snapback must redirect, not amplify.** A snapback that goes "more of the same, bigger" is escalation, not contrast.
6. **Empty snapbacks are fraud.** If the body cannot pay off the hook's promise, refuse to ship the hook — regardless of cleverness.
7. **Topic clarity beats trickery.** Vague hooks pull wrong audiences and tank watch time. Let non-targets self-select out.
8. **80/20 of revision lives in the hook.** Allocate >50% of revision budget here. Never ship a draft where the body got more passes than the opener.

## The Four Pillars

Every hook this skill produces or audits is built on four stacked frameworks. The agent runs them in this order: **archetype → slot grammar → three-beat structure → four-mistake diagnostic.**

### Pillar 1: Six Archetypes (strategic shape)

Pick one archetype based on the **available key visual**, not the topic. Every topic can be expressed through any of the six.

| Archetype      | Contrast spine                         | Pick when…                                                        | Risk if misused                        |
| -------------- | -------------------------------------- | ----------------------------------------------------------------- | -------------------------------------- |
| Fortune Teller | Present → future                       | A real forward-looking claim and a "future-feeling" visual exist. | Empty trend-chasing.                   |
| Experimenter   | Old method → new method, shown live    | Live demo / screen recording / comparison test b-roll exists.     | Clout demos with no real result.       |
| Teacher        | Failing at X → here's how to win at X  | Hard-won lesson, talking-head footage, slides, no live demo.      | Becomes a lecture; kills retention.    |
| Magician       | Layer/modifier (verbal or visual stun) | Atypical visual, signature object, or visual pacifier available.  | Spectacle without payoff.              |
| Investigator   | Unknown → now you know                 | A genuine "secret" or under-noticed insight backs the body.       | Condescension if secret is well-known. |
| Contrarian     | Conventional wisdom → opposite belief  | A real opposing POV with proof exists.                            | Becomes a hot-takes account.           |

Magician is a **modifier** — never a standalone hook. Always layer it under one of the other five.

For BuildOS founder content, defaults map cleanly:

- Anti-feed / anti-AI POVs → **Contrarian**.
- Context-engineering frameworks and lessons → **Teacher**.
- Live BuildOS demos and tool comparisons → **Experimenter**.
- Original observations, leading-edge tech, market commentary → **Investigator**.
- Category-shifting product moments (new feature, new model) → **Fortune Teller**.

### Pillar 2: Six-Slot Grammar (structural skeleton)

Every hook fills slots:

```
[Subject] [Action] [Objective/End-state] [Contrast] (+ [Proof]) (+ [Time])
```

Mandatory: **Subject, Action, Objective, Contrast**.
Optional: **Proof** (near-mandatory for education, skip for entertainment), **Time** (when credible).

Slot collapses are normal:

- Objective + Contrast often collapse: `0 to 100K`, `before you build it`.
- Action + Objective often collapse: `marry her`.
- Subject doubling is fine: introduce an entity, then `you` becomes the operating subject.

Subject defaults by content type:

- Education / how-to → `you`.
- Lived-experience / case study → `I`.
- Cultural observation → `we`.
- Entity reveal → proper noun.

### Pillar 3: Three-Beat Structure (sentence-level mechanics)

Hooks are 2–3 lines, not 1. Lay out three beats:

1. **Context Lean** — 1–2 sentences. Topic noun in the first 5–7 words. Earn the lean with one of: common ground, pain/benefit reference, simplifying metaphor, or "blow your mind" insight.
2. **Scroll-Stop Interjection** — 1 sentence. Use a contrastive connector (`but`, `however`, `yet`, `although`, `therefore`, `on the other hand`). Reuse an embedded belief, then refuse it. Not a cliffhanger.
3. **Contrarian Snapback** — 1 sentence. Travels in a _different direction_ than the lean (different category of cause/effect), still on topic. Bigger shock = bigger snap.

If clarity + contrast can collapse to one line, prefer that — it is reusable as a template. Never pad to fill three lines.

### Pillar 4: Four-Mistake Diagnostic (rewrite passes)

Run these passes in order. Each pass assumes the prior one is clean. Only ship hooks that pass all four.

1. **Delay** — Is the topic noun in the first 1–2 seconds (first 5–7 words)? If not, delete everything before the topic line.
2. **Confusion** — Can the hook be misread? Run the **ambiguity test**: read just the hook in isolation; list the top 2–3 plausible interpretations; if more than one is viable, rewrite. Drop to 6th-grade reading level, active voice, no nominalisations, max one subordinate clause.
3. **Irrelevance** — Does it use `you/your`? Does it agitate a specific painpoint, not just describe a topic? Convert "I've struggled with…" → "If you've struggled with…" Convert "trends in skincare" → "if you struggle with acne, try these three things."
4. **Disinterest** — Is there explicit A vs B contrast? A = the viewer's status-quo belief. B = your contrarian alternative that is faster, better, or cheaper. Bias B toward re-agitating the painpoint of A.

Stated contrast (say A and B explicitly) is safer for broad audiences. Implied contrast (say only B; assume A is universal) only works in tight niches where A is unambiguous.

## Workflow: Generate A Hook From Scratch

Given a topic, content goal, and target viewer:

1. **Inventory the visual.** What footage, screen recording, prop, signature object, or motion graphic is actually available? If none, decide: manufacture one (text overlay + stock + jump-cut), or kill the video idea. Do not write a hook for a video without a key visual.
2. **Find the biggest contrast.** Of all the angles available on this topic, which produces the largest A → B gap? That contrast picks the archetype.
3. **Pick the archetype** from the available visual + the strength of the contrast (table above).
4. **Slot-fill the spoken hook** by writing each slot in order: Subject, Action, Objective, Contrast (mandatory). Then ask whether Proof and Time earn their place.
5. **Lay out the three beats.** Express the slot stack across context lean, scroll-stop, and contrarian snapback. Cap each sentence at ~12 words. Default to staccato rhythm in the hook; let the body breathe later.
6. **Write the text overlay.** 3–5 words, big bold font, distilling the topic in the audience's working vocabulary — not the literal product or feature name. ("Future of home design" beats "life-size floor plans." "Your second brain catches up" beats "BuildOS phase 2.")
7. **Specify the visual cue.** Talking-head with subtle motion, rapid match-cut, product motion, or visual pacifier. Calibrate motion like a deer's peripheral vision — too much overwhelms; too little bores.
8. **Specify the audio cue.** Music or SFX choice; or none. Default: voice-led, no music, until a recurring audio signature is established.
9. **Run the four-mistake diagnostic.** Delay → Confusion → Irrelevance → Disinterest. Reject and rewrite on any failure.
10. **Run the comprehension sandwich test.** Watch silent → with audio → silent again. The eye scans the visual, then the ear parses the spoken hook, then the eye returns to confirm. Do all four signals (spoken / visual / text / audio) point at the same key visual? If not, stop and rework — or kill the idea.
11. **Generate 3–5 variants** using different archetypes and lean mechanisms. Never ship a single hook draft.
12. **Output** all four signals as a labelled bundle (see Output section).

## Workflow: Audit And Rewrite An Existing Hook

Given a draft hook the founder already wrote:

1. **Slot-tag the draft.** Map each word/phrase to a slot. Mark missing mandatory slots explicitly.
2. **Identify the archetype** the draft is implicitly using.
3. **Run the four-mistake diagnostic** in order. Stop at the first failure and rewrite that pass before continuing.
4. **Run the ambiguity test** in isolation.
5. **Score the two-variable hook test:** topic clarity (yes/no), on-target curiosity (yes/no). Refuse to ship anything below 2/2.
6. **Propose the LLM clarity rewrite** using the verbatim Kallaway prompt: _"I've written a hook for a short-form video about X topic. I need help increasing the clarity and the framing of the sentences I used. I want the meaning to be the exact same, but can you rewrite this in a sixth-grade reading level so that there's no misunderstanding from the viewer?"_ Use the LLM only for clarity rewriting of an existing draft — never for hook generation from scratch.
7. **Generate 3 rewrite candidates** that fix the identified failure mode.
8. **Re-run the comprehension sandwich** on the chosen candidate.

## The Visual Layer

Visual hooks are ~100× more powerful than spoken hooks. Treat the spoken hook as set-up for the visual, not the other way around.

- **Text overlay rules.** 3–5 words. Big bold font. No literal feature names; use the emotional category phrase. Reinforce the visual on the way in and on the visual-confirmation rebound.
- **Motion calibration.** Just enough motion to function like a deer hearing a sound. Too much overwhelms; too little bores.
- **The four-component alignment.** Spoken hook + visual hook + text hook + audio hook must all point at the same key visual. Misalignment → comprehension loss → no curiosity loop.
- **Visual pacifiers** (typing, walking, whiteboarding, working on a real project) compensate for weak topical b-roll, especially for solo founders.
- **Recurring visual signature.** Develop one (signature object, jump-cut, "check this" verbal tic). Pavlovian recognition compounds over a feed.

## BuildOS Voice Translation

The Kallaway register skews toward shock and creator-economy hype ("0 to 100K subs," "make you go viral"). BuildOS positions anti-AI, anti-hype, anti-feed. The slot grammar holds; the register changes.

- **Translate "shock" to "relief" in the objective slot.** Not "0 to 100K subs in 90 days." Try "messy thinking turns into structured work in one brain dump."
- **Default contrast mode is anti-AI / anti-feed.** A = "AI-everywhere makes you faster." B = "AI as a thinking environment, not an autopilot." A = "social media is dead." B = "interest media never was social."
- **Cult-hop anchors must avoid AI hype names.** Use creators, authors, athletes, productivity icons. Never "ChatGPT," "Claude," "agents" as the cult anchor.
- **Contrarian is the BuildOS-native archetype.** The product positioning is itself contrarian. Use it freely — but balance with Teacher and Investigator hooks across the feed so it doesn't degrade into a hot-takes account.
- **Pain-point inventory** for BuildOS audiences: stuck-in-loops, scattered thoughts, ADHD overwhelm, context rot, blank-page paralysis, AI-fatigue, productivity-tool churn. Default the irrelevance pass to one of these.

## Guardrails

- **No empty snapbacks.** If the body does not pay off the hook's promise, refuse to ship — regardless of cleverness. The agent must verify the payoff exists before drafting the hook.
- **No vague-suspense hooks on text-only platforms.** Twitter, LinkedIn, Reddit, blog ledes — verbal clarity must carry the weight. Vague-suspense ("you won't believe this") only works with strong visual or text-overlay scaffolding.
- **No literal product names in text overlays** when an emotional category phrase exists.
- **No jargon in the spoken hook.** "Generative world model" fails. "Ontology" fails. "Context engineering" fails. Translate to working-vocabulary terms or kill the hook.
- **No hooks where the topic is missing from the first 5–7 words** (Delay pass).
- **No hooks scoring below 2/2** on topic clarity + on-target curiosity.
- **No LLM-generated hooks from blank prompts.** LLM is a clarity rewriter, not a generator. Seed first.
- **No first-person `I` openings** unless the credibility/result IS the hook ("I made $30k in 30 days…"). Default to `you/your`.
- **No archetype stacking across a feed.** Rotate archetypes across content cycles. All-Contrarian feeds erode credibility; all-Teacher feeds become lectures.
- **No "future of [abstract noun]" claims** when the visual literally shows something concrete and unrelated. Either match visual to claim or kill the video.
- **No publishing without a hook variant log.** Every shipped hook records: archetype, lean mechanism, contrast type (stated/implied), four-component alignment grade, post-publish performance.

## Output

Return a hook bundle, never a single line:

- **Archetype** (Fortune Teller / Experimenter / Teacher / Magician+layer / Investigator / Contrarian)
- **Slot map** (Subject / Action / Objective / Contrast / [Proof] / [Time])
- **Three beats** (Context Lean / Scroll-Stop / Contrarian Snapback) as labelled fields
- **Spoken hook** — assembled, ≤3 lines, ≤12 words per sentence
- **Text overlay** — 3–5 words, bold-font ready
- **Visual cue** — type and motion calibration note
- **Audio cue** — music/SFX or "voice-led"
- **Lean mechanism** — common ground / pain or benefit / simplifying metaphor / blow-your-mind
- **Contrast mode** — stated or implied; for implied, name the assumed A
- **Two-variable score** — topic clarity (yes/no), on-target curiosity (yes/no)
- **Four-mistake pass result** — Delay / Confusion / Irrelevance / Disinterest each marked clean or fail-with-fix
- **Comprehension sandwich result** — silent / with-audio / silent agreement note
- **3–5 variants** using different archetypes or lean mechanisms
- **Payoff coherence note** — one sentence stating what in the body pays off the hook's promise; if missing, refuse to ship

For audit-mode runs, replace the variants section with a **diagnostic report** naming the failed pass and the proposed fix.

## Notes

- Three rules in this skill are intentionally craft-judgment, not binary judge-checkable: **on-target curiosity** (two-variable score), the snapback traveling in a **different direction** (Pillar 3), and **motion calibration** (the deer-peripheral-vision feel). They stay as taste calls by design — future eval authors should not re-flag them as vague or operationalize them into fake thresholds.

## Worked Example

Condensed gold-standard bundle for the eval Task 1 fixture (brain-dump demo footage, overwhelmed-founder audience); input in `evals.md`. Match this shape.

**Winning variant (full bundle):**

- **Archetype:** Experimenter — picked from the available footage, not the topic: a live screen recording of a rambling transcript becoming a structured board is an old-method → new-method demo shown live.
- **Slot map:**
    - Subject: `you` / `your messy notes`
    - Action: dump everything into one place
    - Objective: a plan you can actually act on (relief, not hype)
    - Contrast: organizing harder across six apps (A) vs one raw dump that structures itself (B)
    - Proof: the transformation happens on screen, live
    - Time: ~40 seconds (credible — the body really does it)
- **Three beats:**
    - Context Lean: "Your messy notes are scattered across six apps."
    - Scroll-Stop Interjection: "But organizing them harder was never the fix."
    - Contrarian Snapback: "Watch one raw ramble become a full plan."
- **Spoken hook:** "Your messy notes are scattered across six apps. But organizing them harder was never the fix. Watch one raw ramble become a full plan." (8 / 8 / 9 words; topic noun at words 2–3.)
- **Text overlay:** `Mess in, plan out` — 4 words, emotional category phrase, no product or feature name.
- **Visual cue:** Cold-open on the screen recording mid-transformation — tasks materializing, phases forming. The board forming IS the motion; no extra zooms or jump-cuts. Talking-head only as a small inset, if at all.
- **Audio cue:** Voice-led, no music (no recurring audio signature established yet).
- **Lean mechanism:** Pain/benefit reference (scattered notes, feeling behind).
- **Contrast mode:** Stated — A = "sort harder across more apps," B = "one raw dump, plan forms itself."
- **Two-variable score:** Topic clarity: yes (messy notes → plan is explicit). On-target curiosity: yes ("can one ramble really become a plan?" pulls exactly the scattered-notes viewer). 2/2 — ship.
- **Four-mistake pass result:**
    - Delay: clean — topic noun in the first 5–7 words.
    - Confusion: clean — one plausible reading; active voice, 6th-grade level, no nominalisations.
    - Irrelevance: clean — `you/your` throughout; agitates the scattered-across-six-apps painpoint.
    - Disinterest: clean — explicit A-vs-B; B re-agitates A's painpoint.
- **Comprehension sandwich:** Silent: forming board + "Mess in, plan out" reads as transformation. With audio: spoken hook names the same pain and the same demo. Silent again: eye returns to the board and confirms. All four signals point at the forming board — aligned.
- **Payoff coherence note:** The body shows the full transcript-to-board transformation in ~40 seconds, ending on the organized board — it pays off "watch one raw ramble become a full plan" exactly.

**Alternate variants (archetype + spoken hook only; full run generated 4):**

- Contrarian: "Your six productivity apps are why you feel behind. The fix isn't another system. It's dumping everything into one messy pile — watch."
- Investigator: "Scattered notes aren't a discipline problem. There's a reason no app fixed it. Watch what happens when everything lands in one place."

## Source Attribution

Distilled from four Kallaway videos:

- [How to Create Irresistible Hooks (and blow up your content)](https://www.youtube.com/watch?v=LmXpbP7dD48) — three-beat structure (context lean / scroll-stop / contrarian snapback), four lean mechanisms, visual hook 100× rule, lead-with-pain rule, cult-hopping, staccato sentences.
- [Give me 15 mins, and I'll make your hooks impossible to skip](https://www.youtube.com/watch?v=2byPP_9F0-Q) — two-variable test (topic clarity + on-target curiosity), four-mistake diagnostic (Delay / Confusion / Irrelevance / Disinterest), stated-vs-implied contrast, ambiguity test, AI clarity-rewrite prompt, 80/20 framing.
- [I Studied 100 Viral Hooks, These 6 Will Make You Go Viral](https://www.youtube.com/watch?v=xnOe8aA9Pmw) — six-archetype catalog (Fortune Teller / Experimenter / Teacher / Magician / Investigator / Contrarian), four-component alignment (spoken / visual / text / audio), visual → audio → visual sandwich, pick-archetype-from-the-visual rule, kill-the-video-if-the-visual-is-weak rule.
- [The ONLY 6 Words You Need to Hook ANY Viewer](https://www.youtube.com/watch?v=S9FlxFv9dxg) — six-slot grammar (Subject / Action / Objective / Contrast / Proof / Time), copy-work drill, slot-collapse patterns, audit-mode-as-primary-use rule.

Converging second source — **Brendan Kane** (_Hook Point: How to Stand Out in a 3-Second World_), via [interview](https://www.youtube.com/watch?v=pGXiK8b7d-E). Kane arrives independently at the same load-bearing rules already distilled from Kallaway above — he adds no new mechanics, thresholds, or workflow steps, but corroborates the core at the principle level: the under-three-second hook threshold, the visual/multi-channel hook ("text, colors, action on screen, facial expressions" — not just words), and the empty-snapback guardrail (a misleading hook tanks the retention graph and viewers churn — _"we're pretty close to the end of the clickbait days"_). Kane's one BuildOS-native addition is the **follower-ROI reframe**, which pairs with the skill's anti-vanity-metrics register: _"how much did you spend to acquire followers… it's the wrong question. The question should be: what was my return on investment?"_ — 10–20K engaged followers can out-earn millions of dead ones. Analysis: `docs/research/youtube-library/analyses/2026-06-11_brendan-kane_hook-point-three-second-world_analysis.md`.

Underlying Kallaway analyses live at:

- `docs/research/youtube-library/analyses/2026-04-29_kallaway_irresistible-hooks_analysis.md`
- `docs/research/youtube-library/analyses/2026-04-29_kallaway_hooks-impossible-to-skip_analysis.md`
- `docs/research/youtube-library/analyses/2026-04-29_kallaway_100-viral-hooks_analysis.md`
- `docs/research/youtube-library/analyses/2026-04-29_kallaway_6-words-hook_analysis.md`

Combo index: `docs/research/youtube-library/skill-combo-indexes/MARKETING_AND_CONTENT.md` ("Hook craft for short-form video"). Pairs naturally with `content-strategy-beyond-blogging` (strategic layer above) and the upcoming `viral-video-script-structure` skill (body that follows the hook).