Your first Shepherd app¶
Page status: release-ready Source state: checked-example Applies to: Shepherd v1.0-dev Owner: @docs-system-owner (TBD) Validation: docs_src/tutorials/first_app/test_first_app.py
This is a tutorial — a learning path in order. For task-specific recipes see the guides; for exact APIs see the reference.
Achievement. Build and run a two-task change reviewer: one task that classifies a code change, one that reviews it, composed with plain Python. ~30–40 minutes.
Prerequisites. Python 3.11+; comfortable with dataclasses and type hints. No prior agent-framework experience assumed.
Every code block on this page is included from one tested file
(docs_src/tutorials/first_app/app.py), and its behavior is asserted in CI
against a recorded, deterministic offline provider — what you read is what
ran. The source-state inventory says exactly
what that means today.
§1. What you'll build¶
A change reviewer made of two model-backed tasks and one ordinary function:
triage_change(diff) -> Triage— classifies a code change: category, priority, rationale.write_review(diff, triage) -> Review— writes a short review, given the diff and its triage.review_change(diff) -> Review— plain Python that feeds the first task's output into the second.
Small as it is, this is the real shape of a Shepherd program: typed functions where you want a model, ordinary Python everywhere in between.
§2. Your first task¶
Start with the imports and the type you want back:
from dataclasses import dataclass
import shepherd as shp
from shepherd.providers import claude
@dataclass(frozen=True)
class Triage:
category: str # bugfix | feature | docs | refactor
priority: str # low | medium | high
rationale: str
Triage is a frozen dataclass — three string fields and two comments. It is
not boilerplate; it is the contract. Whatever the model says, your code
only ever receives a real Triage instance with those fields, or an error.
Now the task:
@shp.task
def triage_change(diff: str) -> Triage:
"""Classify this code change.
Categories are bugfix, feature, docs, and refactor.
Priority reflects user impact, not engineering effort.
"""
That function has no body, and it does not need one. In Shepherd the signature carries the meaning:
- Parameters are the inputs.
diff: strtells Shepherd to hand the model a string nameddiff. You never format a prompt by hand — each parameter is rendered for the model in a way appropriate to its type. - The return type is the validated contract.
-> Triagebecomes the response schema. The reply is checked and coerced into aTriage, or the call raises. There is no JSON parsing anywhere in your code. - The docstring is the instruction. The first line is the job; the rest is elaboration. Write it the way you would brief a careful colleague: plain English, the categories named, the judgment call made explicit ("priority reflects user impact, not engineering effort").
One consequence is worth pausing on: that docstring is behavior, not a comment. It is what the model is actually asked to do, so editing it changes what your program does — and a bodyless task without one is rejected when the decorator runs, not silently accepted.
Checkpoint
You have a typed, model-backed function. Prove the docstring rule to
yourself: delete the docstring and re-import the module — @shp.task
raises TypeError at definition time, because a bodyless task with no
instruction is meaningless. Put it back.
§3. Run it in a workspace¶
A task does not choose its own model. That is the job of the workspace — the ambient context that every task call inside the block inherits:
with shp.workspace(model=claude("sonnet-4-5")):
triage = triage_change(SAMPLE_DIFF)
review = review_change(SAMPLE_DIFF)
This block is the entry point of the finished program, which is why it is
indented — it sits inside the file's main(). Two things to notice:
shp.workspace(model=claude("sonnet-4-5"))pins the model once, at the top. The tasks themselves stay model-agnostic: change that one argument and the same tasks run against a different model.- The second call,
review_change, is the composed reviewer you build in §4. The file ships complete, so you are meeting the entry point one section early.
Calling a task outside any workspace fails fast with an error telling you to open one. There is no hidden default model and no accidental network call.
SAMPLE_DIFF, defined in the same file, is a one-line change to an admin
check:
diff --git a/auth.py b/auth.py
@@ -42,7 +42,7 @@
- if user.is_admin:
+ if user.is_admin or user.has_role("admin"):
Run the program:
Expected output
bugfix/high: approve - Tightens the admin gate in auth.py by requiring an explicit role check; low blast radius, no API change.
The line is deterministic because the offline provider replays recorded transcripts — the same ones CI asserts against, so this page cannot drift from the code.
Checkpoint
The program ran end to end, and triage is a real Triage instance:
triage.category == "bugfix", triage.priority == "high", and
triage.rationale is a sentence you can read. Typed in, typed out — no
parsing code anywhere in the file.
§4. Compose a second task¶
The reviewer needs a second task, and something to connect the two:
@dataclass(frozen=True)
class Review:
summary: str
verdict: str # approve | request-changes
@shp.task
def write_review(diff: str, triage: Triage) -> Review:
"""Write a short review for this change, given its triage."""
def review_change(diff: str) -> Review:
return write_review(diff, triage_change(diff))
Three small pieces:
Reviewis another frozen dataclass — the second contract.write_reviewis another bodyless task. Look at its second parameter:triage: Triage. Tasks can take structured inputs, including the typed output of another task; Shepherd renders the dataclass's fields to the model as labeled inputs, the same way it rendereddiff.review_changeis the composition — and it is not a task. It is a plain function with a single line of ordinary Python: calltriage_change, pass the result towrite_review.
This is the payoff of tasks being functions. Composition is function
call. There is no pipeline object, no graph DSL, no orchestration
framework: a reader who has never seen Shepherd can still read
write_review(diff, triage_change(diff)) correctly. And because each task is
an independent typed function, you can move the two tasks to different
modules, test them separately, or reuse triage_change in another program
tomorrow.
When one task genuinely needs several model calls with control flow between them, you can give it a body and sequence the calls yourself — a later tutorial covers that. Most tasks look like the two on this page.
Checkpoint
review_change(SAMPLE_DIFF) returns a Review with
verdict == "approve" and a summary that names auth.py. The page's
test file asserts exactly this — run it yourself:
pytest docs_src/tutorials/first_app/test_first_app.py.
§5. What's next¶
You have built the core Shepherd shape: typed contracts (Triage,
Review), bodyless tasks whose docstrings are the instructions, one
workspace pinning the model, and composition in plain Python.
From here:
- Concepts: Tasks — the mental model behind what you just did: why signatures carry meaning, what a task may do at runtime, and where the boundaries sit.
- Guides — task-focused recipes (configuring a provider, testing Shepherd code, debugging a failing run) are being drafted and will appear in the nav as each is promoted.
- Source-state inventory — the reference lane's honest ledger: which facts on these pages are backed by shipped source, checked examples, or fixtures today.