What are the building blocks of a strong People Ops prompt?

Five: Role, Context, Task, Constraints, and Output spec. Role sets the perspective, Context gives the model what it cannot know, Task states what you actually need, Constraints set the boundaries, and Output spec fixes the exact shape of the answer. You do not need all five every time. If you can only fix one thing, fix the Output spec, it does more work than any other block. Context and Output spec are the two people drop first under time pressure, which is exactly when they matter most.

What is prompt chaining and why does it matter for HR?

Prompt chaining is breaking a complex task into a sequence of smaller prompts, where each output feeds the next. Instead of asking the model to design your performance review process in one shot, you have it list the common failure modes, pick the three that fit your context, draft one version, then stress-test it. Chaining produces work closer to how a senior practitioner thinks, and it makes errors visible at each step so you catch them before they compound. Three to five prompts is the useful range.

How do I stop AI giving confidently wrong answers?

Run a critique pass on every draft before you act on it, not just on the big calls. Ask the model what it assumed and has not verified, what would change its recommendation, and where the answer is weakest. The upgrade most teams miss: run the critique in a different model from the one that drafted. A model will not reliably catch its own blind spots, because the blind spots come from how that model reasons in the first place.

Does the same prompt work across ChatGPT, Claude and Gemini?

No, and treating it as if it does leaves quality on the table. GPT-5 follows instructions very literally and needs explicit reasoning steps. Claude is more interpretive and surfaces nuance unprompted, which makes it strong for drafting and sensitive comms. Gemini is strongest on research and multimodal work. Write one good prompt, then adapt it per model. The switching cost is low and the quality lift is real.

Prompting patterns for People Ops

The wrong belief is that better AI output comes from a better model. Spend more, buy the top tier, and the answers get sharper on their own. It does not work like that. The variable that decides whether AI does anything useful for a People team is not the model, it is the prompt pattern, and prompt patterns are learnable in an afternoon.

Prompting patterns for People Ops are a small set of reusable prompt structures that cover most of the work a People function does with AI. Three families do most of it: structured prompts built from five blocks, prompt chains that break a hard task into steps, and critical-thinking prompts that stress-test a draft before you act on it. Learn those three and you will outperform a team on a more expensive model that is still typing one question in and reading one answer out.

The five building blocks of a strong prompt

Every prompt that earns its place draws from some combination of five elements.

Role. Who is the model acting as? "Act as a senior People Business Partner with experience in regulated environments." This sets the perspective and the implicit expertise level, and it changes the register of everything that follows.

Context. What do you know that the model does not? "We are a 220-person SaaS company, two-year average tenure, just acquired a 40-person team in Berlin, hybrid policy under review." Context is the single biggest lever on output quality, and it is the one people skip because it feels like typing out the obvious.

Task. What do you actually need? Not "help with onboarding," which is a topic. "Draft a 30-60-90 day onboarding plan for an Engineering Manager joining the Berlin team," which is a task the model can finish.

Constraints. The boundaries. Format, tone, length, what to avoid. "Under 600 words, British English, no jargon, do not assume we have a formal levelling framework."

Output spec. The exact shape of the answer. "Return a table with columns: Phase, Goal, Activities, Owner, Success metric." An output spec saves more time than any other block, because it stops the model handing you a wall of prose you then have to reshape by hand.

You do not need all five every time. A throwaway question does not need a role and a constraint stack. But the moment the work matters, the five blocks turn an average answer into one you can take into a meeting.

A prompt that gets used

Names the role and the expertise level it needs

Gives the real context: size, tenure, the acquisition, the open policy

States one finishable task, not a topic

Sets constraints: length, English, tone, what to avoid

Specifies the output shape, so the answer arrives usable

A prompt that gets ignored

Assumes the model knows who it should be

Leaves out everything that makes your company yours

Points at a topic and hopes

No boundaries, so the answer sprawls

No shape, so you rebuild the output by hand anyway

Same model, same question. The difference is which blocks are present.

Put it in concrete terms. A weak prompt: "Give me ideas for improving onboarding." You get generic ideas you have read three times. A strong one: "Act as a People Ops lead with onboarding redesign experience in 200-person hybrid SaaS companies. Context: average new-hire ramp is 12 weeks, manager satisfaction with onboarding sits at 6.2 out of 10, we are about to grow engineering by 40%. Task: propose three onboarding redesigns we could pilot in Q3, each addressing a different root cause. Constraints: under 800 words, prioritised by impact-to-effort. Output: a markdown table with columns Pilot name, Hypothesis, Owner, Cost, Risk, Success metric."

The second one produces something you can take to a planning meeting. The first produces a wall of words you then have to think through yourself, which was the job you were trying to hand over.

Chaining: break the work into a sequence

Single prompts have a ceiling. The work that matters in People Ops, redesigning a process, diagnosing a culture problem, drafting a policy that survives legal and the team, is too complex for one prompt. The pattern that breaks the ceiling is chaining: a sequence of prompts where each output feeds the next, so the model works the problem the way a thoughtful practitioner would.

01
Prompt 1
Map the failure modes
List the ten most common failure modes in technical onboarding for hybrid SaaS at 150 to 300 people. Name the root cause and the symptom for each.
02
Prompt 2
Narrow to your case
Of those ten, pick the three most relevant given this context, and justify the selection.
03
Prompt 3
Draft the redesign
For each of the three, redesign the current onboarding flow to fix the root cause without breaking what works.
04
Prompt 4
Stress-test it
What would a sceptical engineering lead say? What would Finance ask about cost? What would a new joiner who lived it think?
05
Prompt 5
Synthesise the proposal
One page: what we are testing, why, how we measure, what we stop doing if it fails.

Five prompts, about twenty minutes. The output is meaningfully better than anything a single mega-prompt produces, because the chain forces sequential thinking instead of asking for a finished answer to a question the model has not yet reasoned through.

The other reason chains beat mega-prompts is that errors become visible. If prompt 2 picks the wrong three failure modes, you catch it before prompt 3 builds a redesign on a bad foundation. A single prompt hides its reasoning inside one block of output. A chain exposes it, one decision at a time, where you can still change it. Three to five links is the useful range. Fewer and you have not actually broken the task down. More and you are burning time a single sharp prompt would have saved.

The critique pass most teams skip

The prompts most teams skip are the ones that catch the costly mistakes. A first draft from any model sounds confident. Confident is not the same as correct, and in People work the gap between the two shows up in a grievance, a botched policy line, or a comp decision that will not survive a tribunal.

So run everything you are about to act on through a critique pass. These are the questions that do the work.

Run every draft through this before you ship it

What did you assume in this answer that I have not verified?

Fails when: The model states a legal or comp fact as settled when it guessed

What evidence would change your recommendation?

Fails when: Nothing would, which means it is not reasoning, it is asserting

What is the weakest part of this argument?

Fails when: The weak part is the bit you were about to present as the headline

Where could this fail in production, and how would I detect it?

Fails when: No failure mode named, so nobody is watching for one

Give me three reasons this is wrong, even if you think it is right.

Fails when: It cannot find three, and the one it finds is trivial

Never ship the first draft. Critique pass, sometimes a second refine, then ship. That single habit is the biggest quality lift available to a People team using AI.

Models are good critics of their own work when you ask them to be. But there is an upgrade most teams miss: run the critique in a different model from the one that drafted. A model will not catch its own blind spots as reliably as a second one will, because the blind spots come from how that specific model reasons. Draft in Claude, critique in GPT-5. The one that did not write the draft has no ego in it.

Which model for which job

The same prompt does not produce the same quality across models, and copy-pasting your library from one to another leaves value on the table. The current generation behaves differently enough that it is worth matching the model to the task.

Model	How it behaves	Reach for it when
GPT-5 and the OpenAI line	Very literal instruction following, large context, will not reason step by step unless told to	Structured outputs, multi-document workflows, agentic system prompts
Claude	More interpretive, surfaces nuance and pushback unprompted, strong long-form voice	Drafting, sensitive comms, policy, anything where tone and judgement matter
Gemini	Strongest on research and multimodal, good at synthesising large public-web context	Market scans, comparative analysis, vendor research
Perplexity	Answer engine, not a chat model in the same sense, cites as it goes	"What does the public web say about this," with sources attached

A working pattern falls out of that table: draft with Claude, structure with GPT-5, research with Gemini or Perplexity, critique with whichever model did not produce the draft. The switching cost is low. The quality lift is real. For a deeper read on which model to reach for by HR task type, see choosing AI models for HR work, and if you are still setting up which tools sit on the desk in the first place, the AI workspace setup for People teams covers the ground under all of this.

Make your prompting patterns for People Ops shared assets

That field note is the whole argument for treating prompts like code, not like notes. A prompt that works is a small piece of institutional knowledge. If it lives in one person's history, it is not an asset, it is a single point of failure wearing a paragraph.

The fix is not complicated, and it is worth being opinionated about.

Name and version them. "Exit-interview-themes v3" beats "that prompt I use." When you improve it, save the new version and note what changed. You will want the old one back at least once.

Document the pattern, not just the prompt. The reusable thing is the shape, five blocks, a chain, a critique pass, not the exact words about onboarding. Write down why the chain has those five steps, so the next person can adapt it to comp or performance without starting cold.

Keep them where the team already works. A Notion page, a shared doc, a folder in your AI workspace. Somewhere findable, not a personal bookmark. The test is simple: could a new joiner find and run your three best prompts on their first Monday, without asking anyone?

Audit what gets used. Patterns nobody runs should be retired or rewritten. A library of forty prompts that no one trusts is worse than five that everyone reaches for.

The team that shares prompts compounds faster than the team that hoards them, and it compounds in a way that survives someone leaving. That is the difference between a clever individual and a capable function.

Where prompting stops being enough

Prompting is the entry point, and mastering the three families above will put you ahead of most People teams. But there is a ceiling, and it is worth naming so you do not spend a year polishing prompts when the constraint has moved.

Past a certain point, prompt quality stops being what holds you back. The gap becomes the context the model cannot reach on its own, the workflows it has to sit inside, and the guardrails that make it safe to run without someone watching every output. When you find yourself pasting the same context into every prompt, you have outgrown prompting. That context wants to live in a system the model can call, not in your clipboard.

That is the move from prompts into workflows: a tool like n8n, at around £20 per builder seat per month, SOC 2 and ISO 27001 compliant and self-hostable, wraps your best chain into something that runs on a schedule without a human retyping it. The extraction inside it stays model-only, never a regex fallback, so the judgement you built into the prompt is the judgement that runs in production. That shift is the subject of prompts to systems, and building the agents that come after it is covered in production agents for People Ops. The wider craft of getting AI to do real work in a People function sits under the AI workspace for People Ops pillar.

Get the prompting right first. The systems work pays off far more when the people building it can prompt well, and the fastest way to see where your own function sits on that curve is the Readiness Assessment: sixteen questions, about ten minutes, scored across the four layers a People AI capability actually rests on.

Common questions

What are the building blocks of a strong People Ops prompt?: Five: Role, Context, Task, Constraints, and Output spec. Role sets the perspective, Context gives the model what it cannot know, Task states what you actually need, Constraints set the boundaries, and Output spec fixes the exact shape of the answer. You do not need all five every time. If you can only fix one thing, fix the Output spec, it does more work than any other block. Context and Output spec are the two people drop first under time pressure, which is exactly when they matter most.
What is prompt chaining and why does it matter for HR?: Prompt chaining is breaking a complex task into a sequence of smaller prompts, where each output feeds the next. Instead of asking the model to design your performance review process in one shot, you have it list the common failure modes, pick the three that fit your context, draft one version, then stress-test it. Chaining produces work closer to how a senior practitioner thinks, and it makes errors visible at each step so you catch them before they compound. Three to five prompts is the useful range.
How do I stop AI giving confidently wrong answers?: Run a critique pass on every draft before you act on it, not just on the big calls. Ask the model what it assumed and has not verified, what would change its recommendation, and where the answer is weakest. The upgrade most teams miss: run the critique in a different model from the one that drafted. A model will not reliably catch its own blind spots, because the blind spots come from how that model reasons in the first place.
Does the same prompt work across ChatGPT, Claude and Gemini?: No, and treating it as if it does leaves quality on the table. GPT-5 follows instructions very literally and needs explicit reasoning steps. Claude is more interpretive and surfaces nuance unprompted, which makes it strong for drafting and sensitive comms. Gemini is strongest on research and multimodal work. Write one good prompt, then adapt it per model. The switching cost is low and the quality lift is real.

11 min

Not sure where your function stands yet?Take the Readiness Assessment→

When reading turns into doing

The Grain Audit maps one People Ops process end to end, ranks the highest-return automations, and hands you a 90-day plan you keep whether or not we work together.

Two weeks. £2,000, credited in full against a programme. Three slots a month.

Book a Grain Audit

If this resonated, there's more.

Subscribe to receive new Intelligence pieces as they're published. No noise, just the work.

By subscribing you agree to our Privacy Policy. Unsubscribe any time.