Few-shot prompting Vs. Zero-shot prompting. Which approach and when?

1 day ago
4 min read

Providing examples in your prompt is a technique known as few-shot prompting. Examples can be a great way to quickly communicate what your desired response should look like - it can often be easier to show with a few examples rather than trying to describe it with words.

'Shots' is a term taken from the field of machine learning. Each shot is an example given to the model before it performs the task. We often refer to 'few-shot' and 'zero-shot' prompting (where you don't provide any examples at all).

Knowing when and how to supply examples is key to writing great prompts and getting better results from AI tools.

Let's see how it's done

I'll take my own advice and illustrate what each of these looks like with an example. Here, we'll use AI for a classification task: identifying the sentiment of a customer review.

Zero Shot:

Classify the sentiment of this review:
"The product arrived damaged and support was unhelpful."

Few Shot:

Classify the sentiment of these reviews as Positive, Negative, or Mixed.

Review: "Delivery was late but the item itself was great."
Sentiment: Mixed

Review: "Exactly what I ordered, arrived quickly."
Sentiment: Positive

Review: "The product arrived damaged and support was unhelpful."
Sentiment:

Why examples can be useful

To understand when and why examples can be a helpful addition, researchers tested what happens when random, wrong examples are given instead. What they found was that performance barely changed; the model was not primarily using the specific correct answers. The research identified three actual drivers of what makes examples useful:

Identifying the 'label space': seeing the range of possible outputs (e.g. Positive, Negative, Mixed)
Scoping the input distribution: exposure to what typical inputs look like
Learning the format and structure: the overall pattern of input --> output

What this tells us is that examples guide the model through structure and context, not factual recall. That's why even imperfect examples can still help, as long as they expose the model to the right output categories, realistic input patterns, and the format you expect.

When zero-shot is enough

Zero-shot is the right option for most everyday tasks. Modern AI models are trained through a process called instruction tuning. This helps the models to learn how to actually interpret instructions, and helps them to sensibly respond to task descriptions they have not encountered before.

Before instruction tuning became widespread, techniques like zero-shot chain-of-thought prompting were developed to help unlock reasoning in models that struggled with complex tasks. Kojima et al. (2022) showed that simply adding a phrase like 'Let's think step by step' could dramatically improve accuracy on multi-step reasoning problems. As instruction tuning has improved across successive model generations, this kind of hand-holding has become less necessary for most everyday use, though it can still be worth trying when working through particularly complex, multi-step problems.

Zero-shot works well when:

The task is clear and unambiguous
The expected output format is standard (a summary, a list, a translation)
You are working quickly, and consistency across many outputs is not critical
Enough context is already provided that a worked example would add nothing

For example:

Write a professional email and subject line for a message about a delayed project delivery.

Probably doesn't need any examples to get a useful response. Being explicit about format and including enough context about audience and purpose will cover most situations. However, for:

Here are three emails I've sent to clients [Example 1, 2, 3]. Now, write a follow-up to a new lead about our latest proposal [Context about the specific company].

...it can be quite useful to ensure consistent quality, accuracy and adherence to your specific brand voice.

When to use few-shot prompting

There are situations where zero-shot will keep falling short, regardless of how carefully the request is phrased. Common things to watch for include:

The output format is specific or non-standard
You need consistent results across a large number of inputs
You have refined the prompt several times, and it still misses something
The task involves a judgement call where an example anchors what "correct" looks like
You are teaching the model a custom behaviour, it has no prior training on

Writing good few-shot examples

The structure of your examples matters more than their content. A few practical guidelines:

Match the format you want to receive. If you want concise outputs, keep examples concise. If you need a specific structure, show it exactly. The model will mirror what it sees.
Use representative inputs, not ideal ones. Examples that look like real, typical inputs are more useful than polished ones. If your real inputs will be messy, your examples should reflect that.
Balance across output categories. If the task has three possible outcomes, show all three. A prompt that only demonstrates two of them may handle the third inconsistently.
Include at least one harder case. Clear-cut examples teach the easy end of the task. An edge case helps the model understand where the boundaries are, which is where errors tend to cluster.
Two to four examples is usually enough. A small number of well-chosen examples typically outperforms a longer list of mediocre ones.

A suggested workflow

Often, the best approach is to start with zero-shot. If the output format is wrong, results are inconsistent, or you keep rephrasing the same prompt, then a few well-chosen examples will help you out.

Zero-shot handles most everyday tasks well. Reach for few-shot when you need more consistency or precision.

Just start simple. Move to examples when you need them. Few-shot prompting can be another tool in your arsenal when tackling a problem with an LLM.

References

Brown, T. et al. (2020). Language Models are Few-Shot Learners. NeurIPS 2020. https://arxiv.org/abs/2005.14165
Wei, J. et al. (2021). Finetuned Language Models Are Zero-Shot Learners. ICLR 2022. https://arxiv.org/abs/2109.01652
Min, S. et al. (2022). Rethinking the Role of Demonstrations - What Makes In-Context Learning Work? EMNLP 2022. https://aclanthology.org/2022.emnlp-main.759/
Kojima, T. et al. (2022). Large Language Models are Zero-Shot Reasoners. NeurIPS 2022. https://arxiv.org/abs/2205.11916