When a Prompt Isn’t Enough: Rethinking Your Approach to AI

I’ve spent considerable time crafting prompts for numerous professional and personal tasks and projects: knowledge retrieval, research, productivity, writing, gaming, and media. Over time, I’ve developed my preferred prompt-writing style. I’ve learned a few tips along the way, and expect to learn more soon. How do we write the best prompts? Keep it simple.

Before we can start writing prompts, we need to revisit the fundamentals.

Subscribe to my blog and get posts like this in your inbox. Share your email below, or follow me on Threads, LinkedIn, or BlueSky.

Understanding Success Before Writing Prompts

The internet is littered with best practices on writing prompts. Anthropic’s advice simplifies a strong foundation before we get started:

Clearly define the success criteria for your use case.
Identify methods to empirically test against those criteria.
Draft an initial prompt to refine further.

I love these simple steps. Clearly defining what you’re trying to accomplish and how you define success before writing prompts is crucial. Unclear objectives make even well-crafted prompts ineffective. I can’t stress this enough:

Unclear objectives make even well-crafted prompts ineffective

Establish benchmarks by documenting inputs, expected outputs, and clear criteria. This will serve as your “ground truth” for evaluating prompt effectiveness. Make it easy on yourself, use tools like ChatGPT or Claude to draft initial prompts aligned with your success criteria, then iteratively refine based on your documented tests.

Sounds straightforward, right?

When Simple Isn’t Simple

While this process seems simple, experience tells me otherwise. Even with clearly defined success criteria and solid tests, you might encounter significant hurdles with a prompt. The challenge often lies not in the prompt writing itself, but in understanding the fundamental limitations and capabilities of Large Language Models (LLMs).

Before investing time in prompt engineering, we must ask:

Is what you’re asking the LLM technically feasible, or reasonable?

Consider this analogy: Imagine assigning your task to a grad student. While they may be knowledgeable, they might struggle if the task is either too complex or requires specialized knowledge or expertise. Similarly, with LLMs, we need to consider both their capabilities and the complexity of our request.

Beyond the prompt, pay attention to what models you’re using. Prompt writing between chat and reasoning models is dramatically different. The differences between the quality of models, i.e., 4o and 4o-mini, can also vary greatly.

Understanding what you need to do, and with what models, helps us determine whether to pursue simple prompt engineering or explore more sophisticated approaches.

Prompt Complexity

Consider basic Retrieval-Augmented Generation (RAG): an LLM is given a question, asks for data to answer the question, and then answers question, synthesizing an output. Simple, achievable, and well-suited to an LLM’s strengths. A rather simple prompt with a tool can accomplish this well.

Now, let’s complicate things slightly. What if the provided data isn’t sufficient and you want to find more data? Your simple prompt evolves into a more complex, iterative approach:

Find data for the user’s answer.
If the data is insufficient, instruct the LLM to specify what’s missing.
Retrieve the additional data based on the LLM’s guidance.
Reattempt the task with this newly sourced information.
Rinse and repeat.

This iterative loop significantly increases complexity. Your prompt isn’t just answering a question—it’s diagnosing knowledge gaps and chasing the right data. This can get complex, and we can get even more complex from here!

This might be where things get a little confusing…

KISS: Keep It Simple, Stupid!

No, you’re not stupid, I’m not stupid, it’s just a saying, a principle I aim for.

One best practice in prompt engineering is maintaining simplicity and focus. Each prompt should have a clear purpose with well-structured instructions. This approach aligns with how LLMs process and respond to inputs.

When prompts become overly complex through attempting to handle multiple requirements, conditions, or tasks simultaneously, several risks emerge:

Increased hallucination risk: Complex prompts can overwhelm the model’s context management, leading to more frequent hallucinations (h11n) or fabricated information.
Degraded response quality: As prompt complexity grows, the likelihood of the model missing or misinterpreting key instructions increases.
Reduced reliability: Complex prompts often yield inconsistent results across multiple runs, making outcomes less predictable.

This challenge, where increasing prompt complexity to handle sophisticated tasks actually reduces output quality, highlights why breaking down complex tasks into simpler, focused prompts is often more effective than crafting elaborate single-prompt solutions.

If you find yourself chasing poor outputs and h11ns, maybe your prompt is too complex!

“But my requirements demand I build something bigger…”

RKYPSS: Really, keep your prompt simple, smarty

When business requirements demand more, single-prompt solutions won’t cut it. I’ve found we have to move to multiple prompts. This might be a “well, duh” moment, as it often is with me. However, I need frequent reminding that making one complex prompt isn’t helping.

There are two solid approaches to working with multiple prompts:

Parallel and Independent

Parallel strategies employ independent prompts that address different aspects of the task simultaneously, each independently contributing toward a final output.

For example, we have an agent that needs to find related content in our advanced search tool. We need to break down the user’s ask into search phrases, identify filter values, identify people’s names, identify the time frame, and expand acronyms. It’s a lot. One prompt might be able to manage it… but in reality, not really. Instead, we break each requirement into its own prompts and run them in parallel.

Again, pay attention to the models you’re using. If you were using 4o for your single prompt, consider moving to 4o-mini here: given small, focused prompts, 4o-mini does a great job.

Sequential and Dependent

Alternatively, running prompts sequentially, dependent on each other, we call prompt chaining. Prompt chaining connects multiple prompts sequentially, where one prompt’s output directly informs the next, forming a structured workflow.

Taking our example above, prompt chaining might be needed if we need to first know what the search phrases are before we can generate the filters, expand acronyms, etc.

As you might imagine, this approach is slower, as we’re making calls to our LLM, one after the other. Pay attention to models!

Sequential and Parallel

One size does not fit all! We generally use a combination of running our prompts sequentially and in parallel, giving us a lot more control and flexibility in our solution.

Impact of Multi-Prompt Strategies

Multi-prompt approaches offer powerful solutions for complex tasks, there are a few considerations you should keep in mind:

Increased costs: Running multiple prompts requires more LLM calls. If you’re highly concerned with usage and cost, keep in mind you’ll have more token usage as you have more prompts.
Higher latency: Sequential prompt chaining can be slow and long. Make sure you are streaming back statuses for the user to know what’s going on.
Greater complexity in error handling: Multiple prompts mean more potential failure points, requiring robust error management strategies. Should one prompt error fail the entire flow, or are the others sufficient to complete the task?

However, the benefits are worth it:

Enhanced accuracy: Breaking complex tasks into focused prompts often yields more reliable and precise results.
Better maintainability: Individual prompts are easier to update, test, and optimize compared to monolithic prompts.

Understanding these impacts and benefits helps in making informed decisions about when to implement multi-prompt strategies versus simpler approaches.

When Single-Prompt Engineering Hits Limits

Complex tasks often exceed what a single prompt can reasonably manage. When this occurs, we must explore using more than one prompt, giving the LLM the ability to reasonably consider each of our requirements, one at a time. As we consider our prompts, we should leverage those foundational steps we began with: clearly define our success criteria, empirically test against these criteria, and iteratively refine each prompt accordingly.

Happy prompting!