Guide to Prompt Engineering

Apr 14, 2025

The difference between AI that churns out mediocre, generic responses and AI that produces million-dollar insights is more than the model. It’s the prompt. Since ChatGPT launched in 2022, there’s an invisible divide growing in the user base: those who simply use AI and those who truly command it.

Prompt engineering is a fundamental skill to fully harness the power of AI. Mastering it will help you unlock the latent power in each model, putting their intelligence at your service for generating content, improving decision-making, and automating complex workflows.

This guide equips you with all the core knowledge to get ahead of the pack. You’ll learn a more structured writing process and best practices to get more consistent results. These are useful when interacting with any AI model and also for building powerful AI agents and workflow automation tools in Stack AI.

What is prompt engineering?

Prompt engineering is the art and science of crafting effective instructions to guide AI models to desired outputs.

Even though models are trained on a vast amount of data and display high levels of intelligence, they still require precise instructions to perform tasks with maximum accuracy.

Writing good prompts helps the model locate the areas of its experience that best fit your objectives. For example, when adding an instruction to “act like a marketing expert”, the model will lean towards interpreting and responding based on concepts related to that topic.

Please note: to focus on the most common use cases for businesses, this guide won’t cover prompting audio, image, video, or niche AI models.

How to structure a prompt

The first step to increase reliability and accuracy is to follow a prompt structure. This is especially important when starting out in prompt engineering, as this will give you a good grasp of how your words control model behavior.

A good prompt structure is composed of the following parts.

Role (recommended)

Adding a role such as “you are a creative writer…”, “think like an expert project manager…”, or “as a financial analysis specialist…” helps the model understand the overall area of knowledge it will work in.

Instruction (recommended)

These are the core commands that tell the model what to do and how to do it. Here are examples of the kind of instructions you can issue:

General tasks: these cover common tasks such “Summarize this article in bullet point”, “Translate this text from Spanish to English”, or “Write a product description for this item, targeting non-technical customers”.
Style and tone: control how the model writes with “Make the tone casual and conversational”, “Write in the style of 1920s copywriting”, “Write in simple language that a 5th grader can understand”.
Process: instruct on the steps to take before creating the answer, such as “List pros and cons pefore giving a final answer”, “Ask clarifying questions before answering”, or “Break down the task into smaller subtasks”.
Meta: these reveal or control the reasoning process of the model. Examples include “Before answering, explain how you interpreted the prompt”, “Only use information provided, do not make assumptions”, and “Think critically, evaluate the argument, and identify logical fallacies”.

Output preferences (optional)

If you want the output to be formatted in a specific way (as Markdown, JSON, or CSV), you can specify how you want to see the response. This is especially important if the output will be routed to another tool automatically.

Context (recommended)

Adding more information about the context around a task increases the quality of the result. The model will understand more clearly the constraints and opportunities around the instruction, basing parts of the output on the facts contained here.

For example, if you want to generate a project brief, include all the details about your company, objectives, and resources as context to improve the answer’s quality.

Examples (optional)

Showing an example of an ideal response helps the model in replicating the structure, tone, or logic with higher accuracy.

This is especially effective for tasks around classification, pattern matching, imitating writing styles, data transformation, and text-to-structured-output conversion. For example, if you're classifying customer feedback by sentiment, you might include:

Input: “I love how quickly support resolved my issue!”
Output: Positive

Prioritize representative examples and remember that quality is more important than quantity.

Alternatively, if you’re doing one-shot or few-shot prompting (which we’ll cover later in this guide), this is where you should include your example/response pairs.

Input data or user request (required)

This is the core of your prompt, the actual question, task, or data you want the model to consider. All the other sections before this one exist to clarify and guide the model on how to handle this input.

Best practices

Be clear and unambiguous. Ask yourself: if someone unfamiliar with the task received this prompt, could they complete it correctly?
Balance thoroughness with brevity. Longer prompts consume more of the model’s context window and may lead to reduced performance or diluted instructions. Be specific, but concise.
Use consistent terminology. Avoid synonyms or paraphrasing when referencing key parts of your prompt. If you refer to a “daily report,” don’t switch to just “report” later: inconsistency can confuse the model, especially in complex instructions.
Frame instructions positively. Tell the model what to do, not what not to do. Positive framing is more direct and easier for the model to follow. Example: Instead of “Don’t be casual,” say “Use a formal, professional tone.”
Structure longer prompts. Use dividers like “““ or ### to separate sections (e.g., context, input, instruction). This helps the model parse each part more accurately.

Example of a complete prompt

Here’s how a prompt containing all these elements looks like.

Role: You are a content strategist and senior copywriter with experience in SEO and digital marketing.

Instruction: Write a blog introduction based on the User Request. The tone should be friendly and informative, suitable for small business owners who may not have a technical background. Make the intro engaging and no longer than 100 words.

Output preferences: Format the output in Markdown with the title as an H2 heading and the intro as a regular paragraph.

Context: This blog post is part of a content series aimed at helping small business owners adopt new technologies. The goal is to make AI feel approachable and practical, rather than intimidating. The article will be published on a website that caters to entrepreneurs with limited time and tech experience.

Examples: Here’s an example of an intro we like from a previous post: “Running a small business is no small feat — and your time is your most valuable asset. Luckily, automation tools are making it easier than ever to reclaim your day. In this post, we’ll break down how simple apps can do the heavy lifting for you.”

User Request: “5 Ways Small Businesses Can Use AI Tools to Save Time.”

For reference, here’s how GPT-4o responds to this prompt example:

H2: 5 Ways Small Businesses Can Use AI Tools to Save Time

Running a small business means wearing a lot of hats — and let’s be honest, there never seems to be enough hours in the day. That’s where AI tools come in. You don’t need to be a tech wizard to benefit from them. In fact, many are simple, affordable, and built to make your life easier. In this post, we’ll explore five practical ways small businesses like yours can use AI to save time, reduce stress, and focus more on what really matters: growing your business.

Core techniques

Consider AI model strengths and limitations

Each model can interpret the same instructions differently. A model’s size, architecture, and training data affect the quality of the output.

For a hands-on experience of this, try prompting the older and smaller GPT-3.5 and see how it struggles with nuanced, multi-step instructions. Compare that with how GPT-4o’s higher intelligence amplifies the power of your words into clear results.

Beyond intelligence, model type also matters:

SLMs (small language models like Microsoft Phi), are lightweight and efficient for simple tasks, but less adaptable for complex reasoning or creative tasks.
LLMs (large language models like GPT-3.5), are optimized for language-based tasks, such as writing, summarizing, and ideating.
LMMs (large multimodal models like GPT-4o), can handle multiple types of input and output, useful for tasks involving screenshots, real-time audio, or generating images.
Reasoning models (like OpenAI o3 or DeepSeek R1), are better for step-by-step thinking to solve advanced math, logic, or puzzle-like situations.

Adjust the prompt style to the model you’re working with. For example, SLMs need shorter, more direct instructions, while LMMs support combining text and images in a prompt.

System instructions vs user prompt

The OpenAI node in the Stack AI platform, with dedicated input fields for system instructions and user prompts.

When working with development environments or APIs, you’ll notice the distinction between system instructions and the user prompt.

These instructions are hidden from the end user and shape the model’s behavior, acting as persistent background context. They take precedence over user input if there are conflicting instructions.

When adopting the structured prompt formatting outlined above, you can move everything except the input data or user request to the system instructions. This includes the role, instructions, context, output preferences, and examples.

This way, when the tool is live, only the user’s specific input or request needs to be added. The output will be guided by the system-level instructions already in place.

Label different data types

AI doesn’t see structure the same way we do. In longer prompts with multiple data sources, separating and labeling each section helps the model understand and differentiate. This is especially important if there’s an overlap in words across sections, as the model might hallucinate.

Here are a few ways to label and divide the prompt:

XML tags such as
Any set of dividers such as “““ or ###
Start each section with the word that describes that data (as we did in the first section of this guide, with instruction, context, examples, input, and output)

Zero-shot, one-shot, and few-shot

In Stack AI, you can add examples directly to the prompt, or in separate input containers for modularity and better organization.

These terms describe how many examples you give a model to help it understand and complete a task more effectively.

In zero-shot, a model is given only the instruction with no examples. It relies on pretrained data to respond, generalizing its knowledge to apply to the task. Always start here: if the model performs well without examples, you’ll save token space and reduce prompt complexity.

One-shot prompting adds a single input/output example. This is helpful to guide the model in formatting, tone, or decision logic. If the model can generalize accurately from one example, stick with it.

Few-shot prompting includes 2 to 5 examples, giving the model a pattern to follow, improving response consistency. This is especially useful for classification, transformation, or generating outputs with structured data.

Remember that more examples take more tokens, limiting space for context and risking drifting off-task. Use representative examples to avoid bias: repetitive patterns, such as only adding positive examples, can skew the output. Finally, the order of the examples influences how the model prioritizes them, so structure them with intention.

Chain-of-thought prompting

Beyond providing an answer, reasoning models like OpenAI o3 Mini display their reasoning as well.

Chain-of-thought (CoT) encourages the model to slow down and explain its reasoning step-by-step before giving the final answer. It’s especially powerful for tasks involving logic, math, multi-step analysis, or puzzles.

The simplest way to call this behavior is to add sentences like:

“Let’s think step by step”
“Explain your reasoning before answering”
“Break down the problem into smaller steps before responding”

While this is an effective technique to force LLMs and LMMs to reason at a higher level, new reasoning models were trained and tuned to offer this prompting technique by default. Models like OpenAI o3 and DeepSeek R1 take time to think before answering, offering much higher accuracy, even if some responses to complex prompts can take up to 10 minutes to generate.

If your task demands advanced logic and leaves little room for failure, a reasoning model may be the best choice. But, even with general-purpose models, a simple CoT prompt can greatly improve performance on difficult problems.

Prompt chaining

An investment memo writer tool built with Stack AI, using linear prompt chaining to write an investment memo section by section.

In software development, solving a big problem starts by breaking it into smaller, manageable parts. Prompt engineering is similar: if a prompt is too long and complex, breaking it into parts and processing separately or as a sequence is the most effective approach.

Chaining improves accuracy, increases control, and makes it easier to spot and solve problems as they appear. But these benefits come with a trap: it may lead to overengineering in situations where a simpler prompt would’ve done the job.

In this setup, each prompt performs one focused action. This can be summarizing text, extracting a data point, or analyzing a specific element. You pass the output of the first prompt onto the next, so the data is transformed as it moves along the chain, fully processed and ready to use at the end.

There are four types of chains:

Linear chains, where the data travels forward through a sequence (A > B > C). For example, summarize a collection of reports > analyze key issues and action points > generate email to allocate workload to team.
Conditional chains, where the next prompt depends on the results of the previous one. For example, summarize the project pitch > if it includes mentions of green policies that align with our company’s objectives, extract company name and contacts.
Looped chains, that simplify batch processing of large lists. This runs the same prompt chain across each row of the list, delivering the final results without extra manual work.
Parallel chains, where multiple prompts run in parallel, and the results are merged or compared at the end. For example, analyzing a request for proposal document > analyzing each part of the document with a unique prompt > combining the assessment of all prompts into a single report.

Beyond the benefits outlined above, choose prompt chaining for building reusable workflows or building tools or automations. It’s a good match for use cases such as data extraction pipelines, research assistants, multi-step agents, report generation, and transforming formats (e.g. JSON > insights > summary).

One warning: check for data drift when chaining prompts, especially for linear chains. Every time data is transformed, there’s a risk the meaning shifts slightly. At the end of the chain, these small changes can compound into a misrepresentation of the original input.

Retrieval-augmented generation (RAG)

RAG is a technique where an AI model dynamically pulls relevant information from an external knowledge base, such as a document store, database, or vector index. Then, it uses that data as additional context to generate more accurate responses.

Instead of manually pasting background context into your prompt, RAG automates that step by searching for information related to the user input and injecting it into the prompt behind the scenes. This saves time, increases accuracy, and allows you to leverage your full data infrastructure, not just what fits into a single prompt.

RAG is especially useful for building tools like customer support agents, search-based chatbots, internal knowledge assistants, and document QA systems. In essence, anywhere the model needs real-time, accurate information grounded in your organization’s private or proprietary content.

Compared to static prompts, RAG enables dynamic, on-demand context injection, so your model responds not just with what it knows, but with what it retrieves in the moment.

Function calling and tool use

An example of configuring tool use to send emails using Gmail via a graphical interface in Stack AI.

AI models with function calling or tool use capabilities can connect to external systems. They can retrieve data, perform calculations, or trigger actions beyond the chat interface. This turns the model into a true interactive agent, capable of searching the web, querying databases, or triggering automation workflows, for example.

There are two main ways to configure tool use if the feature is available:

In the prompt: describe the condition inside the prompt, for example “If the user wants to search, run a web search and return results”
In the builder’s graphical interface: when using a development platform, assign tools and define when to trigger them using a GUI or setup interface

When designing tool usage logic, it’s important to strike the right balance: too broad and tools may trigger unnecessarily, even when they’re not needed. Too narrow and the model might ignore valid opportunities to use the tool, making the experience feel stiff.

Clear intent design leads to smoother interactions. Whenever possible, share the list of supported tools, commands, and behaviors with your team or users. This will help them understand what the system can do, how to invoke actions, and what kind of control they have.

Evaluate responses and iterate

The analytics tab of Stack AI projects helps gather technical feedback from users, including a native upvote/downvote feature.

Improving prompt performance requires evaluation and ongoing iteration. Start by creating a reference list of expected inputs and ideal outputs, a baseline that represents what "good" looks like.

As you refine your prompts and observe model behavior, compare actual outputs to your ideal responses. Look for patterns: where is the model accurate? Where does it miss the mark? Focus on things like completeness, tone, formatting, hallucination risk, and consistency across inputs.

Once the tool is deployed to your team, set up a lightweight feedback loop. This could be as simple as:

An upvote/downvote system per response
A spreadsheet to track outputs, comments, and recurring issues

Prompt quality improves with real-world feedback. Building evaluation into your workflow helps ensure your AI tools stay aligned with your team's expectations, even as use cases evolve.

Prompt engineering in Stack AI

Stack AI is an enterprise-grade, no-code platform for building generative AI agents and automating complex workflows. It’s a natural home for your prompt engineering efforts — offering flexibility, safety, and support for advanced techniques without requiring any code.

Here’s how Stack AI supports prompt engineering.

Model variety. Stack AI integrates with all major AI model providers, so you can choose the right model for each task. Leverage OpenAI’s flexibility, Anthropic’s high creativity, and Google’s massive contextual memory without extra setup. The interface gives you full control over key inputs, including system instructions, user prompts, and function-calling parameters.

Full support for prompt chaining. Due to Stack AI’s modular nature, you can break complex tasks into smaller steps, run multiple prompts at once, or start batch processing for list-based data. For example, processing a document section by section with customized prompts and combining the results into one final output. You can see it in action in the Stack AI’s investment memo writer tool tutorial.

Built-in RAG. RAG is central to Stack AI’s capabilities. You can integrate with sources like SharePoint, Salesforce, or Amazon S3, syncing knowledge bases on a regular interval to ensure data stays fresh. Just upload documents, connect them to your LLMs, and ask questions with full context. No need to code: simply drop the nodes and connect the flow visually.

Guardrails and PII protection. Stack AI includes built-in safety mechanisms such as message guardrails and PII (personally identifiable information) protection. These ensure AI interactions remain secure and compliant with privacy standards, especially in enterprise environments.

Turn words into action

Prompt engineering will become core digital literacy in the near future, a fundamental skill to harness AI with precision and finesse. More than executing best practices, becoming a good prompt engineer requires deliberate experimentation, an intuitive understanding of how models interpret your words, and a blend of technical savvy with linguistic proficiency.

Ready to put these principles into practice? Get ahead of the curve with a free Stack AI account and start building innovative solutions with your custom instructions today.