Оперативные методы: Zero-Shot, Few-Shot, CoT, ToT, RAG, ART, ReAct и мультимодальный.

After understanding the basics of prompt engineering, the next step is learning the main techniques used to get better results from AI models.

Each technique has a different purpose. Some are useful for simple tasks, while others help with reasoning, retrieval, tool use, or multimodal analysis involving text and images.

Zero-shot prompting

Zero-shot prompting happens when you ask the model to complete a task without giving examples.

You simply provide the instruction, and the model uses what it learned during training to answer.

Example:

“Classify the sentiment of this sentence as positive, negative, or neutral.”

This technique works well for common tasks such as:

• Classification
• Summarization
• Translation
• Explanation
• Simple content generation

Zero-shot is useful when the task is clear and does not require a special format or unusual pattern.

Few-shot prompting

Few-shot prompting includes a few examples before the main request.

These examples show the model what kind of answer you expect.

For example, if you want the model to classify customer feedback, you can provide two or three examples first:

“Great product, I loved it.” → Positive
“The delivery was late and the box was damaged.” → Negative

Then you provide a new sentence and ask the model to classify it.

Few-shot prompting is helpful when you want to:

• Teach the model a response pattern
• Standardize the output
• Improve accuracy in a specific task
• Show the tone, format, or structure you expect

It is especially useful when the task is simple but needs consistency.

Chain of Thought

Chain of Thought, also known as CoT, is a technique that encourages the model to solve problems step by step.

Instead of asking only for the final answer, you guide the model to organize the reasoning process.

This can help in tasks involving:

• Logic
• Math
• Planning
• Analysis
• Step-by-step decisions

A simple prompt could be:

“Analyze the problem step by step and then give the final answer.”

This technique helps the model break down complex problems into smaller parts. However, the final answer still needs to be checked, because the reasoning can sound convincing even when it contains mistakes.

Self-consistency

Self-consistency is related to Chain of Thought, but it goes further.

Instead of relying on only one reasoning path, the model considers different possible reasoning paths and then chooses the most consistent answer.

This is useful because complex problems can sometimes be solved in more than one way. If different paths lead to the same result, the answer is more likely to be reliable.

Self-consistency is especially useful for:

• Arithmetic reasoning
• Logic problems
• Common sense questions
• Tasks where one reasoning path may fail

The main idea is simple: compare multiple possibilities before choosing the final response.

Tree of Thoughts

Tree of Thoughts, or ToT, is a more advanced reasoning technique.

While Chain of Thought follows a linear path, Tree of Thoughts explores multiple branches of reasoning.

The model can consider different options, compare them, reject weak paths, and continue with stronger ones.

This technique is useful for tasks such as:

• Strategic planning
• Creative writing
• Complex problem solving
• Decision-making
• Tasks with multiple possible answers

For example, when planning a project, the model can explore different strategies before recommending the best one.

Tree of Thoughts is helpful when the first answer is not always the best answer.

Retrieval-Augmented Generation

Retrieval-Augmented Generation, known as RAG, combines generative AI with external information retrieval.

Instead of depending only on what the model learned during training, a RAG system searches for relevant documents, data, or knowledge sources and uses that information as context.

This is useful when the answer depends on:

• Updated information
• Company documents
• Technical manuals
• Internal knowledge bases
• Specific sources

RAG is different from fine-tuning.

Fine-tuning changes the model’s weights by training it with additional examples. RAG does not change the model itself. It simply provides relevant information at the moment of the request.

RAG can be more flexible and cost-effective, especially when the information changes often.

ART: Automatic Reasoning and Tool Use

ART stands for Automatic Reasoning and Tool Use.

This technique helps models solve complex tasks by combining reasoning with external tools.

Instead of trying to solve everything internally, the model can break the task into steps and decide when to use a tool, such as search, a calculator, or code execution.

ART is useful for:

• Multi-step tasks
• Research workflows
• Data analysis
• Technical problem solving
• Tasks that require external tools

The main goal is to make the model more capable by connecting reasoning with action.

ReAct prompting

ReAct combines two ideas: reasoning and acting.

The model does not only think through a problem. It can also take actions, such as searching for information, checking a source, or using a tool.

This is useful because language models can have outdated or incomplete knowledge. By taking action, the model can verify information instead of relying only on memory.

ReAct is especially helpful for:

• Research tasks
• Fact-checking
• Question answering with sources
• Tool-based workflows
• Reducing hallucinations

In simple terms, ReAct helps the model think, act, observe the result, and continue from there.

Multimodal prompt engineering

Multimodal prompt engineering is used with models that can process more than one type of input, such as text, images, audio, or video.

Instead of working only with written instructions, the model can also analyze visual or other media-based information.

For example, you can upload a product photo and ask:

“Analyze this image and create a product description highlighting the visible features.”

This is more specific than simply asking:

“What is in this image?”

A good multimodal prompt should explain what the model should focus on and what kind of output you expect.

Techniques for multimodal prompts

There are several useful techniques for multimodal prompting.

Contextualization

Contextualization means giving background information to guide the analysis.

For example, if you upload a historical image, you can tell the model the time period and ask it to explain the visual elements in that context.

Sequential questioning

Sequential questioning starts with a broad question and then moves to more specific ones.

This helps when analyzing complex images, documents, charts, or technical materials.

Role-based prompting

Role-based prompting asks the model to analyze something from the perspective of a specific professional.

For example:

• As a designer
• As a safety specialist
• As a teacher
• As a doctor, with human supervision
• As a marketing strategist

This helps guide the type of analysis the model should perform.

Iterative prompting

Iterative prompting means improving the result through follow-up prompts.

You start with a general request, review the response, and then ask for refinements.

This is useful when the first answer is good but still needs more detail, a different tone, or a better structure.

Applications of multimodal prompting

Multimodal prompts can be used in many areas, such as:

• Education, for visual study guides
• E-commerce, for product descriptions from photos
• Marketing, for captions and campaign ideas
• Healthcare, for assisted image interpretation with expert review
• Design, for analyzing layouts, colors, and visual elements

These applications show how prompt engineering is expanding beyond text.

Conclusion

Prompt techniques help users get better, more accurate, and more useful responses from AI models.

Zero-shot works well for direct tasks. Few-shot helps when examples are needed. Chain of Thought, self-consistency, and Tree of Thoughts improve reasoning. RAG, ART, and ReAct connect AI with external information and tools. Multimodal prompting allows AI to work with text, images.

References: https://skillbuilder.aws/learn/VF6H4SZ1BU/foundations-of-prompt-engineering/