Designing and Engineering High-Quality Prompts for Generative AI Tasks
Engineering a high-quality prompt is crucial for achieving success in any generative AI task. Dataiku uses Prompt Studios to design, test, and operationalize optimal AI prompts to achieve business goals. In this post, we will explore how to create a reusable AI prompt for a financial news topic detection task.
Use Case Example
Suppose a financial analyst wants to use a large language model to automate the process of detecting topics of interest and discovering new themes from thousands of news articles each day.
Creating a New Prompt Studio
The analyst creates a new prompt studio for their use case, reusing templates or adapting existing ones for similar use cases. The analyst starts in manual mode to quickly test and iterate on the query before applying it to data sets in their project. They select the model to use, such as GPT-3.5 Turbo, or any other approved services, including private models.
The Prompt Studio Interface
The interface includes sections where the analyst can:
- Explain the task in plain English
- Add examples of inputs
- Specify expected outcomes
The goal is to create a reusable AI prompt as part of a production data pipeline, allowing for efficient processing of large datasets and improved model performance.
Evaluating a Prompt for Financial News Topic Detection
To test the prompt, we'll provide a list of topics and ask the model to determine whether each topic is covered in the financial news articles provided. We'll provide the list of topics we're interested in to help the model understand what inputs to expect. We'll add brief descriptions to the topics to provide more context. We'll manually add one or two test cases with a headline and text preview for a couple of different articles.
Observations from the Initial Run
- The format of the output is not ideal for downstream use in a data pipeline, as it's not predictable or structured.
- The large language model took some liberties with the task, returning topics not in the provided list of topics of interest. Examples include "vaccine manufacturing risk on mood" and "Personnel changes".
- The model's output is affected by random factors, making it difficult to systematically use the outputs.
Modifying the Prompt
We'll add modifications to the prompt to improve its performance and accuracy. Here is the summarized content in JSON format with a defined structure:
{
"topics": [
{
"topic_name": "instruction_to_format_response",
"description": "Instruction to format response as a JSON object with a defined structure"
},
{
"topic_name": "additional_topics",
"description": "Instruction to include additional topics detected by the model"
},
{
"topic_name": "examples",
"description": "Instruction to include sample headlines and text previews"
},
{
"topic_name": "output_format",
"description": "Instruction to enforce output format conformity"
},
{
"topic_name": "prompt_studios",
"description": "Instruction to use prompt validation and evaluation options"
},
{
"topic_name": "data_set_usage",
"description": "Instruction to use a data set as a batch of inputs"
},
{
"topic_name": "column_names",
"description": "Instruction to assign column names"
}
],
"additional_topics": {
"location": "location designated for additional discovered topics"
}
}
Deploying a Generative AI Prompt in Dataiku
The output of the prompt is in JSON format and passes the validation rule. The estimated cost to run the prompt against 1000 records is provided, allowing teams to:
- Assess the financial impact of embedding generative AI into data pipelines and projects during the design phase.
- Compare the cost of running the same prompt against different models.
The final step is to deploy the prompt in the project flow by clicking "Save as Recipe". The AI enrichment step is now saved as a new recipe in the project flow, making it:
- Efficient to reuse.
- Visually apparent to everyone that generative AI has been applied.
From this point, it is easy for others on the team to:
- Inspect the final prompt logic.
- Validate the resulting outputs.
With Prompt Studio and Dataiku, teams have the ultimate tool at their fingertips to:
- Engineer impeccable AI prompts infused with business context.
- Maximize the value obtained from generative AI models.