The Power of Small, Focused Prompts
As adoption of large language models (LLMs) grows, it's tempting to create highly complex prompts to handle a variety of tasks. After all, if an LLM can engage in open-ended dialogue, surely it can tackle any request we throw at it, right?
Not so fast. My experience building dozens of LLM-powered applications has revealed an important insight: Smaller, single-purpose prompts consistently outperform large, complex ones. Let's dive into why this approach is so effective.
The Problem with Prompt Complexity
Many early LLM demos showcase impressively complex and wide-ranging capabilities. But when it comes to building reliable applications, complex prompts often disappoint for a few key reasons:
-
Unexpected Interactions: The more tasks and logic you pack into a prompt, the more chances there are for components to interfere with each other in unpredictable ways. What worked in isolation breaks down in combination.
-
Changing Requirements: As your application evolves, you'll inevitably need to update the language model's behavior. With a complex prompt, even small changes require careful prompt-wide adjustments to avoid breaking things. Maintainability becomes a nightmare.
-
Challenging to Test: Verifying that a prompt with dozens of interacting parts behaves as expected across a wide range of inputs is extremely difficult. Edge cases and unintended behaviors proliferate. Assurance of a consistent user experience goes out the window.
The Power of Focused Prompts
In contrast, using small prompts that each handle a single, well-scoped task offers major advantages:
-
Predictable Behavior: A prompt that does one thing is much easier to understand, control and reason about. You can thoroughly explore its response space and identify clear boundaries. Nasty surprises are rare.
-
Easily Adaptable: When your application requirements change, updating a small prompt is straightforward. The blast radius of any modifications is limited and testing the impact is fast. Your prompt stays nimble.
-
Robustly Testable: Validating that a focused prompt behaves as intended is tractable. You can efficiently generate test cases that cover the full range of expected inputs and outputs. Confidence in the prompt's reliability is high.
An Example: Appointment Scheduling
Let's make this concrete with an example. Imagine we're building an AI assistant that helps users schedule appointments. A complex prompt might look like:
complex_prompt = """
You are an AI assistant that helps the user schedule appointments.
Based on the user's input, determine what kind of appointment they need, extract relevant details like date, time, and location.
Check for conflicts with existing appointments, and add the new appointment to the calendar.
Then, generate a confirmation message summarizing the appointment details.
"""
There's a lot going on in this prompt: parsing intents, extracting entities, validating constraints, updating a data store, and formatting a response. Handling all of these tasks in one place is a recipe for trouble.
Instead, let's break it down into a series of small, focused prompts:
from google import generativeai as genai
import instructor
from pydantic import BaseModel
from typing import List, Optional
from datetime import datetime
import os
from dotenv import load_dotenv
# Load environment variables
load_dotenv()
# Configure Gemini API
genai.configure(api_key=os.getenv("GOOGLE_API_KEY"))
# Initialize Gemini client
client = instructor.from_gemini(
client=genai.GenerativeModel(
model_name="models/gemini-1.5-flash-latest",
)
)
class AppointmentType(BaseModel):
type: str
class AppointmentDetails(BaseModel):
date: str # Changed from datetime to str
time: str
location: str
class Conflict(BaseModel):
existing_appointment: str
conflict_reason: str
class CalendarUpdate(BaseModel):
success: bool
message: str
class UserConfirmation(BaseModel):
confirmation_message: str
# Prompt 1: Appointment Type Classification
def classify_appointment_type(user_request: str) -> AppointmentType:
prompt = f"Classify the type of appointment based on this user request: '{user_request}'"
return client.chat.completions.create(
response_model=AppointmentType,
messages=[
{"role": "system", "content": "You are an AI that classifies appointment types."},
{"role": "user", "content": prompt}
]
)
# Prompt 2: Appointment Details Extraction
def extract_appointment_details(user_request: str) -> AppointmentDetails:
prompt = f"Extract the date (in YYYY-MM-DD format), time, and location from this appointment request: '{user_request}'"
return client.chat.completions.create(
response_model=AppointmentDetails,
messages=[
{"role": "system", "content": "You are an AI that extracts appointment details. Return the date in YYYY-MM-DD format."},
{"role": "user", "content": prompt}
]
)
# Prompt 3: Appointment Conflict Checking
def check_conflicts(new_appointment: AppointmentDetails, existing_appointments: List[str]) -> List[Conflict]:
prompt = f"Check for conflicts between this new appointment: {new_appointment} and these existing appointments: {existing_appointments}"
return client.chat.completions.create(
response_model=List[Conflict],
messages=[
{"role": "system", "content": "You are an AI that checks for appointment conflicts."},
{"role": "user", "content": prompt}
]
)
# Prompt 4: Calendar Update
def update_calendar(appointment: AppointmentDetails) -> CalendarUpdate:
prompt = f"Add this appointment to the calendar: {appointment}"
return client.chat.completions.create(
response_model=CalendarUpdate,
messages=[
{"role": "system", "content": "You are an AI that updates the calendar."},
{"role": "user", "content": prompt}
]
)
# Prompt 5: User Confirmation Generation
def generate_confirmation(appointment: AppointmentDetails) -> UserConfirmation:
prompt = f"Generate a confirmation message for this appointment: {appointment}"
return client.chat.completions.create(
response_model=UserConfirmation,
messages=[
{"role": "system", "content": "You are an AI that generates appointment confirmations."},
{"role": "user", "content": prompt}
]
)
# Example usage
user_request = "I need to schedule a doctor's appointment for next Tuesday at 2 PM at City Hospital."
existing_appointments = ["Monday at 10 AM: Team meeting", "Tuesday at 3 PM: Dentist appointment"]
try:
appointment_type = classify_appointment_type(user_request)
print(f"Appointment Type: {appointment_type.type}")
details = extract_appointment_details(user_request)
print(f"Appointment Details: {details}")
conflicts = check_conflicts(details, existing_appointments)
if not conflicts:
update = update_calendar(details)
if update.success:
confirmation = generate_confirmation(details)
print(f"Confirmation: {confirmation.confirmation_message}")
else:
print(f"Failed to update calendar: {update.message}")
else:
print("Conflicts detected:", conflicts)
except Exception as e:
print(f"Error: {str(e)}")
Output
Appointment Type: doctor's appointment
Appointment Details: date='2024-07-30' time='2 PM' location='City Hospital'
Confirmation: "Your appointment is confirmed for July 30, 2024 at 2 PM at City Hospital."
With this modular approach, each prompt becomes far easier to develop, maintain, and test in isolation. We can thoroughly verify each piece and then compose them together for a robust overall experience.
Start Small, Then Expand
Rethinking complex LLM interactions as sequences of small, focused tasks is powerful. Begin by carving off pieces of functionality into single-purpose prompts. Exhaustively test each one. Then, gradually compose them into larger and larger workflows.
This strategy lets you start getting value from LLMs quickly while maintaining a high degree of reliability and control as your application grows. Build complexity one solid, predictable piece at a time.
The Future is Composable
As LLM capabilities continue to expand, the composable approach will become even more essential. Assembling powerful applications from small, reliable building blocks that can be endlessly remixed is the future of LLM development. Investing in focused prompt design today sets you up for long-term success.
So the next time you're tempted to build a massive, all-encompassing prompt, pause and ask yourself: Can I break this down into smaller, single-purpose pieces? Your users (and maintainers) will thank you.