Quick Start Guide

This guide will help you get started with LLM Batch Helper quickly.

🎉 New in v0.3.0: Simplified API - no more async/await syntax needed!

Installation

Install from PyPI:

pip install llm_batch_helper

Environment Setup

Set up your API keys as environment variables:

# For OpenAI (all models including GPT-5)
export OPENAI_API_KEY="your-openai-api-key"

# For OpenRouter (100+ models - Recommended)
export OPENROUTER_API_KEY="your-openrouter-api-key"

# For Together.ai
export TOGETHER_API_KEY="your-together-api-key"

Or create a .env file in your project directory:

OPENAI_API_KEY=your-openai-api-key
OPENROUTER_API_KEY=your-openrouter-api-key
TOGETHER_API_KEY=your-together-api-key

Basic Usage

Simple Prompt Processing

from llm_batch_helper import LLMConfig, process_prompts_batch

# Create configuration
config = LLMConfig(
    model_name="gpt-4o-mini",
    temperature=1.0,
    max_completion_tokens=100,
    max_concurrent_requests=5
)

# Define prompts
prompts = [
    "What is the capital of France?",
    "Explain quantum computing in simple terms.",
    "Write a haiku about programming."
]

# Process prompts - no async/await needed!
results = process_prompts_batch(
    config=config,
    provider="openai",
    prompts=prompts,
    cache_dir="cache"
)

# Display results
for prompt_id, response in results.items():
    print(f"Response {prompt_id}:")
    print(response['response_text'])
    print("-" * 50)

Using OpenRouter (Recommended)

Access 100+ models through OpenRouter:

from llm_batch_helper import LLMConfig, process_prompts_batch

config = LLMConfig(
    model_name="deepseek/deepseek-v3.1-base",  # or openai/gpt-4o, anthropic/claude-3-5-sonnet
    temperature=1.0,
    max_completion_tokens=150
)

prompts = [
    "Explain the benefits of renewable energy.",
    "What are the main programming paradigms?"
]

results = process_prompts_batch(
    config=config,
    provider="openrouter",  # Access to 100+ models!
    prompts=prompts,
    cache_dir="openrouter_cache"
)

for prompt_id, response in results.items():
    print(f"{prompt_id}: {response['response_text']}")

Using Together.ai

from llm_batch_helper import LLMConfig, process_prompts_batch

config = LLMConfig(
    model_name="meta-llama/Meta-Llama-3.1-8B-Instruct-Turbo",
    temperature=1.0,
    max_completion_tokens=150
)

prompts = [
    "Explain machine learning to a 10-year-old.",
    "What are the advantages of open-source software?"
]

results = process_prompts_batch(
    config=config,
    provider="together",  # Use Together.ai
    prompts=prompts,
    cache_dir="together_cache"
)

for prompt_id, response in results.items():
    print(f"{prompt_id}: {response['response_text']}")

File-Based Processing

from llm_batch_helper import LLMConfig, process_prompts_batch

config = LLMConfig(
    model_name="gpt-4o-mini",
    temperature=1.0,
    max_completion_tokens=200
)

# Process all .txt files in a directory
results = process_prompts_batch(
    config=config,
    provider="openai",
    input_dir="my_prompts",  # Directory with .txt files
    cache_dir="file_cache",
    force=False  # Use cached responses if available
)

print(f"Processed {len(results)} files!")

Configuration Options

Key Parameters

model_name: The LLM model to use (required)
temperature: Controls randomness (0.0 to 2.0, default: 1.0)
max_completion_tokens: Maximum tokens in the response (preferred)
max_tokens: Legacy parameter (use max_completion_tokens instead)
max_concurrent_requests: Number of parallel requests (default: 5)
system_instruction: System prompt for the model
max_retries: Number of retry attempts on failure (default: 10)
verification_callback: Custom function to verify response quality

Caching

Responses are automatically cached to avoid redundant API calls:

# First run - makes API calls
results1 = process_prompts_batch(
    config=config,
    provider="openai",
    prompts=prompts,
    cache_dir="my_cache"
)

# Second run - uses cached responses
results2 = process_prompts_batch(
    config=config,
    provider="openai",
    prompts=prompts,  # Same prompts
    cache_dir="my_cache",  # Same cache directory
    force=False  # Don't force regeneration
)

Error Handling

The package includes built-in retry logic with detailed logging and error handling:

config = LLMConfig(
    model_name="gpt-4o-mini",
    max_retries=5,  # Retry up to 5 times
    temperature=1.0
)

results = process_prompts_batch(
    config=config,
    provider="openai",
    prompts=prompts
)

# Check for errors in results
for prompt_id, response in results.items():
    if "error" in response:
        print(f"Error in {prompt_id}: {response['error']}")
    else:
        print(f"Success: {response['response_text']}")

🔍 New Retry Logging: You’ll see detailed logs during retries:

🔄 [14:23:15] Retry attempt 1/5:
   Error: RateLimitError (status: 429)
   Message: Rate limit exceeded...
   Waiting 4.0s before next attempt...

Next Steps

Check out the API Reference reference for detailed documentation
Explore Examples for more complex use cases
Learn about different Provider Support and their features
Try the interactive Tutorials