Cookbooks - Future AGI Documentation

Getting Started

Evals

Learn how to evaluate AI model performance with Future AGI Evals

Protect

Implement AI safeguards and protection mechanisms

Dataset

Work with datasets for model training and evaluation

Knowledge Base

Build and manage knowledge bases for your AI applications

Integrations

Portkey

Connect Future AGI with Portkey for enhanced capabilities

LangChain

Improve reliability in LangChain and LangGraph applications

LlamaIndex

Make LlamaIndex PDF chatbot production ready

Evaluation

Meeting Summarization

Evaluate the quality of AI-generated meeting summaries

AI SDR Evaluation

Assess AI-powered sales development representative performance

AI Agent Evaluation

Learn advanced techniques for evaluating AI agent performance

Simulation

Chat Simulation with Fix My Agent

Simulate and test AI chat agents using the Future AGI SDK

Voice Simulation with SDK

Test conversational voice AI agents with agent-simulate SDK

Observability

LangChain Chatbot

Add monitoring and observability to your AI applications

Text-to-SQL Agent

Evaluate the performance of text-to-SQL conversion agents

RAG

Experimenting Langchain RAG

Build and improve RAG applications using LangChain

Evaluating RAG Applications

Methods for evaluating retrieval-augmented generation systems

Trustworthy RAG Chatbots

Build reliable and accurate RAG-powered chatbots

Decrease Hallucinations in RAG

Reduce hallucinations in retrieval-augmented generation systems

AI Evaluation SDK

Local Metrics

Catch hallucinations and contradictions locally in under one second

LLM-as-Judge

Use Gemini to judge accuracy when heuristics miss paraphrases

RAG Evaluation

Diagnose retrieval vs generation failures in your RAG pipeline

Guardrails

Block jailbreaks, code injection, and PII leaks in under 10ms

Streaming Safety

Cut off toxic LLM output mid-stream with real-time monitoring

AutoEval

Auto-generate test pipelines from app descriptions for CI/CD

Feedback Loop

Teach your LLM judge from past mistakes with ChromaDB feedback

Multimodal Judge

Judge images and audio alongside text with Gemini vision

Optimization

End-to-End Prompt Optimization

Optimize prompts using the Future AGI platform

Basic Prompt Optimization

Optimize prompts for better performance

Evolutionary Optimization with GEPA

Optimize prompts using an evolutionary algorithm for state-of-the-art results

Using Different Evaluation Metrics

Choose the right metrics for optimization workflows

Choosing the Right Optimizer

Select the best optimization strategy for your specific use case

Using Custom Datasets for Optimization

Prepare and integrate datasets from various sources for optimization

Evals

​Getting Started

Evals

Protect

Dataset

Knowledge Base

​Integrations

Portkey

LangChain

LlamaIndex

​Evaluation

Meeting Summarization

AI SDR Evaluation

AI Agent Evaluation

​Simulation

Chat Simulation with Fix My Agent

Voice Simulation with SDK

​Observability

LangChain Chatbot

Text-to-SQL Agent

​RAG

Experimenting Langchain RAG

Evaluating RAG Applications

Trustworthy RAG Chatbots

Decrease Hallucinations in RAG

​AI Evaluation SDK

Local Metrics

LLM-as-Judge

RAG Evaluation

Guardrails

Streaming Safety

AutoEval

Feedback Loop

Multimodal Judge

​Optimization

End-to-End Prompt Optimization

Basic Prompt Optimization

Evolutionary Optimization with GEPA

Using Different Evaluation Metrics

Choosing the Right Optimizer

Using Custom Datasets for Optimization

Getting Started

Integrations

Evaluation

Simulation

Observability

RAG

AI Evaluation SDK

Optimization