# Future AGI Documentation ## Docs - [Admin Settings](https://futureagi.mintlify.app/admin-settings.md) - [Create agent definition](https://futureagi.mintlify.app/api-reference/agent-definitions/create-agent-definition.md): Create a new agent definition and its first version. - [Create new version of agent](https://futureagi.mintlify.app/api-reference/agent-versions/create-new-version-of-agent.md): Create a new version of an existing agent definition by providing updated agent properties and a commit message. - [Apply Evaluation Group](https://futureagi.mintlify.app/api-reference/eval-groups/apply-evaluation-group.md): Applies an evaluation group to a set of data, creating user evaluation metrics. - [Create Evaluation Group](https://futureagi.mintlify.app/api-reference/eval-groups/create-evaluation-group.md): Creates a new evaluation group within the user's workspace. - [Delete Evaluation Group](https://futureagi.mintlify.app/api-reference/eval-groups/delete-evaluation-group.md): Soft deletes an evaluation group and removes all its associated evaluation templates. - [Edit Evaluation Group Members](https://futureagi.mintlify.app/api-reference/eval-groups/edit-evaluation-group-members.md): Adds or removes evaluation templates from an evaluation group. - [List Evaluation Groups](https://futureagi.mintlify.app/api-reference/eval-groups/list-evaluation-groups.md): Retrieves a paginated list of evaluation groups for the user's workspace, including sample groups. - [Retrieve Evaluation Group](https://futureagi.mintlify.app/api-reference/eval-groups/retrieve-evaluation-group.md): Retrieves detailed information about a specific evaluation group, including its members. - [Update Evaluation Group](https://futureagi.mintlify.app/api-reference/eval-groups/update-evaluation-group.md): Updates an entire evaluation group's details. - [Get Evaluation Log Details](https://futureagi.mintlify.app/api-reference/eval-logs-&-metrics/get-evaluation-log-details.md): Retrieves detailed logs for a specific evaluation template, with support for advanced filtering, sorting, and pagination. This endpoint uses a GET request with a request body to handle complex filtering and sorting configurations. - [Get Evals List](https://futureagi.mintlify.app/api-reference/evals-list/get-evals-list.md): Retrieves a list of evaluations for a given dataset, with options for filtering and ordering. - [Health check](https://futureagi.mintlify.app/api-reference/health/health-check.md): Returns 200 status when server is up and running. No authentication required. - [Get prompt version by name](https://futureagi.mintlify.app/api-reference/prompt-workbench/get-prompt-version-by-name.md): Fetch a prompt version by template name and either an explicit version (e.g. `v1`) or a label name (e.g. `Production`). When both `version` and `label` are provided, `version` takes precedence. Returns the full prompt template data along with the resolved version's configuration, variables, output s… - [Create a New Test Run](https://futureagi.mintlify.app/api-reference/run-tests/create-a-new-test-run.md): Creates and configures a new test run, associating it with scenarios, an agent definition, and detailed evaluation configurations. - [Execute a test run](https://futureagi.mintlify.app/api-reference/run-tests/execute-a-test-run.md): Triggers the execution of a specified test run. The execution can be customized to include or exclude specific scenarios. - [Add empty rows to a scenario](https://futureagi.mintlify.app/api-reference/scenarios/add-empty-rows-to-a-scenario.md): Adds a specified number of empty rows to an existing scenario. This is useful for populating a scenario with placeholders for future data entry. - [Add rows to a scenario using AI](https://futureagi.mintlify.app/api-reference/scenarios/add-rows-to-a-scenario-using-ai.md): Initiates an asynchronous task to generate and add a specified number of new rows to a scenario's dataset using AI. A description can be provided to guide the content generation. - [Edit a scenario](https://futureagi.mintlify.app/api-reference/scenarios/edit-a-scenario.md): Updates the properties of a specific scenario, such as its name, description, associated graph, or the simulator agent's prompt. - [Generate or create a scenario](https://futureagi.mintlify.app/api-reference/scenarios/generate-or-create-a-scenario.md): Creates a new scenario from a dataset, a script, or a generated/provided graph. The creation is processed in the background. - [Auto-Configure Your Testing Pipeline](https://futureagi.mintlify.app/cookbook/ai-evaluation/autoeval.md): Describe your AI application in plain English and get an auto-generated evaluation pipeline with metrics, scanners, and thresholds -- ready to export to CI/CD. - [Teach Your Judge from Past Mistakes](https://futureagi.mintlify.app/cookbook/ai-evaluation/feedback-loop.md): Store developer corrections in ChromaDB, retrieve them as few-shot examples via vector search, and inject them into the LLM judge prompt so it stops repeating the same mistakes. - [Protect Your LLM from Prompt Injection](https://futureagi.mintlify.app/cookbook/ai-evaluation/guardrails.md): Build a sub-10ms security middleware that blocks jailbreaks, code injection, PII leaks, secret exposure, and malicious URLs -- all locally with zero API calls. - [When Heuristics Aren't Enough: LLM-as-Judge](https://futureagi.mintlify.app/cookbook/ai-evaluation/llm-judge.md): Use an LLM to judge accuracy when local heuristics miss paraphrases, then build custom domain-specific judges and batch QA review pipelines. - [Catch a Hallucinating Medical Chatbot](https://futureagi.mintlify.app/cookbook/ai-evaluation/local-metrics.md): Build a local validation layer that catches hallucinations, wrong dosages, and contradictions -- all in under one second with zero API keys. - [Judge Images and Audio with Your LLM](https://futureagi.mintlify.app/cookbook/ai-evaluation/multimodal-judge.md): Pass image and audio URLs to the LLM judge to verify product descriptions match photos, check transcription accuracy, and auto-generate grading criteria. - [AI Evaluation SDK Cookbooks](https://futureagi.mintlify.app/cookbook/ai-evaluation/overview.md): Hands-on tutorials that solve real problems you will face when building, testing, and securing AI applications with the fi-evals Python SDK. - [Is Your RAG Pipeline Lying to Users?](https://futureagi.mintlify.app/cookbook/ai-evaluation/rag-evaluation.md): Diagnose exactly where your RAG pipeline fails by measuring retrieval quality and generation quality independently. - [Stop Toxic Output Mid-Stream](https://futureagi.mintlify.app/cookbook/ai-evaluation/streaming.md): Monitor streaming LLM output token-by-token and cut off generation the instant it turns toxic, incoherent, or off-topic. - [Meeting Summarization](https://futureagi.mintlify.app/cookbook/cookbook1/AI-Evaluation-for-Meeting-Summarization.md) - [Dataset](https://futureagi.mintlify.app/cookbook/cookbook10/Using-FutureAGI-Dataset.md): Use FutureAGI Dataset to create and manage your datasets - [Evals](https://futureagi.mintlify.app/cookbook/cookbook10/Using-FutureAGI-Evals.md): Use FutureAGI Evals to evaluate your AI models - [Knowledge Base](https://futureagi.mintlify.app/cookbook/cookbook10/Using-FutureAGI-KB.md): Use FutureAGI Knowledge Base to create and manage your knowledge base - [Protect](https://futureagi.mintlify.app/cookbook/cookbook10/Using-FutureAGI-Protect.md): Use FutureAGI Protect to protect your data - [Portkey](https://futureagi.mintlify.app/cookbook/cookbook11/integrate-portkey-and-futureagi.md) - [Text-to-SQL Agent](https://futureagi.mintlify.app/cookbook/cookbook12/Evaluating-Text-to-SQL-Agent-using-Future-AGI.md) - [LangChain](https://futureagi.mintlify.app/cookbook/cookbook13/Adding-Reliability-to-Your-LangChain-LangGraph-Application-with-Future AGI.md) - [LlamaIndex](https://futureagi.mintlify.app/cookbook/cookbook14/Build-Reliable-PDF-RAG-chatbots-with-LlamaIndex-and-Future-AGI.md) - [CrewAI](https://futureagi.mintlify.app/cookbook/cookbook16/Building-AI-Research-Team-with-CrewAI-and-FutureAGI.md): Learn how to build a multi-agent research system using CrewAI with integrated observability and in-line evaluations from FutureAGI for real-time quality monitoring. - [Testing a Voice AI Agent with Agent Simulate SDK](https://futureagi.mintlify.app/cookbook/cookbook17/simulate-sdk-demo.md): This cookbook demonstrates how to use the agent-simulate SDK to test a conversational voice AI agent. - [Chat Simulation with Fix My Agent](https://futureagi.mintlify.app/cookbook/cookbook18/chat-simulation-with-fix-my-agent.md): Simulate AI chat agents at scale and get instant AI-powered diagnostics to improve performance - [AI SDR Evaluation](https://futureagi.mintlify.app/cookbook/cookbook2/AI-Evaluation-for-AI-SDR.md) - [AI Agent Evaluation](https://futureagi.mintlify.app/cookbook/cookbook3/Mastering-Evaluation-of-AI-Agents.md) - [Experimenting Langchain RAG](https://futureagi.mintlify.app/cookbook/cookbook5/How-to-build-and-incrementally-improve-RAG-applications-in-Langchain.md) - [Evaluating RAG Applications](https://futureagi.mintlify.app/cookbook/cookbook6/How-to-evaluate-RAG-Applications.md) - [Trustworthy RAG Chatbots](https://futureagi.mintlify.app/cookbook/cookbook7/Creating-Trustworthy-RAGs-for-Chatbots.md) - [LangChain Chatbot](https://futureagi.mintlify.app/cookbook/cookbook8/How-To-Implement-Observability.md): Master AI observability with FutureAGI. Track LLM performance, monitor metrics, and optimize Python apps. Step-by-step guide with examples. - [Decrease Hallucinations in RAG](https://futureagi.mintlify.app/cookbook/cookbook9/How-To-Decrease-RAG-Hallucination.md) - [MongoDB](https://futureagi.mintlify.app/cookbook/integrations/mongodb.md): Learn how to build production-grade PDF RAG chatbots using MongoDB Atlas for vector search and Future AGI to trace, evaluate, and real-time performance monitoring of LLM pipelines - [Basic Prompt Optimization](https://futureagi.mintlify.app/cookbook/optimization/basic-prompt-optimization.md): A hands-on guide to optimizing your first prompt using the agent-opt Python library with a simple Random Search strategy. - [Choosing the Right Optimizer](https://futureagi.mintlify.app/cookbook/optimization/comparing-optimization-strategies.md): A practical guide to selecting the best optimization strategy (Bayesian Search, Meta-Prompt, GEPA, etc.) based on your specific task and goals. - [End-to-End Prompt Optimization](https://futureagi.mintlify.app/cookbook/optimization/end-to-end-prompt-optimization.md) - [Using Different Evaluation Metrics](https://futureagi.mintlify.app/cookbook/optimization/eval-metrics-for-optimization.md): Learn how to use the FutureAGI platform, local LLM-as-a-judge, and local heuristic metrics to guide your prompt optimization. - [Evolutionary Optimization with GEPA](https://futureagi.mintlify.app/cookbook/optimization/evolutionary-optimization-with-gepa.md): A guide to using GEPA, a powerful evolutionary algorithm for state-of-the-art prompt optimization in complex, high-stakes scenarios. - [Using Custom Datasets for Optimization](https://futureagi.mintlify.app/cookbook/optimization/importing-and-using-datasets.md): Learn how to prepare and integrate datasets from various sources (in-memory, CSV, JSON, JSONL) for effective prompt optimization. - [Cookbooks](https://futureagi.mintlify.app/cookbook/overview.md): Practical guides and tutorials for using Future AGI products effectively - [FAQs](https://futureagi.mintlify.app/faq.md): Find answers to common questions about Future AGI products. - [Answer Refusal](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/answer-refusal.md): Checks whether an AI model properly refuses to answer harmful, dangerous, or inappropriate requests. It identifies cases where the model should have declined to provide information but instead provided a potentially harmful response. - [Audio Quality](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/audio-quality.md): Evaluates the perceptual quality of an audio input, assessing aspects like clarity, noise levels, and overall listenability using an LLM. - [Audio Transcription](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/audio-transcription.md): Analyses the accuracy of a provided transcription against the content of a given audio file. - [Bias Detection](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/bias-detection.md): Identifies various forms of bias, including gender, racial, cultural, or ideological bias in the output. It evaluates input for balanced perspectives and neutral language use. - [BLEU Score](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/bleu.md): Measures n-gram overlap precision between the generated and reference text. - [Caption Hallucination](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/caption-hallucination.md): Evaluates whether an image caption contains fabricated information not actually visible in the image. - [Chunk Attribution](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/chunk-attribution.md): Evaluates whether a language model references the provided context chunks at all when generating its response. This metric assesses if the output acknowledges and incorporates information from the context, indicating the model's basic ability to leverage provided data. - [Chunk Utilization](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/chunk-utilization.md): Measures how effectively a language model leverages information from the provided context to produce a coherent and contextually appropriate output. - [Clinically Inappropriate Tone](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/clinically-inappropriate-tone.md): Evaluates whether text uses an appropriate tone for clinical or healthcare contexts - [Completeness](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/completeness.md): Evaluates whether the response fully addresses the input query. This evaluation is crucial for ensuring that the generated response is comprehensive and leaves no aspect of the query unanswered. - [Content Moderation](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/content-moderation.md): Evaluates content safety using OpenAI's content moderation system to detect and flag potentially harmful, inappropriate, or unsafe content. Provides assessment of content against established safety guidelines. - [Content Safety Violation](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/content-safety-violation.md): Detects harmful, unsafe, or prohibited content that violates safety guidelines. - [Context Adherence](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/context-adherence.md): Evaluates how well responses stay within the provided context by measuring if the output contains any information not present in the given context. This evaluation is crucial for ensuring factual consistency and preventing hallucination in responses. - [Context Relevance](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/context-relevance.md): Evaluates whether the provided context is sufficient and relevant to answer the given input query. This evaluation is crucial for RAG systems to ensure that retrieved context pieces contain the necessary information to generate accurate responses. - [Conversation Coherence](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/conversation-coherence.md): Evaluates how logically a conversation flows and maintains context throughout the dialogue. This metric assesses whether responses are consistent, contextually appropriate, and maintain a natural progression of ideas within the conversation thread. - [Conversation Resolution](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/conversation-resolution.md): Evaluates whether each user query or statement in a conversation receives an appropriate and complete response from the AI. This metric assesses if the conversation reaches satisfactory conclusions for each user interaction, ensuring that questions are answered and statements are appropriately ackno… - [Cultural Sensitivity](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/cultural-sensitivity.md): Analyses the output for cultural appropriateness, inclusive language, and awareness of cultural nuances. It identifies potential cultural biases or insensitive content, ensuring that the content respects diverse perspectives and avoids promoting stereotypes or discrimination. - [Data Privacy Compliance](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/data-privacy.md): Determines whether content aligns with key privacy regulations such as GDPR, HIPAA, ensuring adherence to data protection and compliance standards. This assessment is critical for mitigating risks associated with sensitive data exposure and regulatory violations. - [Detect Hallucination](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/detect-hallucination.md): Identifies if the model fabricated facts or added information that was not present in the input or context - [Embedding Similarity](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/embedding-similarity.md): Measures semantic similarity between the generated and reference content. - [Eval Ranking](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/eval-ranking.md): Provides a ranking score for each context based on specified criteria. This evaluation ensures that contexts are ranked according to their relevance and suitability for the given input. - [Factual Accuracy](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/factual-accuracy.md): Verifies if the provided output is factually correct based on the given information or the absence thereof. It ensures that the output maintains factual integrity and does not introduce inaccuracies. - [Fuzzy Match](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/fuzzy-match.md): Compares two texts for similarity using fuzzy matching techniques. It's useful for detecting approximate matches between expected and generated model output when exact matching might be too strict, accounting for minor differences in wording, spelling, or formatting. - [Groundedness](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/groundedness.md): Assesses whether a response is firmly based on the provided context. This evaluation ensures that the response does not introduce information that is not supported by the context, thereby maintaining factual accuracy and relevance. - [Hit Rate](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/hit-rate.md): Checks whether at least one relevant chunk was retrieved. A simple, high-level retrieval-stage metric for RAG pipelines that measures basic retrieval coverage. - [Prompt Instruction Adherence](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/instruction-adherence.md): Measures how closely an output follows given prompt instructions, checking for completion of requested tasks and adherence to specified constraints or formats. This evaluation is crucial for ensuring that generated content meets the intended requirements and follows given instructions accurately. - [Is Compliant](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/is-compliant.md): Evaluates whether content follows guidelines, standards, and acceptable use policies. - [Is Concise](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/is-concise.md): Evaluates whether the response is concise and to the point - [Is Email](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/is-email.md): Evaluates whether the input text is a valid email address. It checks if the text follows standard email formatting rules, including the presence of an @ symbol, a domain name, and a valid top-level domain. - [Is Factually Consistent](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/is-factually-consistent.md): Evaluates whether output content is factually consistent with provided input or context - [Is Good Summary](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/is-good-summary.md): Evaluates whether a summary effectively captures the key information from the original source content - [Is Harmful Advice](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/is-harmful-advice.md): Evaluates whether content contains guidance, recommendations, or instructions that could lead to harm if followed. - [Is Helpful](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/is-helpful.md): Evaluates whether the response is helpful in solving the user problem or answering their question - [Is Informal Tone](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/is-informal-tone.md): Detects whether the tone is informal or casual (e.g., use of slang, contractions, emoji) - [Is JSON](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/is-json.md): Determines whether a given text conforms to a valid JSON format. Ensuring valid JSON formatting is critical for seamless data interoperability, as incorrect structures can lead to parsing errors and system failures. - [Is Polite](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/is-polite.md): Evaluates whether response demonstrates politeness, respect, and appropriate social etiquette. It checks for the presence of courteous language, absence of rudeness, and adherence to social norms in communication. - [Levenshtein Similarity](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/lavenshtein-similarity.md): Measures text similarity based on the minimum number of single-character edits required to transform one text into another. - [Length Evals - One Line](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/length-evals.md): Validating the structure and length of text is essential for ensuring that generated content meets specific requirements and maintains a high standard of quality. - [LLM Function Calling](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/llm-function-calling.md): Evaluates the accuracy and effectiveness of function calls made by LLM. It checks whether the output correctly identifies the need for a tool call and whether it accurately includes the tool with the appropriate parameters extracted from the input. - [MRR (Mean Reciprocal Rank)](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/mrr.md): Measures how early the first relevant chunk appears in the ranked retrieval results. A retrieval-stage metric for RAG pipelines that focuses on the position of the first correct answer. - [NDCG@K](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/ndcg-at-k.md): Normalized Discounted Cumulative Gain at K: measures ranking quality by giving more credit to relevant chunks that appear earlier in the retrieved results. A retrieval-stage metric for RAG pipelines. - [No Age Bias](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/no-age-bias.md): Evaluates whether a content contains age-related bias, stereotypes, or discriminatory content - [No Apologies](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/no-apologies.md): Evaluates whether the response contains unnecessary apologies or apologetic language - [No Gender Bias](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/no-gender-bias.md): Evaluates whether a content contains gender-related bias, stereotypes, or discriminatory content - [No Harmful Therapeutic Guidance](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/no-harmful-therapeutic-guidance.md): Evaluates whether content contains inappropriate or potentially harmful medical, psychological, or therapeutic advice. - [No LLM Reference](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/no-llm-reference.md): Evaluates whether a model response contains references to OpenAI, its models (like ChatGPT, GPT-3, GPT-4), or identifies itself as an OpenAI product - [No Racial Bias](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/no-racial-bias.md): Evaluates whether a content contains racial bias, stereotypes, or discriminatory content - [Numeric Similarity](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/numeric-similarity.md): Extracts numeric values from generated output and compute absolute or normalised difference between numeric value in reference - [Overview](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/overview.md) - [PII](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/pii.md): PII Detection evaluates text to identify the presence of personally identifiable information. This evaluation is crucial for ensuring privacy and compliance with data protection regulations by detecting and managing sensitive information in text data. - [Precision@K](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/precision-at-k.md): Out of the top K retrieved chunks, what fraction is actually relevant. A retrieval-stage metric for RAG pipelines that measures how much noise your retriever returns. - [Prompt Injection](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/prompt-injection.md): Detects attempts to manipulate or bypass the intended behaviour of language models through carefully crafted inputs. This evaluation is crucial for ensuring the security and reliability of AI systems by identifying potential security vulnerabilities in prompt handling. - [Recall@K](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/recall-at-k.md): Out of all truly relevant chunks, what fraction appears in the top K retrieved results. A core retrieval-stage metric for RAG pipelines that measures how well your retriever surfaces relevant context. - [Recall Score (Deprecated)](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/recall-score.md): Measures how much of the information in the reference is captured in the hypothesis. - [ROUGE Score](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/rouge.md): Recall-specific measurement of lexical overlap between generated hypothesis and reference - [Semantic List Contains](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/semantic-list-contains.md): Evaluates whether a generated response semantically contains one or more reference phrases or keywords. - [Sexist](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/sexist.md): Detects content that has gender bias. This evaluation is essential for ensuring that content does not perpetuate gender stereotypes or discrimination, promoting inclusivity and respect. - [Summary Quality](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/summary-quality.md): Evaluates whether a summary effectively captures the main points, maintains factual accuracy, and achieves an appropriate length while preserving the original meaning. It checks for both the inclusion of key information and the exclusion of unnecessary details. - [Synthetic Image Evaluator](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/synthetic-image-evaluator.md): Evaluates whether an image was generated by AI or captured by a camera/created by humans. - [Task Completion](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/task-completion.md): Evaluates whether a response successfully completes the task requested in the input. - [Text to SQL](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/text-to-sql.md): Evaluates the accuracy and quality of SQL queries generated from natural language instructions. - [Tone](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/tone.md): Tone evaluation analyses the tone and sentiment of content. This evaluation helps in understanding the emotional context and intent behind the text, which is crucial for tailoring communication to specific audiences or purposes. - [Toxicity](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/toxicity.md): Toxicity assesses the content for harmful or toxic language. This evaluation is crucial for ensuring that content does not contain language that could be offensive, abusive, or harmful to individuals or groups. - [Translation Accuracy](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/translation-accuracy.md): Evaluates the quality of translation by checking semantic accuracy, cultural appropriateness, and preservation of original meaning. It considers both literal accuracy and natural expression in the target language. - [Valid Links](https://futureagi.mintlify.app/future-agi/get-started/evaluation/builtin-evals/valid-links.md): Ensures that generated content contains valid hyperlinks. This evaluation helps maintain a high standard of quality by validating the presence and validity of links in generated content. - [Create Custom Evals](https://futureagi.mintlify.app/future-agi/get-started/evaluation/create-custom-evals.md): Creating custom evaluations allows you to tailor assessment criteria to your specific use case and business requirements. Future AGI provides flexible tools to build evaluations that go beyond standard templates, enabling you to define custom rules, scoring mechanisms, and validation logic. - [Evaluation Groups](https://futureagi.mintlify.app/future-agi/get-started/evaluation/eval-groups.md): Evaluation groups allow you to organize multiple evaluations into logical collections and run them simultaneously. This feature streamlines the evaluation process by enabling batch execution of evals, making it easier to manage complex evaluation workflows - [Evaluate via CI/CD Pipeline](https://futureagi.mintlify.app/future-agi/get-started/evaluation/evaluate-ci-cd-pipeline.md) - [Evaluation Patterns & Recipes](https://futureagi.mintlify.app/future-agi/get-started/evaluation/evaluate-patterns.md): Common patterns for evaluating AI applications — RAG pipelines, chatbots, agents, image generation, and more. - [Use Future AGI Models](https://futureagi.mintlify.app/future-agi/get-started/evaluation/future-agi-models.md): Future AGI's proprietary models trained on a vast variety of datasets to perform evaluations - [Running Your First Eval](https://futureagi.mintlify.app/future-agi/get-started/evaluation/running-your-first-eval.md): This guide will walk you through setting up an evaluation in **Future AGI**, allowing you to assess AI models and workflows efficiently. You can run evaluations via the **Future AGI platform** or using the **Python SDK**. - [Use Custom Models](https://futureagi.mintlify.app/future-agi/get-started/evaluation/use-custom-models.md): Future AGI allows you to use your own custom models. This is useful if you want to use a model that is tailor made for your use case. - [Concept](https://futureagi.mintlify.app/future-agi/get-started/knowledge-base/concept.md) - [Create a Knowledge Base using SDK](https://futureagi.mintlify.app/future-agi/get-started/knowledge-base/how-to/create-kb-using-sdk.md): This guide will help you get started with the Knowledge Base (KB) Python SDK. - [Create a Knowledge Base using UI](https://futureagi.mintlify.app/future-agi/get-started/knowledge-base/how-to/create-kb-using-ui.md): This guide will help you seamlessly create a **Knowledge Base (KB)** using the Future AGI platform. - [Overview](https://futureagi.mintlify.app/future-agi/get-started/knowledge-base/overview.md): The Knowledge Base (KB) is the foundation for grounded, context-aware synthetic data generation and accurate evaluations. It ensures that every output whether it's data generation or evaluation is informed by your uploaded content, which is semantically processed and abstracted to reflect your organ… - [Enriching Spans with Attributes, Metadata, and Tags](https://futureagi.mintlify.app/future-agi/get-started/observability/manual-tracing/add-attributes-metadata-tags.md): When building applications, you'll often need to capture additional context beyond what standard frameworks or LLM clients provide. Here's how to enrich your traces with custom information. - [Integrate Events, Exceptions, and Status into Spans](https://futureagi.mintlify.app/future-agi/get-started/observability/manual-tracing/add-events-exceptions-status.md): OpenTelemetry (OTEL) provides support for adding Events, Exceptions, and Status into spans. - [Advanced Tracing (OTEL)](https://futureagi.mintlify.app/future-agi/get-started/observability/manual-tracing/advanced-tracing-examples.md): Exploring Manual Context Propagation, Custom Decorators, and Sampling Techniques - [Adding Annotations to your Spans](https://futureagi.mintlify.app/future-agi/get-started/observability/manual-tracing/annotating-using-api.md): Learn how to annotate your spans in bulk using the API - [Tool Spans Creation](https://futureagi.mintlify.app/future-agi/get-started/observability/manual-tracing/create-tool-spans.md) - [Get Current Tracer and Span](https://futureagi.mintlify.app/future-agi/get-started/observability/manual-tracing/get-current-span-context.md) - [In-line Evaluations](https://futureagi.mintlify.app/future-agi/get-started/observability/manual-tracing/in-line-evals.md): In-line evaluations are a way to evaluate within a trace. - [Instrument with traceAI Helpers](https://futureagi.mintlify.app/future-agi/get-started/observability/manual-tracing/instrument-with-traceai-helpers.md): Future AGI's traceAI library offers convenient abstractions to streamline your manual instrumentation process. - [Langfuse Integration](https://futureagi.mintlify.app/future-agi/get-started/observability/manual-tracing/langfuse-intergation.md): Integrate Future AGI evaluations with Langfuse - [Logging Prompt Templates & Variables](https://futureagi.mintlify.app/future-agi/get-started/observability/manual-tracing/log-prompt-templates.md) - [Mask Span Attributes](https://futureagi.mintlify.app/future-agi/get-started/observability/manual-tracing/mask-span-attributes.md) - [FI Semantic Conventions](https://futureagi.mintlify.app/future-agi/get-started/observability/manual-tracing/semantic-conventions.md) - [Set Session ID and User ID](https://futureagi.mintlify.app/future-agi/get-started/observability/manual-tracing/set-session-user-id.md): Adding SessionID and UserID as attributes to Spans for Tracing - [Implementing Tracing](https://futureagi.mintlify.app/future-agi/get-started/observability/manual-tracing/set-up-tracing.md): We recommend starting with the auto-instrumentation first. For advanced customization and granular control, you can directly utilize our OTEL-compliant instrumentation API. - [Dataset Optimization](https://futureagi.mintlify.app/future-agi/get-started/optimization/dataset-optimization.md): Automatically improve prompt templates stored in your datasets using evaluation-driven optimization algorithms. - [Use agent-opt Python SDK for Prompt Optimization](https://futureagi.mintlify.app/future-agi/get-started/optimization/how-to/using-python-sdk.md): A step-by-step guide to optimizing your AI workflows programmatically with our agent-opt Python library. Learn to set up optimizers, evaluators, and datasets. - [Bayesian Search Optimizer](https://futureagi.mintlify.app/future-agi/get-started/optimization/optimizers/bayesian-search.md): Learn how to use the Bayesian Search optimizer for intelligent few-shot prompt optimization. A guide on its configuration, parameters, and advanced usage. - [GEPA: Evolutionary Prompt Optimization](https://futureagi.mintlify.app/future-agi/get-started/optimization/optimizers/gepa.md): Discover GEPA (Genetic Pareto), a powerful evolutionary algorithm that evolves prompts over generations using reflection and mutation for complex, high-stakes optimization. - [Meta-Prompt Optimizer](https://futureagi.mintlify.app/future-agi/get-started/optimization/optimizers/meta-prompt.md): A guide to the Meta-Prompt optimizer, which uses a teacher LLM for deep reasoning-based prompt refinement through systematic failure analysis and rewriting. - [Prompt Optimization: Concepts and Strategies](https://futureagi.mintlify.app/future-agi/get-started/optimization/optimizers/overview.md): Learn the fundamentals of prompt optimization and compare different algorithms like GEPA, Meta-Prompt, and ProTeGi to choose the right strategy for your use case. - [PromptWizard Optimizer](https://futureagi.mintlify.app/future-agi/get-started/optimization/optimizers/promptwizard.md): Learn about PromptWizard, a multi-stage feedback-driven optimizer that improves prompts through a cycle of mutation, critique, and refinement. - [ProTeGi Optimizer](https://futureagi.mintlify.app/future-agi/get-started/optimization/optimizers/protegi.md): A guide to ProTeGi (Prompt optimization with Textual Gradients), which systematically improves prompts by identifying failures, generating critiques, and applying targeted fixes. - [Random Search Optimizer](https://futureagi.mintlify.app/future-agi/get-started/optimization/optimizers/random-search.md): Understand the Random Search optimizer, a simple and effective gradient-free method for establishing a baseline in prompt optimization by exploring random variations. - [Prompt Optimization Overview](https://futureagi.mintlify.app/future-agi/get-started/optimization/overview.md): An introduction to prompt optimization with the `agent-opt` Python library. Learn why it's essential and explore advanced algorithms for refining AI responses. - [Quickstart: Optimizing Your First Prompt](https://futureagi.mintlify.app/future-agi/get-started/optimization/quickstart.md): A quick, hands-on guide to getting started with prompt optimization using the agent-opt Python library. Optimize your first prompt in minutes. - [Concept](https://futureagi.mintlify.app/future-agi/get-started/protect/concept.md): Future AGI's Protect acts as a vital guardrail for AI applications, ensuring security, reliability, and ethical compliance during real-time interactions across text, image, and audio modalities. - [How to Use](https://futureagi.mintlify.app/future-agi/get-started/protect/how-to.md) - [Overview](https://futureagi.mintlify.app/future-agi/get-started/protect/overview.md): Future AGI's Protect module brings real-time safety and policy enforcement directly into your GenAI application flow. - [Evals for Prototype](https://futureagi.mintlify.app/future-agi/get-started/prototype/evals.md) - [Overview](https://futureagi.mintlify.app/future-agi/get-started/prototype/overview.md) - [Quickstart](https://futureagi.mintlify.app/future-agi/get-started/prototype/quickstart.md) - [Choose Winner](https://futureagi.mintlify.app/future-agi/get-started/prototype/winner.md) - [Auto-Instrumentation](https://futureagi.mintlify.app/future-agi/products/observability/auto-instrumentation/overview.md): Auto-instrumentation allows you to add tracing to your LLM applications with minimal code changes. Simply install our integration packages, and Future AGI will automatically capture spans, metrics, and relevant attributes for your LLM interactions. - [Components of Observability](https://futureagi.mintlify.app/future-agi/products/observability/concept/core-components.md): Observability in LLM-based applications relies on a structured framework that captures execution details at different levels of granularity. Each request follows a well-defined path, where **individual operations are recorded, grouped into execution flows, and organized for broader analysis.** This… - [What is OpenTelemetry?](https://futureagi.mintlify.app/future-agi/products/observability/concept/otel.md) - [Understanding Observability](https://futureagi.mintlify.app/future-agi/products/observability/concept/overview.md) - [What are Spans ?](https://futureagi.mintlify.app/future-agi/products/observability/concept/spans.md) - [What is traceAI?](https://futureagi.mintlify.app/future-agi/products/observability/concept/traceai.md) - [What are Traces ?](https://futureagi.mintlify.app/future-agi/products/observability/concept/traces.md): In observability frameworks, a Trace is a comprehensive representation of the execution flow of a request within a system. It is composed of multiple spans, each capturing a specific operation or step in the process. Traces provide a holistic view of how different components interact and contribute… - [Overview](https://futureagi.mintlify.app/future-agi/products/observability/overview.md): Understanding how your LLM application performs is essential for optimization. Future AGI's observability platform helps you monitor critical metrics like cost, latency, and evaluation results through comprehensive tracing capabilities. - [Alerts and Monitors](https://futureagi.mintlify.app/future-agi/products/observe/alerts-and-monitors.md): Alerts and Monitors in Future AGI are designed to detect anomalies and issues in your data. This feature helps you stay informed about critical metrics such as latency, cost, token usage, and evaluation metrics like toxicity, bias detection, and more. - [How to run evals?](https://futureagi.mintlify.app/future-agi/products/observe/evals.md): Future AGI's Eval tasks allows you to create and run automated tasks on your data. These tasks enable **automated workflows** to manage model **evaluation** at scale. They provide ways to operationalize evaluations and track ongoing results without requiring manual intervention. Users can create and… - [Overview](https://futureagi.mintlify.app/future-agi/products/observe/overview.md): Future AGI's Observability platform delivers enterprise-grade monitoring and evaluation for large language models (LLMs) in production. Our solution provides deep visibility into LLM application performance through advanced telemetry data tracing and sophisticated evaluation metrics. - [Quickstart](https://futureagi.mintlify.app/future-agi/products/observe/quickstart.md) - [Sessions](https://futureagi.mintlify.app/future-agi/products/observe/session.md): Sessions in Future AGI are used to group traces, such as those from chatbot conversations. This feature allows users to view and analyze interactions between a human and AI, making it easier to build or debug chatbot applications. - [User Dashboard](https://futureagi.mintlify.app/future-agi/products/observe/users.md): The User Dashboard provides a consolidated view of all interactions, sessions, and traces linked to a specific user. It enables LLM application developers to debug issues, analyze behavior patterns, and optimize resource allocation at the individual user level. - [Overview](https://futureagi.mintlify.app/future-agi/products/observe/voice/overview.md): The voice observability feature allows you to observe all the conversations that your agent does. You can treat it just like any other observe project, run evals and set up alerts for the same - [Quickstart](https://futureagi.mintlify.app/future-agi/products/observe/voice/quickstart.md): Setting up observability for your voice agent - [What is Future AGI?](https://futureagi.mintlify.app/home.md): Future AGI is an AI lifecycle platform designed to support enterprises throughout their AI journey. It combines rapid prototyping, rigorous evaluation, continuous observability, and reliable deployment to help build, monitor, optimize, and secure generative AI applications. - [Anthropic](https://futureagi.mintlify.app/integrations/anthropic.md) - [Autogen](https://futureagi.mintlify.app/integrations/autogen.md) - [Bedrock](https://futureagi.mintlify.app/integrations/bedrock.md) - [Crew AI](https://futureagi.mintlify.app/integrations/crewai.md) - [DSPy](https://futureagi.mintlify.app/integrations/dspy.md) - [Google ADK](https://futureagi.mintlify.app/integrations/google_adk.md) - [Google GenAI](https://futureagi.mintlify.app/integrations/google_genai.md) - [Groq](https://futureagi.mintlify.app/integrations/groq.md) - [Guardrails](https://futureagi.mintlify.app/integrations/guardrails.md) - [Haystack](https://futureagi.mintlify.app/integrations/haystack.md) - [Instructor](https://futureagi.mintlify.app/integrations/instructor.md) - [LangChain](https://futureagi.mintlify.app/integrations/langchain.md) - [LangGraph](https://futureagi.mintlify.app/integrations/langgraph.md) - [LiteLLM](https://futureagi.mintlify.app/integrations/litellm.md) - [LiveKit](https://futureagi.mintlify.app/integrations/livekit.md) - [Llama Index](https://futureagi.mintlify.app/integrations/llamaindex.md) - [Llama Index Workflows](https://futureagi.mintlify.app/integrations/llamaindex-workflows.md) - [Mistral AI](https://futureagi.mintlify.app/integrations/mistralai.md) - [n8n](https://futureagi.mintlify.app/integrations/n8n.md): With this integration, you can dynamically retrieve prompts from your Future AGI account, select specific versions, and compile prompts with variables - all within the familiar n8n interface. - [Ollama](https://futureagi.mintlify.app/integrations/ollama.md) - [OpenAI](https://futureagi.mintlify.app/integrations/openai.md) - [OpenAI Agents](https://futureagi.mintlify.app/integrations/openai_agents.md) - [Overview](https://futureagi.mintlify.app/integrations/overview.md): Future AGI provides pre-built auto-instrumentation for the following frameworks and LLM providers: - [Pipecat](https://futureagi.mintlify.app/integrations/pipecat.md) - [Portkey](https://futureagi.mintlify.app/integrations/portkey.md) - [Prompt Flow](https://futureagi.mintlify.app/integrations/promptflow.md) - [Smol Agents](https://futureagi.mintlify.app/integrations/smol_agents.md) - [Together AI](https://futureagi.mintlify.app/integrations/togetherai.md) - [Vercel](https://futureagi.mintlify.app/integrations/vercel.md) - [Vertex AI (Gemini)](https://futureagi.mintlify.app/integrations/vertexai.md) - [Overview](https://futureagi.mintlify.app/product/agent-compass/overview.md): Introducing Agent Compass - [Quickstart](https://futureagi.mintlify.app/product/agent-compass/quickstart.md): Understanding components of Agent Compass - [Taxonomy](https://futureagi.mintlify.app/product/agent-compass/taxonomy.md): Taxonomy: actions, outcomes, and classifications. - [Annotation Labels](https://futureagi.mintlify.app/product/annotations/concepts/labels.md): Understand the five annotation label types -- categorical, numeric, text, star rating, and thumbs up/down -- and when to use each. - [Queues & Workflow](https://futureagi.mintlify.app/product/annotations/concepts/queues.md): Learn how annotation queues organize work -- statuses, assignment strategies, reservations, multi-annotator support, and review workflows. - [Scores](https://futureagi.mintlify.app/product/annotations/concepts/scores.md): Understand the Score model -- the unified annotation primitive that stores labels, values, and metadata across all source types. - [Add Items to Queues](https://futureagi.mintlify.app/product/annotations/features/add-items.md): Learn how to add traces, spans, sessions, dataset rows, prototypes, and simulation calls to annotation queues. - [Analytics & Agreement](https://futureagi.mintlify.app/product/annotations/features/analytics.md): Track annotation progress, annotator performance, label distribution, and inter-annotator agreement metrics. - [Annotate Items](https://futureagi.mintlify.app/product/annotations/features/annotate.md): Complete guide to the annotation workspace -- label inputs, keyboard shortcuts, navigation, instructions, and completion workflow. - [Automation Rules](https://futureagi.mintlify.app/product/annotations/features/automation.md): Set up rules to automatically add items to queues or pre-fill annotations based on conditions. - [Export Annotations](https://futureagi.mintlify.app/product/annotations/features/export.md): Export completed annotations as datasets (JSON/CSV) for fine-tuning, evaluation, or analysis. - [Inline Annotations](https://futureagi.mintlify.app/product/annotations/features/inline.md): Annotate traces, spans, sessions, and prototypes directly from their detail views without using queues. - [Create & Manage Labels](https://futureagi.mintlify.app/product/annotations/features/labels.md): Step-by-step guide to creating, editing, duplicating, and archiving annotation labels. - [Create & Manage Queues](https://futureagi.mintlify.app/product/annotations/features/queues.md): Step-by-step guide to creating annotation queues, configuring assignment strategies, and managing queue lifecycle. - [Annotations](https://futureagi.mintlify.app/product/annotations/overview.md): Add human feedback to your AI outputs with annotation labels, queues, and scores across traces, datasets, prototypes, and simulations. - [Quickstart](https://futureagi.mintlify.app/product/annotations/quickstart.md): Get started with annotations in 5 minutes -- create a label, set up a queue, add items, and start annotating. - [JavaScript SDK](https://futureagi.mintlify.app/product/annotations/sdk/javascript.md): Annotate traces and manage annotation queues programmatically using the FutureAGI JavaScript/TypeScript SDK. - [Python SDK](https://futureagi.mintlify.app/product/annotations/sdk/python.md): Annotate traces and manage annotation queues programmatically using the FutureAGI Python SDK. - [Add Rows to Dataset](https://futureagi.mintlify.app/product/dataset/how-to/add-rows-to-dataset.md): Learn how to add rows to your dataset - [Add Annotations](https://futureagi.mintlify.app/product/dataset/how-to/annotate-dataset.md): Annotations are essential for refining datasets, evaluating model outputs, and improving the quality of AI-generated responses. - [Create Dynamic Column by Executing Code](https://futureagi.mintlify.app/product/dataset/how-to/create-dynamic-column/by-executing-code.md): The **Execute Custom Code** feature allows users to create a dynamic column by writing and running Python code on dataset rows. This enables custom transformations, calculations, or data processing based on existing column values. - [Create Dynamic Column by Extracting Entities](https://futureagi.mintlify.app/product/dataset/how-to/create-dynamic-column/by-extracting-entities.md): This feature allows users to create column dynamically by extract information from already existing column by defining extraction rules. - [Create Dynamic Column by Extracting JSON](https://futureagi.mintlify.app/product/dataset/how-to/create-dynamic-column/by-extracting-json.md): The **Extract JSON Key** feature allows users to extract specific values from JSON-formatted data stored in a dataset of JSON data type column. - [Create Dynamic Column by API Call](https://futureagi.mintlify.app/product/dataset/how-to/create-dynamic-column/using-api-calls.md): The **API Call** feature allows users to dynamically fetch and populate new dataset columns by integrating external APIs. - [Create Dynamic Column by Classification](https://futureagi.mintlify.app/product/dataset/how-to/create-dynamic-column/using-classification.md): The **Classification** feature allows users to categorise dataset rows by applying labels based on text content from a selected column. - [Create Dynamic Column by Conditional Node](https://futureagi.mintlify.app/product/dataset/how-to/create-dynamic-column/using-conditional-node.md): A **conditional node** is a dynamic column type that applies **branching logic** (if/elif/else) to determine operations on each row of a dataset. - [Create Dynamic Column by Running Prompt](https://futureagi.mintlify.app/product/dataset/how-to/create-dynamic-column/using-run-prompt.md): The **Run Prompt** feature allows you to create dynamic column type by using custom prompts for LLM. - [Create Dynamic Column by Vector Database](https://futureagi.mintlify.app/product/dataset/how-to/create-dynamic-column/using-vector-db.md): Vector database retrieval allows you to fetch relevant data from an external vector database based on similarity searches. - [Create New Dataset](https://futureagi.mintlify.app/product/dataset/how-to/create-new-dataset.md): Learn to create datasets to do experimentations on them - [Create Static Column](https://futureagi.mintlify.app/product/dataset/how-to/create-static-column.md): Static columns store fixed values directly within a dataset. They do not require computation, external processing, or updates unless manually modified. - [Experiments in Dataset](https://futureagi.mintlify.app/product/dataset/how-to/experiments-in-dataset.md): To test, validate, and compare different prompt configurations - [Run Prompt in Dataset](https://futureagi.mintlify.app/product/dataset/how-to/run-prompt-in-dataset.md): Learn how to execute prompts against your dataset and generate responses - [Overview](https://futureagi.mintlify.app/product/dataset/overview.md): Create, manage and analyze datasets for AI model development and evaluation - [Create Prompt from Existing Template](https://futureagi.mintlify.app/product/prompt/how-to/create-prompt-from-existing-template.md): This guide will walk you through the process of creating a new prompt from an existing template in Future AGI. - [Create Prompt from Scratch](https://futureagi.mintlify.app/product/prompt/how-to/create-prompt-from-scratch.md): This guide will walk you through the process of creating a new prompt in Future AGI, configuring its parameters, and running it. - [Linked Traces](https://futureagi.mintlify.app/product/prompt/how-to/linked-traces.md): Linking prompts to traces is essential for monitoring and improving the performance of your language model applications. By establishing this connection, you can track metrics and evaluations for each prompt version, facilitating iterative enhancements over time. - [Manage Prompt Folders](https://futureagi.mintlify.app/product/prompt/how-to/manage-folders.md): This guide will walk you through the process of managing prompt folders in Future AGI. - [Prompt Workbench Using SDK](https://futureagi.mintlify.app/product/prompt/how-to/prompt-workbench-using-sdk.md) - [Overview](https://futureagi.mintlify.app/product/prompt/overview.md): Create, manage, and optimize AI prompts for reliable and consistent language model outputs - [Agent Definition](https://futureagi.mintlify.app/product/simulation/agent-definition.md): An agent definition is a configuration that specifies how your AI agent behaves during voice or chat conversations - [Chat Simulation Using SDK](https://futureagi.mintlify.app/product/simulation/how-to/chat-simulation-using-sdk.md): Run Future AGI chat simulations from Python by providing an agent callback and executing an existing Run Test. - [Evaluate Tool Calling](https://futureagi.mintlify.app/product/simulation/how-to/evaluate-tool-calling.md): Evaluate the tool calling capabilities of your agent - [Fix My Agent](https://futureagi.mintlify.app/product/simulation/how-to/fix-my-agent.md): Get AI-powered diagnostics and instant fixes for your agent's performance issues - [Replay](https://futureagi.mintlify.app/product/simulation/how-to/observe-to-simulate.md): Replay real production sessions in a dev environment using chat simulation to debug, iterate, and improve your agent. - [Simulate from Prompt Workbench](https://futureagi.mintlify.app/product/simulation/how-to/prompt-simulation.md): Test your prompts in multi-turn chat simulations directly from the Prompt Workbench - no SDK agent setup required. - [Voice Observability](https://futureagi.mintlify.app/product/simulation/how-to/voice-observability.md): Observe all the conversations that your agent does. You can treat it just like any other observe project, run evals and set up alerts for the same - [Overview](https://futureagi.mintlify.app/product/simulation/overview.md): AI agent simulations are controlled environments where AI agents can be tested, evaluated, and refined through various scenarios and interactions - [Personas](https://futureagi.mintlify.app/product/simulation/personas.md): To create realistic scenarios, you need to create personas that will be used in your simulation tests. - [Run Tests](https://futureagi.mintlify.app/product/simulation/run-tests.md): Complete guide to creating and executing simulation tests for your insurance sales agents - [Scenarios](https://futureagi.mintlify.app/product/simulation/scenarios.md): Scenarios defines the test cases, customer profiles, and conversation flows that your AI agent will encounter during simulations. - [Generate Synthetic Data](https://futureagi.mintlify.app/quickstart/generate-synthetic-data.md): Synthetic data generation allows you to create realistic, structured datasets without using real-world data. This powerful feature helps you - [Running Evals in Simulation](https://futureagi.mintlify.app/quickstart/running-evals-in-simulation.md) - [Setup MCP Server](https://futureagi.mintlify.app/quickstart/setup-mcp-server.md) - [Setup Observability](https://futureagi.mintlify.app/quickstart/setup-observability.md) - [Release Notes](https://futureagi.mintlify.app/release-notes.md) - [Datasets](https://futureagi.mintlify.app/sdk-reference/datasets.md): Reference for the Dataset class in the Future AGI Python SDK. - [Evaluator (Cloud API)](https://futureagi.mintlify.app/sdk-reference/evals.md): Using the Future AGI Python SDK Evaluator class for cloud-based evaluations with Turing models. - [AI Evaluation SDK](https://futureagi.mintlify.app/sdk-reference/evaluate.md): Run 72+ evaluations locally, use LLM judges with custom criteria, evaluate images and audio — all through a single function. - [KnowledgeBase](https://futureagi.mintlify.app/sdk-reference/knowledgebase.md): Reference for the KnowledgeBase class in the Future AGI Python SDK. - [Protect](https://futureagi.mintlify.app/sdk-reference/protect.md): Reference for the Protect class in the Future AGI Python SDK. - [Installation](https://futureagi.mintlify.app/sdk-reference/python-sdk-client.md): Installation of the Future AGI Python SDK and Tracing Libraries - [Test Case](https://futureagi.mintlify.app/sdk-reference/testcase.md): Reference for the Test Case classes in the Future AGI Python SDK. - [Tracing](https://futureagi.mintlify.app/sdk-reference/tracing.md): Reference for tracing and telemetry in the Trace AI Python SDK. ## OpenAPI Specs - [openapi](https://futureagi.mintlify.app/openapi.json)