Release Notes - Future AGI Documentation

Week of 2025-11-17

Features

Detailed Voice Provider Logs: Full conversation-level logs from voice providers are now surfaced for every simulation and call, offering deeper visibility for debugging and performance analysis.

Bugs/Improvements

New TTS Model Integrations for Run Prompt and Experiments: Added support for Cartesia, Hume, Neuphonics, and LMNT TTS models, expanding the range of available voices and synthesis characteristics.
Enhanced Simulation Behaviors and Realism: Simulation output now features more natural persona logic, frustration modeling, improved background noise handling, and smoother conversational transitions for more realistic interactions.

Week of 2025-11-14

Features

Logs, Latency Metrics, and Cost Breakdown in Simulation Calls: Simulation calls now display detailed conversation logs as well as latency and cost breakdowns across TTS, LLM, and STT components. These insights improve transparency and observability for voice agent performance.
Run Prompt and Experiment Revamp: The Run Prompt and Experiment interfaces now provide contextual provider selection. Providers are grouped by goal—LLM, TTS, or STT—eliminating the need to scroll through unstructured lists.
Expanded Evaluation Attributes in Voice Observability: Voice agent evaluations now support additional variable mappings, including prompts, scenario descriptions, and other key attributes for more comprehensive and accurate assessments.

Week of 2025-11-12

Features

Credit Usage Summary: The Usage Summary experience has been fully redesigned to provide detailed visibility into workspace-level activity. All API call logs across Traces, Observe, Simulation, and Error Analysis now include workspace attribution. A new cumulative usage API provides long-term consumption insights with improved cost and count tracking for financial clarity.
New Agent Definition UX with Multi-Step Flow: The Agent Definition workflow has been rebuilt into a guided three-step setup—Basic Information, Configuration, and Behaviour. The updated layout improves discoverability, adds a contextual resource panel, and introduces row-level table actions.
Prompt Workbench Revamp: The Workbench UI has been redesigned to simplify prompt version management and improve collaboration. Prompt versions now follow a commit-based history model, making it easier to review, compare, and maintain consistency across experiments.
Multi-Language Support in Agent Definition: Agent Definitions now support multilingual configurations directly within agent settings, enabling structured and version-controlled management of multi-language agents.
Add Columns to Scenarios via AI and Manual Inputs: Scenario creation now supports adding new metadata columns using AI suggestions or manual entry. Duplicate detection, required-field validation, and retrospective schema updates ensure consistency and extensibility.

Bugs/Improvements

Enhanced Language and Accent Support in Simulation: Simulation now supports a broader range of languages and accents for more comprehensive international testing.
Simulate Metrics Revamp: Metrics have been refined for improved clarity, accuracy, and alignment with agent versioning, resulting in more reliable evaluation outcomes.
Dataset Audio Upload Stability Improvements: Audio upload handling has been strengthened with better error handling and extended processing for long or high-quality files.
Enable User Details on Sessions and User Tab: User metadata—such as email, phone number, and custom identifiers—can now be shown or hidden in Sessions and User pages for deeper segmentation.
Sorting Persistence on User Tab: Sorting preferences on the User tab now persist across navigation for a more consistent browsing experience.
DateTime Format Compatibility Fix: Date parsing now supports ISO, RFC, and multiple locale-based date formats, preventing ingestion errors and ensuring consistent processing.

Week of 2025-11-04

What’s New

Features

Retell Integration for Agent Simulation: Retell is now supported as a provider for agent definitions and voice observability in Simulate. Users can monitor and observe their agents directly through Retell, enabling enhanced voice-based insights and analytics.
Tool Evaluation in Simulate: Users can now evaluate the tools they used when building their agents within Simulate, enabling better insights into tool performance.
Added Provider Transcript as an Evaluation Attribute: Users can now send the entire transcript as part of their evaluations when running Observe projects, enabling more comprehensive analysis and insights during evaluation.

Bugs/Improvements

Session History Enhancements: The Session History experience has been improved for better usability, featuring smoother navigation within chats, an enhanced layout, and the ability to move between sessions using Next and Previous buttons.
Edit Persona Language Update: Resolved an issue where selected languages were not updating correctly when editing a persona, ensuring changes are properly saved.
Language and Transcript Enhancements: Improved support for Indian languages by addressing the lack of proper accents, and enhanced the Simulate transcript experience for better readability, clarity, and overall usability during scenario analysis and evaluation.

Week of 2025-10-30

What’s New

Features

Added Voice Output Support in Run Prompt and Run Experiment: Users can now select Audio as an output type in both Run Prompt and Run Experiment workflows. This enhancement allows prompts and experiments to generate voice-based outputs, improving the ability to test and experience spoken responses directly within the platform.
Pre-built and Custom Persona Feature in Simulate: Users can now define customer personas in Simulate, providing greater control over the persona profiles generated in scenarios. This feature allows users to choose from multiple pre-built personas or create custom personas tailored to their needs. Additionally, personas can be edited after a scenario is generated, offering enhanced flexibility and realism in scenario simulation.
Enhanced User Onboarding Flow: A redesigned onboarding experience is now available, allowing users to provide their role, define goals, and invite team members to their organization during setup.
Updated Pricing Calculation in Observe: The pricing mechanism in Observe has been updated to calculate costs during trace ingestion rather than at API runtime. This improvement enables faster retrieval of cost-related metrics, enhancing performance and responsiveness when analyzing traces.

Bugs/Improvements

Enhancements in Simulate: Improved the Simulate experience with several enhancements, including better persona understanding in transcripts and messages, updated time tracking for each conversation turn, and the ability to enable evaluations for the entire transcript, allowing for more comprehensive scenario assessments.

Week of 2025-10-27

What’s New

Features

Add Rows in Simulate Scenarios: Scenario tables can now be expanded with maximum flexibility. Rows can be added manually for precision control, generated intelligently using AI for rapid test case creation, or imported directly from existing datasets to leverage historical data. This enhancement streamlines scenario building and dramatically reduces setup time for complex simulations.
Run Evaluations for Completed Test Runs: New evaluations can now be executed on already completed test runs without rerunning entire simulations, delivering significant time and cost savings. Users can select desired test runs via checkboxes, click Run Evals, and choose specific evaluations to execute. This targeted approach enables efficient resource utilization, faster iteration on evaluation metrics, and flexible experimentation with different criteria.
Agent Definition Version Selection: Specific Agent Definition Versions can now be selected when creating new test runs and directly from the test run details page. This enhancement provides greater control over testing workflows and ensures reproducibility across experiments, making version comparison seamless and reliable.

Bugs/Improvements

Enhanced Evaluation Variable Handling in SDK: Evaluation input variables in the Future AGI SDK can now be easily copied and pasted across all evaluations, eliminating the error-prone manual typing process. This improvement reduces manual errors, accelerates variable mapping, and makes evaluation setup more reliable and efficient.
Agent Version Selection & Scrolling Fixes: Resolved critical issues where incorrect agent definition versions were being selected during test run creation. Additionally, fixed infinite scrolling problems in the Agent Definition Version list, ensuring smooth selection and consistent loading of all versions for a more stable navigation experience.

Week of 2025-10-14

What’s New

Features

Voice Observability Through Vapi Integration: Voice interactions are now fully observable within the platform. Assistant call logs from Vapi, including voice simulations, are automatically captured and displayed in your Observe project alongside other project data, enabling comprehensive monitoring and analysis of voice-based interactions.
Eval Groups in Experiment and Optimization: Evaluation groups can now be configured, created, and applied directly within Experiment and Optimization workflows. This integrated approach reduces workflow friction and accelerates the evaluation setup process.

Bugs/Improvements

Media Visualization in Eval Playground: Media columns now render actual image and audio content instead of raw URL strings, providing complete context and improved clarity in evaluation results.
Accelerated Learning & Improved Accessibility: Implemented a View Docs button across all major modules to streamline access to relevant documentation. Additionally, specific documentation links have been added directly to individual Evals, enabling quicker understanding and more efficient usage.
Contextual Flow Analysis Display: The interface has been streamlined by removing flow analysis views from dataset-based scenarios where they are not applicable, resulting in a cleaner and more intuitive user experience.
Unsaved Changes Protection in Scenario Builder: Added a modal to alert users of unsaved changes when editing scenario graphs, allowing them to save or discard their work before navigating away.

Week of 2025-10-09

What’s New

Features

Simulate via SDK: You can now simulate realistic, ultra-low-latency customer calls against your deployed LiveKit agents directly through the SDK. This update enables fully local testing without external dependencies, automatically records high-fidelity WAVs and transcripts over the WebRTC stream, and integrates with AI Evaluation for end-to-end performance evaluation. Developers gain full ownership and flexibility—with self-hosted control, customizable ASR, TTS, and model configurations—while cutting simulation costs by roughly 60–70%.
Selective Test Rerun in Simulate: Users now have precise control over simulation testing with the ability to rerun individual calls. You can choose to rerun the complete call with evaluations or re-execute evaluations independently, enabling targeted debugging and validation without requiring full test restarts.

Week of 2025-10-02

What’s New

Bugs/Improvements

Evaluation Group Management: Users can now configure and create evaluation groups directly from datasets and simulate, streamlining evaluation setup and saving time.
Default evals group: Access preconfigured evaluation groups for use cases like RAG, computer vision, etc., and save time in evaluation setup.
Advanced Simulation Management: Test executions now auto-refresh with real-time data, giving users instant visibility into ongoing runs. Users can stop simulations at any point to prevent unnecessary calls and costs. Enhanced features include Visual Workflow Tracing to pinpoint agent deviations, Real-Time Test Control to efficiently manage test execution, and Comprehensive Performance Metrics (latency, interruption response time, etc.) for precise agent evaluation and optimization.

Week of 2025-09-27

What’s New

Features

Agent Definition Versioning Upgrades: Managing agent definitions is now faster, simpler, and more organized. Instead of manually copy-pasting and creating new definitions each time, you can instantly create new versions with meaningful commit messages. All test reports are consolidated in one place, making it easy to access and compare logs across versions. With one-click versioning and unified test history, iteration cycles are now much faster—allowing you to update and test new agent configurations in seconds, not minutes.
Automated Scenario & Workflow Builder: Creating scenarios with synthetic data or uploaded datasets was useful, but it often lacked clarity in visualizing agent interactions. With the new Future AGI Scenario & Workflow Builder, you can simply upload SOPs or conversation transcripts and let the AI automatically generate comprehensive test scenarios—including edge cases that humans might miss. Each run now provides a clear, visual map of the exact conversation paths traversed by your agent, while the interactive workflow builder makes it easy to design, edit, and optimize flows. This enhanced experience delivers deeper insights, targeted edge case discovery, and a more intuitive way to implement and evaluate agent behavior.
Simplified User Session Tracking: Session management is now effortless. Instead of shutting down the trace provider and re-registering everything, you can simply add a session.id attribute to your spans. This makes it easy to group data into multiple sessions, enabling granular, user-level insights into your application’s performance and behavior.

Bugs/Improvements

Direct Trace-to-Prompt Linking: Introduced seamless linking of traces to prompts by leveraging the code snippet on the Prompt Workbench Metrics screen.
Enhanced Transcript Clarity: Updated transcript terminology so users can easily distinguish between messages from the Agent and responses from the FAGI Simulator, improving readability and context during review.
Workspace Switching Loader Fix: Fixed the loader behavior during workspace switching, ensuring a smoother transition.
Large Dataset Upload Stability: Improved dataset upload experience by resolving loading issues for large CSV/JSON files, enhancing stability and user visibility.
Custom Evaluation Editing Fixes: Resolved bugs in the Evals Playground to ensure smoother and more reliable editing of custom evaluations.
Group Evaluation UI/UX Improvements: Refined the user interface and experience when editing group evaluations, making the process more intuitive and consistent.

Week of 2025-09-22

What’s New

Features

Advanced Evaluation Group Management: Streamline your evaluation workflows with comprehensive CRUD operations for evaluation groups. Create, view, edit, and delete evaluation groups seamlessly, then apply them directly to tasks and prompts for consistent scoring across your AI applications. Enhanced with intelligent popovers that display eval input details, LLM/Knowledge Base dependencies, and linked evaluations during the grouping process.
Enhanced Call Management & Audio Controls: Manage your voice AI testing with the completely revamped Call Details Drawer that displays associated scenarios for each test run. Features a sophisticated multi-channel audio player for separate visualization and playback of assistant and customer audio streams.
Flexible Call Recording Downloads: Export call recordings in multiple formats (Caller Audio, Agent Audio, Mono Audio, Stereo Audio) to match your analysis workflow requirements. Coupled with granular audio field selection in evaluations for precise control over which conversation segments to score and analyze.

Bugs/Improvements

Enhanced Collaboration Features: Boost team productivity with collaborator support in prompts, allowing you to add and view team members working on specific prompts. Track prompt ownership with visible Created By fields and organize your work more efficiently with sorting capabilities for sample folders, prompts, and prompt templates.
Annotation & Prompt Import Fixes in Dataset: Enhanced annotation workflows by preventing empty label view selections and resolving prompt overflow issues in Run Experiment interfaces.
Filter Issues for Evals Selection: Bug fix for eval type filters on evaluations drawer across the platform.

Week of 2025-09-08

What’s New

Features

Intelligent Prompt Organization System: Transform your prompt management with our new folder-based architecture. Organize prompts and templates in a hierarchical structure, create reusable templates from existing prompts, and maintain consistency across your AI workflows. Templates function as fully-featured prompts while eliminating repetitive configuration tasks.
Enhanced Voice Agent Testing & Analytics: View comprehensive performance metrics of your voice agent test runs in an intuitive dashboard, including Top Performing Scenarios and conversation quality insights. The expanded simulate feature now includes additional scenario columns with grouping capabilities, customizable column visibility, and advanced filtering options—enabling you to optimize your voice AI implementations and focus on the most relevant data for your testing workflows.
Enhanced Plans & Pricing Experience: Navigate pricing options effortlessly with our completely redesigned pricing page featuring interactive plan comparison cards, a dynamic price calculator, and detailed plan breakdowns. The new design provides clear visibility into feature tiers and helps you make informed decisions about your subscription.

Bugs/Improvements

Enhanced Observability & Dashboard Accuracy: Resolved filtering issues for User ID across User Details Dashboard and Observe sections. Improved project selector clarity in Observe Eval Task Drawer and fixed workspace-level OTEL trace creation issues for more reliable monitoring.
UI/UX Enhancements: Streamlined simulation flow interfaces for better user experience and standardized decimal precision across the platform (displaying 2 decimal places for all numeric values).
Enhanced Data Visibility in Dataset Summary: Understand exactly how many data points contributed to your summary results and evaluation metrics, helping with complete transparency.
Code Snippet for Running Evals via SDK: Copy-paste ready terminal commands to run any evaluation without manual configuration by leveraging code snippet on the evals playground.
Unified Design System: Experience consistent interactions across the platform with our custom DatePicker component, ensuring a polished and cohesive user experience throughout your workflow.

Week of 2025-09-05

What’s New

Features

Comprehensive Annotation Quality Dashboard: Monitor annotation quality at scale with our centralized analytics dashboard. Track key metrics including annotator agreement rates, completion times, and advanced quality scores (cosine similarity, Pearson correlation, Fleiss’ kappa) to ensure your training data meets the highest standards.
Enterprise-Grade Multi-Workspace Security: Deploy with confidence using our complete RBAC framework. Create isolated workspaces, manage team members with full CRUD capabilities (edit, deactivate, resend invitations), and implement role-based access controls that scale with your organization’s security requirements.
Advanced Observability with Feed Insights: Gain unprecedented visibility into agent performance with the new Feed Insights tab in the Observe section. Identify failed stages, affected spans, view error cluster events, track user counts, and analyze trend data over time for rapid issue diagnosis and agent optimization.
Intelligent Onboarding Navigation: Experience streamlined onboarding with our redesigned sidebar that prominently highlights the ‘Get Started’ section until all 7 onboarding steps are completed. This ensures new users follow a structured path to success before transitioning to the regular navigation experience.
No Config Evals – Agent Compass for AI Teams: AI agent developers often struggle to identify performance bottlenecks and system failures across complex execution flows. Traditional evaluation methods and system metrics offer only fragmented, span-level visibility—leaving teams blind to the bigger picture. As a result, diagnosing latency spikes, inefficient prompts, or tool-call failures becomes a time-consuming, manual process. Without actionable, trace-level insights, performance optimization turns reactive, error-prone, and expensive.

Bugs/Improvements

Improved Observability Reliability: Enhanced backend resilience for incomplete span creation scenarios and fixed issues when OpenTelemetry exports fail partially, ensuring complete trace visibility.

Week of 2025-08-29

What’s New

Features

Add Rows in Evals Tab of Prompt Workbench: Instantly add new rows with variable values in the evaluations screen, allowing you to generate outputs and evaluate without returning to the Prompt Workbench homepage.
Trace Linked to Prompt Workbench: View comprehensive performance metrics (latency, cost, tokens, evaluation metrics) for each prompt version linked to traces (and spans) across development, staging, and production environments via the Metrics section in Prompt Workbench.
Critical Issue Detection & Mitigation Advice on Datasets: Get actionable, AI-powered insights with recommendations to improve your agent’s performance and accelerate your path to production.
Access FAGI from AWS Marketplace: Sign up or sign in to the FAGI platform via AWS Marketplace and leverage AWS contracts and billing to work with FAGI.
Support for LlamaIndex OTEL Instrumentation in TypeScript: Easily add observability to agents leveraging the LlamaIndex framework with our TypeScript SDK on the FAGI platform.

Bugs/Improvements

Improved UX for Evaluate Pages: Enhanced the Evaluate Page interface for a consistent experience across devices.
Faster Alert Graph Loading: Reduced load times of alert graphs in the Alerts feature for quicker and smoother performance.
UI Improvements for Sidebar Navigation: Enhanced sidebar navigation for better usability.
User Filtering on Navigation: When navigating from the Users List or User Details Page to the LLM Tracing or Sessions Page, the user’s ID is now automatically applied as a filter.
User Details Filter Persistence: User filters (for traces and sessions) now persist across page refreshes.
UI Enhancements for Simulator Agent Form: Improved the user interface for the simulator agent form.
Support for Video in Trace Detail Screen: Added support for viewing videos in the Trace Details screen.
Fixed Scroll Issue in Agent Description Box (Simulation): Enabled scroll functionality via mouse in the agent description box within the simulation module.
Error Handling on Simulation Page: Improved error handling for low credit balances on the simulation homepage to enhance user experience.
Credit Utilization for Error Localizer: Added visibility of credit utilization for the error localizer in the usage summary screen.

Week of 2025-08-19

What’s New

Features

Comparison Summary: Compare evaluations and prompt summaries of two different datasets now with detailed graphs and scores.
Function Evals: Enable adding and editing function-type custom evals from the list of evals supported by Future AGI.
Edit Synthetic Dataset: Edit existing synthetic datasets directly or create a new version from changes.
Document Column Support in Dataset: New document column type to upload/store files in cells (TXT, DOC, DOCX, PDF).
User Tab in Dashboard and Observe: Searchable, filterable user list and detailed user view with metrics, interactive charts, synced time filters, and traces/sessions tabs.
Displaying the Timestamp Column in Trace/Spans: Added Start Time and End Time columns in Observe → LLM Tracing and Prototype → All Runs → Run Details.
Configure Labels: Configure system and custom labels per prompt version in Prompt Management.
Async Evals via SDK: Run evaluation asynchronously for long-running evaluations or larger datasets.

Bugs/Improvements

SDK Codes: Update the SDK codes for columns and rows on create dataset, add rows, and landing dataset page.
Fixed the editable issue in custom evals form: Incorrect config was displayed on evals page for function evals.
The bottom section for trace detail drawer disappeared: Dragging the bottom section caused the entire bottom area to disappear; behavior corrected.
UI screen optimization for different screen sizes.
Bug fixes for updates summary screen - color, text, and font alignment.
Cell loading state issues while creating synthetic data.
UI enhancement for simulation agent flow.
CSV upload bug in datasets and UI fixes for add feedback pop-up.

Week of 2025-08-11

What’s New

Features

Summary Screen Revamp (Evaluation and Prompt): Unified visual overview of model performance with pass rates and comparative spider/bar/pie charts; includes compare views, drill-downs, and consistent filters.
Alerts Revamp: Create alert rules in Observe (+New Alert) from Alerts tab or project; notifications via Slack/Email with guided Alert Type and Configuration steps.
Upgrades in Prompt SDK: Increased prompt availability after first run by virtue of prompt caching. Seamlessly deploy prompts in production, staging, or dev and perform A/B tests using prompt SDK.

Bugs/Improvements

Run prompt issues for longer prompts (>5K words).
Bug fixes for voice simulation naming convention in transcript deleting runs and selection of agent simulator.

Week of 2025-08-07

What’s New

Features

Voice Simulation: New testing infrastructure that deploys AI agents to conduct real conversations with your voice systems, analyzing actual audio, not just transcripts.
Edit Evals Config: Now edit the config (prompt/criteria) for your custom evals via evals playground, but with the restriction of no variable addition.

Bugs/Improvements

Bug fix for dynamic column creation via Weviate.
Reduced dependencies for TraceAI packages (HTTPS & GRPC).
Automated eval refinement: Retune your evals in evals playground by providing feedback.
Markdown now available as a default option for improved readability.
Support for video (traces and spans) in Observe project.

Week of 2025-07-29

What’s New

Features

Edit, Duplicate, and Delete Custom Evals: Now duplicate, edit, or delete evaluations if they are not in use anymore or logic is outdated.
Bulk Annotation/User Feedback: Bulk annotate your observe traces with user feedback directly using API or SDK.
JSON View for Evals Log: Access evals log data in JSON format in evals playground.

Bugs/Improvements

Span name visibility in traces for Observe and Prototype.
Bug fix for adding owner to workspace.
Error handling for evaluations in prompt workbench.
Add variables to system and assistant user roles in prompt workbench.
Speed enhancement for dataset loading.
Error state handling for evaluations in prompt workbench.

Week of 2025-07-21

What’s New

Features

Run button on single cell in evaluations workbench.
Now users can add notes to observe traces.

Bugs/Improvements

Improved search logic to render relevant search results in dataset.
Dataset bugs and API network call optimizations.
Fixed audio icon.
Error handling for network connection issues.
Bug fixes for prompt workbench versioning issues.
Changed the color mapping for deterministic type evals.
Updated loaders for evals playground.
Pagination fix in Observe.
Added clear functionality in add to dataset column mapping fields in Observe.
Clear graph property when Observe changes; fixed thumbs down icon not rendering.
Generate variable bug fix in prompt workbench.
Experiment page break on content tab switch.
Fixed the created_at 30-day filter on evals log section.

Week of 2025-07-14

What’s New

Bugs/Improvements

Prevented overscroll in X direction for entire platform.
Glitch after refreshing while generating sample data.
Error message update for doc uploads and save button status for doc upload.
Variable auto-population issue in compare prompt for multiple versions.
Restricted function tab to LLM spans only.
Error handling for mandatory system prompt for a few LLM models.
Added API null check in all places.
Streaming issues after run prompt when the current prompt version is updated.
Truncate model name in model details drawer.
No rows error on dataset homepage for selective users with low speed.
Easier removal of filters for Observe and Prototype.
Fixed validation in quick filter number-related fields.
Fixed inconsistent fonts in evaluation workbench.
Added loading state to evaluations tab.
Knowledge base name not visible in a few cases issue fixed.
Fixed spacing issue in run prompt.
Link updated for the workbench help section and width update as list.

Week of 2025-05-05

What’s New

Features

Diff view in experiment.
Updated sections for Prototype and Observe.
Error localization in Observe.
[Observe+Prototype] Adding annotations flow for trace view details.
Updated dataset layout and table design.
Higher rate limits to send more traces in Observe.
Sorting in alert.
Support for audio in Observe and datasets.

Bugs/Improvements

Improved error handling in prompt versioning.
Removed unnecessary keys from evaluation outputs.
Better handling of required keys to column names in add_evaluation in dataset.
Removed TraceAI code from FutureAGI SDK - experiment rerun fix.
SSO login issues.
Eval ranking fixes.
Fixed sizing and view issue in dataset when row size is adjusted.
Fixed sidebar item not showing active style when child page is active globally.
Edit integer type has red background in edit field.
Fixed crashing of page when adding JSON value in dataset.
Fixed knowledge base status update issue in case of network issues.
Experiment tab bugs for some browsers and loading state issues on experiment page.
Bug in run insight section of Prototype.

Week of 2025-04-28

What’s New

Features

Prototype / All Runs columns dropdown change.
Prototype / Configure project.
Trace details view for Observe/Prototype.
Allow search in dataset.
Run insights view - evals (deployed without the error modal part).
Improved user flow for synthetic data creation with “best practices” for each input.
Add to dataset flow from Prototype.
API for Gmail account signup.
Enabling search within data.
First-time user experience walkthrough for newly onboarded users.
Quick filters for annotations view in Prototype and Observe.
Compare runs in Prototype.
Diff view for compare dataset.
Enhancement of Observe and Prototype.
Addition of new evals for audio - conversational and completeness evals.

Bugs/Improvements

New choice for Tone Eval if none of the choices are suitable.
Bug on experiment view.
UI/UX bugs - knowledge base and audio support for evals.
Required input field column detail not coming on Audio Quality evals.
UX changes for loader of plan screen.
Changed the color and the percentage of the eval chips in experiment.

Week of 2025-04-21

What’s New

Features

Quick filters in Prototype & Observe.
Added support for knowledge base creation and updating.
Optimization of synthetic data generation.
Evaluate working in compare datasets.

Bugs/Improvements

Rate limit hit better UI.
Audio and knowledge base bug fixes.
Improved wrong evals view.
Fixes in compare dataset.
Changed the logo URL.
Filter issue fixed in Prototype.
Rate limit error message to upgrade the plan.
Experiment optimization under datasets to work faster.
Huggingface error handling for different datasets.

Get Started

Guides

​Features

​Bugs/Improvements

​Features

​Features

​Bugs/Improvements

​What’s New

​Features

​Bugs/Improvements

​What’s New

​Features

​Bugs/Improvements

​What’s New

​Features

​Bugs/Improvements

​What’s New

​Features

​Bugs/Improvements

​What’s New

​Features

​What’s New

​What’s New

​What’s New

​What’s New

​What’s New

​What’s New

​What’s New

​What’s New

​What’s New

​What’s New

​What’s New

​What’s New

​What’s New

​What’s New

​What’s New

Features

Bugs/Improvements

Features

Features

Bugs/Improvements

What’s New

Features

Bugs/Improvements

What’s New

Features

Bugs/Improvements

What’s New

Features

Bugs/Improvements

What’s New

Features

Bugs/Improvements

What’s New

Features

What’s New

What’s New

What’s New

What’s New

What’s New

What’s New

What’s New

What’s New

What’s New

What’s New

What’s New

What’s New

What’s New

What’s New

What’s New