Deep Dives

Observing and Examining AI Agents through AgentOps

Explore how AgentOps monitors, debugs, and tracks costs for LLM-based AI agents in various contexts.

Explore more from ADaSci

Navigating Career Evolution: Aligning Competency Skill Sets with Dynamic Technology Landscape

Responsible AI in Action: LLM Models and Ethical Best Practices

Democratize data analysis and insights generation through the seamless translation of Natural Language into SQL queries

What is Temporally Adaptive Interpolated Distillation (TAID)?

StreamSpeech Deep Dive For Speech-to-Speech Translation

A Deep Dive into J1’s Innovative Reinforcement Learning

PyTorch Tabular: A Framework for Deep Learning with Tabular Data

What does it take to deploying an LLM at major cloud service providers?

Deep Learning in Social Media Recommendation Systems: Insights from Pinterest’s Senior ML Engineer

Adapting Large Language Models for Indian Languages

Computer programs that can act autonomously in an environment to achieve a user-specific goal are known as AI agents. These agents can be built using large language models where LLMs understand the language, apply reasoning and generate data. These LLM-based AI agents can be queried using prompts that can guide the LLM towards user-specific goals and requirements. By observing how the LLM interacts with prompts and data, a user can gain insights into its strengths and weaknesses. AgentOps is one such tool that can be used to observe and examine AI agents and support generative AI agent building in different contexts.

This article explores AgentOps and how it can be used for monitoring, debugging and cost-tracking AI agents.

Table of Content

AI Agent Monitoring and its Importance
Understanding AgentOps and its Features
Hands-on Implementation of AgentOps

AI Agent Monitoring and its Importance

Monitoring an AI agent involves overseeing the performance and behaviour to ensure they operate effectively. AI agents, if not observed and monitored properly, can malfunction, produce inaccurate results or generate biased outputs. Challenges also include explainability, regulatory compliance and secure operations. These challenges can be resolved by implementing a proper observation and monitoring approach while developing LLM-based AI agents.

Observing and Examining agents can be accomplished using a wide array of techniques:

Logging and Metrics – Tracking the agent’s activity, logging execution phases, outputs, resource usage and cost analysis are important aspects of logging and metrics. It helps in understanding an agent’s overall operation and how it can be improved.

Visualisation Tools – Monitoring data can be visualised through dashboards and reports providing a clear idea about the agent’s performance over time.

Replay Analytics – Replaying user interactions, especially in LLM-based agents reveals usability issues and explains the agent’s behaviour.

Specialised Toolkits – Several toolkits are available, that offer exhaustive monitoring and evaluation of AI agents such as AgentOps, LangFuse, Phoenix, etc.

Understanding AgentOps and its Features

AgentOps is an all-inclusive platform for building reliable AI agents using monitoring, testing and replay analytics. AgentOps dashboard examines AI agents through sessions. A session encloses a singular execution of LLM agent workflow containing all the agents, LLMs, agent actions, and responses under one container for easy management. Session attributes consist of ID, Project ID, Timestamps and Tags.

AgentOps provides features such as Session drill-down and session overview which enlists the recorded sessions, their necessary details and meta-analysis.

AgentOps utilises the concept of events which are agent executions based on LLMs, tools and agentic actions. An event consists of attributes such as ID, Session ID, Agent ID, Parameters, Timestamps, LLM Model, Prompt Messages, Tokens, Cost, Thread ID, and Logs.

AgentOps can perform LLM call tracking, agent tracking and event recording using its SDK decorators – @track_agent(), @record_function. The AgentOps SDK can detect Openai, LiteLLM and Cohere as installed modules and starts tracking their usage. AgentOps also supports and integrates the Autogen and CrewAI frameworks for building multi-agent applications.

Hands-on Implementation of AgentOps

Step 1 – Visit https://www.agentops.ai/ and create an account.

Step 2 – Visit the settings and create a new project, which will also automatically generate an API key (you can also use the default project).

Step 3 – Generate the API, then use a development environment for further coding. Install the OpenAI, Cohere, and AgentOps Python libraries.

!pip install agentops openai cohere

Step 4 – Import the installed libraries

from openai import OpenAI
import cohere
from google.colab import userdata
import agentops
import os

Step 5 – Define the API keys: OpenAI API, Cohere API and AgentOps API. Initialise AgentOps using agentops.init() function to start tracking agent operations.

OPENAI_API_KEY = userdata.get("OPENAI_API_KEY")
AGENTOPS_API_KEY = userdata.get("AGENTOPS_API_KEY")
COHERE_API_KEY = userdata.get("COHERE_API_KEY")

openai = OpenAI(api_key = OPENAI_API_KEY)
agentops.init(AGENTOPS_API_KEY)
co = cohere.Client(COHERE_API_KEY)

Step 6 – Implement a simple agent using OpenAI’s chat completion and Cohere’s chat stream event, and track it with AgentOps.

message = [
   {"role": "system", "content": "you are an assistant"},
         {"role": "user", "content": "What is the main premise of the Game of Thrones series?"}
   ]

response = openai.chat.completions.create(
   model = "gpt-3.5-turbo", messages = message, temperature = 0.1, stream = True
   )

stream = co.chat_stream(
   message = "What is the main premise of the Game of Thrones series?",
   connectors = [{"id": "web-search"}]
)

response = ""

for event in stream:
 if event.event_type == "text-generation":
   response += event.text

Step 7 – Once the agent finishes the text completion/generation based on OpenAI and Cohere, end the AgentOps session to finish the tracking process.

agentops.end_session("Success")

Output

Step 8 – Visit AgentOps’s WebUI using the link shown in the previous output to check the session drill-down and summary features. This feature helps the user examine AI agents comrehensively.

Session Drill-Down and Summary Output 1 – It showcases the key metrics.

Session Drill-Down and Summary Output 2 – It showcases the graphs representing event-based metrics and chat viewer.

Session Drill-Down and Summary Output 3 – Check the session replay for step by step examination of agent’s execution.

The session drill-down menu showcases all the necessary artefacts required to monitor, examine and test our agent based on different criteria such as cost, token usage, and timestamps along with a complete trace of agent actions.

Step 9 – You can implement a Pivot Table using the Pivot Table option in the left panel to cross-tabulate, examine, and compare multiple metrics.

In the Pivot table, we can see total number of sessions, events, number of prompt_tokens, completion_tokens, total cost incurred and examine AI agents.

Final Words

AgentOps is an excellent tool to comprehensively monitor, track and check an AI agent’s performance, which is beneficial in understanding and optimising the agent actions and behavior. The Session Drill-Down and Pivot Table features provide an exhaustive view of the agent execution events thereby making the entire operation explainable.

References

Learn more about AI Agents and LLMs through our handpicked modules:

Building Multi-Agent LLMs with AutoGen

₹3,432.00

Add to cart
Autonomous AI Agents and AI Copilots

₹5,148.00

Add to cart
Product on sale

Generative AI Crash Course with Hands-on Implementations

Original price was: ₹3,432.00.Current price is: ₹0.00.

Add to cart

Sachin Tripathi

Sachin Tripathi is the Manager of AI Research at AIM, with over a decade of experience in AI and Machine Learning. An expert in generative AI and large language models (LLMs), Sachin excels in education, delivering effective training programs. His expertise also includes programming, big data analytics, and cybersecurity. Known for simplifying complex concepts, Sachin is a leading figure in AI education and professional development.

The Chartered Data Scientist Designation

Achieve the highest distinction in the data science profession.

Elevate Your Team's AI Skills with our Proven Training Programs

Strengthen Critical AI Skills with Trusted Generative AI Training by Association of Data Scientists.

Our Latest Courses

Observing and Examining AI Agents through AgentOps

Explore more from ADaSci

Table of Content

AI Agent Monitoring and its Importance

Understanding AgentOps and its Features

Hands-on Implementation of AgentOps

Final Words

References

Sachin Tripathi

The Chartered Data Scientist Designation

Elevate Your Team's AI Skills with our Proven Training Programs

Our AI Courses

Agentic AI in Production: Hands-On Workshop

Agentic AI Workforce Readiness Strategies for CXOs

MCP and A2A – The AI Protocols for Next-Gen Agent Ecosystems

Our Accreditations

Get global recognition for AI skills

Chartered Data Scientist (CDS™)

The highest distinction in the data science profession. Not just earn a charter, but use it as a designation.

Certified Data Scientist - Associate Level

Global recognition of data science skills at the beginner level.

Certified Generative AI Engineer

An upskilling-linked certification initiative designed to recognize talent in generative AI and large language models

Join thousands of members and receive all benefits.

Become Our Member

We offer both Individual & Institutional Membership.

The power of intelligence to propel humanity and make a difference

Our Accrediations

CDS Program

Membership

About

For Organizations

Journal