Computer programs that can act autonomously in an environment to achieve a user-specific goal are known as AI agents. These agents can be built using large language models where LLMs understand the language, apply reasoning and generate data. These LLM-based AI agents can be queried using prompts that can guide the LLM towards user-specific goals and requirements. By observing how the LLM interacts with prompts and data, a user can gain insights into its strengths and weaknesses. AgentOps is one such tool that can be used to observe and examine AI agents and support generative AI agent building in different contexts.
This article explores AgentOps and how it can be used for monitoring, debugging and cost-tracking AI agents.
Table of Content
- AI Agent Monitoring and its Importance
- Understanding AgentOps and its Features
- Hands-on Implementation of AgentOps
AI Agent Monitoring and its Importance
Monitoring an AI agent involves overseeing the performance and behaviour to ensure they operate effectively. AI agents, if not observed and monitored properly, can malfunction, produce inaccurate results or generate biased outputs. Challenges also include explainability, regulatory compliance and secure operations. These challenges can be resolved by implementing a proper observation and monitoring approach while developing LLM-based AI agents.
Observing and Examining agents can be accomplished using a wide array of techniques:
Logging and Metrics – Tracking the agent’s activity, logging execution phases, outputs, resource usage and cost analysis are important aspects of logging and metrics. It helps in understanding an agent’s overall operation and how it can be improved.
Visualisation Tools – Monitoring data can be visualised through dashboards and reports providing a clear idea about the agent’s performance over time.
Replay Analytics – Replaying user interactions, especially in LLM-based agents reveals usability issues and explains the agent’s behaviour.
Specialised Toolkits – Several toolkits are available, that offer exhaustive monitoring and evaluation of AI agents such as AgentOps, LangFuse, Phoenix, etc.
Understanding AgentOps and its Features
AgentOps is an all-inclusive platform for building reliable AI agents using monitoring, testing and replay analytics. AgentOps dashboard examines AI agents through sessions. A session encloses a singular execution of LLM agent workflow containing all the agents, LLMs, agent actions, and responses under one container for easy management. Session attributes consist of ID, Project ID, Timestamps and Tags.
AgentOps provides features such as Session drill-down and session overview which enlists the recorded sessions, their necessary details and meta-analysis.
AgentOps utilises the concept of events which are agent executions based on LLMs, tools and agentic actions. An event consists of attributes such as ID, Session ID, Agent ID, Parameters, Timestamps, LLM Model, Prompt Messages, Tokens, Cost, Thread ID, and Logs.
AgentOps can perform LLM call tracking, agent tracking and event recording using its SDK decorators – @track_agent(), @record_function. The AgentOps SDK can detect Openai, LiteLLM and Cohere as installed modules and starts tracking their usage. AgentOps also supports and integrates the Autogen and CrewAI frameworks for building multi-agent applications.
Hands-on Implementation of AgentOps
Step 1 – Visit https://www.agentops.ai/ and create an account.
Step 2 – Visit the settings and create a new project, which will also automatically generate an API key (you can also use the default project).
Step 3 – Generate the API, then use a development environment for further coding. Install the OpenAI, Cohere, and AgentOps Python libraries.
!pip install agentops openai cohere
Step 4 – Import the installed libraries
from openai import OpenAI
import cohere
from google.colab import userdata
import agentops
import os
Step 5 – Define the API keys: OpenAI API, Cohere API and AgentOps API. Initialise AgentOps using agentops.init() function to start tracking agent operations.
OPENAI_API_KEY = userdata.get("OPENAI_API_KEY")
AGENTOPS_API_KEY = userdata.get("AGENTOPS_API_KEY")
COHERE_API_KEY = userdata.get("COHERE_API_KEY")
openai = OpenAI(api_key = OPENAI_API_KEY)
agentops.init(AGENTOPS_API_KEY)
co = cohere.Client(COHERE_API_KEY)
Step 6 – Implement a simple agent using OpenAI’s chat completion and Cohere’s chat stream event, and track it with AgentOps.
message = [
{"role": "system", "content": "you are an assistant"},
{"role": "user", "content": "What is the main premise of the Game of Thrones series?"}
]
response = openai.chat.completions.create(
model = "gpt-3.5-turbo", messages = message, temperature = 0.1, stream = True
)
stream = co.chat_stream(
message = "What is the main premise of the Game of Thrones series?",
connectors = [{"id": "web-search"}]
)
response = ""
for event in stream:
if event.event_type == "text-generation":
response += event.text
Step 7 – Once the agent finishes the text completion/generation based on OpenAI and Cohere, end the AgentOps session to finish the tracking process.
agentops.end_session("Success")
Output
Step 8 – Visit AgentOps’s WebUI using the link shown in the previous output to check the session drill-down and summary features. This feature helps the user examine AI agents comrehensively.
Session Drill-Down and Summary Output 1 – It showcases the key metrics.
Session Drill-Down and Summary Output 2 – It showcases the graphs representing event-based metrics and chat viewer.
Session Drill-Down and Summary Output 3 – Check the session replay for step by step examination of agent’s execution.
The session drill-down menu showcases all the necessary artefacts required to monitor, examine and test our agent based on different criteria such as cost, token usage, and timestamps along with a complete trace of agent actions.
Step 9 – You can implement a Pivot Table using the Pivot Table option in the left panel to cross-tabulate, examine, and compare multiple metrics.
In the Pivot table, we can see total number of sessions, events, number of prompt_tokens, completion_tokens, total cost incurred and examine AI agents.
Final Words
AgentOps is an excellent tool to comprehensively monitor, track and check an AI agent’s performance, which is beneficial in understanding and optimising the agent actions and behavior. The Session Drill-Down and Pivot Table features provide an exhaustive view of the agent execution events thereby making the entire operation explainable.
References
Learn more about AI Agents and LLMs through our handpicked modules: