llama-agents is a new addition to the llama-index family of frameworks, designed to create, iterate and deploy multi-agent AI systems comprehensively and with great efficiency. It employs an async-first architecture that transforms individual AI agents into microservices, allowing them to endlessly operate and process the incoming tasks. This article explores llama-agents and explains it through hands-on implementation.
Table of Contents
- Understanding Async-First Architecture
- Understanding the System Layout of llama-agents
- Hands-on Implementation of llama-agents
Understanding Async-First Architecture
llama-agents employ the use of asynchronous communication, ensuring efficiency when handling multiple tasks simultaneously. This enables agents in a multi-agent system to process and operate on their assigned task portions, based on a larger problem, individually without waiting for each other to finish. Async-first architecture enables the agents to be completely independent without one agent waiting for another agent to finish before starting its operation.
Async-first architecture improves scalability and enhances efficiency as the systems can handle increased workloads more efficiently as the different components, agents in this case, are not blocked waiting for other agents to finish their tasks. Also, the tasks are completed in parallel which leads to a faster overall processing.
Understanding the System-Layout of llama-agents
Each agent in llama-agents acts as a microservice, responsible for specific tasks. This modular structure offers different advantages such as scalability, maintainability and reusability. llama-agents adopt a distributed service-oriented architecture, allowing individual AI agents to collaborate on complex tasks.
llama-agents System Layout
Control Plane is the central hub which is responsible for many crucial tasks such as task management, agent registry, orchestrator and communication management. The control plane receives incoming tasks, breaks them down into smaller subtasks and assigns them to relevant agents. Control plane maintains a registry of all the agents registered within the system, keeping track of their capabilities and availability.
Orchestrator is the decision-making engine within the control plane. It determines the sequence of subtasks, the flow of information between the agents and the criteria for completing the overall task. llama-agents offer two primary orchestration types – Agentic Orchestration, the LLM analyzes the tasks and dynamically decides which agents are to be involved and how they should interact, and Explicit Orchestration, the users define a predefined workflow outlining the specific sequence of interactions between the agents for completing the tasks.
Message Queues act as the communication channels between agents and the control plane. Messages are sent and received asynchronously, allowing agents to work independently without waiting for immediate responses. The message queues enable decoupling and fault tolerance.
Each agent, a self-contained unit specializing in specific tasks, acts as an independent microservice in llama-agents. The agents communicate with each other and the control plane asynchronously, designed to handle specific functionalities, promote modularity, enhance scalability and reusability.
llama-agents allow for integrating external tools as services. These tools can handle computationally expensive tasks or specialized functionalities that agents might not support natively. Tool services can be invoked by agents through the control plane, extending the overall capability of the system.
Hands-on Implementation of llama-agents
Step 1 – Install the required libraries.
!pip install llama-agents llama-index-agent-openai llama-index-embeddings-openai
Step 2 – Import the libraries.
from llama_agents import (
AgentService,
ControlPlaneServer,
SimpleMessageQueue,
AgentOrchestrator,
)
from llama_index.core.agent import FunctionCallingAgentWorker
from llama_index.core.tools import FunctionTool
from llama_index.llms.openai import OpenAI
import logging
from google.colab import userdata
import os
import nest_asyncio
nest_asyncio.apply()
Step 3 – Enable the logger for seeing the system operations in output.
logging.getLogger("llama_agents").setLevel(logging.INFO)
Step 4 – Set the OpenAI_API_KEY.
os.environ['OPENAI_API_KEY'] = userdata.get('OPENAI_API_KEY')
Step 5 – Set up the message queue and control plane.
message_queue = SimpleMessageQueue()
control_plane = ControlPlaneServer(
message_queue=message_queue,
orchestrator=AgentOrchestrator(llm=OpenAI())
)
Step 6 – Create a user-defined tool for converting the agent into a microservice
def get_the_syno() -> str:
"""Returns the word synonym."""
return "The synonym of the word Artificial Intelligence is: Expert Systems."
tool_1 = FunctionTool.from_defaults(fn=get_the_syno)
Step 7 – Define the agent and create the agent service using agent name, message_queue, description, service_name, host and port parameters.
worker1 = FunctionCallingAgentWorker.from_tools([tool_1], llm=OpenAI())
agent1 = worker1.as_agent()
agent_service_1 = AgentService(
agent=agent1,
message_queue=message_queue,
description="Word Synonym Finder",
service_name="synonym_finder",
host = "localhost",
port = 8003
)
Step 8 – Define a human consumer for handling the published result and launch the service.
from llama_agents import ServerLauncher, CallableMessageConsumer
def handle_result(message) -> None:
print(f"Got result:", message.data)
human_consumer = CallableMessageConsumer(
handler=handle_result, message_type="human"
)
launcher = ServerLauncher(
[agent_service_1],
control_plane,
message_queue,
additional_consumers=[human_consumer]
)
launcher.launch_servers()
Step 9 – Use the real-time monitoring to interact with the agent service and inject task queries for agent responses. The code below should be executed on another terminal while the agent service is running.
llama-agents monitor --control-plane-url http://127.0.0.1:8000
Output
The monitoring tool is a point-and-click terminal application, which enlists the agentic services and creates the job based on user queries.
On giving the task query (creating a new job) – synonym for word “knowledge”? The output comes as expertise.
Final Words
llama-agents provides an efficient, scalable and a reusable framework for building complex agent-based systems through the usage of async-first architecture combined with flexible orchestration options and human-in-the-loop integration which enables users to develop and deploy multi-agent systems with ease. The framework is still in its infancy stage but will surely improve in future as per the roadmap issued by llama-index team.
References
Learn more about AI Agents and LLMs through our hand-picked course modules: