Functional tokens are becoming a crucial component in developing enterprise-grade agentic systems. These tokens enable more efficient and accurate planning by transforming how functions are predicted and executed within language models. Today, we will delve into the methodology behind using functional tokens for agentic systems, highlighting the advantages and demonstrating a practical implementation.
Traditional Planning Methods and Their Challenges
Most companies engage in prompt engineering for planning tasks, where they provide the model with a set of functions and request it to generate a plan in a specific JSON format. This approach often includes Chain of Thought prompting to enhance the model’s reasoning capabilities. However, this method faces several challenges:
- Prompt Size and Hallucination: Adding numerous functions can make the prompt excessively large, leading to hallucinations and incorrect plans by the LLM.
- Data Control: Sending data to the cloud for complex planning poses risks and constraints, as smaller models are typically incapable of handling such tasks.
- Cost: Agentic systems require numerous API calls, resulting in high operational costs.
Introducing Octopus v2: A New Approach
The paper “Octopus v2: On-device language model for super agents” proposes an innovative technique to fine-tune models to predict functions directly. This approach involves fine-tuning the model to output functional tokens, simplifying the planning process. Here’s a brief overview of the methodology:
- Tokenizer Enhancement: Adding new tokens to the tokenizer for each function.
- Fine-Tuning: Training the model on a dataset to predict these tokens based on the input questions.
- Efficiency: This method eliminates the need for analyzing tokens from function descriptions, thereby avoiding the retrieval and processing of these descriptions.
- Model Used: The Gemma 2B model was used for fine-tuning in the Octopus v2 paper.
This approach allows the language model to treat function calling as a standard completion task, enhancing efficiency and accuracy.
Implementing Functional Tokens with Phi3 Model
I have fine-tuned a Phi3 model to handle single functional calls using a small synthetic dataset generated with ChatGPT. The fine-tuning process was completed using Unsloth AI. Below is a practical demonstration of how to implement this:
Step 1: Setup and Installation
Ensure you have the necessary libraries installed. If not, you can install them using pip:
pythonCopy code# Uncomment the following line to install the required libraries
# !pip install -U leafmap transformers torch
Step 2: Tokenizer Enhancement
First, add new tokens to the tokenizer for each function:
pythonCopy codefrom transformers import AutoTokenizer
tokenizer = AutoTokenizer.from_pretrained("openai/phi-3")
new_tokens = ["<function1>", "<function2>", "<function3>"]
tokenizer.add_tokens(new_tokens)
Step 3: Fine-Tuning the Model
Fine-tune the model on a dataset to predict the functional tokens. Here is a simplified example:
pythonCopy codefrom transformers import AutoModelForCausalLM, Trainer, TrainingArguments
model = AutoModelForCausalLM.from_pretrained("openai/phi-3")
model.resize_token_embeddings(len(tokenizer))
# Assuming you have a dataset prepared for fine-tuning
training_args = TrainingArguments(
output_dir="./results",
per_device_train_batch_size=4,
num_train_epochs=3,
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset, # Replace with your dataset
)
trainer.train()
Step 4: Using the Fine-Tuned Model
After fine-tuning, you can use the model to predict functions directly:
pythonCopy codeinput_text = "What is the weather today?"
input_ids = tokenizer.encode(input_text, return_tensors="pt")
output = model.generate(input_ids, max_length=50)
predicted_function = tokenizer.decode(output[0], skip_special_tokens=True)
print(predicted_function)
Conclusion
Using functional tokens significantly enhances the efficiency and accuracy of agentic systems. By fine-tuning models to predict these tokens directly, we can overcome the challenges of traditional prompt engineering methods, such as prompt size limitations, data control issues, and high costs. The methodology outlined in the Octopus v2 paper and demonstrated with the Phi3 model provides a practical approach to implementing functional tokens in enterprise-grade systems.
For more details and the full implementation, check out the GitHub repository and the Octopus v2 paper.