AI is redefining industries and transforming the way we communicate with technology. Its full potential is, however, constrained by infrastructural and accessibility issues. Presenting Falcon 3, TII’s most recent large language model (LLM) that is available as open source. Falcon 3, which can operate smoothly on small devices, combines remarkable performance with unparalleled efficiency in an effort to democratise powerful AI. This article offers a thorough guide examining Falcon 3’s architecture, capabilities, and real-world applications.
Table of Content
- Introduction to Falcon 3
- Falcon’s Key Features
- Hands-On Implementation
- Technical Deep Dive
- Enhanced Capabilities
Introduction to Falcon 3
The cutting-edge LLM Falcon 3 redefines efficiency and scalability. It performs exceptionally well on tasks like reasoning, language comprehension, and code generation and comes in four model sizes: 1B, 3B, 7B, and 10B. Falcon 3 guarantees flawless performance even on devices with limited resources because of its quantised variants (GGUF, AWQ, and GPTQ) and optimised decoder-only architecture.
Why Choose Falcon 3?
- High Accessibility: Runs on lightweight infrastructures.
- State-of-the-Art Performance: Surpasses global benchmarks for small LLMs.
- Versatile Applications: Supports generative tasks, conversational AI, and more.
Falcon 3’s Key Features
Key Features Overview
1. Optimized Architecture
It employs a decoder-only design with flash attention and Grouped Query Attention (GQA), reducing memory overhead while enhancing speed and efficiency.
2. Advanced Tokenization
The tokenizer supports an extensive vocabulary of 131K tokens, which is double of Falcon 2 model, enabling superior compression and exceptional performance across diverse tasks.
3. Extended Context Handling
With native training on 32K context size, Falcon 3 excels at processing long and complex inputs.
4. Quantization for Efficiency
Quantized versions (int4, int8, and 1.58 Bitnet) ensure deployment on low-resource environments without performance compromise.
Performance Benchmark
Hands-On Implementation
We will be testing out Falcon using ollama on colab
Step 1: Installing Dependencies
The first stage involves preparing your Colab environment. You’ll need to install two key components:
pciutils: Helps Ollama detect GPU configurations
Ollama installation script: Sets up the Ollama service
!sudo apt update
!sudo apt install -y pciutils
!curl -fsSL https://ollama.com/install.sh | sh
Step 2: Starting the Ollama Service
Since Jupyter Notebooks run code sequentially, we’ll use Python’s threading to run the Ollama service in the background:
import threading
import subprocess
import time
def run_ollama_serve():
subprocess.Popen(["ollama", "serve"])
thread = threading.Thread(target=run_ollama_serve)
thread.start()
time.sleep(5) # Allows service to initialize
Step 3: Pulling a Language Model
Ollama offers a wide range of models. For this article we will be pulling falcon3 10b model.
!ollama pull falcon3:10b
Step 4: Integrating with LangChain
To interact with the model, we’ll use LangChain’s Ollama integration:
from langchain_core.prompts import ChatPromptTemplate
from langchain_ollama.llms import OllamaLLM
from IPython.display import Markdown
template = """Question: {question}
Answer: Let's think step by step."""
prompt = ChatPromptTemplate.from_template(template)
model = OllamaLLM(model="falcon3:10b")
chain = prompt | model
display(Markdown(chain.invoke({"question": "This theorem states that there are no integers \(a\), \(b\), and \(c\) that can satisfy the equation \(a^{n}+b^{n}=c^{n}\) for \(n>2\). Andrew Wiles proved the theorem in 1994. "})))
Output
Step 1: Identify the given information in the question.
The question states that there are no integers (a), (b), and (c) that can satisfy the equation (a^{n}+b^{n}=c^{n}) for (n>2).
Step 2: Recognize the theorem mentioned in the question.
The given information is a statement of Fermat's Last Theorem, which was proposed by Pierre de Fermat in the 17th century.
Step 3: Understand the significance of Fermat's Last Theorem.
Fermat's Last Theorem is a famous problem in number theory, stating that there are no three positive integers (a), (b), and (c) that can satisfy the equation (a^{n}+b^{n}=c^{n}) for any integer value of (n) greater than 2.
Step 4: Identify the person who proved the theorem.
The question mentions that Andrew Wiles proved Fermat's Last Theorem in 1994.
Step 5: Conclude the answer based on the information provided.
Fermat's Last Theorem states that there are no integers (a), (b), and (c) that can satisfy the equation (a^{n}+b^{n}=c^{n}) for (n>2), and it was proven by Andrew Wiles in 1994.
Final answer: Fermat's Last Theorem.
Technical Deep Dive
Training Paradigm
- Trained on 14 trillion tokens, doubling the capacity of its predecessor, Falcon 2.
- Enhanced with multi-stage training to improve reasoning and mathematical capabilities.
Deployment Insights
- Grouped Query Attention (GQA): Optimizes inference by minimizing Key-Value (KV) cache memory.
- Quantized Models: Int4 and Int8 models ensure Falcon 3 runs efficiently without GPU acceleration.
Model Specifications
Advancements in Falcon 3
Enhanced Capabilities
The Falcon 3 family excels across scientific, reasoning, and general knowledge tasks, as demonstrated by internal evaluations using lm-evaluation-harness. Key highlights include:
- Math Capabilities: 10B-Base achieves 22.9 on MATH-Lvl5 and 83.0 on GSM8K, showcasing its ability to tackle complex mathematical problems.
- Coding Proficiency: 10B-Base scores 73.8 on MBPP, while 10B-Instruct achieves 45.8 on Multipl-E, demonstrating strong generalization in programming-related tasks.
- Extended Context Handling: Models support up to 32K tokens (8K for Falcon3-1B), with 10B-Instruct scoring 86.3 on BFCL.
- Improved Reasoning: 7B-Base and Falcon3-10B-Base achieve 51.0 and 59.7 on BBH, reflecting advanced reasoning capabilities.
- Scientific Knowledge Expansion: Performance on MMLU benchmarks highlights domain-specific strengths, with Falcon3-7B-Base scoring 67.4/39.2 (MMLU/MMLU-PRO) and Falcon3-10B-Base achieving 73.1/42.5 (MMLU/MMLU-PRO).
Final words
Falcon 3 sets a new standard in accessible AI, offering unprecedented performance and versatility. Whether you’re a researcher exploring innovative applications or a developer building efficient AI systems, It empowers you to achieve more with less. Start your journey today by downloading Falcon 3 and exploring its capabilities.