Traditional methods of manually processing and understanding complex documents such as contracts, invoices, legal briefs, medical records, etc. are both time-consuming and expensive and prone to human error. Unstract, a no-code AI-powered platform, is designed to mitigate these challenges occurring in the domain of document processing. By utilizing LLMs and advanced OCR techniques, it offers a streamlined solution for automating the complete document lifecycle, from ingestion and extraction to transformation and export. This article explores Unstract and showcases a practical implementation of it.
Table of Contents
- Understanding Unstract
- Overview of LLMWhisperer
- Hands-on Implementation of Unstract
Understanding Unstract
Unstract is a no-code platform that assists in automating and solving complex business processes involving complicated documents with a human in the loop. It is primarily a culmination of intelligent document processing and robotic process automation systems that have increased capability due to the usage of large language models. It is available in three different editions for users to experiment and deploy their LLM workflows which can process and parse documents with ease and without the need of programming.
The three editions available are the Unstract cloud, open-source, and on-premise. The cloud edition is a fully managed, hosted version which is the easiest way for users to get started and experiment with the platform. It offers a 14-day trial where users can experience enterprise-only features such as LLMChallenge, SinglePass Extraction, Summarized Extraction, Human Quality Review, and SSO Support. These features are also available in the open-source and on-premise editions.
The LLMChallenge uses two LLMs to give output and assist users in comparing the LLMs. SinglePass Extraction is the technique used to optimize the process of information extraction from documents using LLMs. It combines all the user prompts into a single, large prompt instead of sending multiple individual prompts to the LLM for each piece of information that needs to be extracted, thereby optimizing and reducing the token usage and saving costs. Summarized extraction, on the other hand, refers to the process of using LLMs to extract key information from documents and present it in a concise and organized way. It goes beyond simple extraction as it involves understanding the context and relationships within the document to provide a meaningful summary of the extracted information.
Human Quality Review is another feature, which means a side-by-side comparison of extracted values and source documents with source segment highlights for human review. SSO support, another feature, refers to a system that allows users to access multiple applications with a single set of login credentials, increasing the degree of convenience and security while managing user access.
Unstract open-source edition allows users to test and try its features without any subscription or account creation. The users can clone the official GitHub repository (https://github.com/Zipstack/unstract) and run it locally with all the features and services. The on-premise edition can be installed on any infrastructure that supports Docker or Kubernetes and it supports the three major cloud service providers – AWS, Azure & GCP.
Unstract can help and support in automating complex business processes involving long and complex documents with human review based on the following flow –
It features a Prompt Studio, a no-code environment designed for handling complex documents. Prompt Studio enables users to engineer their prompts supporting custom document types, a combination of multiple LLMs, vector DBs, embedders, and extraction tools such as LlamaParse. It also provides prompt monitoring, success evaluation across multiple document samples, and building structured information from unstructured data files.
Workflows in Unstract can utilize different data sources for efficient unstructured data retrieval such as AWS S3, Dropbox, GDrive, Google Cloud Storage, SFTP/SSH, Azure Cloud Storage, etc. Additionally, users can also connect and transfer their processed, structured data to platforms such as MariaDB, Snowflake, Redshift, MSSQL/MySQL, OracleDB, PostgreSQL, and BigQuery.
The workflows can also be deployed as APIs, allowing users to POST unstructured documents to the API and receive structured JSON data in response. The workflow deployment can be done as unstructured data APIs, unstructured data ETL pipelines, or custom Q&A apps.
Overview of LLMWhisperer
LLMWhisperer is another technology by Unstract that can present data from documents of different designs and formats to LLMs in a way that they can be understood. LLMWhisperer is available as an API that can be integrated with existing systems for preprocessing documents before they are passed as input to the LLM. Unstract offers LLMWhisperer in the text extractor category under their dashboard along with other parsers.
The recommended use cases of LLMWhisperer can be understood by the image provided below –
Hands-on Implementation of Unstract
The following implementation steps are for the cloud account (free trial) and for an open-source account (programmatic way).
Cloud Account (Free trial for 14 days) –
Step 1: Create a free account and log in to visit the dashboard, make sure you select the Unstract option and then get started with its cloud edition –
Step 2: Setting the LLM, Vector DB, Embedding, and Text Extractor –
By default Azure GPT-4o LLM is offered, this can be changed to other LLMs by selecting New LLM Profile –
If we select OpenAI, we can set the LLM required based on the API keys and other parameters and test the connection –
Once the test connection returns success, click on the submit button to add it to your LLM list –
By default, Unstruct cloud account provides Postgres Free Trial VectorDB. We can add our own Vector DB using the same method we used for adding an LLM –
Let’s use Pinecone and provide the necessary details such as API key, Region, and Cloud Service Provider for the database –
Click on the submit button once the test connection is successful –
Using the methods for LLM and Vector DB setup, we can also set up embedding and text extractors (such as LlamaParse), for this tutorial, we will go with the default setup.
Step 3: Click on Prompt Studio on the left panel and select a new project –
Now, we will set the LLM, vector DB, embedding and text extractor profile using the settings option present on the top right –
Step 4: Once the LLM profile is set, we will upload our documents for parsing using the manage documents option –
Step 5: Add a new prompt – “What is the name of the issuer or the bank who has issued this credit card?”
Select run all LLMs for all documents to see the output –
Step 6: Checking the output for another prompt – “What is the name of the client and also mention closing date and account ending. “
The output is correct as per the pdf document –
Open-source Programmatic Way –
Prerequisites for programmatic way to run the open-source version:
Make sure the following prerequisites are installed on the system –
- Docker
- Docker Compose
- GIT
Step 1: Clone the official GIT repo using the command: git clone https://github.com/Zipstack/unstract.git –
Step 2: Change the directory and execute the command – ./run-platform.sh (make sure Docker is up and running)
Step 3: Visit http://frontend.unstract.localhost and enter the username, and password unstract to log in.
Step 4: Log in using the username & password as unstract to open the dashboard, once the login is done, the tool can be used just like the cloud service as discussed above forthe cloud account.
Final Words
Unstract represents a no-code approach in document processing that can empower businesses of all sizes to automate their complex workflows without requiring specialized technical expertise. By using LLMs, vector DBs, and embeddings at the backend, Unstract goes beyond a simple data extraction. It understands the context and relationships within documents to deliver accurate and meaningful results. This utility is very important for business in terms of handling and working with unstructured data.