Messenger-Integrated Chat-Bot Based on NLP and GPT

Waverley created a smart chat-bot based on generative AI that can be integrated into any messenger to provide assistance to users that will resemble human interaction using natural language.

Project Analysis

The product is a generic chat-bot that can be integrated into any messenger application and adapted to more specific needs. Its main function is to provide answers to user’s questions, recognising their intention and reaction to the provided answers. The bot can work with any data storage, website(s), documents as a source of information. For example, the chat-bot can be connected to an online shop’s database to provide answers about the goods to potential buyers like a consultant. Or, if connected to a company’s file drive or Google Drive, by scraping data from internal company documents, the chat-bot can provide specific answers about the company’s guidelines, policies, and other operational activity to employees as a personal assistant. Also, the chat bot is able to generate a piece of text (e.g. an email) following the indicated manner and style based on a short description from the user. Particular attention is paid to information security in order to prevent any data leaks.

Waverley Solution
(Architecture)

Messenger-Integrated Chat-Bot Based on NLP and GPT image

Data Collection

At the basis, the chat-bot consists of a set of data collectors and pipelines (microservices written in GoLang and Python that can use Large Language Models – LLMs) to collect information from the provided databases, files, websites, etc. and process it to a required format. This way, large documents are split into digestible chunks of summarized and essential information using the ML techniques. Upon being processed in an ML model, paragraphs or pages of text (these can be media files as well) are transformed into embeddings – unique numeric vectors (large vectors of floating numbers) that represent a particular piece of text processed by a specific ML model. These numeric vectors are stored in vector databases, such as Redis, where they are further used to perform Vector Similarity Search.

Defining the Context

When a user interacts with the chat-bot using natural language, their input is processed by the same ML model and also transformed into a numeric vector. This numeric vector is then used in the vector similarity search in the vector database to find several pieces of text/documents that would be the most relevant to the user’s input. This process specifies the context for further processing of the text by OpenAI’s algorithms and it will return an answer generated in natural language.

Caching

Also, to save costs on sending requests to the OpenAI’s server, it’s possible to perform vector similarity search in a cache vector database. This way, if a user generates a request with an intention that is similar to the question answered earlier (their numeric vectors will be similar), the system will understand this by doing the vector similarity search in cache. Thus, the chat-bot will use the response previously returned by OpenAI to answer the new user’s request without sending a new request to OpenAI.

Interface

On the client side, there is an API Gateway implemented in GoLang. It provides at least three interfaces for communication with the server: sockets (used by Slack in our case or other messengers), REST APIs for web interface and GRPC APIs for mobile interface. The API Gateway has a local cache and a logs storage for audit or troubleshooting purposes.

Chat-Bot Engine

The core of the system is the chat-bot engine which performs several key functions:

Intent classification based on an ML model to understand what a user wants to perform (e.g. get information, make a purchase, etc.) and initiate internal actions to complete this task (e.g. query the database to provide the requested information, add item to the shopping cart).
Transformation of the user input into the embeddings (numeric vectors mentioned above) using another ML model to pass it to the vector database and perform the vector similarity search.
Communication with other types of storage (e.g. a document DB or Amazon Buckets) in order to retrieve data related to an embedding but not stored in the vector database (e.g. images, links, additional product description, etc.)
Sentiment analysis for the system to detect how a user feels about an interaction with the chat-bot and act accordingly (e.g. redirect the user to an operator) and in order to improve the ML model responsible for providing relevant responses.

Also, the chatbot engine can interact with third-party APIs and agents to automate business operations that do not require using ML models and can follow standard business rules, such as complete a purchase or create a ticket for the support team, through a decision making service, integration service, and action executor service implemented with GoLang.

ML models

The product is partially based on OpenAI’s algorithms, particularly for processing context and generating relevant natural language responses.
Meanwhile, another part of the system is based on open-source ML models, such as Llama, Llama 2, GPT4All, which are used in scraping, initial processing, and classification of the data from the sources provided to the system before it is fed to the OpenAI’s algorithms. Firstly, this helps to cut costs on using OpenAI’s resources. Secondly, they run locally and help ensure that private and business-critical information is not made publicly available. In addition, this supports the work of the chat-bot in regions where OpenAI’s Chat-GPT is blocked.

Hosting

Because of using the Machine Learning technologies, the system requires a significant amount of computing resources to run, which entails hosting in the cloud or a powerful enough local server.

Feature Development

The chatbot can be integrated into a messenger or website and will provide the following functionality:

Recognise the intention of a user.
Identify the context of interaction.
Provide answers to specific user’s questions based on the resources of information available to it.
Provide summaries of requested documents, webpages, etc. which are set as a source of information for the chatbot.
Provide a reference to the source of information.
Translate text.
Generate a piece of text (email, post, etc.) following the user’s instructions.
Integrate with third-party systems (e.g. Jira) for automation of other business operations based on business rules.

Privacy Protection

The mechanism allows to manually classify certain information as confidential or non-confidential to filter out information that cannot be shared with OpenAI and thus protect business-critical data. Moreover, if the chat-bot is used internally by a company, information contained in the returning responses from the chat-bot to a user will also be filtered based on this user’s role and access level. Considering the fact that many businesses restrict the use of OpenAI’s Chat-GPT due to privacy concerns, losing the opportunity to automate and streamline many operations, Waverley’s solution provides a considerable advantage.

Results

Waverley developed a generic chat-bot system relying on OpenAI’s resources but also able to function autonomously using the locally run ML models, which provides additional information security to business owners and allows them to meet rigid privacy regulations. The system can be adapted to a variety of business needs and, as compared to the OpenAI’s solution, offers more control over confidentiality settings and is more cost-efficient.