Waverley created a smart chat-bot based on generative AI that can be integrated into any messenger to provide assistance to users that will resemble human interaction using natural language.
The product is a generic chat-bot that can be integrated into any messenger application and adapted to more specific needs. Its main function is to provide answers to user’s questions, recognising their intention and reaction to the provided answers. The bot can work with any data storage, website(s), documents as a source of information. For example, the chat-bot can be connected to an online shop’s database to provide answers about the goods to potential buyers like a consultant. Or, if connected to a company’s file drive or Google Drive, by scraping data from internal company documents, the chat-bot can provide specific answers about the company’s guidelines, policies, and other operational activity to employees as a personal assistant. Also, the chat bot is able to generate a piece of text (e.g. an email) following the indicated manner and style based on a short description from the user. Particular attention is paid to information security in order to prevent any data leaks.
At the basis, the chat-bot consists of a set of data collectors and pipelines (microservices written in GoLang and Python that can use Large Language Models – LLMs) to collect information from the provided databases, files, websites, etc. and process it to a required format. This way, large documents are split into digestible chunks of summarized and essential information using the ML techniques. Upon being processed in an ML model, paragraphs or pages of text (these can be media files as well) are transformed into embeddings – unique numeric vectors (large vectors of floating numbers) that represent a particular piece of text processed by a specific ML model. These numeric vectors are stored in vector databases, such as Redis, where they are further used to perform Vector Similarity Search.
When a user interacts with the chat-bot using natural language, their input is processed by the same ML model and also transformed into a numeric vector. This numeric vector is then used in the vector similarity search in the vector database to find several pieces of text/documents that would be the most relevant to the user’s input. This process specifies the context for further processing of the text by OpenAI’s algorithms and it will return an answer generated in natural language.
Also, to save costs on sending requests to the OpenAI’s server, it’s possible to perform vector similarity search in a cache vector database. This way, if a user generates a request with an intention that is similar to the question answered earlier (their numeric vectors will be similar), the system will understand this by doing the vector similarity search in cache. Thus, the chat-bot will use the response previously returned by OpenAI to answer the new user’s request without sending a new request to OpenAI.
On the client side, there is an API Gateway implemented in GoLang. It provides at least three interfaces for communication with the server: sockets (used by Slack in our case or other messengers), REST APIs for web interface and GRPC APIs for mobile interface. The API Gateway has a local cache and a logs storage for audit or troubleshooting purposes.
The core of the system is the chat-bot engine which performs several key functions:
Also, the chatbot engine can interact with third-party APIs and agents to automate business operations that do not require using ML models and can follow standard business rules, such as complete a purchase or create a ticket for the support team, through a decision making service, integration service, and action executor service implemented with GoLang.
The product is partially based on OpenAI’s algorithms, particularly for processing context and generating relevant natural language responses.
Meanwhile, another part of the system is based on open-source ML models, such as Llama, Llama 2, GPT4All, which are used in scraping, initial processing, and classification of the data from the sources provided to the system before it is fed to the OpenAI’s algorithms. Firstly, this helps to cut costs on using OpenAI’s resources. Secondly, they run locally and help ensure that private and business-critical information is not made publicly available. In addition, this supports the work of the chat-bot in regions where OpenAI’s Chat-GPT is blocked.
Because of using the Machine Learning technologies, the system requires a significant amount of computing resources to run, which entails hosting in the cloud or a powerful enough local server.
The chatbot can be integrated into a messenger or website and will provide the following functionality:
The mechanism allows to manually classify certain information as confidential or non-confidential to filter out information that cannot be shared with OpenAI and thus protect business-critical data. Moreover, if the chat-bot is used internally by a company, information contained in the returning responses from the chat-bot to a user will also be filtered based on this user’s role and access level. Considering the fact that many businesses restrict the use of OpenAI’s Chat-GPT due to privacy concerns, losing the opportunity to automate and streamline many operations, Waverley’s solution provides a considerable advantage.
Waverley developed a generic chat-bot system relying on OpenAI’s resources but also able to function autonomously using the locally run ML models, which provides additional information security to business owners and allows them to meet rigid privacy regulations. The system can be adapted to a variety of business needs and, as compared to the OpenAI’s solution, offers more control over confidentiality settings and is more cost-efficient.