InterSystems Developer Community & SerenityGPT Integration

Introduction

This document describes, at a high level, how SerenityGPT provides smart responses for "DC AI" at https://community.intersystems.com/ask-dc-ai based on community posts and the official IRIS documentation. It contains an overview of the architecture, deployment details, functional overview, and performance considerations.

Notice

This is a private document not part of the SerenityGPT Knowledge Hub. It is maintained by the SerenityGPT team and hosted here for convenience but is not publicly available or visible. If you are reading this document, it's because you are a customer of SerenityGPT or someone has shared the URL of this document with you directly.

If you believe that this document was sent to you in error or have any other concerns, please contact the SerenityGPT team directly at support@serenitygpt.com.

High-level architecture diagram

The following diagram illustrates the integration between the SerenityGPT managed service and the InterSystems Developer Community portal:

architecture

Overview

The operation of the SerenityGPT platform and InterSystems Developer Community can be explained from two perspectives: search and indexing.

Search operation

A user logs into the InterSystems Developer Community portal.
The user searches for information by typing a question into the DC AI search box at https://community.intersystems.com/ask-dc-ai
The HTML page issues an API request to the managed instance of SerenityGPT.
The SerenityGPT Natural Language Search service sends the question to the SerenityGPT embedding service, which rapidly returns the vector representation of the query.
The SerenityGPT NLS service uses the vector representation of the question to query the InterSystems IRIS Vector DB for retrieval of relevant posts and documentation.
The database locates relevant indexed reference docs (posts, documents, etc.) using the HNSW index and passes those references back to the SerenityGPT NLS service.
The NLS service sends the query and the relevant documentation bits and community posts to the AI LLM service (via Azure OpenAI API).
The AI LLM processes the results and context, understands the context and meaning, and formulates an explanatory relevant answer alongside the references to source posts and documentation pages.
The SerenityGPT NLS service response to the API request with this explanation and references to source posts and documentation pages.
The user is presented with these as an answer to their query in the DC AI portal.

Indexing operation

The SerenityGPT indexing service leverages InterSystems Developer Community API to retrieve new and updated posts.
When new content is found, it is analyzed and split into chunks.
The chunks are sent to an embedding service (the use of the GPU-accelerated service vs CPU-based service is determined by availability and the volume of data that needs to be processed).
Vector representations of data chunks are returned and stored in the InterSystems IRIS Vector DB for later retrieval at search time.
The InterSystems IRIS db periodically rebuilds its HNSW index.

Deployment details

All SerenityGPT infrastructure is deployed on Microsoft Azure in SerenityGPT's private tenant in the UK South region, with the exception of the GPU-accelerated embedding service, which is hosted on Google Cloud Platform (GCP).

As of October 15, 2024, the active deployment of SerenityGPT operates on an instance of InterSystems IRIS Community Edition, deployed as a Docker image running on the same VM as the instance of the SerenityGPT NLS service (initially co-located for performance).

There are two environments configured: tiger (development) and tuna (production). - The development environment is used for testing and validation of new data sources. It runs a separate instance of the InterSystems IRIS database. - Machine Type: Standard E4-2as v5 (2 vCPUs, 32 GiB memory). - Location: UK South 1.

The production environment is dedicated for production workloads. It runs an isolated instance of the InterSystems IRIS database.
- Machine Type: Standard D4s v3 (4 vCPUs, 16 GiB memory).
- Location: UK South 1.

The upcoming release of SerenityGPT, currently undergoing testing, intends to use an instance of InterSystems IRIS Enterprise Edition, deployed directly to Azure Container Services and backed directly by durable Azure cloud storage.

Performance considerations

Significant improvements have been made in the performance of the SerenityGPT platform, especially in the area of IRIS integration and vector search performance. The database now retrieves search results in under 50ms, which improves on the performance of the previous generation deployment by a factor of 100. This is pending testing and validation but will be deployed into production in the coming weeks.

Next steps

For more information on deploying SerenityGPT, refer to the following resources: