Build a Low Code RAG Application #JoelKallmanDay
In previous blog posts, I described how you can create a question and answer application using LangChain and the Oracle Cloud Infrastructure (OCI) Generative AI service. With the Oracle Database 23ai release, we looked at the power of AI Vector Search. This no-cost, built-in database feature provides another alternative to create Retrieval Augmented Generation (RAG) applications using just PL/SQL. However, both solutions require a good amount of code. With the recent availability of the OCI Generative AI Agents service, you can rapidly create a RAG application using low code!
#JoelKallmanDay
Before I continue, I would like to mention that this blog post is published in memory of Joel Kallman. Joel influenced many of us in the community, and I am no exception. Every year in October, the Oracle community comes together to publish a blog post that celebrates Oracle technologies or the community. This year, #JoelKallmanDay falls on October 16.
Today, I celebrate yet another Oracle product that makes developers look awesome! And I am thankful for the work that the team behind this product to make creating RAG applications fast and easy.
The OCI Generative AI Agents Service
The OCI Generative AI Agents (GenAI Agents) service was made generally available on September 25, 2024. This service allows developers to create GenAI Agent endpoints that responds to questions specifically related to enterprise data sources. Examples of such enterprise information include corporate policy manuals, technical documentation, product catalogues, and written field reports.
Using the use case in my previous post, I will walk through quickly what is needed to build a similar application with GenAI Agents. Here's the expected output:
OCI Prerequisites
If you have administrative rights to the tenancy, then you will have no issues performing the tasks for setting up the service for the application. If not, please have your administrators perform the steps to setup the necessary Identity and Access Management (IAM) groups and policies to allow access to the service. Here, we assume that you have unrestricted access.
To consume the service endpoint using Oracle APEX, be sure to create a restricted IAM user that:
- Is limited to, for example, generate API keys.
- Assigned to an IAM group, that is then used in a policy to allow members to invoke the GenAI Agent's endpoint. For example, it may include a statement like this:
allow group 'Default'/'MyGroup' to use genai-agent-endpoint in compartment mycompartment
allow group 'Default'/'MyGroup' to manage genai-agent-session in compartment mycompartment
Be sure to generate the API key, download the private key, gather all the required configuration information, and then create an Oracle APEX web credential.
We will be working with documents uploaded to the object storage, so create a bucket, and then upload the documents that you wish to include in the knowledge base that we will create shortly.
Knowledge Base
The knowledge base (KB) is an essential component to the GenAI Agent service, and is first to be created. Text generated by the RAG agent is grounded and derived using the content available in its associated knowledge bases. The service supports both service-managed and customer-managed knowledge bases. The following are different data sources that can be used are:
- OCI Object Storage
- Oracle Data base 23ai vector search
- OCI OpenSearch
The easiest to start with is object storage. Simply create a bucket and upload all relevant documents. The demo application described in this article uses object storage as the data source. Before creating the KB, please note the current limits:
- One bucket per data source.
- Documents must either be in
PDF
ofTXT
formats. - Maximum file size is 100 MB.
- If a file contains images, charts, and reference tables, the maximum size for these is 8 MB.
For a full list of the limitations, please see these guidelines. It would also be good to note that the large language model (LLM) used in the service is able to answer questions pertaining to two-dimensional charts with labelled axes without any additional data preparation.
The OCI Generative AI Agents services are in the Analytics & AI group of services. Navigate to the Generative AI Agents homepage, and then click the link to Knowledge Pages. Click the button Create knowledge base to begin. Provide a suitable name and description, select the compartment to create the KB in, and then choose the option to use an object storage bucket as the document store.
Scroll down and click on the button Specify data source.
In the modal, enter an appropriate data source name, and then select the object storage bucket where the document resides. Change the selected compartment if necessary. You may either check to select all objects in the selected bucket, select individual objects found in the bucket, or filter using prefixes. Click the button Create when done. Note that the option Automatically start ingestion job for above data sources is checked by default. Leave it checked for now.
The data ingestion job will extract data from the selected source documents, converted into a structured format that can be used by the agents, and then stored. If data privacy is a concern to you, please note this disclaimer from the service when the object storage data store is specified:
Information ingested into the knowledge base is securely stored and used as reference context when the Agent generates responses. The ingestion process does not attempt to identify, filter, or remove personal information from the input.
I wish there were more "legal" clarity on how data is handled on the platform, however, it seems that is hard to obtain at this time. The reason I am hearing is that a lot of these standards and governance policies are still rapidly evolving, and thus, there isn't an easy way to provide the necessary legalese.
Once the data ingestion has been completed successfully, the status of the knowledge base should show ACTIVE.
Agent
Next, create an agent. Provide a suitable name, description, and select the compartment to create the agent in. Optionally, replace the default welcome message when a session with the agent begins.
Select the knowledge bases that the agent will use. Change compartment if necessary.
You will be presented with an option to also create an agent endpoint. Leave it checked, however, know that you can create more than one endpoints. Creating an endpoint manually allows you to control the available features:
- Enable or disable session.
- And if session is enabled, the idle timeout in seconds.
- Enable or disable content moderation on for the chant input and/or output.
- Enable or disable log traces.
- Enable or disable citations.
When the endpoint is created with the agent, content moderation is disabled for both input and output. However, sessions, citations, and tracing are enabled. All these values may be modified later except for the session feature (and idle timeout if enabled).
After the agent and its endpoint are created, go to the agent endpoint's details page and copy its OCID. This will be needed to construct the request URL when calling the OCI REST API for GenAI Agent service.
Like the OCI Generative AI service, the GenAI Agents service also provides a "playground" to test your prompts and questions. Either launch the chat from agent or agent endpoint details page, or navigate through the service's homepage. Regardless of how you get to the playground, make sure that you select the agent and endpoint that you wish to chat with.
Region Availability
Like the OCI Generative AI service, GenAI Agents were first launched in the US Chicago region, and as of the date of this article, is not yet available in any other regions. This can be a challenge for a few reasons:
- Data residency requirements. For many countries, local laws require IT providers to store data, especially if it relates to personal information, must be stored in a local data centre.
- Documents are stored in a different OCI region. Today, there are many available OCI regions. It is likely that your OCI region is going to be somewhere other than Chicago. For knowledge bases using object storage as a data store, these buckets need to be in the same region. If your documents are stored in a different region, if your source bucket is in a "compatible" region, a quick workaround is to enable cross-region replication.
Implement the Q&A Application
A simple demonstration of using this agent involves one page containing the following components:
- P200_SESSION_ID. A hidden page item for storing the session identifier (note that this does not refer to the APEX session).
- P200_COLLECTION_NAME. A hidden page item to store the name of an APEX collection that stores the question and response history.
- P200_SEQ_ID. A hidden page item to store the sequence identifier of the response to display in the answer panel.
- P200_QUESTION. The text area field that takes the user's input (question).
- ASK. A button that submits the question and executes the process for generating a response.
- NEW_SESSION. A button to clear the APEX session state.
- P200_ANSWER. A display-only page item to render the generate text by the GenAI agent.
- Citations. A classic report to display the list of citations used by the GenAI agent to generate the response.
In addition to the display components, there is a Clear APEX Session State that clears the APEX session state when the New Session button is clicked. And an execution chain that contains two process, one that creates a GenAI agent endpoint session if one does not exist, and another that calls the Chat endpoint to generate the RAG agent's response to the user's question.
Session
A session in the GenAI Agent service context allows for a client application to maintain a conversation on behalf of its user. Doing so allows a user to continue a sequence of questions without needing to repeat what was already asked. To start a session, create one using the GenAI Agents client API. In the application, this is performed by the page process Create session if not exists, that conditionally executes if the state of P200_SESSION_ID
is null.
The process executes the following PL/SQL code:
declare
c_agent_endpoint_id constant varchar2(1024) :=
'ocid1.genaiagentendpoint.oc1...';
l_response clob;
begin
apex_web_service.set_request_headers(
p_name_01 => 'Content-Type'
, p_value_01 => 'application/json'
);
l_response := apex_web_service.make_rest_request(
p_http_method => 'POST'
, p_url =>
'https://agent-runtime.generativeai.us-chicago-1.oci.oraclecloud.com'
|| '/20240531/agentEndpoints/'
|| c_agent_endpoint_id
|| '/sessions'
, p_credential_static_id => 'OCI_CREDENTIALS'
, p_body => json_object(
key 'displayName' value 'AgentTester'
, key 'description' value 'Application testing the GenAI Agent service.'
)
);
if apex_web_service.g_status_code = 200 then
apex_json.parse(p_source => l_response);
-- Store the session ID
:P200_SESSION_ID := apex_json.get_varchar2(p_path => 'id');
-- Create a new APEX collection to store the question and response.
:P200_COLLECTION_NAME := 'SSID_' || to_char(systimestamp
, 'YYYYMMDDHH24MISS');
apex_collection.create_or_truncate_collection(
p_collection_name => :P200_COLLECTION_NAME
);
else
raise_application_error(-20001, 'Error creating session.');
end if;
end;
Calling the client API requires the agent endpoint's OCID that we obtained earlier. The procedure constructs the request URL using the endpoint's OCID, and then makes a POST
request to the endpoint to create a new session. If the successful, a response containing the session identifier, in attribute id
, is returned. This is then used to set the value of the page item P200_SESSION_ID
.
Chat
In the execution chain, the next process to be called will be used to submit the user's questions to the chat API to generate a response. The following PL/SQL procedure is used in the demo application:
declare
c_agent_endpoint_id constant varchar2(1024) :=
'ocid1.genaiagentendpoint.oc1...';
l_response clob;
begin
apex_web_service.set_request_headers(
p_name_01 => 'Content-Type'
, p_value_01 => 'application/json'
);
l_response := apex_web_service.make_rest_request(
p_http_method => 'POST'
, p_url =>
'https://agent-runtime.generativeai.us-chicago-1.oci.oraclecloud.com'
|| '/20240531/agentEndpoints/'
|| c_agent_endpoint_id
|| '/actions/chat'
, p_credential_static_id => 'OCI_CREDENTIALS'
, p_body => json_object(
key 'sessionId' value :P200_SESSION_ID
, key 'shouldStream' value 'false'
, key 'userMessage' value apex_escape.json(:P200_QUESTION)
)
);
if apex_web_service.g_status_code = 200 then
apex_json.parse(p_source => l_response);
-- Save the question and response to the question, and then save the
-- generated SEQ ID to the session state.
:P200_SEQ_ID := apex_collection.add_member(
p_collection_name => :P200_COLLECTION_NAME
, p_c001 => :P200_QUESTION
, p_clob001 => l_response
);
end if;
end;
If successful, the response body will contain a JSON that has this structure:
{
"message": {
"role": "AGENT",
"content": {
"text": "The incubation period for dengue virus (DENV)...",
"citations": [
{
"sourceText": "After a person is bitten by an infective ...",
"sourceLocation": {
"sourceLocationType": "OCI_OBJECT_STORAGE",
"url": "https://objectstorage.us-chicago-1.oraclecloud.com/n/***/b/document-store/o/journals/cm000480.pdf"
}
}
]
},
"timeCreated": "2024-10-06T21:13:52.639Z"
},
"traces": [
{
"traceType": "RETRIEVAL_TRACE",
"timeCreated": "2024-10-06T21:13:50.448Z",
"retrievalInput": "What is the incubation period for DENV?",
"citations": [
{
"sourceText": "After a person is bitten by an infective ...",
"sourceLocation": {
"sourceLocationType": "OCI_OBJECT_STORAGE",
"url": "https://objectstorage.us-chicago-1.oraclecloud.com/n/***/b/document-store/o/journals/cm000480.pdf"
}
},
{
"sourceText": "Sev-\neral genotypes from each of these ...",
"sourceLocation": {
"sourceLocationType": "OCI_OBJECT_STORAGE",
"url": "https://objectstorage.us-chicago-1.oraclecloud.com/n/***/b/document-store/o/journals/11910_2022_Article_1213.pdf"
}
},
...
]
},
{
"traceType": "GENERATION_TRACE",
"timeCreated": "2024-10-06T21:13:52.317Z",
"generation": "The incubation period for dengue virus (DENV)..."
}
]
}
A reminder that the citations and traces are returned because these features were enabled for the agent endpoint.
Set the value of the display-only page item P200_ANSWER
using a computation. This computation executes a SQL queries that returns a single value result:
select
json_value(c.clob001, '$.message.content.text' returning varchar2) as text
from apex_collections c
where c.collection_name = :P200_COLLECTION_NAME
and c.seq_id = :P200_SEQ_ID
To display the list of citations used, set the query for the classic report Citations using the following SQL statement:
select
ct.source_text
, ct.source_location_type
, ct.source_url
from apex_collections c
, json_table(c.clob001, '$.message.content.citations[*]'
columns(
source_text varchar2(32767) path '$.sourceText'
, source_location_type varchar2(32767)
path '$.sourceLocation.sourceLocationType'
, source_url varchar2(32767) path '$.sourceLocation.url'
)
) ct
where c.collection_name = :P200_COLLECTION_NAME
and c.seq_id = :P200_SEQ_ID
Important
In the documentation for the GenAI Agent chat endpoint, it describes that a body containing a single ChatDetails resource. It further states that the attribute
sessionId
is optional. This is only true if the endpoint has the session feature disabled. Calling an endpoint with the session enabled and not providing a valid session identifier will result in the following response:{ "code": "400", "message": "sessionId must be provided." }
Closing Words
There are still some features to implement that will make this a fully functional application. However, at least for now, performing the core task of providing context-aware answers to a user's question, this part is easily done. Don't you agree?
A recent LinkedIn post grabbed my interest recently. It was an intimate sharing by an AI expert, about their personal encounter with cancer, the medical system, and how AI could perhaps help to contribute to modern day medical care. Don't get me wrong. I am not advocating that we should self-medicate or not trust our devoted medical practitioners. If you have been in a similar experience before, and I have been on more than one occasion, then you might appreciate why I too feel that using a well-designed RAG approach can help me be more informed, and ask the good questions when the challenge arise again.
I wish you good health, and as always, if you'd like to get in touch on how to apply RAG in your organisation, please do not hesitate to reach out.
Photo Credits:
Image by Michal Jarmoluk from Pixabay