Crash Course: RAG with Azure OpenAI
Quickly learn the basic steps of integrating your own data with Azure OpenAI using Retrieval-Augmented Generation (RAG) in today’s straightforward crash course.
Disclaimer: This blog provides instructions and resources for the workshop part of my lectures. It is not a replacement for attending class; it may not include some critical steps and the foundational background of the techniques and methodologies used. The information may become outdated over time as I do not update the instructions after class.
1. Sign up for Azure for Students
Use your university or school email to sign up via https://azure.microsoft.com/free/students
Check if your student benefits got activate via https://portal.azure.com/#blade/Microsoft_Azure_Education/EducationMenuBlade
Request to increase the quota limit for Azure OpenAI following the instructions on this page
2. Create an Azure OpenAI resource to get access to Azure OpenAI Studio
Open Azure OpenAI studio via https://oai.azure.com/
Create an Azure OpenAI resource:
- Subscription:
Azure for Students
- Resource group: click Create new ->
SpeedyRAG
- Region: Keep the
East US
option selected - Name: Make a name starting with firstmodel and then some random characters. Like
firstmodelhrn8c29
Pricing tier: Select the
Standard S0
option- For (2) network, (3) Tags and (4) Review + Submit, just click Next and Create
Azure OpenAI resources are constrained by regional quotas. In the event of a quota limit being reached in the exercise, there’s a possibility you may need to create another resource in a different region. Select randomly from Australia East, Canada East, East US, East US 2, France Central, Japan East, North Central US, Sweden Central, Switzerland North or UK South.
- Subscription:
- While the Azure OpenAI resource is being provisioned, create an Azure AI Search resource with the following settings:
- In https://portal.azure.com/ look for
Azure AI Search
and add an instance - Subscription:
Azure for Students
- Resource group: Select the newly created
SpeedyRAG
- Service name: A unique name of your choice like
speedysearch
- Region: Select the same region as our Azure OpenAI resource
- Pricing tier:
B - Basic
- In https://portal.azure.com/ look for
- While the Azure AI Search resource is being provisioned, create a Storage account resource with the following settings:
- Subscription: Subscription:
Azure for Students
- Resource group: Select the newly created
SpeedyRAG
- Storage This needs to be an unique namegain. Let it start with speedyragstorage and some random characters. Like
speedyragstorage56b7tgy
- Region: The region in which you provisioned your Azure OpenAI resource
- Performance: Standard
- Redundancy: Locally redundant storage (LRS)
- Subscription: Subscription:
- Create a Blob container in the storage account:
- Open the storage account
- In the management menu on the left go to Data storage -> Containers
- Click on
+ Container
to create a new blob storage container - Give it the name
rag-data
and click create - Double click on the newly created container
- In the opened container click on ‘Upload’
- Here upload your own data or use the sample data https://aka.ms/own-data-brochures
3. Deploy the models
We’re going to use two AI models in this exercise: (1) A text embedding model to vectorize the text in the brochures so it can be indexed efficiently for use in grounding prompts and (2) A GPT model that you application can use to generate responses to prompts that are grounded in your data.
In the Azure OpenAI studio landing page, click on ‘Chat playground’ to open the Chat-based model playground
- In Azure AI Studio, on the Deployments page, view your existing model deployments. Then create a new base model deployment of the text-embedding-ada-002 model with the following settings:
- Deployment name:
text-embedding-ada-002
- Model:
text-embedding-ada-002
- Model version: The default version
- Deployment type: Standard
- Tokens per minute rate limit:
5K*
- Content filter:
Default
- Enable dynamic quota:
Enabled
- Deployment name:
- After the text embedding model has been deployed, return to the Deployments page and create a new deployment of the gpt-35-turbo-16k model with the following settings:
- Deployment name:
gpt-35-turbo-16k
- Model:
gpt-35-turbo-16k
(if the 16k model isn’t available, choose gpt-35-turbo) - Model version: The default version
- Deployment type: Standard
- Tokens per minute rate limit:
5K*
- Content filter:
Default
- Enable dynamic quota:
Enabled
Here we will select a model that has a good balance between pricing and performance. For the pricing details of the various models, see Azure OpenAI pricing
- Deployment name:
Recently Microsoft has put a 1000 token per minute quota for Azure for Student subscriptions. Azure OpenAI Service quotas and limits
This makes it harder to follow the tutorial and requires general best practices to remain within rate limits: (1) Implement retry logic in your application, (2) Avoid sharp changes in the workload, (3) Increase the workload gradually, (4) Test different load increase patterns and (5) Increase the quota assigned to your deployment. Move quota from another deployment, if necessary.
3.1 Create an index
To make it easy to use your own data in a prompt, you’ll index it using Azure AI Search. You’ll use the text embedding mdoel you deployed previously during the indexing process to vectorize the text data (which results in each text token in the index being represented by numeric vectors - making it compatible with the way a generative AI model represents text)
- In the https://portal.azure.com/, navigate to your Azure AI Search resource
- On the Overview page, select Import and vectorize data.
- In the Setup your data connection page, select Azure Blob Storage and configure the data source with the following settings:
- Subscription:
Azure for Students
- Blob storage account: The storage account you created previously.
- Blob container: rag-data
- Blob folder: Leave blank
- Enable deletion tracking: Unselected
- Authenticate using managed identity: Unselected
- Subscription:
- On the Vectorize your text page, select the following settings:
- Kind:
Azure OpenAI
- Subscription:
Azure for Students
- Azure OpenAI Service: Your Azure OpenAI Service resource
- Model deployment:
text-embedding-ada-002
- Authentication type: API key
- I acknowledge that connecting to an Azure OpenAI service will incur additional costs to my account: Selected
- Kind:
- On the next page, do not select the option to vectorize images or extract data with AI skills.
- On the next page, enable semantic ranking and schedule the indexer to run once.
- On the final page, set the Objects name prefix to margies-index and then create the index.
4. Add your own data
To make a chatbot that can answer to our prompts based on our data, we are first going to put our data collection into a cloud storage bucket.
Search for
Storage accounts
and open the Storage accounts overview- Click on
+ Create
and create a storage account for our project- Subscription:
Azure for Students
- Resource group: Select the resouce group that we created for this project (
SpeedyRAG
) - Storage account name: This needs to be an unique name again. Let it start with speedyragstorage and some random characters. Like
speedyragstorage56b7tgy
- Region: Select the same region as our Azure OpenAI resource
- Primary service:
Azure Blob Storage
- For the remaining settings defaults can be used, so click on the
Review + Create
button to create the storage account
- Subscription:
Go back to the Chat Playground in the Azure OpenAI Studio, open the ‘Add your data’ tab and click ‘+ Add a data source’
- As a data source select ‘Azure Blob Storage (preview)’ and click on ‘Create a new Azure AI Search resource ’ to create a new search service
- Subscription:
Azure for Students
- Resource group: Select the resouce group that we created for this project (
SpeedyRAG
) - Name: Does not need to be unique accros Azure, so let’s call it
speedysearch
- Location: Select the same region as the other services
- Pricing tier: Click
Change Pricing Tier
and selectB - Basic
- We can use the default setting from here on, so click on ‘Review + Create’
- Wait until the deployment of the Azure AI Search instance is complete…
- Subscription:
- Return to the Azure Open AI ‘Add data’ dialogue and use the following settings:
- Select data source:
Azure Blob Storage
- Subscription:
Azure for Students
- Select Azure Blob storage resource: the name of the storage account that we created
- Select storage container:
rag-data
- Select Azure AI Search resource: select the Azure AI Search resource
speedysearch
that we just created - Enter the index name:
first-index
- Indexer schedule:
Once
- Select data source:
- Click next and in the data management setting choose
- Search type:
Semantic
- Chunk Size:
1024 (Default)
- Search type:
- Click nect and in the Data connection pane select
API key
and click next - Check if there aren’t any errors and click
Save and close
5. Ask questions about your own data
Clean up the resources
To prevent the Azure AI Search service to eat up our precious Azure credit, we have to delete it.
- Delete the model deployments in Azure OpenAI Studio
- Delete the resource group
SpeedyRAG
in the Azure Portal - In the Azure AI services blade (currently found here), purge the deleted Azure OpenAI resources by clicking on Manage deleted resources
Sources and more
- [GitHub] Implement Retrieval Augmented Generation (RAG) with Azure OpenAI Service
- Here you can download an archive of brochure data: https://aka.ms/own-data-brochures. Extract the brochures to a folder
- Optimizing Azure OpenAI: A Guide to Limits, Quotas, and Best Practices