Post

Crash Course: RAG with Azure OpenAI

Crash Course: RAG with Azure OpenAI

Quickly learn the basic steps of integrating your own data with Azure OpenAI using Retrieval-Augmented Generation (RAG) in today’s straightforward crash course.

Disclaimer: This blog provides instructions and resources for the workshop part of my lectures. It is not a replacement for attending class; it may not include some critical steps and the foundational background of the techniques and methodologies used. The information may become outdated over time as I do not update the instructions after class.

1. Sign up for Azure for Students

  1. Use your university or school email to sign up via https://azure.microsoft.com/free/students

  2. Check if your student benefits got activate via https://portal.azure.com/#blade/Microsoft_Azure_Education/EducationMenuBlade

  3. Request to increase the quota limit for Azure OpenAI following the instructions on this page

2. Create an Azure OpenAI resource to get access to Azure OpenAI Studio

  1. Open Azure OpenAI studio via https://oai.azure.com/

  2. Create an Azure OpenAI resource: Desktop View

    • Subscription: Azure for Students
    • Resource group: click Create new -> SpeedyRAG
    • Region: Keep the East US option selected
    • Name: Make a name starting with firstmodel and then some random characters. Like firstmodelhrn8c29
    • Pricing tier: Select the Standard S0 option

    • For (2) network, (3) Tags and (4) Review + Submit, just click Next and Create

      Azure OpenAI resources are constrained by regional quotas. In the event of a quota limit being reached in the exercise, there’s a possibility you may need to create another resource in a different region. Select randomly from Australia East, Canada East, East US, East US 2, France Central, Japan East, North Central US, Sweden Central, Switzerland North or UK South.

  3. While the Azure OpenAI resource is being provisioned, create an Azure AI Search resource with the following settings:
    • In https://portal.azure.com/ look for Azure AI Search and add an instance
    • Subscription: Azure for Students
    • Resource group: Select the newly created SpeedyRAG
    • Service name: A unique name of your choice like speedysearch
    • Region: Select the same region as our Azure OpenAI resource
    • Pricing tier: B - Basic
  4. While the Azure AI Search resource is being provisioned, create a Storage account resource with the following settings:
    • Subscription: Subscription: Azure for Students
    • Resource group: Select the newly created SpeedyRAG
    • Storage This needs to be an unique namegain. Let it start with speedyragstorage and some random characters. Like speedyragstorage56b7tgy
    • Region: The region in which you provisioned your Azure OpenAI resource
    • Performance: Standard
    • Redundancy: Locally redundant storage (LRS)
  5. Create a Blob container in the storage account:
    • Open the storage account
    • In the management menu on the left go to Data storage -> Containers
    • Click on + Container to create a new blob storage container
    • Give it the name rag-data and click create
    • Double click on the newly created container
    • In the opened container click on ‘Upload’
    • Here upload your own data or use the sample data https://aka.ms/own-data-brochures

3. Deploy the models

We’re going to use two AI models in this exercise: (1) A text embedding model to vectorize the text in the brochures so it can be indexed efficiently for use in grounding prompts and (2) A GPT model that you application can use to generate responses to prompts that are grounded in your data.

  1. In the Azure OpenAI studio landing page, click on ‘Chat playground’ to open the Chat-based model playground

  2. In Azure AI Studio, on the Deployments page, view your existing model deployments. Then create a new base model deployment of the text-embedding-ada-002 model with the following settings:
    • Deployment name: text-embedding-ada-002
    • Model: text-embedding-ada-002
    • Model version: The default version
    • Deployment type: Standard
    • Tokens per minute rate limit: 5K*
    • Content filter: Default
    • Enable dynamic quota: Enabled
  3. After the text embedding model has been deployed, return to the Deployments page and create a new deployment of the gpt-35-turbo-16k model with the following settings:
    • Deployment name: gpt-35-turbo-16k
    • Model: gpt-35-turbo-16k (if the 16k model isn’t available, choose gpt-35-turbo)
    • Model version: The default version
    • Deployment type: Standard
    • Tokens per minute rate limit: 5K*
    • Content filter: Default
    • Enable dynamic quota: Enabled

      Here we will select a model that has a good balance between pricing and performance. For the pricing details of the various models, see Azure OpenAI pricing

Recently Microsoft has put a 1000 token per minute quota for Azure for Student subscriptions. Azure OpenAI Service quotas and limits
Desktop View
This makes it harder to follow the tutorial and requires general best practices to remain within rate limits: (1) Implement retry logic in your application, (2) Avoid sharp changes in the workload, (3) Increase the workload gradually, (4) Test different load increase patterns and (5) Increase the quota assigned to your deployment. Move quota from another deployment, if necessary.

3.1 Create an index

To make it easy to use your own data in a prompt, you’ll index it using Azure AI Search. You’ll use the text embedding mdoel you deployed previously during the indexing process to vectorize the text data (which results in each text token in the index being represented by numeric vectors - making it compatible with the way a generative AI model represents text)

  1. In the https://portal.azure.com/, navigate to your Azure AI Search resource
  2. On the Overview page, select Import and vectorize data.
  3. In the Setup your data connection page, select Azure Blob Storage and configure the data source with the following settings:
    • Subscription: Azure for Students
    • Blob storage account: The storage account you created previously.
    • Blob container: rag-data
    • Blob folder: Leave blank
    • Enable deletion tracking: Unselected
    • Authenticate using managed identity: Unselected
  4. On the Vectorize your text page, select the following settings:
    • Kind: Azure OpenAI
    • Subscription: Azure for Students
    • Azure OpenAI Service: Your Azure OpenAI Service resource
    • Model deployment: text-embedding-ada-002
    • Authentication type: API key
    • I acknowledge that connecting to an Azure OpenAI service will incur additional costs to my account: Selected
  5. On the next page, do not select the option to vectorize images or extract data with AI skills.
  6. On the next page, enable semantic ranking and schedule the indexer to run once.
  7. On the final page, set the Objects name prefix to margies-index and then create the index.

4. Add your own data

To make a chatbot that can answer to our prompts based on our data, we are first going to put our data collection into a cloud storage bucket.

  1. Go to https://portal.azure.com/

  2. Search for Storage accounts and open the Storage accounts overview

  3. Click on + Create and create a storage account for our project
    • Subscription: Azure for Students
    • Resource group: Select the resouce group that we created for this project (SpeedyRAG)
    • Storage account name: This needs to be an unique name again. Let it start with speedyragstorage and some random characters. Like speedyragstorage56b7tgy
    • Region: Select the same region as our Azure OpenAI resource
    • Primary service: Azure Blob Storage
    • For the remaining settings defaults can be used, so click on the Review + Create button to create the storage account
  4. Go back to the Chat Playground in the Azure OpenAI Studio, open the ‘Add your data’ tab and click ‘+ Add a data source’

  5. As a data source select ‘Azure Blob Storage (preview)’ and click on ‘Create a new Azure AI Search resource ’ to create a new search service
    • Subscription: Azure for Students
    • Resource group: Select the resouce group that we created for this project (SpeedyRAG)
    • Name: Does not need to be unique accros Azure, so let’s call it speedysearch
    • Location: Select the same region as the other services
    • Pricing tier: Click Change Pricing Tier and select B - Basic
    • We can use the default setting from here on, so click on ‘Review + Create’
    • Wait until the deployment of the Azure AI Search instance is complete…
  • Return to the Azure Open AI ‘Add data’ dialogue and use the following settings:
    • Select data source: Azure Blob Storage
    • Subscription: Azure for Students
    • Select Azure Blob storage resource: the name of the storage account that we created
    • Select storage container: rag-data
    • Select Azure AI Search resource: select the Azure AI Search resource speedysearch that we just created
    • Enter the index name: first-index
    • Indexer schedule: Once
  • Click next and in the data management setting choose
    • Search type: Semantic
    • Chunk Size: 1024 (Default)
  • Click nect and in the Data connection pane select API key and click next
  • Check if there aren’t any errors and click Save and close

5. Ask questions about your own data

Clean up the resources

To prevent the Azure AI Search service to eat up our precious Azure credit, we have to delete it.

  1. Delete the model deployments in Azure OpenAI Studio
  2. Delete the resource group SpeedyRAG in the Azure Portal
  3. In the Azure AI services blade (currently found here), purge the deleted Azure OpenAI resources by clicking on Manage deleted resources

Sources and more

  1. [GitHub] Implement Retrieval Augmented Generation (RAG) with Azure OpenAI Service
  2. Here you can download an archive of brochure data: https://aka.ms/own-data-brochures. Extract the brochures to a folder
  3. Optimizing Azure OpenAI: A Guide to Limits, Quotas, and Best Practices
This post is licensed under CC BY 4.0 by the author.