Post

[HAI5016] Week 4: JSON and API's

[HAI5016] Week 4: JSON and API's

This week’s class is about API’s and our dear friend JSON. After the student’s presentations, we will deploy our Azure OpenAI instance and request increase of the 1K token quota that Microsoft enforces on student subscriptions by default.

Before following the instructions below, make sure that your Azure for Students Subscription is active.

1. Deploy an Azure OpenAI instance in Azure

  1. Open Azure OpenAI studio via https://oai.azure.com/

  2. Create an Azure OpenAI resource: Desktop View

    • Subscription: Azure for Students
    • Resource group: click Create new -> HAI5016
    • Region: Keep the East US option selected
    • Name: Make a name starting with firsthai- and then some random characters. Like firsthai-hrn8c29
    • Pricing tier: Select the Standard S0 option

    • For (2) network, (3) Tags and (4) Review + Submit, just click Next and Create

      Azure OpenAI resources are constrained by regional quotas. In the event of a quota limit being reached in the exercise, there’s a possibility you may need to create another resource in a different region. Select randomly from Australia East, Canada East, East US, East US 2, France Central, Japan East, North Central US, Sweden Central, Switzerland North or UK South.

2. Deploy a chat completion and embedding model in Azure OpenAI Studio

We’re going to use two Large Language models in this course: (1) a text embedding model to vectorize the text in our own data so it can be indexed efficiently for use in grounding prompts and (2) a GPT model that your application can use to generate responses to prompts that are grounded in our data.

  1. Open the Azure OpenAI Studio landing page

  2. On the Azure OpenAI Studio landing page, find Deployments link under the Shared Resources section in the left navigation menu. Click + Deploy Model and select Deploy base model.

  3. Find the Chat Completion model gpt-4o-mini and click Confirm.

  4. Deploy the model with the following settings:

  • Deployment name: gpt-4o-mini
  • Model version: Select the latest version (2024-07-18 as the moment of writing)
  • Deployment type: Global Standard
  • Content filter: Default or Default V2
  • Enable dynamic quota: Enabled

    Here we will select a model that has a good balance between pricing and performance. For the pricing details of the various models, see Azure OpenAI pricing

  1. Then, create a new base model deployment of the text-embedding-3-small model with the following settings:
  • Deployment name: text-embedding-3-small
  • Model: text-embedding-3-small
  • Model version: The default version
  • Deployment type: Standard
  • Tokens per minute rate limit: 120K
  • Content filter: Default
  • Enable dynamic quota: Enabled

3. Increase the token quota

  1. On the landing page of your resource in Azure Open AI Studio, click on the Quota link in the bottom of the left navigation menu.

  2. Check if the right subscription (Azure for Students) and region (probably East US) are selected and then find the ‘Request Quota’ button: Request Quota

  3. Find the quota of the deployed model in the list (e.g. gpt-4o-mini) under the deployment (probably GlobalStandard) and click on the ‘Request Quota’ icon. You probably have to fill in the following information:

  • Your first name
  • Your last name
  • Company Email: use your @g.skku.edu or @skku.edu email address
  • Company Name: Sungkyunkwan University
  • Company Address: 25 , Sungkyunkwan-Ro
  • Company City: Seoul
  • Company Postal Code: 110-745
  • Company Country: South Korea
  • Subscription Id: This is the ID of your Azure for Students subscription. This ID can be found in the URL-bar of your Azure OpenAI Studio tab, or find it in the subscriptions blade on Azure Portal. Find your Azure subscription ID

  • Justification: here you can write
1
   Need a higher token limit to follow along with the RAG tutorial which is used in my class (https://github.com/MicrosoftLearning/mslearn-openai/blob/main/Instructions/Exercises/06-use-own-data.md)
  • Quota Request Type: Global Standard
  • Global Standard Region: Your resource region, probably East US
  • Global Standard Model: gpt-4o-mini
  • Global Standard Quota: 400
This post is licensed under CC BY 4.0 by the author.