Post

[HAI5016] Week 8: Azure OpenAI Studio Model Deployments

[HAI5016] Week 8: Azure OpenAI Studio Model Deployments

Model deployments in Azure OpenAI Studio

Due to midterms, there will be no lecture this week. Instead, I want to ask you to prepare for next week’s class by deploying models in Azure OpenAI Studio.

Required Models for Deployment

You will have to deploy the following models in order to be able to use them in class next weeks:

  • gpt-4o-mini
  • text-embedding-3-small

Deploy models

  1. Open the Azure OpenAI Studio landing page
  2. On the Azure OpenAI Studio landing page, find Deployments link under the Shared Resources section in the left navigation menu.

    Deployments

  3. Click + Deploy Model and select Deploy base model:
    • Find the Chat Completion model gpt-4o-mini and click Confirm
    • Deployment name: gpt-4o-mini
    • Deployment type: Global Standard
    • Click Deploy
  4. Now do the same for the text embedding model:
    • Click + Deploy Model and select Deploy base model.
    • Find the Text Embedding model text-embedding-3-small and click Confirm
    • Deplyment name: text-embedding-3-small
    • Deployment type: Standard
    • Click Deploy

After completing the above, your Model Deployments page should display at least both models as successfully deployed:

Model Deployments

Increase token limit

By default, the token limit for the gpt-4o-mini model is set to 1K tokens per minute. You can increase this limit by clicking on the model and then clicking on the Edit button. You can then increase the token limit to (around) 200K tokens per minute.

Increase token limit

If you face any limitation errors during deployment, or cannot increase the token limit above 1K, you will have to request an increase in the token limit by clicking on the Request Increase button on the Quota page. You will have to provide a reason for the increase in the token limit, which the details can be found in this post.

This post is licensed under CC BY 4.0 by the author.