Skip to content

Commit 3470e68

Browse files
committed
Added image processing and converted to use azd for deployments
1 parent 5dcd5dd commit 3470e68

11 files changed

Lines changed: 878 additions & 171 deletions

File tree

AI-in-a-Box.sln

Lines changed: 46 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,46 @@
1+
2+
Microsoft Visual Studio Solution File, Format Version 12.00
3+
# Visual Studio Version 17
4+
VisualStudioVersion = 17.5.002.0
5+
MinimumVisualStudioVersion = 10.0.40219.1
6+
Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "gen-ai", "gen-ai", "{5A9632E5-F638-42BF-BF07-3B35E2BE0605}"
7+
EndProject
8+
Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "semantic-kernel-bot-in-a-box", "semantic-kernel-bot-in-a-box", "{3E61D800-B120-46E9-B7EE-332ED8488CC9}"
9+
EndProject
10+
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "SemanticKernelBot", "gen-ai\semantic-kernel-bot-in-a-box\src\SemanticKernelBot.csproj", "{189DDAAC-CE3B-4488-BD53-C71F802E3395}"
11+
EndProject
12+
Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "Assistants", "Assistants", "{176DEFFB-5FDF-4308-84B3-C685641CFA0B}"
13+
EndProject
14+
Project("{2150E333-8FDC-42A3-9474-1A3956D46DE8}") = "bot-in-a-box", "bot-in-a-box", "{83389DF2-DAC5-4172-A0DF-017F511DBD8A}"
15+
EndProject
16+
Project("{9A19103F-16F7-4668-BE54-9A1E7A4F7556}") = "AssistantBot", "gen-ai\Assistants\bot-in-a-box\src\AssistantBot.csproj", "{BD9D160A-49C7-4BD7-9559-7C0185C76122}"
17+
EndProject
18+
Global
19+
GlobalSection(SolutionConfigurationPlatforms) = preSolution
20+
Debug|Any CPU = Debug|Any CPU
21+
Release|Any CPU = Release|Any CPU
22+
EndGlobalSection
23+
GlobalSection(ProjectConfigurationPlatforms) = postSolution
24+
{189DDAAC-CE3B-4488-BD53-C71F802E3395}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
25+
{189DDAAC-CE3B-4488-BD53-C71F802E3395}.Debug|Any CPU.Build.0 = Debug|Any CPU
26+
{189DDAAC-CE3B-4488-BD53-C71F802E3395}.Release|Any CPU.ActiveCfg = Release|Any CPU
27+
{189DDAAC-CE3B-4488-BD53-C71F802E3395}.Release|Any CPU.Build.0 = Release|Any CPU
28+
{BD9D160A-49C7-4BD7-9559-7C0185C76122}.Debug|Any CPU.ActiveCfg = Debug|Any CPU
29+
{BD9D160A-49C7-4BD7-9559-7C0185C76122}.Debug|Any CPU.Build.0 = Debug|Any CPU
30+
{BD9D160A-49C7-4BD7-9559-7C0185C76122}.Release|Any CPU.ActiveCfg = Release|Any CPU
31+
{BD9D160A-49C7-4BD7-9559-7C0185C76122}.Release|Any CPU.Build.0 = Release|Any CPU
32+
EndGlobalSection
33+
GlobalSection(SolutionProperties) = preSolution
34+
HideSolutionNode = FALSE
35+
EndGlobalSection
36+
GlobalSection(NestedProjects) = preSolution
37+
{3E61D800-B120-46E9-B7EE-332ED8488CC9} = {5A9632E5-F638-42BF-BF07-3B35E2BE0605}
38+
{189DDAAC-CE3B-4488-BD53-C71F802E3395} = {3E61D800-B120-46E9-B7EE-332ED8488CC9}
39+
{176DEFFB-5FDF-4308-84B3-C685641CFA0B} = {5A9632E5-F638-42BF-BF07-3B35E2BE0605}
40+
{83389DF2-DAC5-4172-A0DF-017F511DBD8A} = {176DEFFB-5FDF-4308-84B3-C685641CFA0B}
41+
{BD9D160A-49C7-4BD7-9559-7C0185C76122} = {83389DF2-DAC5-4172-A0DF-017F511DBD8A}
42+
EndGlobalSection
43+
GlobalSection(ExtensibilityGlobals) = postSolution
44+
SolutionGuid = {DB39B9D2-8859-49C2-9636-C4E4D345C9A7}
45+
EndGlobalSection
46+
EndGlobal
Lines changed: 41 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -1,86 +1,61 @@
1-
# Video Analysis-Azure Open AI in-a-box
1+
# Image and Video Analysis-Azure Open AI in-a-box
22
![banner](./readme-assets/banner-aoai-video-analysis-in-a-box.png)
3-
This solution examines vehicles for damage using Azure Open AI GPT-4 Turbo with Vision and Computer Vision Image Analysis 4.0. All orchestration is done with Azure Data Factory, allowing this solution to be easily customized for your own use cases.
3+
This solution examines videos and image of vehicles for damage using Azure Open AI GPT-4 Turbo with Vision and Azure AI Vision Image Analysis 4.0. All orchestration is done with Azure Data Factory, allowing this solution to be easily customized for your own use cases.
44

5-
Please note that as of this 1/31/2024, Azure Open AI GPT-4 Turbo with Vision and Computer Vision Image Analysis 4 are in Public Preview for limited regions.
5+
Please note that as of this 4/4/2024, Azure Open AI GPT-4 Turbo with Vision and Azure AI Vision Image Analysis 4.0 are in Public Preview for limited regions.
66

7-
- [Check here for available regions for Computer Vision Image Analysis 4.0.](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/overview-image-analysis?tabs=4-0#image-analysis-versions)
7+
- [Check here for available regions for Azure AI Vision Image Analysis 4.0.](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/overview-image-analysis?tabs=4-0#image-analysis-versions)
88
- [Check here for available regions for GPT-4 Turbo with Vision.](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#gpt-4-and-gpt-4-turbo-preview-model-availability)
99

1010
## Solution Architecture
1111

1212
![solution-arch](./readme-assets/gpt4-adf-architecture.jpg)
1313

14-
1. Land videos in Azure Blob storage with Azure Event Grid, Azure Logic Apps, Azure Functions, other ADF pipelines or other applications.
14+
1. Land images and/or videos in Azure Blob storage with Azure Event Grid, Azure Logic Apps, Azure Functions, other ADF pipelines or other applications.
1515
1. The ADF pipeline retrieves the Azure AI API endpoints, keys and other configurations from Key Vault.
16-
1. The blob storage URL for the video file is retrieved.
17-
1. With Azure Computer Vision, a video retrieval index is created for the file and the video is ingested. Depending on your use case, you could ingest multiple videos to the same index.
18-
1. Call GPT4-V deployment in Azure Open AI, passing in video URL and the video retrieval index, system message, system prompt and other inputs.
16+
1. The blob storage URL for the image or video file is retrieved.
17+
1. For videos, a video retrieval index is created for the file with Azure AI Vision and the video is ingested into the index. Depending on your use case, you could ingest multiple videos to the same index. Image analysis does not require an index.
18+
1. Call GPT4-V deployment in Azure Open AI, passing in video or image URL, the video retrieval index for videos, the system message, the user prompt and other inputs.
1919
1. Save the response to Azure Cosmos DB.
20-
1. If the video processes successfully, move the video to an archive folder.
20+
1. If the video processes successfully, move the video to the appropriate archive folder.
2121

2222
## Resources Deployed in this solution
2323

2424
![resources](./readme-assets/resources.jpg)
2525

2626
- User Assigned Managed Identity which has access to all resources
27-
- Storage account and containers for input videos and processed videos. Additionally, a SAS key is created which is required at this time for Azure Computer Vision Image Analyis 4.0.
27+
- Storage account and containers for input images and videos and processed images videos. Additionally, a SAS key is created which is required at this time for Azure AI Vision Image Analysis 4.0.
2828
- Azure Key Vault for holding API keys, the storage SAS token, and deployment information.
29-
- Azure Computer Vision with Image Analysis 4.0 for video ingestion. Note that at this time Image Analysis 4.0 is in Preview and in limited regions. [Check here for available regions.](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/overview-image-analysis?tabs=4-0#image-analysis-versions)
29+
- Azure AI Vision with Image Analysis 4.0 for video ingestion and/or image analysis. Note that at this time Image Analysis 4.0 is in Preview and in limited regions. [Check here for available regions.](https://learn.microsoft.com/en-us/azure/ai-services/computer-vision/overview-image-analysis?tabs=4-0#image-analysis-versions)
3030
- Azure Open AI resource with a GPT-4 Vision Preview Deployment. [Check here for available regions.](https://learn.microsoft.com/en-us/azure/ai-services/openai/concepts/models#gpt-4-and-gpt-4-turbo-preview-model-availability)
3131

32-
## Prerequisites
32+
## Prerequisites for running locally
3333

3434
1. Install latest version of [Azure CLI](https://docs.microsoft.com/en-us/cli/azure/install-azure-cli-windows?view=azure-cli-latest)
3535
1. Install latest version of [Bicep](https://docs.microsoft.com/en-us/azure/azure-resource-manager/bicep/install)
36+
1. Install latest version of [Azure Developer CLI](https://learn.microsoft.com/en-us/azure/developer/azure-developer-cli/install-azd?tabs=winget-windows%2Cbrew-mac%2Cscript-linux&pivots=os-windows)
3637
1. Install latest version [Azure Functions Core Tools](https://learn.microsoft.com/en-us/azure/azure-functions/functions-run-local?tabs=windows%2Cisolated-process%2Cnode-v4%2Cpython-v2%2Chttp-trigger%2Ccontainer-apps&pivots=programming-language-python#v2)
37-
1. Clone this repo
3838

39-
## Deploy Azure Resources
39+
## Deploy to Azure
4040

41-
1. Navigate to the **infra** directory in your local repo
42-
1. Login to your Azure account:
41+
### Clone this repository locally
4342

44-
```bash
45-
az login
46-
```
47-
48-
1. Set your Azure subscription ID:
49-
50-
```bash
51-
az account set --subscription <subscription id>
52-
```
53-
54-
1. Create an Azure Resource group in the same region that is supported by Azure OpenAI GPT-4V:
55-
56-
```bash
57-
az group create --name <your resource group name> --location <your resource group location>
58-
```
59-
60-
1. Run command to get the object id for your email address. This is to give you access needed for deployed resources:
61-
62-
```bash
63-
az ad user show --id 'your email' --query id
64-
```
65-
66-
1. Copy the objectid value returned from the above command.
67-
1. Open file main.bicepparam
68-
1. For **spObjectId**, paste the id value from the previous command over 'your-object-id'.
69-
1. Add value for your **resourceGroupName**.
70-
1. For **resourceLocation**, specify a region where GPT-4 Turbo with Vision is available.
71-
1. For **resourceLocationCV**, specify a region where Computer Vision with Image Analysis 4.0 is available.
72-
1. Add 2 or 3 alpha characters for both **prefix**, and **suffix**. Some of the resources require unique names across Azure and cannot be the same as a soft-deleted resource.
73-
1. Save the main.bicepparam file.
43+
```bash
44+
git clone https://github.com/Azure/AI-in-a-Box
45+
```
7446

75-
## Deploy resources to Azure
47+
### Deploy resources
7648

77-
1. Navigate to the **infra** folder and run the following command:
49+
```bash
50+
cd gen-ai/a-services/gpt-video-analysis-in-a-box
51+
azd auth login
52+
azd up
53+
```
7854

79-
```bash
80-
az deployment group create --resource-group <your resource group name> --template-file main.bicep --parameters main.bicepparam
81-
```
55+
You will be prompted for a subscription, a region for GPT-4V, a region for AI Vision, a resource group, a prefix and a suffix.
8256

83-
1. Upload videos of vehicles to your new storage account's **videosin** container using [Azure Storage Explorer](https://learn.microsoft.com/en-us/azure/vs-azure-tools-storage-manage-with-storage-explorer), [AzCopy](https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-files#upload-the-contents-of-a-directory) or within [the Azure portal](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-portal#upload-a-block-blob). You can find some sample videos at the bottom of this blog, [Analyze Videos with Azure Open AI GPT-4 Turbo with Vision and Azure Data Factory](https://techcommunity.microsoft.com/t5/fasttrack-for-azure/analyze-videos-with-azure-open-ai-gpt-4-turbo-with-vision-and/ba-p/4032778).
57+
### Post deployment:
58+
Upload images and videos of vehicles to your new storage account's **videosin** container using [Azure Storage Explorer](https://learn.microsoft.com/en-us/azure/vs-azure-tools-storage-manage-with-storage-explorer), [AzCopy](https://learn.microsoft.com/en-us/azure/storage/common/storage-use-azcopy-files#upload-the-contents-of-a-directory) or within [the Azure portal](https://learn.microsoft.com/en-us/azure/storage/blobs/storage-quickstart-blobs-portal#upload-a-block-blob). You can find some sample images and videos at the bottom of this blog, [Analyze Videos with Azure Open AI GPT-4 Turbo with Vision and Azure Data Factory](https://techcommunity.microsoft.com/t5/fasttrack-for-azure/analyze-videos-with-azure-open-ai-gpt-4-turbo-with-vision-and/ba-p/4032778).
8459

8560
## Run the solution
8661

@@ -93,41 +68,39 @@ Please note that as of this 1/31/2024, Azure Open AI GPT-4 Turbo with Vision and
9368
1. After it runs successfully, go to your Azure Cosmos DB resource and examine the results in Data Explorer:
9469
![cosmos](./readme-assets/cosmos-data-explorer.png)
9570

96-
1. Because the way the system message was instructed to format the results, we can run queries with expressions like the one below to easily see the probability of damage, the severity of any damage, and the kind of damage that occurred:
71+
1. At this time, GPT4-V does not support response_format={"type": "json_object"}. However, if we still specify the chat completion to return the results in Json, we can specify a Cosmos query to convert the string to a Json object:
9772
![cosmos query](./readme-assets/cosmos-query.png)
9873

9974
```sql
100-
SELECT gptoutput.filename, gptoutput.fileurl, gptoutput.shortdate,
101-
SUBSTRING(gptoutput.content, INDEX_OF(gptoutput.content, "Location[") + 9, INDEX_OF(gptoutput.content, "]", INDEX_OF(gptoutput.content, "Location[") + 9) - INDEX_OF(gptoutput.content, "Location[") - 9) AS Location,
102-
SUBSTRING(gptoutput.content, INDEX_OF(gptoutput.content, "VehicleType[") + 12, INDEX_OF(gptoutput.content, "]", INDEX_OF(gptoutput.content, "VehicleType[") + 12) - INDEX_OF(gptoutput.content, "VehicleType[") - 12) AS VehicleType,
103-
SUBSTRING(gptoutput.content, INDEX_OF(gptoutput.content, "DamageProbability[") + 18, INDEX_OF(gptoutput.content, "]", INDEX_OF(gptoutput.content, "DamageProbability[") + 18) - INDEX_OF(gptoutput.content, "DamageProbability[") - 18) AS DamageProbability,
104-
SUBSTRING(gptoutput.content, INDEX_OF(gptoutput.content, "Damage[") + 7, INDEX_OF(gptoutput.content, "]", INDEX_OF(gptoutput.content, "Damage[") + 7) - INDEX_OF(gptoutput.content, "Damage[") - 7) AS DamageType,
105-
SUBSTRING(gptoutput.content, INDEX_OF(gptoutput.content, "Severity[") + 9, INDEX_OF(gptoutput.content, "]", INDEX_OF(gptoutput.content, "Severity[") + 9) - INDEX_OF(gptoutput.content, "Severity[") - 9) AS Severity,
106-
gptoutput.content
107-
FROM gptoutput
75+
SELECT gptoutput.filename, gptoutput.fileurl, gptoutput.shortdate,
76+
StringToObject(gptoutput.content) as results
77+
FROM gptoutput
10878
```
10979

11080
## Enhance the solution in your environment for your own use cases
11181

11282
This solution is highly customizable due to the parameterization capabilities in Azure Data Factory. Below are the features you can parameterize out-of-the-box, or should I say, out-of-the-AI-in-Box (insert-nerdy-laugh-here.)
11383

114-
![parameters](./readme-assets/adf-parms.jpg)
115-
11684
### Test prompts and other settings
11785

11886
When developing your solution, you can rerun it with different settings to get the best results from GPT-4V by tweaking the **sys-message**, **user_prompt**, **temperature**, and **top_p** values.
11987

88+
![parameters](./readme-assets/adf-parms.jpg)
89+
12090
### Change from batch to real-time
12191

122-
This solution is set to loop against a container of videos in batch, which is ideal for testing. However, when you move to production, you may want the video to be analyzed in real-time. To do this, you can set up a storage event trigger which will run when a file is landed in blob storage.
123-
![trigger](./readme-assets/blob-event-trigger.jpg)
124-
Then eliminate the Get Metadata and For Each activities and call the ChildAnalyzeVideo pipeline after the variables are set and the parameters are retrieved from Key Vault. You can get the file name from the trigger metadata. [Read more about ADF Storage Event triggers here](https://learn.microsoft.com/en-us/azure/data-factory/how-to-create-event-trigger?tabs=data-factory).
92+
This solution is set to loop against a container of videos and images in batch, which is ideal for testing. However, when you move to production, you may want the video to be analyzed in real-time. To do this, you can set up a storage event trigger which will run when a file is landed in blob storage.
93+
![triMovegger](./readme-assets/blob-event-trigger.jpg)
94+
Move the If Activity inside the For Each loop to the main Orchestrator pipeline canvas and hen eliminate the Get Metadata and For Each activities. Call the If activity after the variables are set and the parameters are retrieved from Key Vault. You can get the file name from the trigger metadata. [Read more about ADF Storage Event triggers here](https://learn.microsoft.com/en-us/azure/data-factory/how-to-create-event-trigger?tabs=data-factory).
12595

126-
### Use the same Data Factory for other video analysis use cases
96+
### Use the same Data Factory for other image and/or video analysis use cases
12797

12898
You can set up multiple triggers over your Azure Data Factory and pass different parameter values for whatever analysis you need to do:
12999
![triggers](./readme-assets/new-trigger-parm.png)
130100

131-
You can set up different storage accounts for landing videos, then adjust the **storageaccounturl** and **storageaccountcontainer** parameters to ingest and analyze those videos. You can have different prompts and other values sent to GPT-4V in the **sys_message**, **user_prompt**, **temperature**, and **top_p** values for different triggers. You can land the data in a different Cosmos Account, Database and/or Container when setting the **cosmosaccount**, and **cosmosdb**, and **cosmoscontainer** values.
101+
You can set up different storage accounts for landing the files, then adjust the **storageaccounturl** and **storageaccountcontainer** parameters to ingest and analyze the images and/or videos. You can have different prompts and other values sent to GPT-4V in the **sys_message**, **user_prompt**, **temperature**, and **top_p** values for different triggers. You can land the data in a different Cosmos Account, Database and/or Container when setting the **cosmosaccount**, and **cosmosdb**, and **cosmoscontainer** values.
102+
103+
### Only analyze images or videos
104+
If you are only analyzing images OR videos, you can delete the pipeline that is not needed (childAnalyzeImage or childAnalyzeVideo), eliminate the If activity inside the ForEach File activity and specify the Execute Pipeline activity for just the pipeline you need. However, it doesn't hurt to leave the unneeded pipeline there in case you want to use it in the future.
132105

133106
For more details on this solution, check out this blog: [Analyze Videos with Azure Open AI GPT-4 Turbo with Vision and Azure Data Factory](https://techcommunity.microsoft.com/t5/fasttrack-for-azure/analyze-videos-with-azure-open-ai-gpt-4-turbo-with-vision-and/ba-p/4032778)!
Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
# yaml-language-server: $schema=https://raw.githubusercontent.com/Azure/azure-dev/main/schemas/v1.0/azure.yaml.json
2+
3+
name: gpt4v-image-and-video-analysis-in-a-box
4+
metadata:
5+
template: azd-init@1.4.4
6+
hooks:
7+
preprovision:
8+
windows:
9+
shell: pwsh
10+
run: ./scripts/setSPObjectId.ps1
11+
interactive: true
12+
continueOnError: false
13+
14+

0 commit comments

Comments
 (0)