Skip to content

Issue with ADLS Gen2 Access Control Setup - System Still Retrieves from Original Container #2913

@ShreyankaWebasto

Description

@ShreyankaWebasto

Description

I'm experiencing an issue when migrating to ADLS Gen2 with Access Control (ACLs). After running the ACL setup script, the system continues to retrieve documents from the original content folder during Citation instead of using the newly created gptkbcontainer with ACL-enabled documents.

Environment

Feature: ADLS Gen2 with Access Control (ACLs)
Storage Type: Azure Data Lake Storage Gen2
Release: 2025-11-18

Steps to Reproduce

Initial setup: Files were uploaded to content folder during the standard chatbot setup
Enable ADLS Gen2 with ACLs by running:

shell python ./scripts/adlsgen2setup.py './data/*' --data-access-control './scripts/sampleacls.json' -v

This uploads documents to gptkbcontainer with ACL metadata
Run the document preparation script:

shell ./scripts/prepdocs.ps1

Current Behavior

Original files exist in the content folder from initial setup
ACL-enabled files are uploaded to gptkbcontainer after running adlsgen2setup.py
During retrieval, citations are still generated using GET /content/... paths
The system appears to retrieve files from the old content folder, not from gptkbcontainer
The citation works as there is no subfolders in the 'content' container

Problem

The system does not switch to using gptkbcontainer after the ACL setup:

Citations continue to point to /content/... instead of /gptkbcontainer/...
The ACL-enabled documents in gptkbcontainer are being ignored during retrieval
This defeats the purpose of setting up access control, as the retrieved documents don't have the ACL metadata
When manually attempting to change retrieval to use gptkbcontainer, the subfolder path structure doesn't work correctly for citations.

Expected Behavior

After running the ADLS Gen2 ACL setup:

Citations should be generated using paths that point to gptkbcontainer instead of content
The ACL-enabled documents with proper access control metadata should be used for retrieval
The subfolder structure should be preserved in gptkbcontainer for proper citation path resolution

Questions

  1. Is there a configuration step or flag that needs to be set to switch the retrieval source from content to gptkbcontainer and still obey the subfolder structure?
  2. Should the old content folder be deleted after running the ACL setup?
  3. How should the citation path resolution be configured to work correctly with gptkbcontainer while maintaining subfolder structure?

Additional information

I also noticed that the env variables of the container app doesn't have these 3 variables associated with the Gen2 storage:
AZURE_ADLS_GEN2_STORAGE_ACCOUNT="stt3skr6gj3zoje"
AZURE_ADLS_GEN2_FILESYSTEM="gptkbcontainer"
AZURE_ADLS_GEN2_FILESYSTEM_PATH = "/"

Metadata

Metadata

Assignees

No one assigned

    Labels

    authRelated to user login or data access control features that use Entra, MSAL SDK, Built-in Auth

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions