This document provides technical details on the Keycloak integration with the DataExchange platform. The integration uses python-keycloak 5.5.0 and focuses on runtime performance while maintaining flexibility in role management.
The DataExchange platform uses Keycloak for authentication while maintaining its own role-based authorization system. This hybrid approach allows for:
- Centralized Authentication: Keycloak handles user authentication, token validation, and session management
- Custom Authorization: DataExchange maintains its own role and permission models in the database
- Flexible Integration: User data is synchronized from Keycloak but roles are managed within the application
The following environment variables must be set for the Keycloak integration to work:
KEYCLOAK_SERVER_URL=https://your-keycloak-server/auth
KEYCLOAK_REALM=your-realm
KEYCLOAK_CLIENT_ID=your-client-id
KEYCLOAK_CLIENT_SECRET=your-client-secret
The KeycloakManager class in authorization/keycloak.py is the central component that handles all Keycloak interactions. It provides methods for:
- Token validation
- User information retrieval
- User synchronization
- Role mapping
# Example usage
from authorization.keycloak import keycloak_manager
# Validate a token
user_info = keycloak_manager.validate_token(token)
# Get user roles
roles = keycloak_manager.get_user_roles(token)
# Sync user from Keycloak
user = keycloak_manager.sync_user_from_keycloak(user_info, roles, organizations)The KeycloakAuthenticationMiddleware in authorization/middleware.py intercepts requests and authenticates users based on Keycloak tokens. It:
- Extracts the token from either the
Authorizationheader orx-keycloak-tokenheader - Validates the token using the KeycloakManager
- Retrieves or creates the user in the database
- Attaches the user to the request
# The middleware is added to MIDDLEWARE in settings.py
MIDDLEWARE = [
# ...
'authorization.middleware.KeycloakAuthenticationMiddleware',
# ...
]The ContextMiddleware in api/utils/middleware.py extracts additional context from requests, including:
- Authentication token
- Organization context
- Dataspace context
This middleware supports both standard OAuth Bearer tokens and custom Keycloak tokens via the x-keycloak-token header.
The system includes specialized permission classes for both REST and GraphQL APIs:
REST permissions are defined in authorization/permissions.py and extend Django REST Framework's permission classes.
GraphQL permissions are defined in specialized classes that check user roles and permissions:
class ViewDatasetPermission(DatasetPermissionGraphQL):
def __init__(self) -> None:
super().__init__(operation="view")
class ChangeDatasetPermission(DatasetPermissionGraphQL):
def __init__(self) -> None:
super().__init__(operation="change")These classes are used in GraphQL resolvers to enforce permissions:
@strawberry.field(
permission_classes=[IsAuthenticated, ViewDatasetPermission],
)
def get_dataset(self, dataset_id: uuid.UUID) -> TypeDataset:
# ...-
Frontend Login:
- User authenticates with Keycloak directly
- Frontend receives and stores the Keycloak token
-
API Requests:
- Frontend includes the token in the
Authorization: Bearer <token>header - The token must be a valid Keycloak JWT with a subject ID
- Frontend includes the token in the
-
Token Validation:
KeycloakAuthenticationMiddlewareintercepts the request and extracts the token- Token is validated by contacting Keycloak directly via
keycloak_manager.validate_token() - The system verifies that the token contains a valid subject ID (
sub) - If validation fails or the subject ID is missing, the user is treated as anonymous
-
User Synchronization:
- User data is synchronized with the database only if token validation succeeds
- The system creates or updates the user based on the token information
- No users are created if Keycloak validation fails
-
Permission Checking:
- Permission classes check if the user has the required role
- Access is granted or denied based on the user's actual roles
Roles are managed within the DataExchange database rather than in Keycloak. The system includes:
- Role Model: Defines permissions for viewing, adding, changing, and deleting resources
- OrganizationMembership: Maps users to organizations with specific roles
- DatasetPermission: Provides dataset-specific permissions
The init_roles management command initializes the default roles:
python manage.py init_rolesThis creates the following default roles:
- admin: Full access (view, add, change, delete)
- editor: Can view, add, and change but not delete
- viewer: Read-only access
The system now exclusively supports the standard OAuth method for sending tokens from the frontend:
Authorization: Bearer <token>
The token extraction is robust and handles:
- Case-insensitive 'Bearer' prefix
- Proper whitespace trimming
- Raw tokens without the 'Bearer' prefix (though using the prefix is recommended)
All tokens must be valid Keycloak JWTs with a subject ID. The system does not support any development mode or fallback authentication mechanisms.
The Keycloak integration includes comprehensive error handling with detailed logging:
- Token validation errors return anonymous users with specific error messages
- Missing subject ID in tokens is explicitly checked and logged
- User synchronization errors are logged with detailed information
- API views include try-except blocks with appropriate HTTP status codes
- All authentication components provide consistent error responses
- Token Validation: All tokens are validated with Keycloak before granting access
- Role-Based Access Control: Fine-grained permissions based on user roles
- Secure Headers: CORS configuration allows only specific headers
-
Token Validation Failures:
- Check Keycloak server URL and realm configuration
- Verify token expiration and format
- Ensure client secret is correct
- Check that the token contains a valid subject ID (
sub) - Verify that the Keycloak server is accessible and responding correctly
-
Permission Errors:
- Verify user has appropriate role assignments
- Check organization memberships
- Run
init_rolescommand if roles are missing
-
User Synchronization Issues:
- Check the logs for detailed error messages
- Ensure the token contains all required user information (sub, email, username)
- Verify database connectivity and permissions
-
Migration Issues:
- If encountering migration dependency issues, use the techniques described in the migration section
The implementation prioritizes runtime execution over strict type checking:
- Direct Token Validation: Tokens are validated directly with Keycloak for maximum security
- No Caching: User information is not cached to ensure every request uses fresh token validation
- Type Ignores: Strategic use of type ignores to maintain runtime functionality while avoiding circular imports
- Detailed Logging: Comprehensive logging for easier debugging and monitoring
To extend the authorization system:
- Add New Roles: Modify the
init_rolescommand to include additional roles - Custom Permissions: Create new permission classes that extend existing ones
- Additional Context: Extend the ContextMiddleware to include more context information
The Keycloak integration provides a secure, robust authentication system with no development mode or fallback mechanisms. By requiring valid Keycloak tokens for all authenticated operations, the system ensures that only real users with proper credentials can access protected resources. The authentication flow is designed to be reliable, with comprehensive error handling and detailed logging to facilitate debugging and monitoring.
By separating authentication from authorization, the system allows for fine-grained control over permissions while leveraging Keycloak's robust authentication capabilities. The implementation prioritizes runtime execution and security, ensuring that the system works correctly in production environments.