End-to-End-Sign-Language-AI-Translation-System

Computer Vision, Edge Computing, Cloud Computing, Blockchain

Towards Trustworthy Sign Language Translation: A Privacy-Preserving Edge–Cloud–Blockchain System

This repository accompanies the paper:

Nada Shahin, Leila Ismail
Towards Trustworthy Sign Language Translation System:
A Privacy-Preserving Edge–Cloud–Blockchain Approach
Mathematics, 2025, 13, 3759.
DOI: 10.3390/math13233759

📬 Contact

Prof. Leila Ismail Intelligent Distributed Computing and Systems (INDUCE) Lab College of Information Technology, United Arab Emirates University leila@uaeu.ac.ae

📜 Citation

If you use this work, please cite: Shahin, Nada, and Leila Ismail. 2025. "Towards Trustworthy Sign Language Translation System: A Privacy-Preserving Edge–Cloud–Blockchain Approach" Mathematics 13, no. 23: 3759. https://doi.org/10.3390/math13233759

Overview:

This work envisions a new generation of trustworthy sign language translation systems that go beyond translation accuracy to address privacy, accountability, and real-world deployment. In response to the global shortage of certified sign language interpreters and the growing need for inclusive assistive technologies, the paper introduces a privacy-preserving, consent-aware SLMT architecture that integrates edge computing, cloud intelligence, and blockchain governance. By operating on abstract keypoint representations rather than raw video and enforcing explicit, auditable user consent, the system enables real-time communication while safeguarding user rights and regulatory compliance

At its core, the proposed system demonstrates that responsible AI and high performance are not competing goals. Through a comparative evaluation of Transformer-based models on large-scale and medical-domain datasets, the study shows that lightweight adaptive architectures can deliver accurate translations with substantially lower latency and computational cost in distributed environments. By embedding consent management and auditability directly into the AI pipeline, this work establishes a blueprint for ethically grounded, scalable assistive AI, with relevance extending beyond sign language translation to privacy-sensitive applications in healthcare and other biometric domains.

✨ Key Contributions

🔹 End-to-end edge–cloud–blockchain architecture for trustworthy SLMT
🔹 Consent-aware design supporting both application-level and system-level privacy
🔹 Comparative evaluation of:
- Encoder–Decoder Transformer
- Adaptive Transformer (ADAT)
🔹 New medical-domain dataset (MedASL) for sign-to-text translation
🔹 Comprehensive runtime analysis, including:
- Training time
- Inference latency
- Edge–cloud communication
- End-to-end system delay

System Architecture:

Our proposed end-to-end edge-cloud-blockchain system for SLMT is presented in Figure 1. It consists of the following modules:

Sign Language Recognition Module

Captures sign videos via camera input for keypoint extraction and processing.

AI-Enabled Translation Application
Acts as a gateway for user interaction and consent management.
Edge Computing Layer
- Keypoint extraction (MediaPipe)
- Preprocessing and real-time inference
- Reduced latency and improved privacy
Cloud Computing Layer
- Model training and retraining
- Dataset storage (with consent)
- Deployment of updated models
Blockchain Layer
- Immutable logging of:
  - User consent receipts
  - Policy versions
  - System certificates
  - Audit trails
It ensures transparency, integrity, and regulatory compliance.

🤖 Models Implemented

1. Encoder–Decoder Transformer

Baseline and widely adopted architecture for SLMT
Captures long-range spatiotemporal dependencies
Higher computational cost due to quadratic self-attention

2. Adaptive Transformer (ADAT)

Lightweight and efficient variant
Key features:
- LogSparse Self-Attention (O(n log n))
- Adaptive gating for short- and long-range dependencies
Demonstrates:
- Faster convergence
- Reduced model size
- Lower inference and communication latency

📊 Datasets

🔹 RWTH-PHOENIX-Weather-2014T (PHOENIX14T)

German Sign Language (DGS)
Weather domain
Large-scale, multi-signer benchmark dataset

🔹 MedASL (New Dataset)

American Sign Language (ASL)
Medical and healthcare conversations
Designed to reflect real-world assistive scenarios

Dataset characteristics, preprocessing pipelines, and splits are fully described in the paper.

⚙️ Preprocessing Pipeline

Keypoint extraction using MediaPipe
- Hands, face, pose, and iris landmarks
Normalization, rescaling, and padding
Sliding-window segmentation for inference
Tokenization and subword modeling for text output

This design improves privacy, efficiency, and robustness compared to raw RGB-based approaches.

Demo & Code

Sign Language Machine Translation Edge/Cloud Demo with Transformer baseline

End-to-end edge-cloud pipeline for sign language translation, including keypoint capture via camera and model inference on edge + communication with cloud.

Experimental Setup:

Requirements

Edge: opencv-python, mediapipe, numpy, torch, requests, sentencepiece
Cloud: tensorflow, flask, numpy, pyyaml

Quick start

Clone the repository

git clone https://github.com/INDUCE-Lab/End-to-End-Sign-Language-AI-Translation-System.git
cd End-to-End-Sign-Language-AI-Translation-System

Train the translation model
1. Navigate to one of the translation models (Transformer/ or ADAT/)
```
cd Transformer
# or
cd ADAT
```
2. Install dependencies
```
pip install -r requirements.txt
```
3. Train the model By default, trains the model end-to-end using randomly generated video/gloss/text sequences:
```
python train.py --config config.yaml
```
4. Forward pass
```
python run.py
```
5. Training on real sign language datasets To train on real datasets like PHOENIX14T and ISL-CSLTR, replace the synthetic loader inside: data.py with the following preprocessed tensors:
  - video_data: (N, T, 52, 65, 3) or (N, T, feature_dim)
  - gloss_indices: (N, max_g) # Padded integer sequences
  - text_indices: (N, max_t) # Padded integer sequences Then simply run:
```
python train.py --config config.yaml
```
Prepare the cloud service
1. Install the files under the model directory
  - transformer.pth: best Transformer model checkpoint
  - special_ids.json: token IDs and sizes
  - medasl_bpe.model: SentencePiece-BPE model for text decoding
2. Install cloud dependencies and run:
```
pip install -r Cloud_requirements.txt
# run server
python -m Cloud
# or
python Cloud/.py
```
Configure the edge client
1. Open Edge.py and set:
  - CLOUD_UPLOAD_URL = "http://<CLOUD_HOST>:/upload_keypoints"
  - CLOUD_MODEL_URL = "http://<CLOUD_HOST>:/get_model"
  - Adjust SEQ_LEN, MAX_SAMPLES, and paths if needed.
2. Install edge dependencies and run:
```
pip install -r Edge_requirements.txt
# run edge
python -m edge
# or
python edge.py
```
3. Controls:
  - When running the edge Python script, a window will pop up showing landmarks and a caption ("translating.." appears until translation confidence/threshold is met).
  - Press u to force an upload + model fetch cycle.
  - Press q to quit.

Data flow

Edge calls /get_model, /get_specials, /get_spm to refresh assets (periodically).
Camera captures frames, and the edge extracts keypoints using MediaPipe and concatenates keypoints, producing a vector of 1662 floats per frame.
Sliding window of length SEQ_LEN forms an array [SEQ_LEN, 1662].
When the sliding window is full, the window is saved locally (.npy).
The edge preprocesses the npy file.
Inference is run through the model to produce translations.
When the local cache reaches MAX_SAMPLES, the edge packs all windows into samples.npz and transmits them to the cloud, along with the translations.

Configuration notes

Confidence gate: CONF_THRESH on edge controls when a translation becomes the displayed caption.
Warm-up: MIN_WINDOWS_BEFORE_DISPLAY skips the first windows for stabilization before translating sign language to text.
Local inference: toggle with RUN_LOCAL_INFERENCE on edge. If enabled, the edge loads MODEL_PATH and detokenizes with medasl_bpe.model + special_ids.json.

Deployment

Set the edge CLOUD_* URLs to the cloud host/IP and open the cloud port in the firewall.
Run edge from the repo root:

python -m edge
# or
python edge.py

Run cloud similarly:

python -m cloud
# or
python cloud.py

📄 License

This project is released under the Creative Commons Attribution (CC BY 4.0) License, consistent with the published article.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

End-to-End-Sign-Language-AI-Translation-System

Towards Trustworthy Sign Language Translation: A Privacy-Preserving Edge–Cloud–Blockchain System

📬 Contact

📜 Citation

✨ Key Contributions

🤖 Models Implemented

1. Encoder–Decoder Transformer

2. Adaptive Transformer (ADAT)

📊 Datasets

🔹 RWTH-PHOENIX-Weather-2014T (PHOENIX14T)

🔹 MedASL (New Dataset)

⚙️ Preprocessing Pipeline

Demo & Code

Requirements

Quick start

Data flow

Configuration notes

Deployment

📄 License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
ADAT		ADAT
Datasets		Datasets
Transformer		Transformer
models		models
Cloud.py		Cloud.py
Cloud_requirements.txt		Cloud_requirements.txt
Edge.py		Edge.py
Edge_requirements.txt		Edge_requirements.txt
LICENSE		LICENSE
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

End-to-End-Sign-Language-AI-Translation-System

Towards Trustworthy Sign Language Translation: A Privacy-Preserving Edge–Cloud–Blockchain System

📬 Contact

📜 Citation

✨ Key Contributions

🤖 Models Implemented

1. Encoder–Decoder Transformer

2. Adaptive Transformer (ADAT)

📊 Datasets

🔹 RWTH-PHOENIX-Weather-2014T (PHOENIX14T)

🔹 MedASL (New Dataset)

⚙️ Preprocessing Pipeline

Demo & Code

Requirements

Quick start

Data flow

Configuration notes

Deployment

📄 License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages