🏛️ University Knowledge Graph System

This repository contains the complete implementation of the University Knowledge Graph project developed for the Knowledge Engineering and Ontologies course.The project structurally models, validates, and queries the academic ecosystem of the Manisa Celal Bayar University (MCBU) Engineering Faculty, focusing on the Computer, Industrial, and Mechanical Engineering departments.

🎯 Project Objective

The primary objective of this project is to resolve the challenge of fragmented and siloed academic data across university faculties. By designing a unified Semantic Web application, this system:

Formalizes the Domain: Captures complex relationships between students, academic staff (professors and research assistants), courses (mandatory and elective), and academic sub-fields using an OWL 2 ontology.
Ensures Data Integrity: Employs SHACL shapes to validate essential business constraints, such as mandatory student IDs and departmental affiliations.
Enables Intelligent Querying: Leverages SPARQL for logical reasoning and automated deduction (e.g., transitive prerequisite chains).
Integrates Large Language Models (LLMs): Bridges the gap between natural language and structured query logic by translating user questions into ontology-grounded SPARQL queries using the Claude API, drastically minimizing AI hallucinations.

📊 Dataset Sources

]Rather than relying on static or synthetic datasets, real-time data extraction was implemented to reflect the authentic academic ecosystem of the faculty:

MCBU Course Information System: Used to scrape course codes, names, semester data, and multi-tier prerequisite chains.
AVESIS (Academic Data Management System): Used to extract academic staff attributes, including titles, institutional email addresses, and official profile URLs.

Note: Data was collected using an automated Python pipeline leveraging Selenium for dynamic content handling and BeautifulSoup for HTML parsing, cleaned with Pandas, and mapped to the structural schema (TBox) to generate 1,977 RDF triples.

📂 Repository Structure

The repository is organized as follows to fulfill all semantic web and deployment constraints:

├── ontology/
│   └── university-knowledge-graph-v2.rdf  # Core OWL ontology file (Turtle format)
├── shacl/
│   └── shacl-shapes.ttl                   # SHACL constraints for data validation
├── queries/
│   ├── query1_software_staff.sparql       # SPARQL query for sub-field staff retrieval
│   ├── query2_transitive_prereq.sparql    # SPARQL query testing transitive prerequisites
│   ├── query3_professor_emails.sparql     # SPARQL query for professor contact details
│   └── query4_student_enrollments.sparql  # SPARQL query for per-student schedules
├── src/
│   ├── scraper.py                         # Python automation script (Selenium & BeautifulSoup)
│   ├── rdf_mapper.py                      # Relational CSV-to-RDF triple generator
│   ├── validator.py                       # pySHACL validation engine script
│   └── llm_interface.py                   # Claude API Natural Language to SPARQL pipeline
├── docs/                                  # WIDOCO automated ontology documentation
│   └── index.html                         # Interactive visualization and documentation
└── README.md                              # Project documentation overview

1. Running Data Validation (SHACL)

To test data integrity and see the validation constraints in action (including the intentional validation failure of the incomplete :wrong_student record):

python src/validator.py

2. Executing SPARQL Queries Programmatically

You can run the predefined SPARQL queries on the loaded Turtle memory graph:

python src/rdf_mapper.py

Alternatively, you can open ontology/university-knowledge-graph-v2.rdf in Protégé or GraphDB to execute queries directly inside their respective SPARQL Query tabs.

3. Running the LLM Integration (Natural Language Interface)

To launch the semantic natural language interface using Claude Sonnet, make sure to export your API key first:

export ANTHROPIC_API_KEY="your-api-key-here"
python src/llm_interface.py

👥 Team Members

This project was developed collaboratively by:

Efe Şamil Sarıgül

Ayça Selda Keskin

For a detailed look into the TBox/ABox conceptual decisions, axiom justifications, and technical outputs, please explore the interactive documentation inside the docs/ folder or read our full project report.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
README.md		README.md
Widoco.zip		Widoco.zip
graph_try.zip		graph_try.zip
proje 1.docx		proje 1.docx
proje 2.docx		proje 2.docx
shacl_try.zip		shacl_try.zip
university-knowledge-graph-v2.rdf		university-knowledge-graph-v2.rdf
university-knowledge-graph.rdf		university-knowledge-graph.rdf
widoco (2).zip		widoco (2).zip

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🏛️ University Knowledge Graph System

🎯 Project Objective

📊 Dataset Sources

📂 Repository Structure

1. Running Data Validation (SHACL)

2. Executing SPARQL Queries Programmatically

3. Running the LLM Integration (Natural Language Interface)

👥 Team Members

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🏛️ University Knowledge Graph System

🎯 Project Objective

📊 Dataset Sources

📂 Repository Structure

1. Running Data Validation (SHACL)

2. Executing SPARQL Queries Programmatically

3. Running the LLM Integration (Natural Language Interface)

👥 Team Members

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages