alkidbaci
diff --git a/‎.github/workflows/test.yml‎
Lines changed: 24 additions & 0 deletions b/‎.github/workflows/test.yml‎
Lines changed: 24 additions & 0 deletions
diff --git a/‎LICENSE‎
Lines changed: 21 additions & 661 deletions b/‎LICENSE‎
Lines changed: 21 additions & 661 deletions
diff --git a/‎README.md‎
Lines changed: 4 additions & 81 deletions b/‎README.md‎
Lines changed: 4 additions & 81 deletions
diff --git a/‎examples/evaluation_table_generator.py‎
Lines changed: 1 addition & 1 deletion b/‎examples/evaluation_table_generator.py‎
Lines changed: 1 addition & 1 deletion
diff --git a/‎ontolearn_light/abstracts.py‎
Lines changed: 4 additions & 10 deletions b/‎ontolearn_light/abstracts.py‎
Lines changed: 4 additions & 10 deletions
diff --git a/‎ontolearn_light/concept_generator.py‎
Lines changed: 0 additions & 3 deletions b/‎ontolearn_light/concept_generator.py‎
Lines changed: 0 additions & 3 deletions
diff --git a/‎ontolearn_light/data_struct.py‎
Lines changed: 0 additions & 1 deletion b/‎ontolearn_light/data_struct.py‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎ontolearn_light/ea_utils.py‎
Lines changed: 0 additions & 1 deletion b/‎ontolearn_light/ea_utils.py‎
Lines changed: 0 additions & 1 deletion
diff --git a/‎ontolearn_light/knowledge_base.py‎
Lines changed: 16 additions & 35 deletions b/‎ontolearn_light/knowledge_base.py‎
Lines changed: 16 additions & 35 deletions
@@ -0,0 +1,24 @@
+name: Python package
+
+on: [push,pull_request]
+jobs:
+  build:
+    runs-on: ubuntu-latest
+    strategy:
+      matrix:
+        python-version: ["3.10.13"]
+
+    steps:
+      - uses: actions/checkout@v3
+      - name: Set up Python ${{ matrix.python-version }}
+        uses: actions/setup-python@v4
+        with:
+          python-version: ${{ matrix.python-version }}
+      - name: Install dependencies
+        run: |
+          python -m pip install --upgrade pip
+          pip install -e .
+      - name: Test with pytest
+        run: |
+          pip install pytest
+          python -m pytest -p no:warnings -x
@@ -1,5 +1,9 @@
 # OntoSample
 
+[![Downloads](https://static.pepy.tech/badge/ontosample)](https://pepy.tech/project/ontosample)
+[![Downloads](https://img.shields.io/pypi/dm/ontosample)](https://pypi.org/project/ontosample/)
+[![Pypi](https://img.shields.io/badge/pypi-0.2.6-blue)](https://pypi.org/project/ontosample/0.2.6/)
+
 OntoSample is a python package that offers classic sampling techniques for OWL ontologies/knowledge 
 bases. Furthermore, we have tailored the classic sampling techniques to the setting of concept 
 learning making use of learning problem.
@@ -47,87 +51,6 @@ sampler.save_sample(kb=sampled_kb, filename='sampled_kb')
 
 Check the [examples](https://github.com/alkidbaci/OntoSample/tree/main/examples) folder for more.
 
-
-## About the paper
-
-### Abstact
-
-Node classification is an important task in many fields, e.g., predicting entity types in knowledge graphs, classifying papers in citation
-graphs, or classifying nodes in social networks. In many cases, it
-is crucial to explain why certain predictions are made. Towards
-this end, concept learning has been proposed as a means of interpretable node classification: given positive and negative examples
-in a knowledge base, concepts in description logics are learned that
-serve as classification models. However, state-of-the-art concept
-learners, including EvoLearner and CELOE exhibit long runtimes.
-In this paper, we propose to accelerate concept learning with graph
-sampling techniques. We experiment with seven techniques and tailor them to the setting of concept learning. In our experiments, we
-achieve a reduction in training size by over 90% while maintaining
-a high predictive performance.
-
-### Reproducing paper results
-
-You will find in examples folder the script used to generate the results in paper.
-`evaluation_table_generator.py` generates every result for each dataset-sampler-sampling_size 
-combination and store them in a csv.
-
-#### To generate results of Table 2
-Install the whole ontolearn package to use its learning algorithms like EvoLearner and CELOE because 
-they are not included here to keep the number of dependencies low.
-
-```shell
-pip install ontolearn
-```
-
-The evaluation results for a certain sampling percentage can be simply reproduced by using `examples/evaluation_table_generator.py`.
-
-There are the following arguments that the user can give:
-- `learner` &rarr; type of learner: 'evolerner' or 'celeo'.
-- `datasets_and_lp` &rarr; list containing the name of the json files that contains the path to the knowledge graph and
-                           the learning problem.
-- `samplers` &rarr; list of the abbreviation of the samplers as strings.
-- `csv_path` &rarr; path of the csv file to save the results.
-- `sampling_size` &rarr; the sampling percentage
-- `iterations` &rarr; number of iterations for each sampler
-
-Table 2 results can be  generated using the following instructions:
-
-1. Execute the script `evaluation_table_generator.py` using the default parameters.
-2. After the script has finished executing, set the argument `--learner` to `celoe`
-3. Set the csv path to another path by using the `--csv_path` argument.
-4. Execute again.
-
-In the end you will have 2 csv files, one for each learner.
-
-> **Note 1**: Not all datasets are included in the project because some of them are too large.
-> You can download all the SML-bench datasets [here](https://github.com/SmartDataAnalytics/SML-Bench/tree/updates/learningtasks).
-> They need to go to their respective folder named after them inside KGs directory.
-
-> **Note 2**: Keep in mind that this file needs a considerable amount of time to execute (more than 40 hours for each concept learner
-> depending on the machine specifications) when using the default values which were also used to construct 
-> the results for the paper. 
-> 
-> If you want quicker execution, you can enter a lower number of iterations.
-
----------------------------------------------------
-
-#### To generate results of Figure 1
-
-To generate results used in Figure 1 you need to follow the instructions below
-when writing the command to execute the script `examples/evaluation_table_generator.py`:
-
-
-```shell
-cd examples
-python evaluation_table_generator.py --datasets_and_lp {"hepatitis_lp.json", "carcinogenesis_lp.json"} --samplers {"RNLPC", "RWJLPC", "RWJPLPC", "RELPC", "FFLPC"} --sampling_size 0.25
-```
-
-Repeat the command for sampling sizes of `0.20`, `0.15`, `0.10`, `0.5`
-
-
-> **Note:** Make sure to set a different csv path using the `--csv_path` argument each time you execute to avoid
-> overriding the previous results.
-
-
 ### Citing
 
 ```
 
@@ -131,7 +131,7 @@ def start(args):
                         p = set(examples['positive_examples'])
                         n = set(examples['negative_examples'])
                         for individual in removed_individuals:
-                            individual_as_str = individual.get_iri().as_str()
+                            individual_as_str = individual.str
                             if individual_as_str in p:
                                 p.remove(individual_as_str)
                             if individual_as_str in n:
 
@@ -4,20 +4,18 @@
 from abc import ABCMeta, abstractmethod
 from typing import Set, List, Tuple, Iterable, TypeVar, Generic, ClassVar, Optional
 from owlapy.class_expression import OWLClassExpression
-from owlapy.owl_ontology import OWLOntology
+from owlapy.owl_ontology import Ontology
 from owlapy.utils import iter_count
 from .data_struct import Experience
 from .utils import read_csv
 from collections import OrderedDict
-
+from owlapy import owl_expression_to_dl
 _N = TypeVar('_N')  #:
 _KB = TypeVar('_KB', bound='AbstractKnowledgeBase')  #:
 
 logger = logging.getLogger(__name__)
 
-# @TODO:CD: Each Class definiton in abstract.py should share a prefix, e.g., BaseX or AbstractX.
-# @TODO:CD: All imports must be located on top of the script
-from owlapy import owl_expression_to_dl
+
 class EncodedLearningProblem(metaclass=ABCMeta):
     """Encoded Abstract learning problem for use in Scorers."""
     __slots__ = ()
@@ -28,7 +26,6 @@ class EncodedPosNegLPStandardKind(EncodedLearningProblem, metaclass=ABCMeta):
     __slots__ = ()
 
 
-# @TODO: Why we need Generic[_N] and if we need it why we di not use it in all other abstract classes?
 class AbstractScorer(Generic[_N], metaclass=ABCMeta):
     """
     An abstract class for quality functions.
@@ -54,7 +51,6 @@ def score_elp(self, instances: set, learning_problem: EncodedLearningProblem) ->
         """
         if len(instances) == 0:
             return False, 0
-        # @TODO: It must be moved to the top of the abstracts.py
         from ontolearn_light.learning_problem import EncodedPosNegLPStandard
         if isinstance(learning_problem, EncodedPosNegLPStandard):
             tp = len(learning_problem.kb_pos.intersection(instances))
@@ -82,7 +78,6 @@ def score2(self, tp: int, fn: int, fp: int, tn: int) -> Tuple[bool, Optional[flo
         """
         pass
 
-    # @TODO:CD: Why there is '..' in AbstractNode
     def apply(self, node: 'AbstractNode', instances, learning_problem: EncodedLearningProblem) -> bool:
         """Apply the quality function to a search tree node after calculating the quality score on the given instances.
 
@@ -99,7 +94,6 @@ def apply(self, node: 'AbstractNode', instances, learning_problem: EncodedLearni
             f'Expected EncodedLearningProblem but got {type(learning_problem)}'
         assert isinstance(node, AbstractNode), \
             f'Expected AbstractNode but got {type(node)}'
-        # @TODO: It must be moved to the top of the abstracts.py
         from ontolearn_light.search import _NodeQuality
         assert isinstance(node, _NodeQuality), \
             f'Expected _NodeQuality but got {type(_NodeQuality)}'
@@ -331,7 +325,7 @@ class AbstractKnowledgeBase(metaclass=ABCMeta):
 
     # CD: This function is used as "a get method". Insteadf either access the atttribute directly
     # or use it as a property @abstractmethod
-    def ontology(self) -> OWLOntology:
+    def ontology(self) -> Ontology:
         """The base ontology of this knowledge base."""
         pass
 
 
@@ -52,7 +52,6 @@ def union_from_iterables(a_operands: Iterable[OWLClassExpression],
                              b_operands: Iterable[OWLClassExpression]) -> Iterable[OWLObjectUnionOf]:
         """ Create an union of each class expression in a_operands with each class expression in b_operands."""
         assert (isinstance(a_operands, Generator) is False) and (isinstance(b_operands, Generator) is False)
-        # TODO: if input sizes say 10^4, we can employ multiprocessing
         seen = set()
         for i in a_operands:
             for j in b_operands:
@@ -73,8 +72,6 @@ def intersection(self, ops: Iterable[OWLClassExpression]) -> OWLObjectIntersecti
         Returns:
             Intersection with all operands (intersections are merged).
         """
-        # TODO CD: I would rather prefer def intersection(self, a: OWLClassExpression, b: OWLClassExpression). This is
-        # TODO CD: more advantages as one does not need to create a tuple of a list before intersection two expressions.
         operands: List[OWLClassExpression] = []
         for c in ops:
             if isinstance(c, OWLObjectIntersectionOf):
 
@@ -91,7 +91,6 @@ class Experience:
     """
 
     def __init__(self, maxlen: int):
-        # @TODO we may want to not forget experiences yielding high rewards
         self.current_states = deque(maxlen=maxlen)
         self.next_states = deque(maxlen=maxlen)
         self.rewards = deque(maxlen=maxlen)
 
@@ -136,7 +136,6 @@ def ind_to_string(ind: List[Tree]) -> str:
     return ''.join([prim.name for prim in ind])
 
 
-# TODO: Ugly hack for now
 def owlliteral_to_primitive_string(lit: OWLLiteral, pe: Optional[Union[OWLDataProperty, OWLObjectProperty]] = None) \
         -> str:
     str_ = type(lit.to_python()).__name__ + escape(lit.get_literal())
 
@@ -14,15 +14,12 @@
 from owlapy.owl_datatype import OWLDatatype
 from owlapy.owl_individual import OWLNamedIndividual
 from owlapy.owl_literal import BooleanOWLDatatype, NUMERIC_DATATYPES, DoubleOWLDatatype, TIME_DATATYPES, OWLLiteral
-from owlapy.owl_ontology import OWLOntology
-from owlapy.owl_ontology_manager import OWLOntologyManager
 from owlapy.owl_property import OWLObjectProperty, OWLDataProperty, OWLObjectPropertyExpression, \
     OWLDataPropertyExpression
-from owlapy.owl_reasoner import OWLReasoner
 
 from owlapy.owl_ontology import Ontology
 from owlapy.owl_ontology_manager import OntologyManager
-from owlapy.owl_reasoner import FastInstanceCheckerReasoner, OntologyReasoner
+from owlapy.owl_reasoner import StructuralReasoner
 
 from owlapy.render import DLSyntaxObjectRenderer
 from ontolearn_light.search import EvaluatedConcept
@@ -35,18 +32,16 @@
 from .utils.static_funcs import (init_length_metric, init_hierarchy_instances,
                                  init_named_individuals, init_individuals_from_concepts)
 
-from owlapy.class_expression import OWLDataMaxCardinality, OWLDataSomeValuesFrom
-from owlapy import owl_expression_to_sparql, owl_expression_to_dl
+from owlapy.class_expression import OWLDataSomeValuesFrom
 from owlapy.owl_data_ranges import OWLDataRange
 from owlapy.class_expression import OWLDataOneOf
 
 logger = logging.getLogger(__name__)
 
 
-def depth_Default_ReasonerFactory(onto: OWLOntology) -> OWLReasoner:
+def depth_Default_ReasonerFactory(onto: Ontology) -> StructuralReasoner:
     assert isinstance(onto, Ontology)
-    base_reasoner = OntologyReasoner(ontology=onto)
-    return FastInstanceCheckerReasoner(ontology=onto, base_reasoner=base_reasoner)
+    return StructuralReasoner(ontology=onto, class_cache=False, property_cache=False)
 
 
 class KnowledgeBase(AbstractKnowledgeBase):
@@ -89,9 +84,9 @@ class KnowledgeBase(AbstractKnowledgeBase):
     @overload
     def __init__(self, *,
                  path: str,
-                 ontologymanager_factory: Callable[[], OWLOntologyManager] = OntologyManager(
+                 ontologymanager_factory: Callable[[], OntologyManager] = OntologyManager(
                      world_store=None),
-                 reasoner_factory: Callable[[OWLOntology], OWLReasoner] = None,
+                 reasoner_factory: Callable[[Ontology], StructuralReasoner] = None,
                  length_metric: Optional[OWLClassExpressionLengthMetric] = None,
                  length_metric_factory: Optional[Callable[[], OWLClassExpressionLengthMetric]] = None,
                  individuals_cache_size=128,
@@ -101,8 +96,8 @@ def __init__(self, *,
 
     @overload
     def __init__(self, *,
-                 ontology: OWLOntology,
-                 reasoner: OWLReasoner,
+                 ontology: Ontology,
+                 reasoner: StructuralReasoner,
                  load_class_hierarchy: bool = True,
                  length_metric: Optional[OWLClassExpressionLengthMetric] = None,
                  length_metric_factory: Optional[Callable[[], OWLClassExpressionLengthMetric]] = None,
@@ -112,12 +107,12 @@ def __init__(self, *,
     def __init__(self, *,
                  path: Optional[str] = None,
 
-                 ontologymanager_factory: Optional[Callable[[], OWLOntologyManager]] = None,
-                 reasoner_factory: Optional[Callable[[OWLOntology], OWLReasoner]] = None,
+                 ontologymanager_factory: Optional[Callable[[], OntologyManager]] = None,
+                 reasoner_factory: Optional[Callable[[Ontology], StructuralReasoner]] = None,
                  length_metric_factory: Optional[Callable[[], OWLClassExpressionLengthMetric]] = None,
 
-                 ontology: Optional[OWLOntology] = None,
-                 reasoner: Optional[OWLReasoner] = None,
+                 ontology: Optional[Ontology] = None,
+                 reasoner: Optional[StructuralReasoner] = None,
                  length_metric: Optional[OWLClassExpressionLengthMetric] = None,
 
                  individuals_cache_size=128,
@@ -152,14 +147,13 @@ def __init__(self, *,
                     self.manager.save_world()
                     logger.debug("Synced world to backend store")
 
-        reasoner: OWLReasoner
+        reasoner: StructuralReasoner
         if reasoner is not None:
             self.reasoner = reasoner
         elif reasoner_factory is not None:
             self.reasoner = reasoner_factory(self.ontology)
         else:
-            self.reasoner = FastInstanceCheckerReasoner(ontology=self.ontology, base_reasoner=OntologyReasoner(
-                                                                                                ontology=self.ontology))
+            self.reasoner = StructuralReasoner(ontology=self.ontology, class_cache=False, property_cache=False)
 
         self.length_metric = init_length_metric(length_metric, length_metric_factory)
 
@@ -317,7 +311,6 @@ def tbox(self, entities: Union[Iterable[OWLClass], Iterable[OWLDataProperty], It
          If no concept-s|propert-y/ies are given, get all tbox axioms.
 
          Args:
-             @TODO: entities or namedindividuals ?!
              entities: Entities to obtain tbox axioms from. This can be a single
               OWLClass/OWLDataProperty/OWLObjectProperty object, a list of those objects or None. If you enter a list
               that combines classes and properties (which we don't recommend doing), only axioms for one type will be
@@ -525,8 +518,6 @@ def concept_len(self, ce: OWLClassExpression) -> int:
         Returns:
             Length of the concept.
         """
-        # @TODO: CD: Computing the length of a concept should be disantangled from KB
-        # @TODO: CD: Ideally, this should be a static function
 
         return self.length_metric.length(ce)
 
@@ -550,7 +541,7 @@ def cache_individuals(self, ce: OWLClassExpression) -> None:
             raise TypeError
         if ce in self.ind_cache:
             return
-        if isinstance(self.reasoner, FastInstanceCheckerReasoner):
+        if isinstance(self.reasoner, StructuralReasoner):
             self.ind_cache[ce] = self.reasoner._find_instances(ce)  # performance hack
         else:
             temp = self.reasoner.instances(ce)
@@ -666,8 +657,6 @@ def data_properties_for_domain(self, domain: OWLClassExpression, data_properties
 
     def encode_learning_problem(self, lp: PosNegLPStandard):
         """
-        @TODO: A learning problem (DL concept learning problem) should not be a part of a knowledge base
-
         Provides the encoded learning problem (lp), i.e. the class containing the set of OWLNamedIndividuals
         as follows:
             kb_pos --> the positive examples set,
@@ -720,8 +709,6 @@ def evaluate_concept(self, concept: OWLClassExpression, quality_func: AbstractSc
                          encoded_learning_problem: EncodedLearningProblem) -> EvaluatedConcept:
         """Evaluates a concept by using the encoded learning problem examples, in terms of Accuracy or F1-score.
 
-        @ TODO: A knowledge base is a data structure and the context of "evaluating" a concept seems to be unrelated
-
         Note:
             This method is useful to tell the quality (e.q) of a generated concept by the concept learners, to get
             the set of individuals (e.inds) that are classified by this concept and the amount of them (e.ic).
@@ -752,22 +739,16 @@ def get_leaf_concepts(self, concept: OWLClass):
 
     def get_least_general_named_concepts(self) -> Generator[OWLClass, None, None]:
         """Get leaf classes.
-        @TODO: Docstring needed
-        Returns:
         """
         yield from self.class_hierarchy.leaves()
 
     def least_general_named_concepts(self) -> Generator[OWLClass, None, None]:
         """Get leaf classes.
-        @TODO: Docstring needed
-        Returns:
         """
         yield from self.class_hierarchy.leaves()
 
     def get_most_general_classes(self) -> Generator[OWLClass, None, None]:
-        """Get most general named concepts classes.
-        @TODO: Docstring needed
-        Returns:"""
+        """Get most general named concepts classes."""
         yield from self.class_hierarchy.roots()
 
     def get_direct_sub_concepts(self, concept: OWLClass) -> Iterable[OWLClass]: