GitHub - PharminfoVienna/cmpOffTargets: The visualisation for the manuscript "Identifying Differences in the Performance of Machine Learning Models for off-Targets trained on publicly available and proprietary datasets"

Comparing the prediction space of models

This repository consists of:

The visualisation engine
The data retrieval jupyter notebooks
The KNIME workflows for ML- models training and prediciton

KNIME workflows

Two KNIME workflows are provided for generating off-target ML models and evaluating created models. Generate_off-target_ml-model.knwf KNIME workflow allows hyperparameter search, 5-fold cross-validation and generating final models that can be used for the evaluation in the Test_models_fin.knwf workflow.

KNIME version == 4.6

Visualisation

The visualisation of the prediction space of ChEMBL and Naga et. al. models is an important tool for understanding the performance of these models and for identifying differences between them. This visualization is created using Uniform Manifold Approximation and Projection (UMAP) and a custom distance function that takes into account 95% structural similarity of the molecules and 5% proximity in the prediction space.

To create this visualization, we have selected 10,000 random molecules from ChEMBL and visualized them using UMAP and our custom distance function. The visualization is interactive and allows for a detailed exploration of the prediction space of the models.

Requirenments

rdkit for drawing molecules
plotly and dash for visualising
dash-bootstrap-components for styling the visualization

Usage

To start the visualization, simply run the command python viz.py in your terminal or command prompt.

Screenshot

Citation

Please cite the following publication: "Identifying Differences in the Performance of Machine Learning Models for off-Targets trained on publicly available and proprietary datasets" by Aljoša Smajić, Iris Rami, Sergey Sosnin, Gerhard F. Ecker, submitted to Chemical Research in Toxicology.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
ChEMBL_Data_Retrieval		ChEMBL_Data_Retrieval
KNIME_workflows		KNIME_workflows
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
umap_test.csv		umap_test.csv
viz.py		viz.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Comparing the prediction space of models

KNIME workflows

Visualisation

Requirenments

Usage

Screenshot

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Comparing the prediction space of models

KNIME workflows

Visualisation

Requirenments

Usage

Screenshot

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages