Using Jupyter Notebooks for re-training machine learning models

Aljoša Smajić; Melanie Grandits; Gerhard F. Ecker

doi:10.1186/s13321-022-00635-2

Titel

Using Jupyter Notebooks for re-training machine learning models

Autor*in

Aljoša Smajić

Department für Pharmazeutische Wissenschaften, Fakultät für Lebenswissenschaften, Universität Wien

Melanie Grandits

Department für Pharmazeutische Wissenschaften, Fakultät für Lebenswissenschaften, Universität Wien

Gerhard F. Ecker

Department für Pharmazeutische Wissenschaften, Fakultät für Lebenswissenschaften, Universität Wien

Abstract

Machine learning (ML) models require an extensive, user-driven selection of molecular descriptors in order to learn from chemical structures to predict actives and inactives with a high reliability. In addition, privacy concerns often restrict the access to sufficient data, leading to models with a narrow chemical space. Therefore, we propose a framework of re-trainable models that can be transferred from one local instance to another, and further allow a less extensive descriptor selection. The models are shared via a Jupyter Notebook, allowing the evaluation and implementation of a broader chemical space by keeping most of the tunable parameters pre-defined. This enables the models to be updated in a decentralized, facile, and fast manner. Herein, the method was evaluated with six transporter datasets (BCRP, BSEP, OATP1B1, OATP1B3, MRP3, P-gp), which revealed the general applicability of this approach.

Stichwort

Classification modelsTransporter proteinsDecentralizationRe-trainingJupyter Notebook

Objekt-Typ

journal article

Sprache

Englisch [eng]

Persistent identifier

phaidra.univie.ac.at/o:1671672

DOI

10.1186/s13321-022-00635-2

Erschienen in

Titel