| No. |
Title |
Authors |
Journal |
| 183 |
Machine learning of toxicological big data enables read-across structure activity relationships (RASAR) outperforming animal test reproducibility |
Thomas Luechtefeld, Dan Marsh, Craig Rowlands and Thomas Hartung |
Society of Toxicology (): |
Abstract
Earlier we created a chemical hazard database via natural language processing of dossiers submitted to the European
Chemical Agency with approximately 10 000 chemicals. We identified repeat OECD guideline tests to establish reproducibility
of acute oral and dermal toxicity, eye and skin irritation, mutagenicity and skin sensitization. Based on 350–700þ chemicals
each, the probability that an OECD guideline animal test would output the same result in a repeat test was 78\%–96\%
(sensitivity 50\%–87\%). An expanded database with more than 866 000 chemical properties/hazards was used as training data
and to model health hazards and chemical properties. The constructed models automate and extend the read-across method
of chemical classification. The novel models called RASARs (read-across structure activity relationship) use binary fingerprints
and Jaccard distance to define chemical similarity. A large chemical similarity adjacency matrix is constructed from this
similarity metric and is used to derive feature vectors for supervised learning. We show results on 9 health hazards from 2
kinds of RASARs—“Simple” and “Data Fusion”. The “Simple” RASAR seeks to duplicate the traditional read-across method,
predicting hazard from chemical analogs with known hazard data. The “Data Fusion” RASAR extends this concept by creating
large feature vectors from all available property data rather than only the modeled hazard. Simple RASAR models tested in
cross-validation achieve 70\%–80\% balanced accuracies with constraints on tested compounds. Cross validation of data fusion
RASARs show balanced accuracies in the 80\%–95\% range across 9 health hazards with no constraints on tested compounds.
Date: 2025.07.04 (FRI) 18:00
Presenter: Minji Baek (CSB Lab. MS. student)
백민지 학생이 주도하여 "머신러닝 기반의 독성 예측" 를 주제로 7.04 (금) 오후 6시 저널클럽을 진행했습니다.
