摘要
Transcriptional phenotypic drug discovery has achieved great success,and various compound perturbation-based data resources,such as connectivity map(CMap)and library of integrated network-based cellular signatures(LINCS),have been presented.Computational strategies fully mining these resources for phenotypic drug discovery have been proposed.Among them,the fundamental issue is to define the proper similarity between transcriptional profiles.Traditionally,such similarity has been defined in an unsupervised way.However,due to the high dimensionality and the existence of high noise in high-throughput data,similarity defined in the traditional way lacks robustness and has limited performance.To this end,we present Dr Sim,which is a learning-based framework that automatically infers similarity rather than defining it.We evaluated Dr Sim on publicly available in vitro and in vivo datasets in drug annotation and repositioning.The results indicated that Dr Sim outperforms the existing methods.In conclusion,by learning transcriptional similarity,Dr Sim facilitates the broad utility of high-throughput transcriptional perturbation data for phenotypic drug discovery.The source code and manual of Dr Sim are available at https://github.com/bm2-lab/Dr Sim/.
基金
supported by the National Key R&D Program of China(Grant Nos.2021YFF1201200 and 2021YFF1200900)
the National Natural Science Foundation of China(Grant Nos.31970638 and 61572361)
the Shanghai Natural Science Foundation Program(Grant No.17ZR1449400)
the Shanghai Artificial Intelligence Technology Standard Project(Grant No.19DZ2200900)
the Shanghai Shuguang scholars project
the We Bank scholars project
the Shanghai outstanding academic leaders project
the Fundamental Research Funds for the Central Universities,China。