| 引用本文: | 李 梦1,2,唐文燕1,2.基于知识先验和多通道注意力机制的生物活性预测模型(J/M/D/N,J:杂志,M:书,D:论文,N:报纸).期刊名称,2026,43(2):43-53 |
| CHEN X. Adap tive slidingmode contr ol for discrete2ti me multi2inputmulti2 out put systems[ J ]. Aut omatica, 2006, 42(6): 4272-435 |
|
| 摘要: |
| :目的 针对现有药物研发中分子活性值预测不精、泛化性不高等问题,提出基于知识先验与注意力机制相结
合的多通道语义深度神经网络,通过使用分子的 SMILES(Simplified Molecular Input Line Entry System)表达式,预测
雌激素受体 α 亚型(ERα)的 pIC50 生物活性值。 方法 该网络采用两阶段特征提取策略,在语义层设计了将知识先
验与迁移学习结合的语义分析网络,它将分子 SMILES、描述符和图表征的关键信息定位,通过在 Erα 数据集中微
调参数,得到综合的分子 SMILES 表征信息;在通道层,基于高效通道注意力(Efficient Channel Attention, ECA)机
制,设计了 1D-ECA 算法,将其嵌入 CNN 子模块中,构成多通道深度神经 1D-ECA-CNN 模块,实现分子表征的特
征再提取,并减少分子表示学习过程中的信息损失;最后将语义层和通道层相结合形成 KBAC(Knowledge-BERT-
1D-ECA-CNN)深度神经网络,实现 pIC50 生物活性值的回归预测。 结果 实验结果表明:所提出的框架在 4 个评
估指标上均表现优异,MAE 可达 0. 091,MSE 可达 0. 014,RMSE 可达 0. 117,R
2 可达 0. 993,相对于 4 个具有代表
性的模型有较为明显的提升,说明所提模型具有更高的预测精度。 结论 该两阶段特征提取过程使其能够获取更为
全面的分子特征,从而帮助筛选治疗疾病的候选药物。 |
| 关键词: SMILES Knowledge-BERT 多通道注意力机制 KBAC pIC50 预测 |
| DOI: |
| 分类号: |
| 基金项目: |
|
| A Bioactivity Prediction Model Integrating Knowledge Priors and Multi-Channel Attention |
|
LI Meng1 2, TANG Wenyan1 2
|
|
1. School of Mathematics and Statistics Chongqing Technology and Business University Chongqing 400067 China
2. Chongqing Key Laboratory of Statistical Intelligent Computing and Monitoring Chongqing Technology and Business
University Chongqing 400067 China
|
| Abstract: |
| Objective To tackle the problems of inaccurate prediction of molecular activity values and low generalizability
in current drug R&D a multi-channel semantic deep neural network based on the combination of knowledge prior and
a
m
tt
o
e
l
n
ec
ti
u
o
l
n
es
m
e
t
c
h
h
i
a
s
n
n
is
e
m
two
is
rk
pr
c
o
a
p
n
os
p
e
r
d
e
.
dic
B
t
y
th
u
e
sin
p
g
IC
t
5
h
0
e
b
si
i
m
oa
p
c
l
t
i
i
f
v
ie
e
d
v
m
al
o
u
l
e
e
s
cu
o
l
f
ar
es
i
t
n
ro
p
g
u
e
t
n
li
r
n
e
e
ce
e
p
n
t
t
o
r
r
y
α
sys
is
te
o
m
form
S
M
E
I
R
L
α
E
S
.
M
ex
e
p
t
r
h
e
o
ss
d
i
s
on
T
s
h
o
is
f
network adopted a two-stage feature extraction strategy. At the semantic layer a semantic analysis network that combined
knowledge prior and transfer learning was designed. It located the key information of molecular SMILES descriptors and
graph representations. By fine-tuning the parameters in the Erα dataset comprehensive molecular SMILES representation
information was obtained. At the channel layer based on the efficient channel attention ECA mechanism a 1D-ECA
algorithm was designed and embedded into the CNN sub-module. This formed a multi-channel deep neural 1D-ECA-CNN
module which re-extracted the features of molecular representation and reduced the information loss during the molecular
representation learning process. Finally by combining the semantic layer and the channel layer a Knowledge-BERT-1DECA-CNN KBAC deep neural network was formed to achieve the regression prediction of the pIC50 biological activity
value. Results Experimental results demonstrated the superior performance of the proposed framework across four
evaluation metrics. The proposed network achieved a mean absolute error MAE of 0. 091 a mean squared error MSE
of 0. 014 a root mean squared error RMSE of 0. 117 and a coefficient of determination R
2
of 0. 993. These results
represent a significant improvement over four representative baseline models indicating higher prediction accuracy of the
proposed network. Conclusion The two-stage feature extraction process enables the acquisition of more comprehensive
molecular features facilitating the screening of candidate drugs for treating diseases. |
| Key words: SMILES Knowledge-BERT multi-channel attention mechanism KBAC pIC50 prediction |