Method for screening and terminating structure descriptor of activity related model of pollutant quantitative structure

污染物定量结构活性相关模型结构描述符筛选终止的方法

Abstract

本发明公开了污染物定量结构活性相关模型结构描述符筛选终止的方法。本发明将交叉验证相关系数q2和模型修正相关系数R2adj进行整合;对某一变量子集建立统计模型,获得观测值与模型估计值之间的相关系数r2、修正相关系数R2adj;对上述过程中的变量子集进行交叉验证,获得模型的交叉验证相关系数q2,所用的交叉验证方法有两种:留一法交互验证和留多法交互验证;根据上述过程得到的统计参数构造新参数QRadj;对同一系统新参数QRadj的数值越大,说明模型的稳定性越高,预测能力越强。本发明中的新标准QRadj不仅能够保证所获模型具有较高的交叉验证相关系数q2,而且可以避免过拟合现象的出现,杜绝筛选出低r2值高q2值的QSAR模型变量组合,科学描述模型的稳定性和预测能力。
The invention discloses a method for screening and terminating a structure descriptor of an activity related model of a pollutant quantitative structure. The method provided by the invention comprises the following steps of: integrating a cross validation correlation coefficient q2 and a model modification correlation coefficient R2adj, establishing a statistical model of a variable subset to obtain a correlation coefficient r2 between an observed value and a model estimation value and obtain a modification correlation coefficient R2adj; subjecting the variable subset of the process above to cross validation to obtain a cross validation correlation coefficient q2 of the model, wherein the cross validation is carried out by means of two methods, i.e. a leave-one-out cross validation and a leave-many-out cross validation; constructing a new parameter QRadj according to a statistical parameter obtained in the process above, wherein the numerical value of the new parameter QRadj of the same system is proportional to the stability of the model and is proportional to the predictive ability. The method for screening and terminating a structure descriptor of an activity related model of apollutant quantitative structure provided by the invention has the advantages that the relatively high cross validation correlation coefficient q2 of the model can be ensured while avoiding the presence of over-fitting phenomenon through the new standard QRadj, the QSAR (Quantitative Structure Activity Relationship) model variable combination with low r2 value and high q2 value can be prevented from screening, and the stability and the predictive ability of the model are scientifically described.

Claims

Description

Topics

Download Full PDF Version (Non-Commercial Use)

Patent Citations (1)

    Publication numberPublication dateAssigneeTitle
    CN-101140289-AMarch 12, 2008南京大学Method for quick screen selecting surroundings organic pollutant male hormone based on molecular structure

NO-Patent Citations (4)

    Title
    Hua Yuan, et al..Mode of action-based local QSAR modeling for the prediction of acute toxicity in the fathead minnow.《Journal of Molecular Graphics and Modelling》.2007,第22卷327–335.
    Wei Dongbin,et al..A Case Study of Logistic QSAR ModelingMethods and Robustness Tests.《Ecotoxicology and Environmental Safety Environmental Research, Section B》.2002,第52卷143-149.
    王连生,等.定量结构-活性相关研究进展.《环境科学进展》.1994,第2卷(第4期),全文.
    赵蔡斌,等.基于神经网络的大黄素类化合物抗癌活性模型.《陕西理工学院学报》.2007,第23卷(第4期),全文.

Cited By (0)

    Publication numberPublication dateAssigneeTitle