-
Indian Journal o f Chemi stry Vol. 40B. August 2001 , pp.
688-697
QSAR of matrix metalloproteinase inhibitor
(S)-3-methyl-2-(substituted biphenyl-4-sulfonylamino )butyric acids
using principal component factor analysis
KUllal Royt Di vision of Pharmaceutical Chemi stry. Seemanta
Institute of Pharmaceutical Sciences. Jharpokharia, Mayurbhanj 757
086.
Orissa, India
Received 20 Jlllle 2000; accepted (revised) 22 Jalluary 2001
Quantitative structure-acti vity relationship studies of matrix
metalloproteinase inhibitor (S)-3-methyl-2-(substituted
biphenyl-4-sulfonylamino)butyric ac ids have been performed to
explore the contribution pattern of phenyl ring substituents on the
activities. Principal component factor analysis was used as the
preprocessing step to identify the important molecular parameters
(physicochemical and structural variables) describing the
activities against different metalloproteinases (pC " pC2, pC .. ,
pC?, pC9 and pC 13) and to remove collinearities from the
equations. The study reveals that volume of 4' substituents on the
biphenyl moiety has positive contribution to the activities like
pC2, pC9 and pC 13. This suggests possibility o f di spe rsion
interactions with the receptor in these cases. High electronic
influence (e itper withdrawl or donation) of 4'-substituents on the
phenyl ring has negative impact on pC" pC2, pC9 and pC 13• For pC I
and pC9, the presence of 4'-methoxy or 4'-methyl group is fa
voured. For pC3 and pC?, a 4' substituent, except amino,
contributes to the activity. Further. vo lume of meta (3')
substituent on the phenyl ring or mere presence of meta substituent
has negative impact on all the acti vities except pC9. Quantitati
ve activity-acti vity relations among the activities have also been
explored.
Matrix metalloproteinases (MMPs)) , a family of zinc dependent
endopeptidases, are capable of efficient degradation of different
components of extracellular matrix (e.g., fibrillar and
nonfibrillar collagens, basement membrane collagens, fibronectin,
elastin, gelatins, proteoglycans, amelogenin etc.) at physiological
pH2. These enzymes play significant physiological role in tissue
remodeling in normal growth and development3. Upregulation of MMPs
has been implicated in the pathogenesis of several disease states
like destruction of cartilage and bone in rheumatoid arthritis,
tumor growth and metastatis in cancer, degeneration of aortic wall
in abdominal aortic aneurysms and cardiac dilation in patients with
congestive heart failure3. Besides these, involvement of MMPs in
periodontal disease, skin ulceration, corneal ulceration, gastric
ulcer, liver fibrosis etc. has also been reported2• This has
created recent interest in the design and development of MMP
inhibitors as possible anticancer, antiarthritis, antiintlammatory
or other pharmacotherapeutic agents to stop imbalance of MMP
regulation in these pathological states) -3 .
Recently, O'Brien et al.3 have reported a series of
biphenylsulfonamide MMP inhibitors . The present communication has
been attempted to explore QSAR
t Address for correspondence: C/o Dr. A. G. Saha. Flat No. 2E.
8. Dr. Ashutosh Sastry Road, Calcutta 700 DI D (Indi a) e-mail :
kunalroy_in @yahoo.com
4'
Figure 1-General structure of matrix metalloproteinase i nhi
bitor (S)-3-methy 1-2 -( 4' -substituted bi pheny 1-4- su I fony
lamino)-butyric acids
of the compounds and also quantitative actlvlty-activity
relationships (QAAR) among the activities against different MMPs.
Though the original authors attempted structural variations at
different regions of the molecules, due to insufficient data points
covering some structural features, some compounds have not been
considered in the present study. A series of fourteen
(S)-3-methyl-2-(substituted biphenyl-4-sulfonylamino)butyric acids
were subjected to QSAR analysis with different physicochemical and
structural parameters using principal component fac tor analysis as
the preprocessing step.
Factor analysis4.5 The principal objectives o f factor analysis
are to
display multidimensional data in a space of lower dimensionality
with mini mal loss of information and to extract bas ic features
behind the data with ultimate
-
ROY: QSAR OF MATRIX METALLOPROTEINASE INHIBITOR BUTYRIC ACIDS
689
Table I -Strctural variations and in vitro MMP inhibitory
activities of (S)-3-methyl-2-(substituted biphenyl
-4-sulfonylamino)butyric acids
SI R MMPI MMP2 MMP3 MMP7 MMP9 MMPI3 No. ICso* pC 1# ICso*
pC2#
H 5.4 2.268 0.040 4.398 2 4-F 4.2 2.377 0.039 4.409 3 3-F 8.6
2.066 0.049 4.310 4 4-Br 6 2.222 0.004 5.398 5 3-Br 100 1.000 0.535
3.272 6 4-CI 6.5 2.187 0.011 4.959 7 2-F,4-Br 3.6 2.444 0.005 5.301
8 4-CH3 2.2 2.658 0.002 5.699 9 4-0CH) 1.5 2.824 0.003 5.523 10
4-NH2 26 1.585 0.036 4.444 II 4-CF3 4.2 2.377 0.013 4.886 12 4-CN
18 1.745 0.033 4.482 13 4-CHO 3.2 2.495 0.012 4.921 14 4-N02 12
1.921 0.061 4.215
*ICso va lues against different MMPs (MMPj ) are in 11M scale
#pCj = - log [lCso (mM)]j
ICso*
0.038 0.010 0.0 17 0.007 0.290 0.009 0.016 0.003 0.008 0.036
0.009 0.006 0.008 0.015
goal of interpretation and/or prediction. Factor analysis can be
performed as the data preprocessing step for regression analysis to
eliminate collinearities and to reduce the number of equations to
be calculated by judicious selection of the variables
connected with biological potency . If a factor analysis is
performed on a data set containing biological activity (log lie)
and all descriptor variables which are to be considered, the
extracted factors are first rotated, preferably by V ARIMAX
rotation, to obtain Thurston's simple structure and then the
following conclusions could be drawn from the resuting factor
pattern :
(i) Only variables with non-zero loadings in such
factors where biological potency also has non-zero loading are
important in explaining variance of the acti vi ty.
(ii) Only variables with non-zero loadings in different factors
may be combined in regression equations.
~ i i i) The factor pattern indicates whether in the parameter
space the biological potency can be explained in a satisfactory
manner. If not, a different set of variables are to be chosen .
QSAR using factor analysis
The present QSAR analysis was attempted using lydrophobicity (n)
, steric (MR) and electronic (0") Jarameters as the descriptor
variables and when these Nere found insufficient to satisfactorily
describe ' ariance of the response variable (log I/C) , some
pC/ ICso* pC/ ICso* pC/ ICso* pC 1/
4.420 71 1.149 26 1.585 0.062 4.208 5.000 4.8 2.319 64 1.194
0.043 4.367 4.770 22 1.658 65 1.187 0.150 3.824 5.155 7.2 2.143 7.9
2.102 0.008 5.097 3.538 100 1.000 100 1.000 0.710 3.149 5.046 7.5
2.125 16 1.796 0.048 4.3 19 4.796 2.1 2.678 4.9 2.310 0.007 5.155
5.523 4.5 2.347 3.9 2.409 0.011 4.959 5.097 7.2 2.143 2.2 2.658
0.006 5.222 4.444 31 1.509 20 1.699 0. 105 3.979 5.046 7.9 2.102 20
1.699 0.023 4.638 5.222 7.0 2.155 59 1.229 0.037 4.432 5.097 4.5
2.347 17 1.770 0.016 4.796 4.824 5.6 2.252 38 1.420 0.102 3.991
indicator variables (structural descriptors) were tried. The
factors were extracted by principal component method and then
rotated by V ARIMAX rotation to obtain Thurston's simple structure.
Only factors describing ~ 5% of the total variance were
considered.
Although different combinations of predictor variables were
required to be used for factor analysis to extract the factors that
are highly loaded in different molecular parameters describing the
activity , only the final combinations are reported here for
brevity .
The structural variations (Figure 1) and activity of
the compounds against different MMPs as reported by O'Brien et
al. 3 are presented in Table I. It is observed that out of total
fourteen compounds, only three
possess meta or or/IIO substituents. At first, these (compounds
3, 5 and 7) were excluded and QSAR analysis of rest (n = 11) of the
compounds was performed to explore the contribution pattern of
4'-substituents. Factor analysis of the activity matrix (pex) was
performed to explore interrelation among the activities as long as
contribution of 4' -substituents is concerned. In next step, data
matrix of inter-related response variables along with
physicochemical parameters and/or structural descriptors was
subjected to factor analysis to select the important predictor
variables which satisfactorily describe the response variable(s)
and are least autocorrelated. Next, regression analysis was
performed and quality of the equations was judged by the
statistical parameLers6
-
690 INDIAN J. CHEM., SEC S, AUGUST 2001
like explained variance (EY%), coefficient of variation (CY%),
cOITelation coefficient (r or R), standard error of estimate (SEE),
variance ratio (F) at specified degree of freedom (df) and average
of absolute values of residuals (AYRES)7. The
significance of the regression constants was checked by 't'
test. ]n case wherein intercept of an equation was statistically
insignificant and omission of the same did not affect the quality
of the equation,
exclusion of the intercept gave statistically more acceptable
equation. The stability of the best equations of each activity
series was checked from "PRESS" statistics 7.8. A compound was
considered as
an outlier when the residual value exceeded twice the SEE of the
equation. For equations having outlier(s), an attempt was made to
improve quality of the equations on exclusion of the
outlier(s).
Though the number of data points for ortho and meta substituents
are insufficient to explore contribution pattern of meta (3') and
ortho (2') substituents, a preliminary attempt was made to include
compounds 3, 5 and 7 in the QSAR analysis. For this, indicator
variable (1m : representing presence or absence of 3'-substituent)
and steric parameters
(MR3' and MR2+ 3' : considering involvement of steric bulk of
aromatic ortho and meta substituents during various ligand receptor
interactions, e.g., steric hindrance due to ortho substituents9,
influence of size of meta substituents on the proper orientation of
ligand molecule lo etc.) were included in the data matrix
containing specific activity with important variables for
4'-substituents (obtained from previous analysis) for subsequent
factor and regression analyses. Finally, quantitative
actlVlty-actlVlty relationships were also explored. The data matrix
containing all activity values and important descriptor variables
emerged from previous study was subjected to factor analysis to
select important variables for QAAR equations.
The factor analysis was performed by statistical software MIN]T
AB version 13". The regression analyses and calculation of PRESS
statistics were performed by the GW-BASIC programs RRR98, KRPRESI
and KRPRES2 developed by the author l2. The physicochemical
parameter values (Table II) were taken from the Ii terature I
).
Table II - Physicochemical par~eters of the aromatic
substituents"
Substituent 1[ MRb (J (para)
F 0. 14 0.092 0.06 Br 0.86 0.888 0.23 CI 0.71 0.603 0.23 CH3
0.56 0.565 -0.17 OCH3 -0.02 0.787 -0.27 NH2 -1.23 0.542 -0.66 CF3
0.88 0.502 0.54 CN -0.57 0.633 0.66 CHO -0.65 0.688 0.42 N02 -0.28
0.736 0.78 H 0.00 0. 103 0.00
"Taken from reference 13. bScaled by a factor of 0.1 as
usual.
Results and Discussion
QSAR of (S)-3-methyl-2-(4'-substituted biphenyl-4
sulfonylamino)butyric acids (n = 11)
The results of principal component factor analyi and regression
analysis of the compounds (excludin; those with meta or ortho
substitution) are shown il Tables III and IV. The stability of the
best equation as revealed from PRESS statistics is presented il
Table V.
Factor analysis of the activity data (Table lila revealed that
the activities could be divided into thre groups: (i) pC I (highly
loaded with factor 3), (ii) pC: pC9 and pCI3 (highly loaded with
factor 1) and (iii pC) and pC7 (highly loaded with factor 2). These
thre> factors explain 95.3% of the total variance.
Factor analysis of the data matrix containing pC with
physicochemical parameters could not reveal th paramter(s)
satisfactorily describing the activit) When two dummy parameters
lOCH) (denotin. presence or absence of 4'-OCH3 or 4'-CH) group) ani
IN (denoting presence or absence of groups like NO; NH 2/CN) were
included, the activity was found to b loaded highly with factor I
and moderately wit factor 3 (Table IlIb). Again, factor 1 is highly
loadel in IN and 0'/ and factor 3 in lOCH). Accordingl) regression
analysis involving pC I was performed. Th equations Al and A2
(Table IV) explaining 87.79 and 62.4% respectively of the total
variance of pC
involve the parameters IOCH3' IN and 0'/' The positiv
coefficients of lOCH) in the equ ations suggest the presence of a
4'-methoxy or 4'-methyl group i conducive to the activity (pC I ).
Again, negativ coefficient of 0'/ in Eq. A2 suggests that the
L,
-
ROY: QSAR OF MATRIX METALLOPROTEINASE INHIBITOR BUTYRIC ACIDS
691
Table III -Factor loadings of the variables (after varimax
rota~ion) for QSAR of (S)-3-methyl-2-(4'-substituted
biphenyl-4-sulfonylamino)butyric acids (n = 11)
(a) Data matrix : pCI> pC2. pC). pC7• pC9• pCI)
Variable Factor 1 Factor 2 Factor 3 Communality
pC I 0.440 -0.2 15 -0.868 0.993 pC2 0.897 -0.332 -0.263 0.984
pC) 0.381 -0.886 -0.145 0.952 pC7 0.036 -0.958 -0.157 0.943 pC9
0.937 -0.009 -0.261 0.947 pC I) 0.744 -0.405 -0.423 0.897
% Variance 0.430 0.337 0.186 0.953
(b) Data matrix: pCI> 7t4·. 7t/. M~· . M~.2 , cr4'. cr/,
lOCH), IN
Variable Factor 1 Factor 2 Factor 3 Factor 4 Factor 5
Communality
pC I -0.747 0.083 -0.461 0.347 -0.163 0.924
7t4' -0.497 0.060 -0.018 0.076 -0.850 0.979 , 0.237 0.159 0.180
-0.918 0.000 0.957 7t4'-
M~· 0. 152 0.976 -0.094 -0.075 -0.059 0.994 M~.2 0.027 0.987
-0.045 -0.006 -0.015 0.977
cr4' 0.264 0.237 0.557 0.626 -0.375 0.968 , 0.927 0.214 0.177
0.008 0.085 0.944 cr4'-
IOCH3 -0.216 0.180 -0.948 0.117 -0.037 0.992
IN 0.940 0.079 0.083 -0.082 0.275 0.979
% variance 0.305 0.234 0.167 0. 154 0.109 0.968
(c) Data matrix : pC2• pC9, pC I), 7t4'. 7t/. M~·, M~.2. cr4',
cr/
Variable Factor 1 Factor 2 Factor 3 Factor 4 Factor 5
Communality pC2 0.471 -0.668 0.395 -0.371 0.00 1 0.963 pC9 0.514
-0.454 0.666 -0.241 0.075 0.977 pC I ) 0.459 -0.827 0.125 -0.192
0.129 0.964
7t4' 0.015 -0.324 -0.093 -0.934 0.084 0.998 , 0. 130 0.141 0.153
0.075 -0.965 0.998 7t4'-
M~· 0.983 -0.052 -0.029 -0.005 -0. 137 0.988 MR/ 0.957 -0.138
-0.048 -0.008 -0.043 0.939
cr4' 0.224 0.187 -0.887 -0.192 0.262 0.978
cr/ 0.361 0.749 -0.293 0.341 -0.182 0.926
% variance 0.309 0.231 0.169 0.140 0.120 0.970
(d) Data matrix : pC9, MR4" cr4', cr/, IOCH3' In
Variable Factor I Factor 2 Factor 3 Factor 4 Communality pC9
-0.379 -0.650 0.549 -0.336 0.981 M~· 0.200 -0. 163 0.952 0.136
0.991
cr4' 0.113 0.250 0.092 0.956 0.998 , 0.923 0. 162 0.158 0.234
0.958 cr4'-
locm -0. 158 -0.954 0.135 -0.207 0.996 IN 0.976 0.152 0.023
-0.015 0.977
% variance 0.338 0.245 0.210 0.191 0.983
(e) Data matrix : pC), pC7, 7t4', 7t/, M~· , M~.2, cr4', cr/,
Ind
Variable Factor I Factor 2 Factor 3 Factor 4 Factor 5
Communality pC) 0.892 0.210 0.265 -0.035 0.080 0.917 pC7 0.943
0.204 0.016 0.166 0.084 0.965 7t~ . 0.255 0.013 0.946 0. 101 0.Q38
0.971
(COl/ld)
-
692 INDIAN J. CHEM., SEC B, AUGUST 2001
Table III -Factor loadings of the variables (after varimax
rotation) for QSAR of (S)-3-methyl-2-(4'-substituted biphenyl-4-su
lfonylamino)butyrlc acids (n = II )-Col1ld
1 -0. 184 0.152 -0.105 -0.121 -0.950 0.986 1I4'-
M~· 0.282 0.931 -0.067 0.101 -0.174 0.991
MR/ 0.274 0.968 0.009 0.066 -0.031 0.991
04' 0.274 0.079 0.082 0.907 0.216 0.958 1 -0.082 0. 197 -0.650
0.611 -0.317 0.941 04'-
Ind -0.862 ·0.257 -0.220 -0.289 -0.192 0.978
% variance 0.304 0.225 0.162 0.150 0.126 0.966
Table IV - Relations of matrix metalloproteinase inhibition
activities of (S)-3-methy l-2-(4'-substilUted
biphenyl-4-sulfonylamino)butyric acids with physicochemical and
structural parameters [n = II (if not stated otherwise)]
Eq. Equations@ Statistics No. EV O/O CY% FI7
(r or R) (SEE) (AYRES)
Al pC I = 0.420 (± 0.247) lOCH) -0.571 (± 0.214) IN 87.7 5.9
36.7
+ 2.321 (± 0.123) (0.950) (0.131 ) (0.098)
A2 pC I = 0.440 (± 0.439) lOCH) -0.957 (± 0.833) 0 / 62.4 10.2
9.3
+ 2.349 (± 0.259) (0.836) (0.229) (0.165)
A3 pC2 = 1.487 (± 0.796) MR4· -1.876 (± 0.949) 0 / 73 .5 5.4
14.9
+ 4.387 (± 0.460) (0.888) (0.262) (0. 181 ) A3A# pC2 = 1.409 (±
0.555) MR4· -1.615 (± 0.685) 0 / 84.3 3.7 25.1
+ 4.322 (± 0.322) (0.937) (0.177) (0.120)
A4 pC) = - 0.680 (± 0.316) Ind +5.112 (± 0.135) 69.4 3.6
23.7
(0.851 ) (0.179) (0. 105) A4A# pC) = - 0.629 (± 0.202) Ind +
5.061 (± 0.090) 84.9 2.2 51.7
(0.931 ) (0.111 ) (0.068) A4Bs pC) = - 0.663 (± 0.132) Ind +
5.094 (± 0.062) 94.6 1.4 141.4
(0.976) (0.070) (0.045)
A5 pC7 = - 0.886 (± 0.226) Ind +2.2 15 (± 0.096) 88.6 6.2
78.9
(0.947) (0. 128) (0.128)
A6 pC9 = 1.339 (± 0.789) MR4· -0.458 (± 0.476)* 0 4' 70.0 14.2
8.6
-1.087 (± 0.997) 0 / + 1.320 (± 0.510) (0.887) (0.253) (0.
192)
A7 pC9 = 0.769 (± 0.603) MR4· + 0.691 (± 0.387) locm 82.6 10.7
16.9
- 0.363 (± 0.333) IN (0.937) (0. 191 ) (0. 11 9)
+ 1.322 (± 0.386) A8 pC u = 1.184 (± 0.768) MR4• - 1.528 (±
0.9(6) 0 / 65.2 5.6 10.4
+ 4. 185 (± 0.444) (0.849) (0.253) (0. 177)
#Excluding compound 8 (n = (0); $ Excluding compounds 8 and 14
(n = 9) @95% confidence intervals of regression constan ts are
shown within parentheses; 't' values of the regression constants
are significant at 95% level unless superscripted with * (Sign
ificant at 90% level) Idf= n-np-I: n = no. of data points; np = no.
of predictor variables]
I7 F values are significant at 99% level; df= (np, n-np-I ).
substitution should not be highly electron donating or electron
withdrawing due to resonance effect. This is corroborated by the
negat ive coefficient of IN in Eq. A 1 :cyano (J p = 0.66) and
nitro (ap = 0.78) are strong electron withdrawing groups while am
ino (ap = -0.66) is a strong e lectron releasing group through
resonance
effect. The PRESS statistics (Table V) of Eq. A I suggests
robustness of the equation.
The data matrix containing pC2, pCq and pC I3 and
physicochemical parameters could be described by 5 factors (Table
I1Ie). The steric (MR4-) and electronic (a/; additionally, a4' for
pC9) parameters showed
-
ROY: QSAR OF MATRJX METALLOPROTEINASE INHIBITOR BUTYRIC ACroS
693
Table V -Summary of PRESS statistics
Model equation pCj = I~iXi + a, n = II
Reference Average* regression constants (standard deviation)
PRESS statistics
Eq. pCj ~, ~2 ~) (sd) (sd) (sd)
Al pC, 0.420 lOCH, -0.571 IN
(0. 150) (0.362)
A3 pC2 1.483 M~· -1.8760/
(0.398) (0.884)
A4 pC) -6801nd
(0.403)
A5 pC4 -886 Ind
(0.486)
A7 pC5 0.765 MR4, + 0.692 lOCH) -0.362 IN
(0.235) (0.209) (0.238)
A8 pC6 1 . 184M~, -1.5190/
(0.329) (0.719)
* Average of II PRESS equations
importance in describing the activities. The important
regression equations for pC2• pC9 and pC'3 (Eqs. A3, A3A, A6, A8)
are shown in Table IV. The positive coefficients of MR4• in all the
cases indicate positive contribution of steric bulk of 4'
-substituents suggesting possible dispersion interaction with the
active site. The negative coefficients of 0'/ indicate that the
electronic influence (donation and withdrawl) of the
4'-substituents (due to resonance) on the phenyl ring should be
minimum. The optimum value of 0'4' for pC9 was calculated to be
-0.21 (moderately electron releasing) from Eq. A6. The PRESS
statistics of Eqs. A3 and A8 are shown in Table V. However,
crossvalidation of Eq. A6 shows high dependence of the equation on
compound 10. Hence factor analysis of the data matrix of pC9• M~·.
0'4. 0'/ alongwith locH3 and IN (Table IIId) was performed and
this
showed importance of loc"'3 and IN alongwith MR4,. The
cOlTesponding regression equation A 7 is of high statistical
quality (Table IV) and robustness (Table V). The equation suggests
that apart from positive contribution of steric bulk of
4'-substituents, the presence of 4'-methoxy or 4'-methyl groups and
the absence of groups like N02/NH 2/CN are conducive for the
activity (pCg).
For pC3 and pC7, no physicochemical parameters were found to be
important contributor. However, when an indicator variable Ind
(denoting presence or
a EVO/O SEE AVRES
(sd)
+2.321 80.1 0.895 0.167 0.139
(0.599)
+4.391 65.1 0.807 0.301 0.246
( 1.086)
+5.112 65.1 0.807 0.191 0.119
( 1.278)
+2.215 74.J 0.864 0.190 0.148
(0.572)
+ 1.323 66.6 0.836 0.265 0. 193
(0.355)
+4,185 55.9 0.747 0.285 0.236
( 1.037)
absence of 4'-NH2/4'-H group) was included in the data matrix
for factor analysis (Table IIIe), the activities were found be
highly loaded with the variable Ind. The corresponding regression
equations (A4 and AS) and their PRESS statistics are given Tables
IV and V respectively. The negative coefficient of Ind indicates
that a 4' -substituent, excluding amino, favours the
activities.
QSAR of (S)-3-methyl-2-( substituted
biphenyl-4-sulfonylamino)butyric acids (n = 14)
The QSAR analysis of all 14 compounds are given Tables VI and
VII. The factor analysis of the data matrix involving the
activities showed that the factor pattern of the activities as
found in case of data set of 4'-substituted compounds (n = 11) was
almost retained except that the interrelation between pC, and pC)
increased.
The factor analysis of the data matrix involving pC, (Table VIa)
showed the importance of MR), and
MR2+ 3' apart from IOCH3 and IN. The equations involving MR3,
and MR2+ 3' were statistically equivalent though the later was
slightly better and shown in Table VII. The negative contributions
of MR), and MR2+ 3' suggest that bulk of orlho and meta
substituents is negative contributor to the activity.
The factor analysis of the data matrix involving pC2• pCg and
pC'3 (Table VIb) showed that MR)' and
-
694 INDIAN J. CHEM .• SEC B. AUGUST 2001
Table VI - Factor loadings of the variables (after varimax
rotation) for QSAR of (S)-3-methyl-2-(substituted biphenyl-4-
sulfonylamino)butyric acids (n = 14)
(a) Data matrix: pC" locH). IN. 1m. MR3". MR2+ 3"
Variable Factor I Factor 2 Factor 3 Factor 4 Communality
pC I 0.660 0.523 -0.414 0.318 0.98 1 IOCHJ 0.059 0.120 -0.988
0.065 0.998 I N 0.074 -0.983 0.104 0.107 0.994 1m -0.445 0. 100
0.094 -0.884 0.999 MRy -0.962 0.052 0.043 -0.260 0.998 MR2+ 3·
-0.962 0.050 0.041 -0.262 0.998
% variance 0.416 0.2 12 0. 195 0.172 0.995
(b) Data matrix: pC2• pC9• pC13, MR4·• 0 4', 0 /, 1m. MR3·• MR2+
3'
Variable Factor I Factor 2 Factor 3 Factor 4 Factor 5
Communality
pC2 -0.842 -0.455 -0.187 -0.059 -0.148 0.975 pC9 -0.913 -0. 173
-0. 110 -0.263 -0. 144 0.966 pC IJ -0.825 -0.407 -0.189 0.104
-0.265 0.963 MR4· -0.854 -0. 106 0.422 0.207 -0.079 0.968
04' 0.030 -0.054 0. 189 0.976 -0.044 0.993
0 / 0.092 -0. 132 0.952 0.198 -0. 11 2 0.983
1m 0.427 0.432 -0.23 1 -0.077 0.754 0.997 MRJ· 0.264 0.944 -0.
105 -0.040 0. 159 0.999 MR2+)" 0.269 0.943 -0.103 -0.041 0. 158
0.998
% variance 0.365 0.267 0.142 0.125 0.084 0.983
(c) Data matrix: pC3• pC7• 1nd, 1m, MR)". MR2+ 3'
Variable Factor 1 Factor 2 Factor 3 Communality
pC3 0.743 0.495 -0.342 0.914 pC7 0.456 0.714 -0.474 0.942 1nd
0.085 -0.981 -0.1 3 1 0.987 1m -0.419 0.022 0.903 0.992 MRy -0.950
-0.005 0.301 0.994 MR2+ )" -0.949 -0.008 0.303 0.993
% variance 0.458 0.286 0.226 0.970
Table VII - Relations of matrix metalloproteinase inhibition
activities of (S)-3-methyl-2-(substituted
biphenyl-4-sulfonylamino)butyric acids with physicochemical and
structural parameters [n = 14 (if not stated otherwise)]
Eq. Equations@ Stati stics No. EV% CV% FI7
(r or R) (SEE) (AYRES)
B I pC I = 0.44 J (± 0.26 1) locm- 0.549 (± 0.223) IN 90.2 6.9
40.8
- 1.653 (± 0.444) MR2+ 3+2.640 (± 0.164) (0.961) (0.148)
(0.104)
B2 pC I = 0.468 (± 0.404) ' ocm- 0.87 J (± 0.740) 0/ 76.7 10.6
15.2 - 1.675 (± 0.697) MRz+ 3+ 2.662 (± 0.268) (0.906) (0.228) (0.
157)
B3 pC2 = 1.456 (± 0.6 J 7) M~·- J. 708 (± 0.853) 0 4.2 83.8 5.5
23.5 - 1.539 (± 0.832) MR3+ 4.490 (± 0.405) (0.936) (0.259) (0.
164)
B3A # pC2 = 1.396 (± 0.419) M~·- 1.493 (± 0.592) 0/ 91.8 3.7
46.0 - 1.461 (± 0.565) MR3· + 4.427 (± 0.280) (0.969) (0.172) (0.
103)
B4 pC3 = - 0.6 18 (± 0.353) Ind- 1.92 J (± 0.610) MR3" 8 1.1 4.3
28.9
+ 5.248 (± 0.167) (0.9 J 7) (0.208) (0. J 22) (COIlId)
-
ROY : QSAR OF MATRIX METALLOPROTEINASE INHIBITOR BUTYRIC ACIDS
695
Table VII- Relations of matrix metalloproteinase inhibition
activities of (S)-3-methyl-2-(substituted
biphenyl-4-sulfonylamino)butyric acids with physicochemical and
structural parameters [n = 14 (if not stated otherwise)]-Collld
B4A# pC) = -0.571 (± 0.263) Ind- 1.862 (± 0.453) MR)· 88.9 3.2
49.0 +5 .195 (± 0.129) (0.953) (0.152) (0. 101 )
85 pC7 = -0.932 (± 0.383) Ind- 0.932 (± 0.383) 1m 78.4 IJ.3 24.6
+ 2.26 1 (± 0. 156) (0.904) (0.225) (0. 165)
86 pCg = 1.568 (± 0.558) M~·- 0.472 (± 0.437) 04' 75.8 14.4 14.6
- 0.974 (± 0.861) 0/ + 1.132 (± 0.349) (0.902) (0.247) (0. 183)
87 pC9 = 1.1 14 (± 0.460) M~+0.645 (± 0.375) l OCH) 83.2 12.0
22.4 - 0.396 (± 0.3 I 9) IN + 1.1 36 (± 0.285) (0.933) (0.206) (0.
126)
88 pC n = 1.399 (± 0.625) M~·- 1.450 (± 0.865) 0/ 80.2 5.9 18.6
- 1.28 1 (± 0.843) MR3" +4.147(± 0.410) (0.921) (0.262) (0.1
80)
#Excluding compound 8 (n = 13); @95% confi dence intervals of
regression constants are shown within parentheses; 't' values of
the regression constants are
significant at 95% level [df = n-np- I ; n = no. of data points;
np = no. of predictor variables) vF values are significant at 99%
level; df = (np. n-np-1 ). Table VIII -Calculated MMP inhibitory
activities of (S)-3- methy l-2-(substituted
biphenyl-4-sulfonylamino) butyric ac ids
SI. pC I pC2 pC) pC7 pC9 pC 13 No. Cal. Res. Cal. Res. Cal. Res.
Cal. Res. Cal. Res. Cal. Res.
I 2.300 -0.032 4.482 -0.084 4.432 -0.0 12 1.329 -0.180 1.250
0.335 4.159 0.048 2 2.300 0.077 4.460 -0.05 1 5.050 -0.050 2.261
0.058 1.238 -0.044 4.138 0.228 3 2.3 18 -0.252 4.499 -0.189 5.071
-0.30 1 1.329 0.329 1.250 -0.063 4.173 -0.349 4 2.300 -0.077 5.535
-0.137 5.050 0.105 2.26 1 -0. 11 8 2. 125 -0.022 5.181 -0.840 5
1.002 -0.002 3.274 -0.003 3.542 -0.004 1.329 -0.329 1.250 -0.250 3.
154 -0.005 6 2.300 -0. 11 2 5.120 -0. 161 5.050 -0.004 2.26 1
-0.136 1.807 -0.0 11 4.782 -0.463 7 2.3 18 0.126 5.535 -0.234 5.050
-0.254 2.26 1 0.417 2. 125 0.185 5.181 -0.260 8 2.74 1 -0.083 5.
106 0.593 5.050 0.473 2.261 0.086 2.410 -0.001 4.764 0. 195 9 2.74
1 0.083 5.354 0.169 5.050 0.047 2.26 1 -0. 11 8 2.657 0.001 5.0 11
0.211 10 1.750 -0.1 65 4.377 0.066 4.432 0.012 1.329 0.180 1.344
0.355 4.142 -0. 163 11 2.300 0.077 4.565 0.32 1 5.050 -0.004 2.261
-0.159 1.695 0.004 4.295 0.344 12 1.750 -0.005 4.510 -0.029 5.050
0.172 2.261 -0.106 1.445 -0.2 16 4.269 0.162 13 2.300 0.195 5.033
-0. 11 2 5.050 0.047 2.261 0.086 1.902 -0. 132 4.722 0.074 14 1.750
0.171 4.365 -0. 150 5.050 -0.226 2.26 1 -0.009 1.560 -0. 140 4.163
-0. 172
Cal. = Calculated val ues Res. = Observed-Calculated values
Values calculated accord ing to (i) Eq. B 1 for pC I> (ii ) Eq.
8 3 for pC2• (i ii ) Eq. 84 for pC}. (iv) Eq. B5 for pC7• (v) Eq.
87 for pCg and (vi) Eq. 88 for pCn .
MR2+ 3' were important only for pC2 and pC 13. The quality of
the equations in volving MR3' was slightly better than that with
MR2+ 3' and the former types are reported in Table VII . The
equations reveal negati ve
impact of meta (and poss ibly ortho) substituents on pC2 and pC
I3 due to steric bulk. However. pC9 was not fo und to be
significantly contributed by non-para substituents.
The factor analysis of the data set involving pC3 and pC7 (Table
VIc) showed that the former is dependent on MRy and MR2+ y while
the later on fill' The colTesponding regress io n equations are
shown in Table VII. For pC3, steric bulk of meta (and ortho)
substituents is negative contributor. while for pC7•
mere presence of meta substituents is detrimental to the
activity.
Though preliminary attempt was made to explore contribution
pattern of 3' and 2'-substituents, more compounds having such
groups are required to reach any conclusion. The calculated
activity values of the compounds according to the best equations of
the individual series along with corresponding res iduals are shown
in Table VIII.
Through molecular modeling study, O'Brien et al. 3 suggested
that during interaction with the receptor site, the biphenyl moiety
of the compounds remains within PI' subsite where it may be
involved in
-
696 INDIAN 1. CHEM .• SEC B. AUGUST 2001
Table IX - Factor loadings of the variables (after varimax
rotation) for QAAR equations
Data matrix : pCI> pC2• pC). pC7• pC9.pCn . MR4·• 0 /.
IOCfl). IN. Ind. 1m. MR)·
Variable Factor I Factor 2 Factor 3 Factor 4 Factor 5
Communality
pC, 0.774 -0.297 0.389 -0. 100 0.307 0.943
pC2 0.622 -0.616 0.227 -0.114 0.367 0.966 pC, 0.842 -0.220
-0.065 -0.356 0.243 0.947 pC7 0.624 -0.442 -0.074 -0.586 -0.001
0.934 pC9 0.319 -0.759 0.232 0.049 0.499 0.983
pC n 0.624 -0.667 0.219 -0. 159 0.220 0.956
M~· 0.227 -0.868 -0.303 -0.265 0.066 0.970
0 / 0.075 -0.079 -0.957 -0.005 -0. 128 0.944
loc,-,] 0.142 -0. 192 0.132 -0.072 0.954 0.990
IN -0.025 0.038 -0.973 0.134 -0.066 0.971 Ind -0.006 0.064 -0.
141 0.981 -0.065 0.990
1m -0.665 0.540 0.250 -0.192 0.022 0.833 MR)· -0.939 0.149 0.149
-0.102 0.004 0.937
% variance 0.306 0.219 0.184 0.125 0.117 0.951
Table X - QAAR equations among matrix metalloprotei nase
inhibition activities of (S)-3-methyl-2-(substituted
biphenyl-4-sulfonylamino)butyric acids [n = 14]
Eq. Equations@ Statistics No. Ev% Cv% F'"
(r or R) (SEE) (A YRES)
C I pC2 = 0.939 (± 0.350) pC,+ 0.925 (± 0.558) M~· 84.5 5.3
36.5
+ 2.230 (± 0.720) (0.932) (0.253) (0.163) C2 pC, =0.782 (±
0.356) pC}- 0.853 (± 0.844) 0 / 65.4 12.9 13.3
- 1.509 (± 1.727)* (0.84 1 ) (0.278) (0.2 14)
C3 pC, = 0.626 (± 0.397) PCr 0.497 (± 0.451) IN 55.7 14.6
9.2
+ 1.012 (± 0.822) (0.791 ) (0.315) (0. 198)
C4 pC, =0.692 (± 0.428) PCr 0.914 (± 1.023) 0/ 49.7 15 .6
7.4
+ 0.919 (± 0.868) (0.758) (0.335) (0.227)
C5 pC7 = 0.569 (± 0.418) pC,- 0.625 (± 0.544) Ind 57.1 15.9
9.6
+ 0.859 (± 0.938)* (0.798) (0.317) (0.233)
C6 pC9 = 0.595 (± 0. 160) pC,+ 0.842 (± 0.596) MR4· 70.5 15.9
293.5
(0.853) (0.273) (0.224) C7 pC, ] =0.85 1 (± 0.281) pC,+0.910 (±
0.449) MR4· 88.1 4.6 49.2
+ 2. 133 (± 0.579) (0.948) (0.203) (0. 136)
C8 pC2 = 0.903 (± 0.065) pC]+ 1.060 (± 0.568) MR4· 86.3 5.0
1866.5
-1.323 (± 0.772) 0 / (0.940) (0.238) (0. 186)
C9 pC2 = 0.789 (± 0.518) pC7+ 0.798 (± 0.690) locfl ) 61.2 8.5
11.3
+ 3.042 (± 1.044) (0.820) (Q.40 1) (0.276)
C IO pC2 = 1.156 (± 0.345) pC9+ 2.744 (± 0.617) 80.0 6.1
53.2
(0.903) (0.287) (0.235)
C II pC2 = 1.065 (± 0.028) pC,} 88.7 4.6 6792.2
(0.942) (0.2 16) (0.172)
C I2 pC] = 0.793 (± 0.374) pC7+ 3.273 (± 0.766) 61.0 6.2
21.3
(0.800) (0.299) (0.220)
C I3 pC7 = 0.332 (± 0.077) pC)+ 0.758 (± 0.638) M~· 64.4 14.5
346.5
(0.8 19) (0.288) (0.203)
C I4 pC, ] =0.841 (± 0.071)pC)+1.01O(± 0.628)MR4· 80. 1 5.9
1343.2
- 1.1 03 (± 0.853) 0 / (0.912) (0.263) (0. 196) (COl/u(
-
ROY: QSAR OF MATRIX METALLOPROTEINASE INHIBITOR BUTYRIC ACIDS
697
Table X-QAAR equations among matrix metalloprotei nase
inhibition acti vi ties of (S)-3-methyl-2-(substituted
biphenyl-4-sulfonylamino)butyric acids [n = 14]- Contd
Eq. No.
Equations@ EY%
(r or R)
Stati stics CY% F'" (SEE) (AY RES)
C I5 pC7 = 0.441 (± 0.429) pC9- 0.738 (± 0.594) Ind + 1.343 (±
0.776)
46.8
(0.741 )
17.7 6.7+
(0.353) (0.244) CI6 pC, ) = 0.527 (± 0.563) pC7+ 1.110 (± 0.976)
MR4·
- 1.179 (± 1.066) cr/ + 3.000 (± 1.232)
70.5
(0.879)
7.2 11.3
(0.321 ) (0.220) C I7 pC9 = 0.717 (± 0.290) pC, ) - 1.464 (±
1.296) 68.4 16.5 29.1
(0.84 1) (0.283) (0.2 15)
@95% confidence intervals of regression constants are shown
within parentheses; 't' values of the regression constants are
significant at 95% level unless superscripted with * (Significant
at 90% level ) [df= n-np-i; n = no. of data points; np = no. of
predictor vari ables; i = 1 if intercept is present, i = 0 otherwi
se]
"'F values are significant at 99% level except + (significant at
95% level); df= (np, n-np-i) .
hydrophobic interaction and 7t-7t stacking with aromatic amino
acid residues. The present analysis has shown that steric bulk of
the 4'-substituents has positive impact on pC2• pC9 and pC I3
suggesting possibility of dispersion interaction . Again, high
electron withdrawing or electron donating 4'-substituents have
detrimental effect on pCI> pC2• pC9 and pCI) and this suggests
that optimum electronic distribution of phenyl ring is required for
7t-7t stacking. Again , except for pC9, sterk bulk of meta (and
possily ortho also) substituent or mere its presence decreases the
activities and this is indicative of the presence of narrow channel
in the active site which is in good agreement with the results of
X-ray study on the enzymes3.
Exploring QAAR For deriving QAAR equations, factor analysis
of
the data matrix consisting of all activity values and important
predictor variables was made (Table IX) and from the pattern of
factor loadings, the QAAR equations (Table X) were constructed. The
equations may be helpful in determining selectivity of the
compounds for specific activity and determining one activity when
the other is known .
Acknowledgement The author thanks Prof. (Mrs.) C Sengupta, Prof.
A
U De and Sri Dipak Kumar Pal of Department of Pharmaceutical
Technology of Jadavpur University, Calcutta (India) for
inspiration.
References and Notes 1 Whittaker M, Floyd C D, Brown P &
Gearing A J H, Chelll
Rev,99, 1999,2735. 2 Scozzafava A & Supuran C T, J Med Chem,
43,2000, 1858. 3 O'Brien P M, Ortwine D F, Pavlovsky A G, Picard J
A,
Sli skovic D R, Roth B D, Dyer R D, Johnson L L, Man C F &
Hallak H, J Med Chem, 43, 2000, 156.
4 Franke R & Gruska A in Chemometric Methods in Molecular
Design, edited by H van de Waterbeemd (YCH, Wei nheim) 1995, pp
113-157.
5 Franke R, Theoretical Drug Design Methods, (Elsevier,
Amsterdam) 1984, pp 184-193.
6 Snedecor G W & Cochran W G, Statistical Methods, (Oxford
& IBH Publishing Co Pvt Ltd, New Delhi ) 1967, pp 38 1- 41
8.
7 Kier L B & Hall L H in Advances in Drug Design, Yol. 22,
(Academic Press, New York) 1992, pp 1-38.
8 Wold S & Eriksson L, in Chemometric Methods ill Molecular
Design, edited by H van de Waterbeemd (YCH, Weinheim) 1995,3
12.
9 Franke R, Theoretical Drug Desigll Methods, (Elsevier,
Amsterdam) 1984, 41 .
10 For example: Roy K, Pal D K. De A U & Sengupta C,
Illdial! J Chem, 38B, 1999, 1194.
II MINIT AB is a statistcal package of Minitab Inc. (New York).
12 The GW-BASIC programs RRR98, KRPRESI and KRPRES2
were developed by Kunal Roy ( 1998) and standardized on known
datasets.
13 Kubinyi H, in Burger's Medicillal Chemistry and Drug
Discovery, edited by M E Wolff, 5th edition, Yol. 1, (John Wiley
& Sons, New York) 1995, pp 507-509.