QSAR ANALYSIS AND MOLECULAR DOCKING STUDY OF PYRROLO- AND PYRIDOQUINOLINECARBOXAMIDES WITH DIURETIC ACTIVITY

The aim. The aim of the study was to reveal QSAR and ascertain the possible mechanism of action via docking study in the row of tricyclic quinoline derivatives with diuretic activity. Materials and methods. Pyrroloand pyridoquinolinecarboxamides with proven diuretic activity were involved in the study. Molecular descriptors were calculated using HyperChem and GRAGON software, and QSAR models were built using BuildQSAR software. For receptor-oriented flexible docking, the Autodock 4.2 software package was used. Results. Multivariate linear QSAR models were built on two datasets of quinolinecarboxamides: Vol=a∙X1+b∙X2+c∙X3+d, where Vol – volume of the daily produced urine in rats, Xi – molecular descriptor. QSAR analysis showed that the diuretic activity is determined by the geometric and spatial structure of molecules, logP, the energy values, RDFand 3D-MoRSE-descriptors. Based upon internal and external validation of the models, the most informative two-parameter linear QSAR model 3а was proposed. Docking data showed the high affinity of two lead compounds to the carbonic anhydrase II. Conclusions. QSAR analysis of tricyclic quinoline derivatives revealed that the diuretic activity increases with the increase of value of logP, refractivity, and dipole moment and with the decrease of volume, surface area, and polarization of the molecules. Increase of values of such energy descriptors as bonds energy, core-core interaction, and energy of the highest occupied molecular orbital results in higher diuresis; decrease in hydration energy leads to higher diuretic activity. Based upon molecular docking calculation, the mechanism of diuretic action is proposed to be carbonic anhydrase inhibition. QSAR models and docking data are useful for in-depth study of diuretic activity of tricyclic quinolines and could be a theoretical basis for de novo-design of new diuretics


Introduction
Modern QSAR is an efficient method for building mathematical models, which attempts to find a statistically significant correlation between the chemical structure and continuous (pIC50, pEC50, Ki, etc.) or categorical/binary (active, inactive, toxic, nontoxic, etc.) biological/toxicological property using regression and classification techniques, respectively [1]. QSAR analysis is among the most important strategies that can be applied for the successful design of new molecules: it helps the identification of hits, generation of leads, as well as to accelerate the optimization of leads into drug candidates [2]. QSAR models are applied to identify perspective compounds with desired properties [3]. In the last decades QSAR has undergone several transformations, ranging from the dimensionality of the molecular descriptors (from 1D to nD) and different methods for finding a correlation between the chemical structures and the biological property [4]. The method is widely used by medicinal chemists in the process of development of drug candidates with various pharmacological activities [5][6][7], and it is especially actual nowadays helping in urgent search of drug for SARS-CoV-2 [8].
Molecular docking is a computational method used to determine the binding strength between the active site residues and specific molecule(s). Molecular docking is expedient tool used to investigate the binding compatibility of ligands to target (receptor), thus to select active molecules, predict their mechanism of action, optimize the lead structure [9]. Combination of QSAR analysis and docking-based scoring is widely used strategy in the structure-based drug discovery field [10].
Class of 4-hydroxy-2-oxo-1,2-dihydroquinoline-3carboxylic acid derivatives are ones of the key objects for biologically active substances search that has been carried out in National University of Pharmacy. These compounds showed various types of pharmacological activities, and diuretic activity is one of them. Diuretics are important class of medicine, because they are preferable therapy of widespread cardiovascular and non-cardiovascular diseases [11][12][13]. First of all, diuretics are the most common initial treatment for mild hypertension, and thiazide diuretics have been listed as one of equally weighted first-line treatment options alongside with β-blockers, calcium antagonists, and angiotensin converting enzyme inhibitors/angiotensin receptor blockers [14]. Clinically used diuretic drugs generally exhibit an overall favourable risk/benefit balance. However, they are not devoid of side effects such as loss of electrolytes (hyponatraemia, hypokalemia, hypomagnesaemia), hyperuricemia, hypercholesterolemia [15], and diuretic resistance [16]. That is why new molecules with better pharmacological profile are always needed. Highthroughput screening, virtual screening methods, progress in protein structure analysis, and modern methods of chemical modification have opened good possibilities for identification of new candidates for preclinical and clinical testing, revealing promising classes of diuretics with novel mechanisms of action, for example, vasopressin receptor antagonists, SGLT2 inhibitors, urea transporters inhibitors, aquaporin antagonists, ROMK inhibitors, so on [17].
The aim of this study was to reveal QSAR in the row of tricyclic quinoline derivatives with diuretic activity and to ascertain the affinity of the most active compounds to the carbonic anhydrase as potential biological target, thus making a theoretical background for hit selection, lead optimization in the process of design of a new quinolone diuretic substance.

Planning (methodology) of research
To achieve the aim mentioned above, the methodology of the QSAR analysis was developed (Fig. 1).
The choice of crystallographic models of carbonic anhydrase II PDB ID: 1Z9Y as a biological target for the docking study of the mechanism of diuretic activity is due to the existing crystallographic model co-crystallized with furosemide as sulfonamide inhibitor (Fig. 2)
For receptor-oriented flexible docking, the Autodock 4.2 software package was used. Ligands were prepared using the MGL Tools 1.5.6 program. The ligands optimization was performed using the Avogadro program. To perform calculations in the Autodock 4.2 program the output formats of the receptor and ligand data were converted to a special PDBQT format. In our previous studies, a similar software package was used [23]. The active macromolecule center of the carbonic anhydrase II from the Protein Data Bank (PDB) was used as a biological target for docking. The receptor maps were made in MGL Tools and AutoGrid programs. Water molecules, ions, and the ligand were removed from the PDB file ID: 1Z9Y. The visual analysis of complexes of substances in the active center of the carbonic anhydrase II (PDB ID: 1Z9Y) was performed using the Discovery Studio Visualizer program.

Results
The preliminary optimization of the molecular structures of the investigated compounds was carried out using the molecular mechanics method MM+ (Hyper-Chem software package). The final minimization of theirs energies was carried out using AM1 semiempirical quantum chemical method to achieve a RMS gradient value less than 0.01 kcal/(mol•Å). The use of the AM1 method was based on the fact that it allowed the most accurate calculation of the electron and spatial structure of oxygen and nitrogen-containing heterocyclic compounds [24].
HyperChem software [25] was used for calculation of refractivity (R) and polarization (P) of the mole-cule, dipole moment (D), volume (V) and surface area (S) of the molecule. The energy descriptors such as the total energy of the molecule (TE), the binding energy (BE), the electronic energy (EE), the isolated atomic energy (IAE), core-core interaction (CCI), heat of formation (HF), and descriptors characterizing donoracceptor interaction, namely, energies of the highest occupied molecular orbital (HOMO) and lowest unoccupied molecular orbital (LUMO) were calculated as well [26].
3D descriptors of different classes were calculated using on-line service DRAGON 7. Molecular RDF (Radial Distribution Function) descriptors characterize 3Dstructure of a molecule, its size and mutual arrangement of some atoms and groups of atoms that influence on biological activity. We calculated the 3D RDFdescriptors weighted by mass (RDF085m), weighted by Sanderson electronegativity (RDF130e) and weighted by polarizability (RDF140p). Moreover 3D-MoRSEdescriptors (3D Molecule Representation of Structures based on Electron diffraction) (Mor14m, Mor28m) and GETAWAY-descriptors (GEometry, Topology, and Atom-Weights AssemblY) (H7e, H7v, H8v, H7p, H8p, R8p) were calculated. Descriptors with high pairwise correlation (R>0.95) and constant descriptors were excluded from the multidimensional descriptor space.
Based on experimental data of diuretic activity and calculated descriptors the regression analysis was carried out by means of BuildQSAR. As diuretic activity parameters the daily volumes of urine in rats (Vol) were used.
To build adequate model it was necessary to divide the compounds on training (for building models) and test (for external validation) datasets. Clusters analysis of structures was made using Euclidean distance based method. Number of clusters was defined as 4 n  . In order to reduce the size of the analyzed matrix of descriptors, descriptors with high pairwise correlation and constant descriptors of compounds from the training set were excluded one more time.
The building of linear models was carried out by the Multiple Linear Regression (MLR) method. Statistically, the number of compounds tested (N) and inde-pendent variables (M) used in the QSAR model should correspond to a ratio of N/M ≥ 5. It allows testing all possible combinations. Analysis of the applicability domain and search for extreme values («outliers») did not reveal any compounds with a significant deviation from the applicability domain. Firstly, the descriptors from different groups were used to build individual models, and secondly descriptors that most fully describe the change in biological activity were used to obtain mixed models. GA-MLRA procedure allowed us to choose single or multi-parameter models with the maximal correlation coefficient (r) and the minimal standard deviation (s) and sum of squares of prediction error (SRRESS).
It was found that the diuretic activity of the studied compounds is largely determined by the geometric and spatial structure of their molecules, the values of RDF-and 3D-MoRSE-descriptors. Among the energy characteristics of the molecules of the studied compounds, the energies of the highest occupied molecular orbital (HOMO), the energy of core-core interaction (CCI) and the bonds energy (BE) have the greatest influence on diuretic activity. The activity of the tested compounds also depends on the logP value, which characterizes the hydrophilic-lipophilic properties of the compounds. As a result, two one-parameter, three twoparameter and nine three-parameter linear QSAR models were obtained (Table 1). According to the analysis of the derived QSAR models 1a-3i, it was established that the diuretic activity increases with the increase of value of the such molecular descriptors as logP, refractivity of the molecule (R), and dipole moment (D) and with the decrease of volume (V), surface area (S), and polarization (P) of the molecule. Discussing the energy parameters, it should be noted that increase of values of bonds energy (BE), core-core interaction (CCI), and energy of the highest occupied molecular orbital (HOMO) lead to higher diuresis; and on the contrary, the lower energy hydration (EH) value the higher diuretic activity.
QSAR models obtained are characterized by high predictive ability, determined both by internal and external validations. The descriptors included in these models have a structural interpretation, good correlation with activity, no correlations with other molecular descriptors, gradual change of descriptor values corresponds to gradual changes in the structure of the compound. Selected models were analyzed according to the following statisti-cal parameters: correlation coefficient (r), the standard deviation (s), the Fisher coefficient (F), predictive ability (Q 2 ), sum of squares of prediction error (SPRESS), and the statistical significance (p) ( Table 2).
To confirm the predictive ability of the models their validation was carried out using both the test dataset (external validation) and cross-validation (internal). First, the descriptors for each of the test structures were calculated and substituted into the QSAR equation, and then the diuretic activity values were calculated and compared with the already known experimental values. For internal validation leave-oneout cross-validation strategy was used: removing one or several compounds from the initial dataset, a model was built for a reduced sample, and the properties of the removed compounds were predicted using obtained model. The models were evaluated by the predicted residual error sum of squares (PRESS). The optimal number of components is at which the standard deviation (s) in the cross-validation is minimal. The predictive ability of the models was determined by the values of the cross-validation factor (Q 2 ), which was calculated based on the sum of the squares of the prediction error (SRRESS). High values of Q 2 for all two-parameter and three-parameter QSAR models (0.765÷0.880) indicates a high level of statistical quality and predictive ability. Based upon validation, the most informative two-parameter linear QSAR model 3а was chosen (1). This model has r=0.796 and characterized by sufficient adequacy (F=31.287) and predictive ability (Q 2 =0.871). As the next step of the study, we have validated the model 3a on the test dataset.
The obtained data showed that the model has good correlation between the experimental data and calculated values of diuretic activity ( Fig. 4; the diagonal on the graph is a function of y=x).
Vc=46.32(±30.641)R+0.512(±0.178)CCI+ +132,781(±19.075)RDF025u±809.120 (1) Fig. 4. Validation of the QSAR model 3а: dependence of the observed and predicted diuretic activities Based on the results of molecular docking the following data were calculated: the scoring function indicating the enthalpy contribution to the value of the free energy of binding (Affinity DG) for the best conformational positions; the values of the free energy of binding and binding constants (EDoc kcal/mol and Ki uM (micromolar)) for a specific conformational position of the ligand; they allow assessing the stability of complexes formed between ligands and the corresponding receptor (Table 3).  Thus, it can be assumed that the inhibitory activity of the molecules tested relative to the receptor PDB ID: 1Z9Y can be actualized by forming complexes between them; their stability is provided mainly due to the energy favourable geometric location of ligands in the active center of this acceptor, the formation of hydrogen bonds between them, intermolecular electrostatic and donor-acceptor interactions. As a consequence, the thermodynamic probability of such binding is confirmed by negative values of the scoring function (Affinity DG, kcal/mol), calculated values of the free energy of binding EDoc (kcal/mol), and binding constants Ki (µM) ( Table 3).
In order to understand how the affinity of the molecules studied to the target occurred a detailed analysis of the geometric location of these molecules in the active site of the receptor was conducted. Molecule Ia with the carbonic anhydrase II forms a complex due to hydrogen bonds between the oxygen and hydrogen atoms of the carbonyl groups, hydroxyl group and the amino acid residues of glycine Gln92, asparagine Asn67, threonine Thr199, threonine Thr200, proline Pro201 and histidine His64. Additional stabilization of complexes with amino acid residues Phe131, Ile91, Pro202, Val121, Leu198 is facilitated by π-Alk and Alk intermolecular interactions (Fig. 5).
The formation of the complex of molecule Ib with carbonic anhydrase II is facilitated by the π-cation and π-π interactions between the fragment of the molecule and the histidine His94 residues. Molecule Ib with the carbonic anhydrase II forms a complex due to hydrogen bond between the oxygen atom of the carbonyl group and the amino acid residues of glycine Gln92. The complex is formed by the π-π interaction between the phenyl ring and the phenylalanine residue Phe131. Stabilize the complex of π-Alk and Alk interactions between the fragments of the molecule with the corresponding amino acid residues Leu141, Leu198, Val135, Val143, Val 207, Pro202 (Fig. 6).
The values of interatomic distances in the active site of the carbonic anhydrase II between fragments of molecules Ia, b and amino acid residues, categories and types of intermolecular interactions are given in Table 4.   [30]), and E. coli inhibitors (Metelytsia L. et al. [31]) was performed recently as well. But, no relationships were analyzed for derivatives that have diuretic activity. Modern publications reveal, that up-to-day scientific strategy of new drug development includes QSAR method, especially which is based upon 3D descriptors calculation, that accompanies molecular docking, synthesis and biological testing.
Study limitations. As far as this research dealt with tricyclic quinolone carboxamides, the dramatic changes in the structures of studied compounds requires building and validation of new QSAR models and docking calculations.
Prospects for further research. Based upon QSAR models developed and docking data, screening of virtual library of tricyclic quinolone compounds is being planned with the purpose to find perspective structures with diuretic properties. All together with other in silico resources such as virtual prediction of types of biological activity, mechanisms of action and toxicity, QSAR analysis and molecular docking allow to select hits reasonably. Having limited list of perspective structures, their laboratory synthesis would be greatly reduced in price. The proposed method [20,21] for synthesis of 6-hydroxy-2-methyl-4-oxo-2,4-dihydro-1Hpyrrolo[3,2,1-ij]quinoline-5-carboxamides (I) and 7hydroxy-5-oxo-2,3-dihydro-1H,5H-pyrido[3,2,1-ij]quinoline-6-carboxamides (II) opens up possibility to vary substituents on amide group for obtaining related compounds and completing the structure optimization of an identified lead compound in the future.

Conclusions
QSAR analysis of tricyclic quinoline derivatives revealed that the diuretic activity increases with the increase of value of logP, refractivity, and dipole moment and the decrease of volume, surface area, and polarization of the molecule. Increase of energy descriptors values such as bonds energy, core-core interaction, and energy of the highest occupied molecular orbital results in higher diuresis; decrease in hydration energy leads to higher diuretic activity as well.
Taking into account the detailed analysis of the location of the molecules tested in the active site of the carbonic anhydrase II PDB ID: 1Z9Y, the formation of a number of intermolecular interactions between them, negative values of scoring functions and calculated values of binding constants it can be concluded that the tested molecules have an affinity for this biological target. That fact allows to propose that probable mechanism of diuretic activity of studied pyrroloquinolines is due to carbonic anhydrase inhibition.
QSAR models and docking data obtained are useful for experimental study of diuretic activity of tricyclic quinoline derivatives, their chemical structure optimization, the identification of hits and perspective compounds with desired properties; they allow minimizing labor-, time-, and cost-resources on preclinical testing of diuretic activity. As the result, it is a theoretical basis for de novo-design of new medicinal drugs with diuretic action for edema and hypertension management.