Home Medicine Computer-aided design of aptamers for SMMC-7721 liver carcinoma cells
Article Publicly Available

Computer-aided design of aptamers for SMMC-7721 liver carcinoma cells

  • Xinliang Yu EMAIL logo , Jiyong Deng and Qiuping Guo
Published/Copyright: March 1, 2017

Abstract

Objective

A pattern recognition model was developed for aptamers against SMMC-7721 by applying the support vector machine (SVM) algorithm. Subsequently, according to computer-aided aptamer design, eight DNA aptamer sequences were synthesized and tested.

Methods

Candidate aptamer sequences selected for SMMC-7721 liver carcinoma cells were obtained with whole cell-SELEX. Their reverse-complement sequences were used for descriptor calculation to develop structure-activity relationships (SAR) model. SVM was adopted for the pattern recognition for candidate aptamer sequences with high or low affinity against SMMC-7721 liver carcinoma cells. By the model predictions, we designed, synthesized and tested eight DNA aptamer sequences against SMMC-7721.

Results

Five molecular descriptors from reverse-complement sequences were obtained to develop the SAR model of pattern recognition. The predicted fractions of winner aptamers with high affinity for the 3rd, 5th, 7th, 9th, 11th, and 13th rounds of SELEX selection are 0.09, 0.17, 0.69, 0.84, 0.90 and 0.98, respectively. Their fitting curve and the corresponding exponential equation conform to the aptamer evolutionary principles of SELEX based screening. These new designed sequences belonging to the class of sequences with high binding affinity have experimental dissociation constants Kd in the nanomolar range.

Conclusion

The feasibility of applying computer-aided aptamer design has been demonstrated.

Özet

Amaç

Destek vektör makina (DVM) algoritması kullanılarak, SMMC-7721’e karşı aptamerlerde patern tanıyıcı bir model geliştirildi. Bunu takiben, bilgisayar destekli aptamer dizaynına göre, sekiz DNA aptamer sekansı sentezlendi ve test edildi.

Metod

SMMC-7721 karaciğer karsinom hücreleri için aday aptamer sekanslarının seçimi, tüm hücre SELEX yöntemi ile elde edildi. Bu aptamerlerin ters kompleman sekansları, tanımlayıcı hesaplamarda yapı-aktivite ilişki modeli (SAR) geliştirmek için kullanıldı. SMMC-7721 karaciyer karsinom hücrelerine yüksek ve düşük afiniteli sekans adaylarında patern tanımlanması için DVM adapte edildi. Model tahminleri sonucunda SMMC-7721’e karşı sekiz DNA aptameri dizayn edildi, sentezlendi ve test edildi.

Bulgular

Patern tanımada, SAR modelini geliştirmek için ters kompliman sekanslardan beş moleküler tanımlayıcı belirlendi. SELEX seçiliminde 3., 5., 7., 9., 11., ve 13. devirlerde başarılı olan yüksek afiniteli aptamerlerin tahmini fraksiyonları 0,09, 0,17, 0,69, 0,84,0,90 ve 0,98 olarak tespit edilmiştir. Bu bulguların eğri uydurma ve denk düşen üstel denklemleri, evrimsel aptamer prensiplerine dayanan SELEX seçilimine uygundur. Yüksek bağlanma afinitesine sayip sekanslar sınıfına ait olan ve yeni dizayn edilen bu sekansların ayrılma katsayıları Kd nano molar seviyelerindedir.

Sonuç

Bilgisayar destekli aptamer dizaynını geçerliliği ispat edilmiştir.

Introduction

Aptamers generally have 15–60 nucleotide bases in length and can fold into complex three-dimensional conformations and bind targets, including molecules, cancer markers and cells, with high affinity and high specificity [1], [2]. This binding reaction enables their application not only as receptors in the analysis of food and environment, but also as new molecular medicines, or as molecular probes for clinical diagnostics and related applications. Compared to other molecular probes, such as antibodies, ligands, enzyme substrates, nucleic acid aptamers are relatively new and promising. They not only have many inherent advantages in stability and facility of generation and synthesis, but also exhibit fast blood clearance and rapid tissue and tumor penetration [3], [4], [5]. Aptamers are typically selected from libraries of random DNA (or RNA) sequences with systematic evolution of ligands by exponential enrichment (SELEX) [6], [7]. SELEX process partitions bound candidate aptamers from non-binding sequences by means of an affinity method. This technique amplifies the bound sequences by polymerase chain reaction (PCR). In cell-SELEX, the whole cells are used as targets [8], [9]. This method has been used to obtain aptamers identifying molecular differences among cancer cells [10], [11].

Generally, the SELEX experiments need 10–15 selection rounds to be completed. Thus this technique is relatively labor-intensive, time-consuming and expensive. Structure-activity relationships (SAR) models including the pattern recognition (classification) can be used to reveal the relationships between chemical structures and properties. According to classical chemical theory, it is the molecular structure that predetermines the chemical, biological, and physical properties. Therefore, these properties may be elucidated from the structural information, which can be reflected with descriptive parameters (descriptors). SAR models can be used to screen a series of molecules, including those not yet synthesized, on the computer in order to select chemical structures with the properties desired. It is then possible to synthesize the most-promising candidates for laboratory testing [12]. Thus, computer-aided aptamer design based on pattern recognition approach can accelerate the process of selecting DNA (or RNA) sequences for use as aptamers or for any other purpose. However, as we know, there are only a few SAR models reported on aptamer sequences [13], [14], [15]. There may be two reasons for this phenomenon. One is that experimental data such as affinity for aptamers are insufficient. The other is that aptamer sequences have large molecular weight, which leads to difficult in calculating descriptors.

To design and synthesize aptamers for hepatocellular carcinoma (HCC) with high sensitivity and specificity would be of great importance for scientific research, and clinical diagnosis and treatment of cancer [11]. Schütze et al. [16] found that reverse-complement sequences can recognize targets. The aim of this work is to develop a pattern recognition model (or SAR model) for aptamers against SMMC-7721 liver carcinoma cells, using molecular descriptors from reverse-complement sequences as the input variables of the support vector machine (SVM) classification model. Subsequently, according to computer-aided aptamer design, eight DNA aptamer sequences are synthesized and tested.

Materials and methods

Candidate aptamer sequences selected for SMMC-7721 liver carcinoma cells through whole cell-SELEX were taken from our laboratory (see Table S1 in the Supplementary data). After deleting the sequences whose secondary structure containing no pair, 226 reverse-complement sequences (5′-GATGACGACCGACTGACTTC – (center sequence) – GGTAGTCAGTGGCAAGTCAA-3′) were obtained and split into two sets: a training set (60 reverse-complement sequences) and a test set (166 reverse-complement sequences).

The training set was randomly taken from the reverse-complement sequences of the 3rd, 11th and 13th round of selection. The class labels (i.e. target values) of aptamers from the 3rd, 11th and 13th rounds of SELEX were 1, 2, and 2, respectively. The class labels 1 and 2, respectively, denote the high and low affinity and specificity aptamers. The training set that has 35 sequences with the class labels being 1 and 25 sequences with the class labels being 2 was used to train and optimize pattern recognition models. The test set including reverse-complement sequences from the 3rd, 5th, 7th, 9th, 11th, and 13th rounds of experiments was used later to check the predictive power of the developed model.

During the first few SELEX rounds, the binding affinity Kd increases significantly [17], [18]. When a certain number of rounds are performed, the affinity reaches a plateau. Therefore, it is reasonable to set class labels of aptamer sequences from the 3rd, 11th and 13th rounds of SELEX as 1, 2, and 2, respectively.

RNA structure software version 5.3 [19] was used to calculate the free energy (E) and obtain the secondary structures of center sequences of reverse-complement sequences. The loop of each secondary structure that has smaller hairpin structures and larger stems was used to calculate molecular descriptors. Each loop structure was sketched using ChemBioDraw Ultra 11.0 [20], and optimized using ChemBio3D Ultra 11.0 [20] with the default convergence criteria in the soft. Each energy minimized molecule was then input for Dragon software [21] by which 1664 molecular descriptors categorized by 20 blocks were calculated. It should be noted that ChemBio3D could not be used for determining the absolute or precise 3D structures of such sophisticated molecules as these in this work. However, the errors due to the approximate nature of ChemBio3D method are largely transferable within structurally related series [12]. Thus, relative structures optimized with ChemBio3D can be meaningful even though their absolute and precise structures are not directly applicable.

Besides above the free energy (E) and the 1664 molecular descriptors, the percentages for adenine (A) (A%), guanine (G) (G%), cytosine (C) (C%), and thymine (T) (T%) in each center sequence were calculated. Totally, 1669 molecular descriptors were calculated in this work for each aptamer.

One of the 20 descriptor blocks, 3D-MoRSE (3D Molecule Representation of Structures based on Electron diffraction) descriptors consist of 160 molecular descriptors. Our pervious study suggests that 3D-MoRSE descriptors are important in reflecting the structure information [15]. Three dimensional-MoRSE descriptors are calculated with the following expression [20]:

(1)Morsw=i=1nAT1j=i+1nATwiwjsin(srij)srij,

where Morsw is the scattered electron intensity, w is an atomic property, i.e. the unweighted case (u), atomic mass (m), the van der Waals volume (v), the Sanderson atomic electronegativity (e) or the atomic polarizability (p). rij is the interatomic distances and nAT is the number of atoms. The term s represents the scattering in various directions by a collection of nAT atoms and takes integer values in the range 0–31.

Results and discussion

The rounds of SELEX as the dependant variable, 1669 molecular descriptors as the independent variables, stepwise multiple linear regression (MLR) was adopted to select the optimal subset of variables used to develop the SVM classification model from the training set. By investigating the correlation between the 1669 descriptors and the rounds of SELEX with SPSS 11.5, an optimal descriptor subset, E, Mor26e (3D-MoRSE - signal 26 / weighted by atomic Sanderson electronegativities), Mor32p (3D-MoRSE - signal 32 / weighted by atomic polarizabilities), A%, and G%, were obtained to describe the structure features of each aptamer, which are listed in Table S2 (see Supplementary data). The two 3D-MoRSE descriptors, Mor13m and Mor32p, retain important structural features of electronegativities and polarizabilities, respectively [15], [21]. Thus the 3D-MoRSE descriptors used denote the electronic distribution of loops, which is related to the induced fitting behavior and molecular recognition of aptamers. The free energy (E) is related to the stem length of the secondary structure of center reverse-complement sequences and is correlated with conformational stability of the secondary structure.

Table S2 indicates that an aptamer with a higher A% (or lower G%) value would possess high affinity and selectivity for SMMC-7721 liver carcinoma cells. According to classical theory, entropy is commonly associated with the degree of disorder in a system. During the SELEX process, the amount of disorder (or the entropy) in a center sequence gradually decreases. This means that the fraction of a certain base would increase or decrease. On the other hand, the bases A and G have different chemical composition and structure that may contribute to molecular recognition, such as shape, electrostatics, dynamics and entropy, and result in difference with respect to the affinity and selectivity of aptamers. Therefore, the descriptors A% and G% are related to the binding affinity of aptamers to targets.

The optimal descriptor subset (E, Mor26e, Mor32p, A%, and G%) was then used to develop a support vector classification (SVC) model, with Gaussian radial basis function (RBF) as the kernel function. The SVM parameters C and γ were optimized with the particle swarm optimization (PSO) method [22]. During the optimization process, both of the cognition learning factor c1 and the social learning factor c2 were set as 2. The number of particles and the maximum number of iterations were set to be 20 and 200, respectively. The searching range of SVM parameter C was between 0.1 and 1000. The searching range of γ was 0.1–1000. The leave-one-out (LOO) cross-validation procedure was carried out for the training set during the optimization of SVM parameters C and γ. LibSVM package (Version 3.12) [23] was used to develop the SVC model.

The optimization results show that the relatively optimal SVM parameters C and γ were 4.6 and 0.1, respectively. The SVC model with C=4.6 and γ=0.1 has classification accuracy of 98.3% for the training set. To evaluate the model, we calculated prediction values for the test set, which are listed in Table S2. For the classification parameters of the 3rd, 11th, and 13th round experiments, specificity and sensitivity are 90.91% and 84.91%, respectively. The fractions of winner aptamers with high affinity and specificity against targets for the 3rd, 5th, 7th, 9th, 11th, and 13th round experiments are 0.09, 0.17, 0.69, 0.84, 0.90 and 0.98, respectively. Thus, we can obtain the following fitting curve and the corresponding exponential equation (see Figure 1). Obviously, the results are in agreement with the aptamer evolutionary principles of SELEX based screening, as stated above.

Figure 1: Evolutionary curve of aptamers for SMMC-7721 liver carcinoma cells.
Figure 1:

Evolutionary curve of aptamers for SMMC-7721 liver carcinoma cells.

Table S2 shows that the reverse-complement sequences of Nos. 53, 59, 60, 224, 225, and 226 have prediction class labels being 2, and belong to the class of sequences with high binding affinity (effective equilibrium dissociation constant, Kd). Their experimental Kd values are 7.28 nM, 16.44 nM, 14.28 nM, 1.47 nM, 29.79 nM, 5.34 nM, respectively. The theoretical predictions conform to experimental results.

To further evaluate the SVC models, we designed, synthesized and tested eight DNA candidate aptamer sequences with HCC cell line SMMC-7721 as target cells. The experimental conditions can be found in Reference [11]. Figure 2 is the experimental curves of Kd for new designed sequences measured by Becton Dickinson FACScalibur flow cytometer. According to the single binding site scheme [24]: B=(Bmax×C)/(Kd+C), where C is the concentration of candidate aptamer sequences, B is the aptamer adsorption (i.e. the relative Geo Mean values), Bmax is the maximal adsorption of candidate aptamer sequence at high aptamer concentrations. The experimental Kd values of eight new designed sequences were obtained and listed in Table 1. Table 1 shows that these new sequences belong to the class of sequences with high binding affinity and have experimental dissociation constants Kd in the nanomolar range. The results show that the eight new sequences can recognize the target SMMC-7721 cells with high binding affinity and may serve as effective tools for early diagnosis of hepatocellular carcinoma. The feasibility of applying SAR model based on reverse-complement sequences to design aptamers has been demonstrated.

Figure 2: The experimental Kd curves of new designed sequences.
Figure 2:

The experimental Kd curves of new designed sequences.

Table 1:

Eight new designed sequences and experimental Kd values.

No.Sequences (5′→3′)Kd (nM)
Seq1TGGCGCATTGACGTCAGGTTGAGCTGAAGATCGTAACGT382.4
Seq2TGGCGCATTGACGTCAGGTTGAGCTGAAGATCCTACGGT23.7
Seq3TGGCGCATTGACGTCAGGTTGAGCTGAAGATCGTACCCT694.7
Seq4AGTAGTCGAAGACTGATGGTTGAGCTGATGATCCTACGGT2.9
Seq5AATAGACGAAGTCTGATGGTTGAGCTGATGATCCTACGGT2.1
Seq6AATAGACGAAGTCTGATGGTTGAGCTGATGATCCTACGGT1.7
Seq7AGGTTTCTACCTGGTTGAGCTGAAGATCGTACCGT1.6
Seq8ACGTTTCTACGTGGTTGAGCTGAAGATCGTACCGT1.7

Conclusions

By means of the SVC technique, a pattern recognition model was constructed for two-class candidate aptamer sequences with high or low affinity against SMMC-7721 liver carcinoma cells. The five descriptors, E, Mor26e, Mor32p, A%, and G%, obtained from reverse-complement sequences were used to describe the structure features of each aptamer and reflect the properties of aptamers-targets interactions. According to the predictions from the SVC model, eight new DNA aptamer sequences with HCC cell line SMMC-7721 as target cells were designed, synthesized and tested. The experimental Kd values of eight new designed sequences are in the nanomolar range. The results show that the eight new sequences can bind the target SMMC-7721 cells with high binding affinity. These results encourage the further application of pattern recognition models based on reverse-complement sequences to other designs of aptamers.

Acknowledgment

This work was supported by the National Natural Science Foundation of China (21190041, 21190044), Natural Science Foundation of Hunan Province (Grant No. 2015JJ2042), and Scientific Research Fund of Hunan Provincial Education Department (16A047).

  1. Conflict of interest: The authors declared that they have no conflicts of interest to this work.

References

1. Famulok M, Mayer G, Blind M. Nucleic acid aptamers-from selection in vitro to applications in vivo. Acc Chem Res 2000;33:591–9.10.1021/ar960167qSearch in Google Scholar

2. Gold L, Brody E, Heilig J, Singer B. One, two, infinity: genomes filled with aptamers. Chem Biol 2002;9:1259–64.10.1016/S1074-5521(02)00286-7Search in Google Scholar

3. Rimmele M. Nucleic acid aptamers as tools and drugs: recent developments. ChemBioChem 2003;4:963–71.10.1002/cbic.200300648Search in Google Scholar PubMed

4. Rusconi CP, Scardino E, Layzer J, Pitoc GA, Ortel TL, Monroe D, et al. RNA aptamers as reversible antagonists of coagulation factor Ixa. Nature 2002;419:90–4.10.1038/nature00963Search in Google Scholar PubMed

5. Jayasena SD. Aptamers: an emerging class of molecules that rival antibodies in diagnostics. Clin Chem 1999;45:1628–50.10.1093/clinchem/45.9.1628Search in Google Scholar

6. Ellington AD, Szostak JW. In vitro selection of RNA molecules that bind specific ligands. Nature 1990;346:818–22.10.1038/346818a0Search in Google Scholar PubMed

7. Tuerk C, Gold L. Systematic evolution of ligands by exponential enrichment: RNA ligands to bacteriophage T4 DNA polymerase. Science 1990;249:505–10.10.1126/science.2200121Search in Google Scholar PubMed

8. Fang X, Tan W. Aptamers generated from cell-SELEX for molecular medicine: a chemical biology approach. Acc Chem Res 2010;43:48–57.10.1021/ar900101sSearch in Google Scholar PubMed PubMed Central

9. Tan WH, Donovan MJ, Jiang JH. Aptamers from cell-based selection for bioanalytical applications. Chem Rev 2013;113:2842–62.10.1021/cr300468wSearch in Google Scholar PubMed PubMed Central

10. Ninomiya K, Kaneda K, Kawashima S, Miyachi Y, Ogino C, Shimizu N. Cell-SELEX based selection and characterization of DNA aptamer recognizing human hepatocarcinoma. Bioorg Med Chem Lett 2013;23:1797–802.10.1016/j.bmcl.2013.01.040Search in Google Scholar PubMed

11. Guo QP, Liu XD, Tan YY, Wang KM, Yang XH. Selection of aptamers for human hepatocellular carcinoma with high specificity. Chin Sci Bull 2013;58:2745–50.10.1360/972013-360Search in Google Scholar

12. Karelson M, Lobanov VS, Katritzky AR. Quantum– chemical descriptors in QSAR/QSPR studies. Chem Rev 1996;96:1027–43.10.1021/cr950202rSearch in Google Scholar PubMed

13. Li BQ, Zhang YC, Huang GH, Cui WR, Zhang N, Cai YD. Prediction of aptamer-target interacting pairs with pseudo-amino acid composition. PLoS One 2014;9:e86729.10.1371/journal.pone.0086729Search in Google Scholar PubMed PubMed Central

14. Musafia B, Oren-Banaroya R, Noiman S. Designing anti-influenza aptamers: novel quantitative structure activity relationship approach gives insights into aptamer–virus interaction. PLoS One 2014;9:e97696.10.1371/journal.pone.0097696Search in Google Scholar PubMed PubMed Central

15. Yu XL, Yu RQ, Tang LJ, Guo QP, Zhang Y, Zhou Y, et al. Recognition of candidate aptamer sequences for human hepatocellular carcinoma in SELEX screening using structure–activity relationships. Chemo Intel Lab Syst 2014;13:10–4.10.1016/j.chemolab.2014.05.002Search in Google Scholar

16. Schütze T, Wilhelm B, Greiner N, Braun H, Peter F, Mörl M, et al. Probing the SELEX process with next-generation sequencing. PLoS One 2011;6:e29604.10.1371/journal.pone.0029604Search in Google Scholar PubMed PubMed Central

17. Djordjevic M. SELEX experiments: new prospects, applications and data analysis in inferring regulatory pathways. Biomol Eng 2007;24:179–89.10.1016/j.bioeng.2007.03.001Search in Google Scholar PubMed

18. Djordjevic M, Sengupta AM. Quantitative modeling and data analysis of SELEX experiments. Phys Biol 2006;3:13–28.10.1088/1478-3975/3/1/002Search in Google Scholar PubMed

19. Reuter JS, Mathews DH. RNAstructure: software for RNA secondary structure prediction and analysis. BMC Bioinformatics 2010;11:129.10.1186/1471-2105-11-129Search in Google Scholar PubMed PubMed Central

20. Cambridge Soft Inc. ChemBioOffice Ultra Version 11.0, Cambridge, USA, 2008.Search in Google Scholar

21. Todeschini R, Consonni V, Mauri A, Pavan M. DRAGON for Widows (Software for the Calculation of Molecular Descriptors), Version 5.4, Talete srl, Milan, Italy, 2006.Search in Google Scholar

22. Ang K, Chong G, Li Y. PID control system analysis, design, and technology. IEEE Trans Control Syst Technol 2005;13:559–76.10.1109/TCST.2005.847331Search in Google Scholar

23. Chang CC, Lin CJ. LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol 2011;2:27.10.1145/1961189.1961199Search in Google Scholar

24. Sefah K, Tang ZW, Shangguan DH, Chen H, Lopez-Colon D, Li Y, et al. Molecular recognition of acute myeloid leukemia using aptamers. Leukemia 2009;23:235–44.10.1038/leu.2008.335Search in Google Scholar PubMed PubMed Central


Supplemental Material

The online version of this article (DOI: https://doi.org/10.1515/tjb-2016-0166) offers supplementary material, available to authorized users.


Received: 2016-09-24
Accepted: 2016-12-26
Published Online: 2017-03-01
Published in Print: 2017-06-01

©2017 Walter de Gruyter GmbH, Berlin/Boston

Downloaded on 26.12.2025 from https://www.degruyterbrill.com/document/doi/10.1515/tjb-2016-0166/html
Scroll to top button