Utilizing AlphaFold to foretell the affect of single mutations on protein stability and performance
Summary
AlphaFold modified the sphere of structural biology by reaching three-dimensional (3D) construction prediction from protein sequence at experimental high quality. The astounding success even led to claims that the protein folding drawback is “solved”. Nevertheless, protein folding drawback is extra than simply construction prediction from sequence. Presently, it’s unknown if the AlphaFold-triggered revolution may assist to resolve different issues associated to protein folding. Right here we assay the power of AlphaFold to foretell the affect of single mutations on protein stability (ΔΔG) and performance. To review the query we extracted the pLDDT and <pLDDT> metrics from AlphaFold predictions earlier than and after single mutation in a protein and correlated the expected change with the experimentally recognized ΔΔG values. Moreover, we correlated the identical AlphaFold pLDDT metrics with the affect of a single mutation on construction utilizing a big scale dataset of single mutations in GFP with the experimentally assayed ranges of fluorescence. We discovered a really weak or no correlation between AlphaFold output metrics and alter of protein stability or fluorescence. Our outcomes suggest that AlphaFold is probably not instantly utilized to different issues or functions in protein folding.
Quotation: Pak MA, Markhieva KA, Novikova MS, Petrov DS, Vorobyev IS, Maksimova ES, et al. (2023) Utilizing AlphaFold to foretell the affect of single mutations on protein stability and performance. PLoS ONE 18(3):
e0282689.
https://doi.org/10.1371/journal.pone.0282689
Editor: Nediljko Budisa,
Berlin Institute of Know-how, GERMANY
Obtained: December 20, 2021; Accepted: February 21, 2023; Revealed: March 16, 2023
Copyright: © 2023 Pak et al. That is an open entry article distributed below the phrases of the Creative Commons Attribution License, which allows unrestricted use, distribution, and copy in any medium, supplied the unique writer and supply are credited.
Knowledge Availability: All related information are inside the manuscript and its Supporting information information.
Funding: The authors thank Zimin Basis and Petrovax for assist of the introduced research on the Faculty of Molecular and Theoretical Biology 2021. The funders had no function in research design, information assortment and evaluation, determination to publish, or preparation of the manuscript.
Competing pursuits: The authors have declared that no competing pursuits exist.
Introduction
AlphaFold is extensively claimed to have revolutized protein 3D construction prediction from protein sequence, a 50-years long-standing problem of protein physics and structural bioinformatics [1]. The fourteenth spherical of CASP, a blind competitors on protein 3D construction prediction [2], demonstrated that AlphaFold, a newcomer to the sphere, considerably outperforms all different strategies. Crucially, AlphaFold fashions confirmed an accuracy of their predicted constructions that was akin to constructions solved by experimental strategies, like X-ray crystallography, NMR, and Cryo-EM [3].
‘It’s going to change all the things’, stated Andrei Lupas in an interview to Nature [3]. One of many main modifications could also be that AlphaFold may remedy different issues associated to protein folding. These issues embody the prediction of assorted protein interactions, resembling protein-protein, protein-ligand and protein-DNA/RNA, and the prediction of the affect of mutations on protein stability. AlphaFold proved to be helpful for experimental willpower of protein constructions with molecular substitute phasing [4, 5] and already facilitated elucidation of SARS-Cov2 protein constructions [6, 7]. Subsequent, AlphaFold in collaboration with EMBL-EBI constructed the construction fashions for the entire protein sequence area [8]. The database of freely out there constructions of all proteins, is attributed to “revolutionize the life sciences” [3]. A pool of high-quality predicted constructions is a plus for 3D-based prediction of mutation affect on protein stability since 3D-based prediction is extra correct than 1D-based one [9–11]. Moreover, AlphaFold is anticipated to deliver new insights into our understanding of the structural group of proteins, increase the event of recent medication and vaccines [12]. Researchers within the area are already actively testing AlphaFold efficiency in varied bioinformatics duties, as an example, in peptide-protein docking [13, 14].
Guided by the anticipated quick affect of AlphaFold for the answer of a variety of issues in structural bioinformatics, we explored the capability of AlphaFold predictions to function a proxy for the affect of mutations on protein stability change (ΔΔG). Though AlphaFold supplies a disclaimer that it “has not been validated for predicting the impact of mutations” (https://alphafold.ebi.ac.uk/faq), the expectations of AlphaFold are so excessive that we judged it prudent to test how properly AlphaFold predictions may work for estimation of ΔΔG values. Subsequently, provided that pLDDT rating displays confidence of the situation of the residue within the construction, it might be anticipated that this measure correlates with ΔΔG or protein perform. We discovered that the distinction between pLDDT scores, the one native AlphaFold prediction metric reported within the output PDB file, had a really weak correlation with experimentally decided ΔΔG values (Pearson correlation coefficient, PCC = -0.17). The distinction within the international AlphaFold metric—the pLDDT averaged for all residues—exhibits no correlation, each remoted and together with the mutated residue’s pLDDT rating. Equally, the identical AlphaFold metrics had a really weak correlation with the affect of single mutations on protein perform, fluorescence, of GFP. Latest outcomes [15] present that the usage of AlphaFold fashions as a substitute of template constructions doesn’t enhance ΔΔG prediction. Taken collectively, thus far we didn’t discover a use for AlphaFold to foretell the affect of a mutation on protein stability. The supply of AlphaFold fashions permits making use of extra correct 3D protein structure-based ΔΔG predictors quite than sequence-based ΔΔG predictors; the bottleneck nonetheless appears to be the accuracy of present 3D protein structure-based ΔΔG predictors.
Supplies and strategies
Dataset of experimental mutations
The info on experimentally measured results of mutations on protein stability have been taken from ThermoMutDB [16] (model 1.3). From 13,337 mutations within the database we extracted single-point mutations with information on ΔΔG measured within the experimental circumstances of pH between 3 and 9, and temperature between 293 to 300 Kelvins. We additionally put the restriction on protein size for it to be lower than 250 amino acids. Since stabilizing mutations need to have unfavourable ΔΔG whereas in ThermoMutDB they’re optimistic, all ΔΔG values from ThermoMutDB have been multiplied by −1.
The filtered dataset resulted in 1779 mutations in 80 proteins. Now we have achieved the evaluation for randomly chosen 1154 mutations in 73 proteins. The ultimate dataset and computed metrics are given in S1 Table.
Dataset of GFP mutants fluorescence
We took information on fluorescence ranges of GFP mutants from [17]. From the unique dataset we randomly extracted 796 single mutants for our evaluation. The checklist of the chosen mutations is given in S2 Table.
Protein construction modeling with AlphaFold
The wild kind protein constructions have been retrieved from the AlphaFold Protein Construction Database (AlphaFold DB) [8] by their UniProt accession code. The constructions of authentic proteins that have been absent within the AlphaFold DB in addition to constructions of mutant proteins have been modeled by the standalone model of AlphaFold [1] utilizing the fasta file with UniProt sequence of a protein as the one enter within the ‘–fasta_paths’ flag.
Prediction metrics
The per-residue native distance distinction check (pLDDT) confidence scores for the protein construction fashions downloaded from the AlphaFold DB have been retrieved from the B-factor area of the coordinate part of the pdb file. The pLDDT confidence scores for the protein construction fashions that we predicted by standalone AlphaFold have been extracted from the pickle file, from “plddt” array. By default, AlphaFold produces 5 fashions. The variations in pLDDT and <pLDDT> have been statistically important inside the group of 5 produced fashions (each for wildtype and mutant); we utilized in our evaluation solely the perfect one, i.e., having the very best worth of <pLDDT>.
Sequence id of proteins inside the dataset of mutations
To determine the sequence identities between the proteins within the dataset of mutations we carried out protein BLAST [18] search of protein sequences in opposition to themselves.
We divided the dataset into coaching and check units for linear regression mannequin primarily based on the arbitrary sequence id threshold of fifty%. Mutations in proteins above the brink comprised the coaching set, and the remainder of mutations have been used because the check set. The coaching and check units resulted in 423 mutations in 50 proteins and 731 mutations in 23 proteins, respectively.
Linear regression evaluation
A number of linear regression match with two parameters was carried out utilizing the linear_model module of Sklearn library with default parameters.
Properties of mutated amino acid residues
Mutated amino acids have been annotated by relative solvent accessibility, impact of mutation on stability, hydrophobicity, polarity, and aspect chain dimension.
Data on solvent accessibility was taken from Stride [19]. The relative solvent accessibility (RSA) of an amino acid residue was calculated in accordance with the equation:
(1)
the place ASA is the solvent accessible floor space and maxASA is the utmost attainable solvent accessible floor space of an amino acid [20]. Following [21] we used the solvent accessibility threshold of 25% to categorise residues as uncovered or buried.
The remainder of the properties have been assigned in accordance with http://www.imgt.org/IMGTeducation/Aide-memoire/_UK/aminoacids/IMGTclasses.html. The aspect chain sizes have been annotated as very small (1), small (2), medium (3), giant (4), very giant (5). We outlined ‘no’, ‘small’, and ‘giant’ change in dimension chain quantity equal to distinction of 0, 1 or 2, and three or 4 in absolute values, respectively.
All correlations have been adjusted for a number of hypotheses testing by Benjamini-Hochberg correction [22].
Outcomes
Knowledge set of mutations
We used experimental information on protein stability modifications upon single-point variations from ThermoMutDB Database [16]. After the filtering process (see Materials and methods) we carried out evaluation for 1154 mutations in 73 proteins. For the a number of linear regression evaluation, the dataset was break up into two units, a coaching and a check set. The break up was primarily based on BLAST [18] outcomes, such that the mutations have been assigned to the check set if corresponding proteins had <50% sequence id to some other protein in the whole dataset (see Materials and methods). All the different mutations have been assigned to the coaching set.
AlphaFold prediction metrics
Together with coordinates of all heavy atoms for a protein, AlphaFold mannequin comprises “its confidence in type of a predicted lDDT-Cα rating (pLDDT) per residue” [1]. LDDT ranges from 0 to 100 and is a superposition-free metric indicating to what extent the protein mannequin reproduces the reference construction [23]. The pLDDT scores averaged throughout all residues designate the general confidence for the entire protein chain (<pLDDT>). The distributions of AlphaFold prediction metrics for wildtype and mutant constructions statistically considerably differ from one another, each for pLDDT (p-value = 7 ⋅ 10-10) and <pLDDT> (p-value = 3 ⋅ 10-3). For every mutation within the dataset, we calculated the distinction in pLDDT between the wild kind and mutated constructions within the mutated place in addition to the distinction in <pLDDT> between wild kind and mutant protein construction fashions. By checking ΔpLDDT and Δ<pLDDT> values as potential proxies for the change of protein stability we explored the speculation that the change of protein stability resulting from mutation is by some means mirrored within the distinction of AlphaFold confidence between wild kind and mutant constructions.
Correlation between ΔΔG and ΔpLDDT values
First, we studied the connection between the impact of mutation on protein construction stability and the distinction within the accuracy of protein construction prediction by AlphaFold for the wild-type and mutant proteins. We didn’t observe a pronounced correlation between the mutation impact and the distinction in confidence metrics (Fig 1). The correlation coefficient is -0.17 ± 0.03 (p-value = 10-8) for ΔpLDDT and 0.02 ± 0.03 (p-value = 0.44) for the Δ<pLDDT>.
Fig 1. Correlation between ΔΔG and ΔpLDDT.
Correlation between the impact of mutation on protein stability, ΔΔG, and alter of confidence rating of construction prediction, ΔpLDDT. A: The correlation for the mutated amino acid. B: The correlation for the entire construction.
For the reason that confidence metrics for a given amino acid and entire protein are weakly correlated (PCC = 0.21 ± 0.03, p-value = 10-12) we then explored how their mixture correlates with the impact of mutation. A number of linear regression mannequin resulted within the dependence ΔΔG = -0.99–0.13 ⋅ ΔpLDDT + 0.03 ⋅ Δ<pLDDT>. We didn’t get hold of any pronounced correlation both for coaching (0.12 ± 0.05, p-value = 0.01) or check units (0.20 ± 0.04, p-value = 3 ⋅ 10-8).
Relationship of ΔpLDDT and amino acid properties
We explored the outliers with excessive absolute values of ΔpLDDT in Fig 1A. Expectedly, the destabilizing impact of mutations was related to reducing pLDDTs: 87% of destabilizing mutations had unfavourable ΔpLDDT (p-value = 10−22). Nevertheless, there was no correlation between ΔpLDDT and ΔΔG for that 87% of mutations with unfavourable ΔpLDDT.
We explored the correlation between AlphaFold prediction metrics and ΔΔG for various classes of mutations (see Strategies, Properties of mutated amino acid residues). The correlation remained poor (|PCC| being lower than 0.19 and 0.07 for ΔpLDDT and Δ<pLDDT>, respectively, see S3 Table) for mutations stratified by their impact on polarity, hydrophobicity, cost upon mutation and relative solvent accessibility of mutated residue (S3 Table).
The distinction in pLDDT rating distributions was important for positions with the completely different secondary constructions of mutated residue (Kruskal-Wallis p-value = 4 ⋅ 10−10) and for mutations altering the aspect chain dimension (Kruskal-Wallis p-value = 0.04). Nevertheless, the correlation between ΔpLDDT or Δ<pLDDT> and ΔΔG for several types of mutations inside these classes was not robust (|PCC| < 0.38 apart from 29 mutations having a big enhance in dimension exhibiting correlation of -0.67 for ΔpLDDT, see S3 Table).
Correlation between GFP fluorescence and ΔpLDDT values
Protein stability is intimately coupled with protein performance. Thus, an affordable speculation holds that the lack of protein performance resulting from mutations most often outcomes from decreased stability [24]. Subsequently, together with testing correlation of AlphaFold metrics with ΔΔG, it’s affordable to check the correlation of AlphaFold metrics with protein perform. Moreover, the change of pLDDT scores could contribute on to protein performance with out contributing to protein stability. We checked the correlation between ΔpLDDT values and the fluorescent stage of 796 randomly chosen single GFP mutants from [17]. The correlation coefficient is 0.17 ± 0.03 (p-value = 3 ⋅ 10-6) for ΔpLDDT and 0.16 ± 0.04 (p-value = 10-5) for the Δ<pLDDT> (Fig 2).
Dialogue
Extraordinary success of AlphaFold in predicting protein 3D construction from protein sequence could result in temptation to use this device to different questions in structural bioinformatics. Right here we checked the potential of AlphaFold metrics to function a predictor for the affect of mutation on protein stability and performance. We discovered a weak correlation of -0.17 ± 0.03 between ΔpLDDT and ΔΔG related to particular mutations. Though the correlation was statistically important (p-value < 10-8), it’s so weak that it can’t be used for correct ΔΔG predictions (Fig 1) and it’s unclear how such predictions can be utilized in sensible functions. Clearly, ΔpLDDT would present a greater correlation with ΔΔG if it was measured throughout bins of averaged ΔΔG. Alternatively, ΔpLDDT might be a separate time period in a a number of linear regression mannequin. The averaged metric Δ<pLDDT> exhibits correlation with ΔΔG, which is statistically indistinguishable from zero. Nevertheless, a linear mixture of the 2 metrics, ΔpLDDT and Δ<pLDDT>, doesn’t drastically enhance the correlation. As for the loss-of-function prediction, the correlation with the affect of mutation on GFP fluorescence confirmed comparable outcomes: PCC was 0.17 ± 0.03 and 0.16 ± 0.04 for ΔpLDDT and Δ<pLDDT>, respectively (Fig 2).
Taken collectively, our information point out that AlphaFold predictions can’t be used on to reliably estimate the affect of mutation on protein stability or perform. However why ought to we have now anticipated such a correlation within the first place? Certainly, AlphaFold was not designed to foretell the change of protein stability or perform resulting from mutation. Within the phrases of the authors “AlphaFold will not be anticipated to provide an unfolded protein construction given a sequence containing a destabilising level mutation” (https://alphafold.ebi.ac.uk/faq). Nevertheless, the one purpose for a protein to fold into the distinct native construction is the soundness of this construction, so the protein 3D construction and its stability are carefully related. Logically, an algorithm predicting protein 3D construction from sequence ought to seek for essentially the most secure 3D state below the native (or normal) circumstances. If a compact construction turns into unstable (for instance, resulting from mutation) then we’d count on that the algorithm shifts its predictions towards an unfolded state. Proof in favor of this viewpoint is the profitable prediction of natively disordered protein areas by AlphaFold and the correlation between the lower of pLDDT and propensity to be in a disordered area [25]. Thus, it isn’t unreasonable to count on a lower within the confidence rating of the mutated residue or the entire native construction.
Certainly, it was reported many occasions that 3D-based predictors carry out higher than 1D-based [9–11], so the supply of a pool of high-quality 3D predicted constructions might be a plus.
Our outcomes present that AlphaFold repurposing for ΔΔG prediction didn’t work for the proteins we studied. AlphaFold 3D fashions can be utilized to foretell the affect of a mutation on protein stability or perform by 3D-structure-based ΔΔG predictors. Nevertheless, the efficiency of the ensuing predictions goes to be removed from excellent: the 3D-structure primarily based ΔΔG predictors present modest efficiency even utilizing 3D constructions from PDB [26], with correlation of 0.59 or much less in impartial checks [27]. Thus, utilizing AlphaFold fashions as a substitute of PDB constructions doesn’t make ΔΔG predictions extra correct [15], so availability of AlphaFold fashions is anticipated to indicate an roughly 0.59 correlation with predictions of ΔΔG, which can be too low for a lot of functions.
The deep studying method demonstrated by AlphaFold could also be an inspiring instance to develop a deep studying ΔΔG predictor. Nevertheless, we see the dramatic distinction between the conditions with 3D construction prediction and ΔΔG prediction that will impede this improvement. The distinction is within the quantity of accessible information. For protein construction prediction AlphaFold used PDB with ∼150,000 information, and every file contained a wealth of data. In distinction to PDB, the variety of experimentally measured ΔΔG values are of the order of 10,000 and these are simply numbers with out accompanying additional information. To make a tough comparability of data in bits, PDB constructions occupy 100 Gb, whereas all of the recognized experimentally ΔΔG values occupy about 10 kb. Neural networks are very delicate to the quantity of data within the coaching set so the power of deep studying to sort out the ΔΔG prediction job at current appears to be like hindered largely by the shortage of experimental information.
Total, we explored the capability of direct prediction of ΔΔG by all AlphaFold metrics reported in the usual deafault mode: (i) the distinction within the pLDDT rating earlier than and after mutation within the mutated place, (ii) the distinction within the averaged pLDDT rating throughout all positions earlier than and after mutation. We discovered that the correlation was weak or absent, and, due to this fact, AlphaFold predictions are unlikely to be helpful for ΔΔG predictions. Taken along with our latest outcome that AlphaFold fashions aren’t higher for ΔΔG predictions than finest templates [15], we see no easy manner to make use of AlphaFold advances for fixing the duty of prediction of ΔΔG upon mutation. The duty of ΔΔG prediction ought to be solved individually and it’ll face the issue of restricted quantity of knowledge for coaching neural networks.
Acknowledgments
The authors acknowledge the usage of Zhores supercomputer [28] for acquiring the outcomes introduced on this paper.
References
- 1.
Jumper J, Evans R, Pritzel A, Inexperienced T, Figurnov M, Ronneberger O, et al. Extremely correct protein construction prediction with AlphaFold. Nature 2021 596(7873):583–589. pmid:34265844 - 2.
Kryshtafovych A, Schwede T, Topf M, Fidelis Ok, Moult J. Vital evaluation of strategies of protein construction prediction (CASP)—Spherical XIII. Proteins: Construction, Perform, and Bioinformatics 2019 87(12):1011–1020. pmid:31589781 - 3.
Callaway E. “It’s going to change all the things”: DeepMind’s AI makes gigantic leap in fixing protein constructions. Nature 2020 588(7837):203–204. pmid:33257889 - 4.
Millán C, Keegan RM, Pereira J, Sammito MD, Simpkin AJ, McCoy AJ, et al. Assessing the utility of CASP14 fashions for molecular substitute. Proteins 2021. pmid:34387010 - 5.
Hegedűs T, Geisler M, Lukács G, Farkas B. AlphaFold2 transmembrane protein construction prediction shines. bioRxiv 2021. - 6.
Gupta M, Azumaya CM, Moritz M, Pourmal S, Diallo A, Merz GE, et al. CryoEM and AI reveal a construction of SARS-CoV-2 Nsp2, a multifunctional protein concerned in key host processes. bioRxiv 2021. - 7.
Flower TG, Hurley JH. Crystallographic molecular substitute utilizing an in silico-generated search mannequin of SARS-CoV-2 ORF8. Prot. Sci. 2021 30(4):728–734. - 8.
Tunyasuvunakool Ok, Adler J, Wu Z, Inexperienced T, Zielinski M, Žídek A, et al. Extremely correct protein construction prediction for the human proteome. Nature 2021 596(7873):590–596. pmid:34293799 - 9.
Montanucci L, Capriotti E, Frank Y, Ben-Tal N, Fariselli P. DDGun: an untrained technique for the prediction of protein stability modifications upon single and a number of level variations. BMC Bioinformatics 2019 20:S14. pmid:31266447 - 10.
Savojardo C, Fariselli P, Martelli PL, Casadio R. INPS-MD: an internet server to foretell stability of protein variants from sequence and construction. Bioinformatics 2016 32(16):2542–2544. pmid:27153629 - 11.
Lv X, Chen J, Lu Y, Chen Z, Xiao N, Yang Y. Precisely predicting mutation-caused stability modifications from protein sequences utilizing excessive gradient boosting. J. Chem. Inf. Mod. 2020 60(4):2388–2395. pmid:32203653 - 12.
Higgins MK. Can we AlphaFold our manner out of the subsequent pandemic? J. Mol. Biol. 2021 433(20):167093. pmid:34116123 - 13.
Ko J, Lee J. Can AlphaFold2 predict protein-peptide complicated constructions precisely? bioRxiv 2021 - 14.
Tsaban T, Varga J, Avraham O, Ben-Aharon Z, Khramushin A, Schueler-Furman O. Harnessing protein folding neural networks for peptide-protein docking. Nat. Commun. 2022 pmid:35013344 - 15.
Pak MA, Ivankov DN. Greatest templates outperform homology fashions in predicting the affect of mutations on protein stability. Bioinformatics 2022 38(18):4312–4320. pmid:35894930 - 16.
Xavier JS, Nguyen T-B, Karmarkar M, Portelli S, Rezende PM, Velloso JPL, et al. ThermoMutDB: a thermodynamic database for missense mutations. Nucl. Acids Res. 2020 49(D1):D475–D479. - 17.
Sarkisyan KS, Bolotin DA, Meer MV, Usmanova DR, Mishin AS, Sharonov GV, et al. Native health panorama of the inexperienced fluorescent protein. Nature 2016 533(7603):397–401. pmid:27193686 - 18.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Fundamental native alignment search device. J. Mol. Biol. 1990 215(3):403–410. pmid:2231712 - 19.
Frishman D, Argos P. Information-based protein secondary construction project. Proteins 1995 23(4):566–579. pmid:8749853 - 20.
Miller S, Janin J, Lesk A, Chothia C. Inside and floor of monomeric proteins. J. Mol. Biol. 1987 196(3):641–656. pmid:3681970 - 21.
Wu W, Wang Z, Cong P, Li T. Correct prediction of protein relative solvent accessibility utilizing a balanced mannequin. BioData Min. 2017 24(10):1. pmid:28127402 - 22.
Benjamini Y, Hochberg Y. Controlling the False Discovery Charge: A sensible and highly effective method to a number of testing. J. R. Stat. Soc. Ser. B 1995 57:289–300. - 23.
Mariani V, Biasini M, Barbato A, Schwede T. lDDT: a neighborhood superposition-free rating for evaluating protein constructions and fashions utilizing distance distinction checks. Bioinformatics 2013 29(21):2722–2728. pmid:23986568 - 24.
Bershtein S, Segal M, Bekerman R, Tokuriki N, Tawfik DS. Robustness–epistasis hyperlink shapes the health panorama of a randomly drifting protein. Nature 2006 444(7121):929–932. pmid:17122770 - 25.
Ruff KM, Pappu RV. AlphaFold and implications for intrinsically disordered proteins. J. Mol. Biol. 2021 433(20):167208. pmid:34418423 - 26.
Berman HM. The Protein Knowledge Financial institution. Nucl. Acids Res. 2000 28(1):235–242. pmid:10592235 - 27.
Potapov V, Cohen M, Schreiber G. Assessing computational strategies for predicting protein stability upon mutation: good on common however not within the particulars. Prot. Eng. Des. Sel. 2009 22(9):553–560. - 28.
Zacharov I, Arslanov R, Gunin M, Stefonishin D, Bykov A, Pavlov S, et al. “Zhores”—Petaflops supercomputer for data-driven modeling, machine studying and synthetic intelligence put in in Skolkovo Institute of Science and Know-how. Open Eng. 2019 9(1):512–520.