In order to avoid security-related warning messages when switching to secured connection, you may want either to:
- confirm the exception on the next page, or
- import our CA key in your web browser
Click here to proceed.
Frequently Asked Questions
- What kind of predictions can I make with SwissTargetPrediction?
- How does SwissTargetPrediction work?
- Is SwissTargetPrediction restricted to bioactive molecules?
- Can I use SwissTargetPrediction with other molecules (e.g., peptides, antibodies,...)?
- What about other targets not included in SwissTargetPrediction?
- What do the probability bars represent in the result page?
- How are the targets ranked?
- What does Target Class mean?
- How is the number of similar compounds in the result page chosen for each similarity measure?
- What do the similarity values mean and can they be compared between 2D and 3D similarity?
- What does "(by homology)" stand for in the list of predicted targets?
- Why only five organisms?
- Why are most predictions made "by homology" in non-human organisms?
- I know my molecule binds to a protein in another organisms but I do not see its ortholog in my predictions?
- Where does the set of small molecule-protein interactions used to train SwissTargetPrediction come from?
- Is SwissTargetPrediction freely available for academic users?
- How should I cite SwissTargetPrediction?
- What do we mean by chemical similarity?
- What is 2D similarity?
- Is there a difference between 2D similarity and Fingerprint similarity?
- How does fingerprint-based similarity work?
- What is 3D similarity?
- What's special about ElectroShape?
- Why is it useful to combine different similarity measures?
What kind of predictions can I make with SwissTargetPrediction?
SwissTargetPrediction is an online tool to predict the targets of bioactive small molecules in human and other vertebrates. This is useful to understand the molecular mechanisms underlying a given phenotype or bioactivity, to rationalize possible side-effects or to predict off-targets of known molecules.
How does SwissTargetPrediction work?
Shortly, the main idea of SwissTargetPrediction is that two bioactive molecules that are similar are likely to share their targets. Therefore, for a query molecule, we identify the most similar molecules among a set of more than 300'000 known ligands. The predicted targets are those bound by the ligands displaying the highest similarity with the query molecule.
Is SwissTargetPrediction restricted to bioactive molecules?
No, you can use any molecule as input and it will predict its target based on the known ligands most similar to your input molecule. However, as SwissTargetPrediction has been trained and validated on molecules with at least one target, you should expect much lower performance if you use random molecules that do not a priori bind to some target or display some bioactivity. In particular, SwissTargetPrediction currently does not include predictions on whether a molecule is bioactive or not. Yet, if you find a known ligand that is highly similar to your query, it is still an interesting indication of a possible interaction and potential bioactivity.
Can I use SwissTargetPrediction with other molecules (e.g., peptides, antibodies,...)?
In theory you can, but you should expect much lower accuracy and longer run-times. This is because SwissTargetPrediction only contains small molecules in its training set. Therefore you are not likely to find ligands similar to your input molecules if the latter is a large molecule. For short peptides you may still find small molecules that are similar (possibly interesting peptido-mimetics). Antibodies should be all together avoided. Also, very small molecules (less than 8 heavy atoms) should be avoided because similarity measures for these molecule tend not to perform well.
What about other targets not included in SwissTargetPrediction?
Currently SwissTargetPrediction only includes protein or protein complexes for which at least one molecule has been experimentally observed to bind to, or homologs thereof. Therefore predictions are restricted to these targets. However, it should be noted that they already cover most of the main 'druggable targets'.
What do the probability bars represent in the result page?
These values correspond to the probability that a protein bound by a ligand displaying the same similarity with the query molecule is a true target of the query molecule in our training dataset. As target prediction accuracy is affected by the molecule size (see Gfeller et al.), probabilities were computed separately for each ligand size. It is important to note that these probabilities have been derived from a cross-validation (leave-one-out) analysis over all training data in human. Therefore, they represent an average probability over all ligands and may be affected by inherent biases, such as the fact that many molecules in ChEMBL are derived from the same chemical series and therefore have many similar ligands, which make target prediction easier for them. For these reasons, it is always advisable to look manually at the ligands of each predicted target before planning follow-up experiments. Moreover, if you do not have any indication that the query molecule is bioactive, you should expect significantly lower probabilities.
How are the targets ranked?
Targets are ranked according to a score that combines both 2D and 3D similarity values with the ligands most similar to the query molecule. Therefore the top ranking target is not necessarily the ones bound by the most similar ligand with any of the similarity measure (see Gfeller et al. for details). Importantly, the ranking of the targets rather than the absolute values of scores or probabilities, is the most meaningful parameter. A maximum of 15 targets can be displayed.
What does Target Class mean?
Proteins can be classified into different functional classes (enzyme, membrane receptor, ...). In SwissTargetPrediction, the classes have been taken from ChEMBL target annotation, based on four hierarchical levels (l1 to l4). The 'Target Class' column in the result page shows the most specific target class (e.g., Ser_Thr kinase). The pie chart indicating General Target Class frequencies uses L1 level, except for enzymes that have been broken into the following L2 classes: 'Kinase', 'Phosphatase', 'Cytochrome P540', 'Protease' and 'enzyme other'. For predictions based on homology, the target classes have been mapped from the homologous proteins.
How is the number of similar compounds in the result page chosen for each similarity measure?
We use thresholds on each similarity values to decide which ligands are to be displayed for each predicted target. These thresholds are set at 0.75 for ES5D similarity and 0.45 for FP2 similarity. Below such thresholds the ligands most often do not display similarity with the query molecule.
What do the similarity values mean and can they be compared between 2D and 3D similarity?
For all ligands used to predict a target, we provide the similarity with the query molecule. These values are the exact values as given by the algorithm used to quantify the similarity. However, they can NOT be compared between different similarity measures as the typical scale is completely different. As a rule of thumb 3D similarity values (based on Electroshape) larger than 0.85 often indicate some relevant similarity and 2D similarity values larger than 0.5 are often informative. But this may vary between ligands and targets. So it is always the ranking that is meaningful, and not the absolute similarity values.
What does "(by homology)" stand for in the list of predicted targets?
A query molecule may have very similar ligands that bind to a target in another organism than the one chosen in the input form. This is a strong indication that the molecule may also bind to orthologs of this target. We therefore include such predictions and mark them as "(by homology)". Note that homology-based predictions are also used between paralogs within the same organism. Homology relationships have been derived from the three databases Ensembl Compara, Treefam and orthoDB.
Why only five organisms?
The five organisms present in SwissTargetPrediction have been selected based on the availability of enough data in the ChEMBL database. For other organisms, much less protein-small molecule interactions are available. Therefore including them would not be very useful for training the algorithm. However, you can still use the results of SwissTargetPrediction in other organisms by manually mapping the predicted targets based on homology. For instance, if a molecule binds to a given human protein, it will likely bind to the homologs in Chimpanzee or Gorilla.
Why are most predictions made "by homology" in non-human organisms?
Most small molecule protein interactions in our dataset involve human targets. Thus, for any query molecule, you are much more likely to find a similar ligand that binds to a human target with an ortholog in your chosen organism. This results in many predictions being done "by homology".
I know my molecule binds to a protein in another organisms but I do not see its ortholog in my predictions?
We used three orthology databases (Ensembl Compara, Treefam and orthoDB) to find orthology relationships. However, we cannot exclude that some orthology relationships will be missed, for instance because they are distant orthologs.
Where does the set of small molecule-protein interactions used to train SwissTargetPrediction come from?
We use data from the ChEMBL database (version 16), which is the largest publicly available database of interactions between small molecules and proteins. All interactions with activity (IC50, EC50, Kd, Ki,...) lower than 10μM in all assays have been used to build our training set.
Is SwissTargetPrediction freely available for academic users?
Yes, SwissTargetPrediction can be used without charges for academic research purposes.
How should I cite SwissTargetPrediction?
A manuscript describing the web service is currently under preparation. In the mean time you can cite our methodological paper.
What do we mean by chemical similarity?
Chemical similarity aims at quantifying the similarity between two molecules. Such notion is very useful since similar molecules often have similar properties (targets, side-effects, toxicity,...). However developing automated methods to quantify this similarity is a challenging task, especially because different features contribute to the similarity. Two main approaches have been developed, which are often referred to as 2D and 3D similarity.
What is 2D similarity?
2D similarity uses information present in the 2D chemical structure of the molecules. The most common approach for 2D similarity is to use chemical fingerprints, but other techniques have been developed such as Maximum Common Subgraph.
Is there a difference between 2D similarity and Fingerprint similarity?
In principal, yes. 2D similarity is any method that uses the 2D chemical structure as input. However, in practice the two terms are often used with the same meaning, since 2D similarity is most frequently measured using molecular fingerprints.
How does fingerprint-based similarity work?
Fingerprints-based methods encode a molecule into a large number of bits based on all linear fragments up to a maximal path length. Early methods used a database of fragment ("structural keys") to assign a single bit to each of them based on their presence or absence in a molecule. More recent approaches use hashing methods to assign uniquely each fragment in a molecule to bits in a bit string. To measure the similarity between two fingerprint vectors, the Tanimoto coefficient (also called Jaccard's index) is often used. A more detailed description of fingerprint-based similarity can be found here. In SwissTargetPrediction, we use the FP2 fingerprints as implemented in OpenBabel.
What is 3D similarity?
3D similarity uses the 3D structure of molecules to find similarities between them. This is an important aspect that is typically neglected when computing 2D similarities. In particular, molecules with similar overall shape can sometime bind the same pockets even if they are not similar in terms of their chemical structure. Therefore 3D similarity has the potential to uncover common targets between chemically different molecules. Different strategies have been designed to compute the 3D similarity. A standard approach consists of aligning the two molecules in the 3D-space. However, the alignment can take some time. Therefore other strategies have been developed, such as Ultrafast Shape Recognition. 3D similarity requires computing the 3D structure of molecules. As molecule can adopt many conformations, it is often indicated to compute many different conformations and compare them all. Is SwissTargetPrediction, we typically compute 20 different conformations for each molecule and perform the 20-by-20 comparisons.
What's special about ElectroShape?
ElectroShape is a very fast method to compare the shape of two molecules. It was developed as an extension of UltraFast Shape Recognition. ElectroShape associates a five dimensional vector to each atom of a molecule, consisting of the three Cartesian coordinates, the partial charge and the atomic lipophilicity. It then computes for each atom its distance to 6 centroids chosen as in Armstrong et al. (2011). A molecule is then described by the first three moments of the distance distribution to each of the six centroids (i.e. an 18-dimension vector). Distances between two molecules are computed as the Manhattan distance between the 18-dimensional vectors.
Why is it useful to combine different similarity measures?
Each similarity measure will have pros and cons. For instance, in our work, we observed that by combining different similarity measures, the accuracy of predictions is significantly increased. Moreover, this allows us to explore larger regions of the chemical space which is useful, for instance, to explore different scaffolds ('scaffold-hopping') or to overcome intellectual property issues.