Five Ways of Using AlphaFold 2

Read the previous section: Why Is AlphaFold 2 So Powerful?

Next, I will discuss five different ways in which a researcher can use AlphaFold 2 to predict protein structures, along with the pros and cons of each approach.

The first way: use the AlphaFold database.      If you want to view the AlphaFold 2-predicted structure for a known protein (i.e., one listed in UniProt), you can go to https://alphafold.ebi.ac.uk and search for it directly. The AlphaFold database, a collaboration between Google DeepMind and EMBL-EBI, stores pre-computed prediction results for all UniProt proteins.

Pros:
1. It is easy.
2. It is fast. No waiting need.
3. It is free.

Cons:
1. Only available for existing proteins in UniProt.
2. It is not possible to alter the way in which predictions are generated on the AlphaFold database, meaning that there is no option to adjust the parameters used in the predictions (unlike the fifth method - installing and running AlphaFold 2 locally - where you have full control over the prediction process). Additionally, only one prediction is available for each protein on the database, whereas running AlphaFold 2 locally using default parameters generates five predictions for each protein.

The second way: use homology search.      If your protein of interest is not listed in UniProt, you can use the homology search functions provided by EMBL-EBI at https://www.ebi.ac.uk/Tools/sss/fasta/. Under "Structures," select "AlphaFold DB." Enter the sequence of your protein in the provided text box, choose a homology search function (such as FASTA), and submit your job. You will receive a list of proteins that are homologous to your protein, for which AlphaFold 2-predicted structures are available.

Pros:
1. Almost as easy as the first way.
2. It is fast.
2. It is free.
4. Feasible for proteins not present in UniProt.

Cons:
1. The predicted structures you will receive are not for your protein of interest, but rather for proteins similar to it.
2. As with the first method, you do not have control over the prediction process when using the homology search function. Additionally, you will only receive one prediction for each protein, rather than the five predictions generated when running AlphaFold 2 locally with default parameters.

The third way: use AlphaFold Colab.      A Colab is a website that hosts a pre-written Python program, which connects to a computer in the cloud on the backend. AlphaFold Colab is a service provided by Google that allows you to run predictions on your own protein sequences using a modified version of AlphaFold 2 on a Google Cloud computer. The main difference between this version and the full AlphaFold 2 is that templates are not used (templates are part of the homology features used in the training of AlphaFold 2). AlphaFold Colab is free to use with a Google account, although a monthly subscription is also available for $13.99 to access improved service and hardware.

Pros:
1. No coding or installing is needed.
2. You get to make predictions for your protein of interest.
2. It is free (or almost free).
4. While the modified version of AlphaFold 2 used in AlphaFold Colab is not the full version, it still provides relatively accurate predictions for many protein sequences, particularly shorter ones.

Cons:
1. Long wait times. Execution sometimes times-out without completion. Not suitable for large prediction jobs.
2. The accuracy of predictions made using AlphaFold Colab is not as good as that of the full version of AlphaFold 2, and it may not be reliable, particularly for longer proteins with more than 800 amino acids.
3. Like the first and second methods, you are unable to adjust parameters or control the prediction process when using AlphaFold Colab.

The fourth way: use ColabFold.      ColabFold is a community-based Colab developed by Sergey Ovchinnikov and his colleagues, with modifications made to the AlphaFold 2 code. One of the main changes is the replacement of the MSA algorithm used in the original AlphaFold 2 with the faster MMseqs2, resulting in significant time savings. ColabFold also includes a limited set of additional features, most notably the ability to make predictions for both homo-oligomers and hetero-oligomers.

Pros:
1. As with the third way, no coding or installing is needed.
2. As with the third way, you get to make predictions on your protein of interest.
3. It is free (or almost free).
4. ColabFold is generally faster than the third method (AlphaFold Colab), although some sacrifice in accuracy may be observed according to benchmarking.
5. ColabFold offers some additional functions that are not present in the original AlphaFold 2, such as the ability to make predictions for protein complexes (homo-oligomers and hetero-oligomers).

Cons:
1. Although faster than the third way, as a community-based service, it is not for suitable for large prediction jobs.
2. The accuracy of ColabFold is not as high as that of the original AlphaFold 2. Limited benchmarking makes it difficult to determine the expected degree of reduction in accuracy.
3. As with each of the first three methods, you are unable to adjust the parameters of the predictions or control the prediction process when using ColabFold.

The fifth way: install and run it yourself.      Google DeepMind has made AlphaFold 2 open-source, allowing you to install and run predictions on your own computer. You can choose either the Docker version or the non-Docker setup. While the initial setup cost is relatively high, the benefits of installing and running AlphaFold 2 directly are significant. A fast, multi-core computer with large storage space (an SSD is recommended) is required to host the ~2.2TB database. While a GPU is not strictly necessary, it is recommended for decent predictions (an A100, which costs around $20,000 on the market, is recommended). You can also install AlphaFold 2 on Google Cloud Platform (GCP), which can save on costs if you do not anticipate running many predictions over extended periods of time. The main advantage of installing and running AlphaFold 2 directly is the ability to access the full power of the model, including the ability to adjust prediction parameters and make as many predictions as needed. Additionally, you can modify the code and conduct tests related to your own hypotheses with sufficient knowledge. If you do not have the skills or expertise to install and run AlphaFold 2 on your own, you can seek assistance from AccuraScience.

Pros:
1. You get the full power of AlphaFold2 in all aspects.

Cons:
1. It is costly.
2. It requires some knowhow and other knowledge.

Read next: Limitations of AlphaFold 2 Modeling

-- About the author

Need assistance in your AlphaFold or structural bioinformatics project? We may be able to help. Take a look at the intro to our bioinformatician team, see some of the advantages of using our team's help here, and check out our FAQ page!

Send us an inquiry, chat with us online (during our business hours 9-5 Mon-Fri U.S. Central Time), or reach us in other ways!



Chat Support Software