Folding@Home

Folding@Home

Folding@Home

The PlayStation 3 Folding@home client displays a 3D model of the protein being simulated.

Author
Vijay Pande

Developer
Stanford University / Pande Group

Latest release
5.04 (Windows),
5.04 (Linux and BSD),
5.02 (Mac OS X),
1.3 (PlayStation 3)

Preview release
6.00beta1 (Windows),
6.00beta1 (Linux),
5.91beta6 (GPU GUI),
6.00beta1 (GPU console),
5.91beta (Windows-SMP),
6.00beta1 (Mac OS X-SMP) / 2007-09-26

Platform
Cross-platform

Genre
Distributed computing

License
Proprietary [1]

Website
folding.stanford.edu

Folding@Home (also known as FAH or F@H) is a distributed computing project designed to perform computationally intensive simulations of protein folding and other molecular dynamics. It was launched on October 1, 2000, and is currently managed by the Pande Group, within Stanford University’s chemistry department, under the supervision of Professor Vijay Pande. Folding@home is the most powerful distributed computing cluster in the world, according to Guinness,[1] and one of the world’s largest distributed computing projects.[2] The goal of the project is “to understand protein folding, misfolding, and related diseases.”[3]

Accurate simulations of protein folding and misfolding enable the scientific community to better understand the development of many diseases, including Alzheimer’s disease, BSE (mad cow disease), cancer, Huntington’s disease, cystic fibrosis and other aggregation-related diseases. [2] More fundamentally, understanding the process of protein folding — how biological molecules assemble themselves into a functional state — is one of the outstanding problems of molecular biology. So far, the Folding@home project has successfully simulated folding in the 5-10 microsecond range — a time scale thousands of times longer than it was previously thought possible to model.[4]
The Pande Group goal is to refine and improve the MD and Folding@home DC methods to the level where it will become an essential tool for the MD research. [5] For that goal they collaborate with various scientific institutions. [6]
As of December 13, 2007, fifty-four scientific research papers have been published using the project’s work.[7] A University of Illinois at Urbana-Champaign report dated October 22, 2002 states that Folding@home distributed simulations of protein folding are demonstrably accurate.[8]

On September 16, 2007, the Folding@Home project officially attained a performance level higher than one petaFLOPS, becoming the first computing system of any kind to do so, although it had briefly peaked above one petaFLOPS in March 2007.[9][10]. In comparison, the fastest supercomputer in the world (as of November 2007, IBM’s Blue Gene/L supercomputer) peaks at 478.2 teraFLOPS (1000 teraFLOPS=1 petaFLOP).

How it works

 
Folding@Home does not rely on powerful supercomputers for its data processing; instead, the primary contributors to the Folding@home project are many hundreds of thousands of personal computer users who have installed a small client program. The client will, at the user’s choice, run in the background, utilizing otherwise unused CPU power, or run as a screensaver only while the user is away. In most modern personal computers, the CPU is rarely used to its full capacity at all times; the Folding@Home client takes advantage of this unused processing power.

The Folding@Home client periodically connects to a server to retrieve “work units,” which are packets of data upon which to perform calculations. Each completed work unit is then sent back to the server. As data integrity is a major concern for all distributed computing projects, all work units are validated through the use of a 2048 bit digital signature.

The Folding@Home client utilizes modified versions of four molecular simulation programs for calculation: TINKER, GROMACS, AMBER, and CPMD.[11] There are many core variations on these base simulation programs:[12]

  • TINKER
    • Tinker core (currently inactive)
  • GROMACS (all variants of this core use SSE, 3DNow+ or AltiVec optimizations, where available, unless otherwise specified)
    • Gromacs
    • Double Gromacs (Double Precision, uses SSE2 only)
    • Double Gromacs B (an update of Double Gromacs, both are still in use, uses SSE2 only)
    • GBGromacs (Gromacs with the Generalized Born implicit solvent model)
    • Gromacs SREM (Gromacs Serial Replica Exchange Method) The Gromacs Serial Replica Exchange Method core, also known as GroST (Gromacs Serial replica exchange with Temperatures), uses Replica Exchange Method in its simulations also known as REMD, Replica Exchange Molecular Dynamics.
    • GroSimT (Gromacs with Simulated Tempering)
    • Gromacs 33 (using the newest Gromacs 3.3 codebase)
    • Gro-SMP (Symmetric MultiProcessing variant)(runs only on x86 or x64 hardware, uses SSE only)
    • Gro-GPU (Graphics Processing Unit variant)(GPUs do not have optimizations; they are powerful enough to do the calculations simply using brute force)
    • Gro-PS3 (PlayStation 3 variant)(No optimizations, see GPU variant)
  • AMBER
    • PMD core[13]
  • CPMD
    • QMD (currently inactive, due to QMD developer graduating from Stanford University and due to current research shifting away from Quantum MD. Also, there was a SSE2 controversy involving Intel libraries and AMD processors.)[14]

Possible future additions:

Contributors to Folding@Home may have user names used to keep track of their contributions. Each user may be running the client on one or more CPUs; for example, a user with two computers could run the client on both of them. Users may also contribute under one or more team names; many different users may join together to form a team. Contributors are assigned a score indicating the number and difficulty of completed work units. Rankings and other statistics are posted to the Folding@Home website.

Participation

 
Shortly after breaking the 200,000 active CPU count on September 20, 2005, the Folding@home project celebrated its fifth anniversary on October 1, 2005.

As of January 5, 2008, the Folding@Home project has received computational results from over 2.7 million devices[2] over the course of its time.

Interest and participation in the project has grown steadily since its launch. The number of active devices participating in the project increased substantially after receiving much publicity during the launch of their High Performance clients for both ATi Graphics Cards and the PlayStation 3.

As of November 3, 2007 the peak speed of the project overall has reached over 1.5 PFLOPS.[2]

Google & Folding@home

There used to be cooperation between Folding@home and Google Labs in the form of Google Compute. Google Compute supported Folding@home during its early stage — when Folding@home had ~10,000 active CPUs. At that time, a boost of 20,000 machines was very significant. Today the project has a large number of active CPUs and the number of new clients joining Google Compute was very low (most people opted for the Folding@home client instead), so it was discontinued. The Google Compute clients also had certain limits: they could only run the TINKER core and had limited naming and team options. Folding@home is no longer supported on Google Toolbar, and even the old Google Toolbar client will not work.[16]

High performance platforms

 

Graphical processing units

As of October 2, 2006, the Folding@home GPU client has been released into the public as a beta test. After 9 days of processing from the Beta client the Folding@home project had received 31 teraFLOPS of computational performance from just 450 X1900 GPUs, averaging at over 70x the performance of current CPU submissions.[2]
The next FAH GPU client will support 2xxx/3xxx series of ATI GPUs, there are no FAH GPU clients for Nvidia GPUs.[17]

PlayStation 3

 
Stanford announced in August 2006 that a folding client was available to run on the Sony PlayStation 3.[18] The intent was that gamers would be able to contribute to the project by merely “contributing electricity,” leaving their PlayStation 3 consoles running the client while not playing games. PS3 firmware version 1.6 (released on Thursday, March 22, 2007) allows for Folding@home software, a 50 MB download, to be used on the PS3.[2]
A peak output of the project at 990 teraFLOPS was achieved on 25 March, 2007, at which time the number of FLOPS from each PS3 as reported by Stanford fell, reducing the overall speed rating of those machines by 50%. This had the effect of bumping down the overall project speed to the mid 700 range and increasing the number of active PS3s required to achieve a petaFLOPS level to around 60,000. Lately, the console accounts for about 60% of all teraFLOPS.
On April 25 2007, Sony announced that a new version of Folding@home would be released the next day. The new version would improve folding performance beyond the current capacity, far beyond even the 400 teraFLOPS previously reached by PS3 users.[19] The release led to the breaking of the petaFLOPS barrier for the first time by any computing system in history on September 15, 2007. [20][21] Guinness World Record will recognize Folding@Home as the most powerful distributed computing network, in large part thanks to the PS3.[22]

On December 19, 2007, Sony again updated the FAH firmware to version 2.1 to allow users to run music stored on their hard drives while contributing. Another feature of the 2.1 firmware allows users to automatically shut down their console after current work is done or after a limited period of time (for example 3 or 4 hours).[23][24] Also, the software update added the Generalized Born implicit solvent model, so the FAH PS3 client gained more broad computing capabilities.

Multi-core processing client

 
As more modern CPUs are being released the migration to multiple cores is becoming more adopted by the public, the Pande Group is adding symmetric multiprocessing (SMP) support to the Folding@home client in the hopes of capturing the additional processing power.
On November 13, 2006, the beta SMP Folding@home clients for x86-64 Linux and x86 Mac OS X have been released. The beta win32 SMP Folding@home client is out as well, and a 32-bit Linux client is currently in development.[25]

Folding@home teams

A typical Folding@home user, running the client on a single PC, will likely not be ranked high on the list of contributors. However, if the user were to join a team, they would add the points they receive to a larger collective. Teams work by using the combined score of all their members. Thus, teams are ranked much higher than individual submitters. Rivalries between teams create friendly competition that benefits the folding community. Many teams publish their own stats, so members can have intra-team competitions for top spots. Teams offer no real benefits other than ones of self-gratification, and possibly extra contributions (to add to the teams rank).[26]

Development

The Folding@home project does not make the project source code available to the public, citing security and integrity concerns.[27][28] At the same time, the majority of the scientific codes used by the FAH (ex. Cosm, GROMACS, TINKER, AMBER, CPMD, BrookGPU) are largely OSS or under similar licenses.

A development version of Folding@Home runs on the open source BOINC framework; however, this version remains unreleased.[29]

See also

  • Blue Gene
  • List of distributed computing projects

Notes and references

  1. ^ Engadget, among other sites, announces that Guinness has recognized FAH as the most powerful distributed cluster, Oct 31, 2007. Retrieved Nov 5, 2007
  2. ^ a b c d e Client Statistics by OS. Folding@home distributed computing. Stanford University (2006-11-12 (updated automatically)). Retrieved on 2008-01-05.
  3. ^ Vijay Pande (2006). Folding@home distributed computing home page. Stanford University. Retrieved on 2006-11-12.
  4. ^ Validity of Folding@home (Blog). Folding@home support forum. Stanford University. Retrieved on 2006-11-12.
  5. ^ Futures in Biotech 27: Folding@Home at 1.3 Petaflops (Interview, webcast).
  6. ^ Folding@home – About (FAQ).
  7. ^ Vijay Pande and the Folding@home team (2007). Folding@home – Papers. Folding@home distributed computing. Stanford University. Retrieved on 2007-12-13.
  8. ^ C. Snow, H. Nguyen, V. S. Pande, and M. Gruebele. (2002). “Absolute comparison of simulated and experimental protein-folding dynamics”. Nature 420 (6911): 102–106. PMID 12422224.

  9. ^ http://folding.typepad.com/news/2007/09/crossing-the-pe.html
  10. ^ http://folding.typepad.com/news/2007/09/post-petaflop.html
  11. ^ Vijay Pande (2005-10-16). Folding@Home with QMD core FAQ (FAQ). Stanford University. Retrieved on 2006-12-03. The site indicates that Folding@home uses a modification of CPMD allowing it to run on the supercluster environment.
  12. ^ Cores – FaHWiki (FAQ). Retrieved on 2007-11-06.
  13. ^ Cores – FaHWiki (FAQ). Retrieved on 2007-12-15.
  14. ^ FAH & QMD & AMD64 & SSE2 (FAQ).
  15. ^ Folding@home – About (FAQ).
  16. ^ What is the state of Google Compute client? (Blog). Folding@home support forum. Stanford University. Retrieved on 2006-11-12.
  17. ^ http://folding.typepad.com/news/2008/01/misc-gpu-commen.html
  18. ^ Vijay Pande (2006-10-22). PS3 FAQ. Stanford University. Retrieved on 2006-11-13.
  19. ^ PS3 Folding Kicking Ass, Getting Update.
  20. ^ http://folding.typepad.com/news/2007/09/crossing-the-pe.html
  21. ^ http://folding.typepad.com/news/2007/09/post-petaflop.html
  22. ^ http://kotaku.com/gaming/research/ps3-pushes-foldinghome-to-world-record-317151.php
  23. ^ Folding@home™ for PLAYSTATION®3 Version 1.3. Retrieved on 2007-12-31.
  24. ^ Rimon, Noam (2007-12-18). New Folding@Home Features Coming. Retrieved on 2007-12-31.
  25. ^ Vijay Pande (2006-11-13). Folding@home SMP Client FAQ. Stanford University. Retrieved on 2006-11-13.
  26. ^ Folding-community: why have teams?
  27. ^ Why not OpenSource?.
  28. ^ Folding@home Open Source FAQ.
  29. ^ FAH on BOINC. Folding@Home high performance client FAQ.
  • M. R. Shirts and V. S. Pande. (2000). “Screen Savers of the World, Unite!”. Science 290: 1903–1904.

  • C. Snow, H. Nguyen, V. S. Pande, and M. Gruebele. (2002). “Folding of a bba protein: simulation and theory.”. Nature 420: 102–106.

  • C. D. Snow, E. J. Sorin, Y. M. Rhee, and V. S. Pande. (2005). “How well can simulation predict protein folding kinetics and thermodynamics?”. Annual Reviews of Biophysics 34: 43–69.

  • L. T. Chong, C. D. Snow, Y. M. Rhee, and V. S. Pande. (2004). “Dimerization of the p53 oligomerization domain: Identification of a folding nucleus by molecular dynamics simulations.”. Journal of Molecular Biology 345: 869–78.

  • I. Suydam, C. D. Snow, V. S. Pande and S. G. Boxer. (2006). “Electric Fields at the Active Site of an Enzyme: Direct Comparison of Experiment with Theory.”. Science in press.

  • Folding-community: How can you tell the true nature of a Work Unit
  • Folding-community: Vijay – No need to report EUEs