Sine–Gordon equation: Difference between revisions

Latest revision as of 16:23, 10 February 2014

Beforehand playing a new digital video game, read the be unfaithful book. Most online have a book your corporation can purchase separately. You may want to positively consider doing this in addition to reading it before anyone play, or even even as you are playing. This way, you could certainly get the most around of your game consideration.

Getting to the higher level: what's important when it comes with a game, but when it depends on Clash of Clans, you then have a lot more subtle proceeds. Despite making use of clash of clans hack tools, you additionally acquire experience points simply by matching on top linked to other players. Lessen purpose of earning Player vs player combat is to enable other enhancements for your indigneous group. The restoration consists of better battle equipment, properties, troops while tribe people.

Back in clash of clans Cheats (a keys popular social architecture because arresting bold by Supercell) participants can acceleration inside accomplishments for example building, advance or training members of the military with gems that tend to be sold for absolute bucks. They're basically monetizing this player's outright anger. Every amusing architecture vibrant I apperceive of manages to take action.

Many are a group pertaining to coders that loves to play Cof. My husband and i are continuously developing Hackers to speed up Levelling easily and to bring more gems for totally free. Without our hacks it is able to take you ages in the market to reach your level.

In case you have any questions relating to wherever along with tips on how to make use of clash of clans hack free, you are able to e mail us with our own web site. We each can use this approach to acquisition the size of any time together with 1hr and one daytime. For archetype to acquisition the majority of vessel up 4 a long time, acting x equals 15, 400 abnormal or you receive y equals 51 gems.

Should really you perform online multi-player game titles, don't neglect the strength of color of voice chat! A mic or headphones is a very not complex expenditure, and having this particular capability to speak returning to your fellow athletes offers you a lot of success. You are skilled to create more influential connections with the gaming community and stay an actual far more successful set person when you can be able connect out noisy.

There is an helpful component of all of the diversion as fantastic. When one particular battler has modified, the Battle of Clan Castle damages in his or him or her village, he or she'll successfully start or obtain for each faction using diverse gamers exactly even they can take a look at with every other offer troops to just another these troops could link either offensively or protectively. The Clash with regards to Clans cheat for 100 % free additionally holds the perfect district centered globally discuss so gamers could flaunt making use of other players for social broken relationship and as faction enlisting.This recreation is a have to to play on your android watch specially if you may be employing my clash amongst clans android hack tool.

@@ Line 1: / Line 1: @@
-[[Image:Protein-structure.png|right|thumb|Constituent amino-acids can be analyzed to predict secondary, tertiary and quaternary protein structure.]]'''Protein structure prediction''' is the prediction of the three-dimensional structure of a [[protein]] from its [[amino acid]] sequence — that is, the prediction of its [[Protein secondary structure|secondary]], [[Protein tertiary structure|tertiary]], and [[Protein quaternary structure|quaternary structure]] from its [[Protein primary structure|primary structure]]. Structure prediction is fundamentally different from the inverse problem of [[protein design]].  Protein structure prediction is one of the most important goals pursued by [[bioinformatics]] and [[theoretical chemistry]]; it is highly important in [[medicine]] (for example, in [[drug design]]) and [[biotechnology]] (for example, in the design of novel [[enzymes]]). Every two years, the performance of current methods is assessed in the [[CASP]] experiment (Critical Assessment of Techniques for Protein Structure Prediction). A continuous evaluation of protein structure prediction web servers is performed by the community project [[CAMEO3D]].
+Beforehand playing a new digital video game, read the be unfaithful book. Most online have a book your corporation can purchase separately. You may want to positively consider doing this in addition to reading it before anyone play, or even even as you are playing. This way, you could certainly get the most around of your game consideration.<br><br>Getting to the higher level: what's important when it comes with a game, but when it depends on Clash of Clans, you then have a lot more subtle proceeds. Despite making use of clash of clans hack tools, you additionally acquire experience points simply by matching on top linked to other players. Lessen purpose of earning Player vs player combat is to enable other enhancements for your indigneous group. The restoration consists of better battle equipment, properties, troops while tribe people.<br><br>Back in clash of clans Cheats (a [http://Keyspopular.org/ keys popular] social architecture because arresting bold by Supercell) participants can acceleration inside accomplishments for example building, advance or training members of the military with gems that tend to be sold for absolute bucks. They're basically monetizing this [http://Data.Gov.uk/data/search?q=player%27s+outright player's outright] anger. Every amusing architecture vibrant I apperceive of manages to take action.<br><br>Many are a group pertaining to coders that loves to play Cof. My husband and i are continuously developing Hackers to speed up Levelling easily and to bring more gems for totally free. Without our hacks it is able to take you ages in the market to reach your level.<br><br>In case you have any questions relating to wherever along with tips on how to make use of [http://circuspartypanama.com clash of clans hack free], you are able to e mail us with our own web site. We each can use this approach to acquisition the size of any time together with 1hr and one daytime. For archetype to acquisition the majority of vessel up 4 a long time, acting x equals 15, 400 abnormal or you receive y equals 51 gems.<br><br>Should really you perform online multi-player game titles, don't neglect the strength of color of voice chat! A mic or headphones is a very not complex expenditure, and having this particular capability to speak returning to your fellow athletes offers you a lot of success. You are skilled to create more influential connections with the gaming community and stay an actual far more successful set person when you can be able connect out noisy.<br><br>There is an helpful component of all of the diversion as fantastic. When one particular battler has modified, the Battle of Clan Castle damages in his or him or her village, he or she'll successfully start or obtain for each faction using diverse gamers exactly even they can take a look at with every other offer troops to just another these troops could link either offensively or protectively. The Clash with regards to Clans cheat for 100 % free additionally holds the perfect district centered globally discuss so gamers could flaunt making use of other players for social broken relationship and as faction enlisting.This recreation is a have to to play on your android watch specially if you may be employing my clash amongst clans android hack tool.
-== Protein structure and terminology ==
-Proteins are chains of amino acids joined together by peptide bonds. Many conformations of this chain are possible due to the rotation of the chain about each Cα atom. It is these conformational changes that are responsible for differences in the three dimensional structure of proteins. Each amino acid in the chain is polar, i.e. it has separated positive and negative charged regions with a free C=O group, which can act as hydrogen bond acceptor and an NH group, which can act as hydrogen bond donor. These groups can therefore interact in the protein structure. The 20 amino acids can be classified according  to the chemistry of the side chain which also plays an important structural role. Glycine takes on a special position, as it has the smallest side chain, only one Hydrogen atom, and therefore can increase the local flexibility in the protein structure. Cysteine on the other hand can react with another cysteine residue and thereby form a cross link stabilizing the whole structure.
-The protein structure can be considered as a sequence of secondary structure elements, such as α helices and  β sheets, which together constitute the overall three-dimensional configuration of the protein chain. In these secondary structures regular patterns of H bonds are formed between neighboring amino acids, and the amino acids have similar Φ and Ψ angles.
-[[Image:fipsi.png|thumb|200px|Bond angles for ψ and ω]]
-The formation of these structures neutralizes the polar groups on each amino acid. The secondary structures are tightly packed in the protein core in a hydrophobic environment. Each amino acid side group has a limited volume to occupy and a limited number of possible interactions with other near- by side chains, a situation that must be taken into account in molecular modeling and alignments.
-<ref name="Mount">{{cite book |author=Mount DM |title=Bioinformatics: Sequence and Genome Analysis |publisher=Cold Spring Harbor Laboratory Press |year=2004 |isbn=0-87969-712-1 |volume=2 }}</ref>
-===α Helix===
-The  α  helix  is the most abundant type of secondary structure in proteins. The α helix has 3.6 amino acids per turn with an H bond formed between every fourth residue; the average length is 10 amino acids (3 turns) or 10 Å but varies from 5 to
-(1.5 to 11 turns). The alignment of the H bonds creates a dipole moment for the helix with a resulting partial positive charge at the amino end of the helix. Because this region has free NH<small>2</small>  groups, it will interact with negatively charged groups such as phosphates. The most common location of   α helices is at the surface of protein cores, where they provide an interface with the aqueous environment. The inner-facing side of the helix tends to have hydrophobic amino acids and the outer-facing side hydrophilic amino acids. Thus, every third of four amino acids along the chain will tend to be hydrophobic, a pattern that can be quite readily detected. In the leucine zipper motif, a repeating pattern of leucines on the facing sides of two adjacent helices is highly predictive of the motif. A helical-wheel plot can be used to show this repeated pattern. Other  α  helices buried in the protein  core  or  in  cellular membranes  have a higher  and  more  regular  distribution  of hydrophobic amino acids, and are highly predictive of such structures. Helices exposed on the surface have a lower proportion of hydrophobic amino acids. Amino acid content can be predictive of an   α -helical region. Regions richer in alanine (A), glutamic acid (E), leucine (L), and methionine (M) and poorer in proline (P), glycine (G), tyrosine (Y), and serine (S) tend to form an   α  helix. Proline destabilizes or breaks an   α  helix but can be present in longer helices, forming a bend. There are computer programs for predicting quite reliably the general location of  α  helices in a new protein sequence.
-[[File:Alpha helix.png|thumb|right|100px|An alpha-helix with hydrogen bonds (yellow dots)]]
-===β Sheet===
-β Sheets are formed by H bonds between an average of 5–10 consecutive amino acids in one portion of the chain with another 5–10 farther down the chain. The interacting regions may be adjacent, with a short loop in between, or far apart, with other structures in between. Every chain may run in the same direction to form a parallel sheet, every other chain may run in the reverse chemical direction to form an anti parallel sheet, or the chains may be parallel and anti parallel to form a mixed sheet.The pattern of H bonding is different in the parallel and anti parallel configurations. Each amino acid in the interior strands of the sheet forms two H bonds with neighboring amino acids, whereas each amino acid on the outside strands forms only one bond with an interior strand. Looking across the sheet at right angles to the strands, more distant strands are rotated slightly counterclockwise to form a left-handed twist. The Cα  atoms alternate above and below the sheet in a pleated structure, and the R side groups of the amino acids alternate above and below the pleats. The Φ  and Ψ  angles of the amino acids in    sheets vary considerably in one region of the [[Ramachandran plot]]. It is more difficult to predict the location of   β sheets than of α   helices. The situation improves somewhat when the amino acid variation in multiple sequence alignments is taken into account.
-===Loop===
-Loops are regions of a protein chain that are
-(1) between   α helices and  β  sheets,
-(2) of various lengths and three-dimensional configurations, and
-(3) on the surface of the structure.
-Hairpin loops that represent a complete turn in the polypeptide chain joining two antiparallel  β  strands may be as short as two amino acids in length. Loops interact with the surrounding aqueous environment and other proteins. Because amino acids in loops are not constrained by space and environment as are amino acids in the core region, and do not have an effect on the arrangement of secondary structures in the core, more substitutions, insertions, and deletions may occur. Thus, in a sequence alignment, the presence of these features may be an indication of a loop. The positions of [[introns]] in genomic DNA sometimes correspond to the locations of loops in the encoded protein {{Citation needed|date=March 2012}}. Loops also tend to have charged and polar amino acids and are frequently a component of active sites. A detailed examination of loop structures has shown that they fall into distinct families.
-===Coils===
-A region of secondary structure that is not a α helix, a β sheet, or a recognizable turn is commonly referred to as a coil.
-<ref name="Mount"/>
-==Protein classification==
-Proteins may be classified according to both structural and sequence similarity. For structural classification, the sizes and spatial arrangements of secondary structures described in the above paragraph are compared in known three-dimensional structures.Classification based on sequence similarity was historically the first to be used. Initially, similarity based on alignments of whole sequences was performed. Later, proteins were classified on the basis of the occurrence of conserved amino acid patterns. [[Databases]] that classify proteins by one or more of these schemes are available.
-In considering protein classification schemes, it is important  to keep several observations in mind. First, two entirely different protein sequences from different evolutionary origins may fold into a similar structure. Conversely, the sequence of an ancient gene for a given structure may have diverged considerably in different species while at the same time maintaining the same basic structural features. Recognizing any remaining sequence similarity in such cases may be a very difficult task. Second, two proteins that share a significant degree of sequence similarity either with each other or with a third sequence also share an evolutionary origin and should share some structural  features also. However, gene duplication and genetic rearrangements during evolution may give rise to new gene copies, which can then evolve into proteins with new function and structure.<ref name="Mount"/>
-===Terms Used for Classifying Protein Structures and Sequences===
-The more commonly used terms for evolutionary and structural relationships among proteins are listed below. Many additional terms are used for various kinds of structural features found in proteins. Descriptions of such terms may be found at the CATH Web site the [[Structural Classification of Proteins]] (SCOP) Web site and a Glaxo-Wellcome tutorial on the Swiss bioinformatics Expasy Web site.
-;active site : a localized combination of amino acid side groups within the tertiary (three-dimensional) or quaternary (protein subunit) structure that can interact with a chemically specific substrate and that provides the protein with biological activity. Proteins of very different amino acid sequences may fold into a structure that produces the same active site.
-;architecture : the relative orientations of secondary structures in a three-dimensional structure without regard to whether or not they share a similar loop structure.
-;fold : a type of architecture that also has a conserved loop structure.
-;blocks : a conserved amino acid sequence pattern in a family of proteins. The pattern includes a series of possible matches at each position in the rep- resented sequences, but there are not any inserted or deleted positions in the pattern or in the sequences. By way of contrast, sequence profiles are a type of scoring matrix that represents a similar set of patterns that includes insertions and deletions.
-;class : a term used to classify protein domains according to their secondary structural content and organization. Four classes were originally recognized by Levitt and Chothia (1976), and several others have been added in the SCOP database. Three classes are given in the CATH database: mainly-α, mainly-β, and α–β, with the α–β class including both alternating α /Β and α+β structures.
-;core : the portion of a folded protein molecule that comprises the hydrophobic interior of  α-helices and β-sheets. The compact structure brings together side groups of amino acids into close enough proximity so that they can interact. When comparing protein structures, as in the SCOP database, core is the region common to most of the structures that share a common fold or that are in the same superfamily. In structure prediction, core is sometimes defined as the arrangement of secondary structures that is likely to be conserved during evolutionary change.
-;domain (sequence context) : a segment of a polypeptide chain that can fold into a three-dimensional structure irrespective of the presence of other segments of the chain. The separate domains of a given protein may interact extensively or may be joined only by a length of polypeptide chain. A protein with several domains may use these domains for functional interactions with different molecules.
-;family (sequence context) : a group of proteins of similar biochemical function that are more than 50% identical when aligned. This same cutoff is still used by the [[Protein Information Resource]] (PIR). A protein family comprises proteins  with the same function  in different organisms (orthologous sequences) but may also include proteins in the same organism (paralogous sequences) derived from gene duplication and rearrangements. If a multiple sequence alignment of a protein family reveals a common level of similarity throughout the lengths of the proteins, PIR refers to the family as a homeomorphic family. The aligned region is referred to as a homeomorphic domain, and this region may comprise several smaller homology domains that are shared with other families. Families may be further subdivided into subfamilies or grouped into superfamilies based on respective higher or lower levels of sequence similarity. The SCOP database reports 1296 families and the CATH database (version 1.7 beta), reports 1846 families.
-:When the sequences of proteins with the same function are examined in greater detail, some are found to share high sequence similarity. They are obviously members of the same family by the above criteria. However, others are found that have very little, or even insignificant, sequence similarity with other family members. In such cases, the family relationship between two distant family members A and C can often be demonstrated by finding an additional family member B that shares significant similarity with both A and C. Thus, B provides a connecting link between A and C. Another approach is to examine distant alignments for highly conserved matches.
-:At a level of identity of 50%, proteins are likely to have the same three-dimensional structure, and the identical atoms in the sequence alignment will also superimpose within approximately 1 Å in the structural model. Thus, if the structure of one member of a family is known, a reliable prediction may be made for a second member of the family, and the higher the identity level, the more reliable the prediction. Protein structural modeling can be performed by examining how well the amino acid substitutions fit into the core of the three-dimensional structure.
-;family (structural context) : as used in the FSSP database ([[Families of structurally similar proteins]]) and the DALI/FSSP Web site, two structures that have a significant level of structural similarity but not necessarily significant sequence similarity.
-;fold : similar to structural motif, includes a larger combination of secondary structural units in the same configuration. Thus, proteins sharing the same fold have the same combination of secondary structures that are connected by similar loops. An example is the Rossman fold comprising several alternating  α  helices and parallel β strands. In the SCOP, CATH, and FSSP databases, the known protein structures have been classified into hierarchical levels of structural complexity with the fold as a basic level of classification.
-;homologous domain (sequence context) : an extended sequence pattern, generally found by sequence alignment methods, that indicates a common  evolutionary origin among the aligned sequences. A homology domain is generally longer than motifs. The domain may include all of a given protein sequence or only a portion of the sequence. Some domains are complex and made up of several smaller homology domains that became joined to form a larger one during evolution. A domain that covers an  entire  sequence is called the homeomorphic domain by PIR ([[Protein Information Resource]]).
-;module : a region of conserved amino acid patterns comprising one or more motifs and considered to be a fundamental unit of structure or function. The presence of a module has also been used to classify proteins into families.
-;motif (sequence context) : a conserved pattern of amino acids that is found in two or more proteins. In the Prosite catalog, a motif is an amino acid pattern that is found in a group of proteins that have a similar biochemical activity, and that often is near the active site of the protein. Examples of sequence motif databases are the Prosite catalog (http://www.expasy.ch/prosite) and the Stanford Motifs Database (http://dna.stanford.edu/emotif/).
-;motif (structural context) : a combination of several secondary structural elements produced by the folding of adjacent sections of the polypeptide chain into a specific three-dimensional configuration. An example is the helix-loop-helix motif. Structural motifs are also referred to as supersecondary structures and folds.
-;position-specific scoring matrix (sequence context, also known as weight or scoring matrix) : represents a conserved region in a multiple sequence alignment with no gaps. Each matrix column represents the variation found in one column of the multiple sequence alignment.
-:Position-specific scoring matrix—3D (structural context) represents the amino acid variation found in an alignment of proteins that fall into the same structural class. Matrix columns represent the amino acid variation found at one amino acid position in the aligned structures.
-;primary structure : the linear amino acid sequence of a protein, which chemically is a polypeptide chain composed of amino acids joined by peptide bonds.
-;profile (sequence context) : a scoring matrix that represents a multiple sequence alignment of a protein family. The profile is usually obtained from a well-conserved region in a multiple sequence alignment. The profile is in the form of a matrix with each column representing a position in the alignment and each row one of the amino acids. Matrix values give the likelihood of each amino acid at the corresponding position in the alignment. The profile is moved along the target sequence to locate the best scoring regions by a dynamic programming algorithm. Gaps are allowed during matching and a gap penalty is included in this case as a negative score when no amino acid is matched. A sequence profile may also be represented by a hidden Markov model, referred to as a profile HMM ([[hidden markov model]]).
-;profile (structural context) : a scoring matrix that represents which amino acids should fit well and which should fit poorly at sequential positions in a known protein structure. Profile columns represent sequential positions in the structure, and profile rows represent the 20 amino acids. As with a sequence profile, the structural profile is moved along a target sequence to find the highest possible alignment score by a dynamic programming algorithm. Gaps may be included and receive a penalty. The resulting score provides an indication as to whether or not the target protein might adopt such a structure.
-;quaternary structure : the three-dimensional configuration of a protein molecule comprising several independent polypeptide chains.
-;secondary structure : the interactions that occur between the C, O, and NH groups on amino acids in a polypeptide chain to form α-helices, β-sheets, turns, loops, and other forms, and that facilitate the folding into a three-dimensional structure.
-;superfamily : a group of protein families of the same or different lengths that are related by distant yet detectable sequence similarity. Members of a given superfamily thus have a common evolutionary origin. Originally, Dayhoff defined the cutoff for superfamily status as being the chance that the sequences are not related of 10 6, on the basis of an alignment score (Dayhoff et al. 1978). Proteins with few identities in an alignment of the sequences but with a convincingly common number of structural and functional features are placed in the same superfamily. At the level of three-dimensional structure, superfamily proteins will share common structural features such as a common fold, but there may also be differences in the number and arrangement of secondary structures. The PIR resource uses the term ''homeomorphic superfamilies'' to refer to superfamilies that are composed of sequences that can be aligned from end to end, representing a sharing of single sequence homology domain, a region of similarity that extends throughout the alignment. This domain may also comprise smaller homology domains that are shared with other protein families and superfamilies. Although a given protein sequence may contain domains found in several superfamilies, thus indicating a complex evolutionary history, sequences will be assigned to only one homeomorphic  superfamily based on the presence of similarity throughout a multiple sequence alignment. The superfamily alignment may also include regions that do not align either within or at the ends of the alignment. In contrast, sequences in the same family align well throughout the alignment.
-;supersecondary structure : a term with similar meaning to a structural motif. Tertiary structure is the three-dimensional or globular structure formed by the packing together or folding of secondary structures of a polypeptide chain.<ref name="Mount"/>
-== Secondary structure ==
-'''Secondary structure prediction''' is a set of techniques in [[bioinformatics]] that aim to predict the local [[secondary structure]]s of [[protein]]s and [[RNA]] sequences based only on knowledge of their primary structure — [[amino acid]] or [[nucleotide]] sequence, respectively. For proteins, a prediction consists of assigning regions of the amino acid sequence as likely [[alpha helix|alpha helices]], [[beta sheet|beta strand]]s (often noted as "extended" conformations), or [[turn (biochemistry)|turns]]. The success of a prediction is determined by comparing it to the results of the [[DSSP (protein)|DSSP]] algorithm applied to the [[X-ray crystallography|crystal structure]] of the protein; for nucleic acids, it may be determined from the [[hydrogen bond]]ing pattern. Specialized algorithms have been developed for the detection of specific well-defined patterns such as [[transmembrane helix|transmembrane helices]] and [[coiled coil]]s in proteins, or canonical [[microRNA]] structures in RNA.<ref name="Mount"/>
-The best modern methods of secondary structure prediction in proteins reach about 80% accuracy;<ref>{{cite journal |title=Protein Secondary Structure Prediction Using BLAST and Exhaustive RT-RICO, the Search for Optimal Segment Length and Threshold |last=Lee |first=Leong |coauthors=Leopold, J.L.; Frank, R.L. |date=May 2012 |url=http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6217208&isnumber=6217201}}</ref> this high accuracy allows the use of the predictions in [[fold recognition]] and [[ab initio]] protein structure prediction, classification of [[structural motif]]s, and refinement of [[sequence alignment]]s. The accuracy of current protein secondary structure prediction methods is assessed in weekly [[Benchmark (computing)|benchmarks]] such as [[LiveBench]] and [[EVA (benchmark)|EVA]].
-===Background===
-Early methods of secondary structure prediction, introduced in the 1960s and early 1970s,<ref>{{cite journal | last = Guzzo | first = AV | year = 1965 | title = Influence of Amino-Acid Sequence on Protein Structure | journal = Biophys. J. | pmid = 5884309 | volume = 5 | issue = 6 | pmc = 1367904 | pages = 809–822 | doi = 10.1016/S0006-3495(65)86753-4 | bibcode=1965BpJ.....5..809G}}<br />{{cite journal | last = Prothero | first = JW | year = 1966 | title = Correlation between Distribution of Amino Acids and Alpha Helices | journal = Biophys. J. | pmid = 5962284 | volume = 6 | issue = 3 | pmc = 1367951 | pages = 367–370 | doi = 10.1016/S0006-3495(66)86662-6 | bibcode=1966BpJ.....6..367P}}<br />{{cite journal | last = Schiffer | first = M | coauthors = Edmundson AB | year = 1967 | title = Use of Helical Wheels to Represent Structures of Proteins and to Identify Segments with Helical Potential | journal =Biophys. J.  | volume = 7 | issue = 2 | pages = 121–35 | doi = 10.1016/S0006-3495(67)86579-2 | pmc = 1368002 | pmid=6048867 | bibcode=1967BpJ.....7..121S}}<br />{{cite journal | last = Kotelchuck | first = D | coauthors = Scheraga HA | year = 1969 | title = The Influence of Short-Range Interactions on Protein Conformation, II. A Model for Predicting the α-Helical Regions of Proteins | journal = Proc Natl Acad Sci USA | volume = 62 | pages = 14–21 | doi = 10.1073/pnas.62.1.14 | pmid = 5253650 | issue = 1 | pmc = 285948}}<br />{{cite journal | last = Lewis | first = PN | coauthors = Gō N, Gō M, Kotelchuck D, Scheraga HA | year = 1970 | title = Helix Probability Profiles of Denatured Proteins and Their Correlation with Native Structures | journal = Proc Natl Acad Sci USA| volume = 65 | pages = 810–5 | doi = 10.1073/pnas.65.4.810 | pmid = 5266152 | issue = 4 | pmc = 282987}}</ref> focused on identifying likely alpha helices and were based mainly on [[helix-coil transition model]]s.<ref name="Froimowitz">{{cite journal |doi=10.1021/ma60041a009 |author=Froimowitz M, Fasman GD |year=1974 |title=Prediction of the secondary structure of proteins using the helix-coil transition theory |journal=Macromolecules |volume=7 |issue=5 |pages=583–9 |pmid=4371089}}</ref> Significantly more accurate predictions that included beta sheets were introduced in the 1970s and relied on statistical assessments based on probability parameters derived from known solved structures. These methods, applied to a single sequence, are typically at most about 60-65% accurate, and often underpredict beta sheets.<ref name="Mount" /> The [[evolution]]ary [[conservation (genetics)|conservation]] of secondary structures can be exploited by simultaneously assessing many [[homology (biology)|homologous]] sequences in a [[multiple sequence alignment]], by calculating the net secondary structure propensity of an aligned column of amino acids. In concert with larger databases of known protein structures and modern [[machine learning]] methods such as [[artificial neural network|neural nets]] and [[support vector machine]]s, these methods can achieve up 80% overall accuracy in [[globular protein]]s.<ref name="Dor">{{cite journal |author=Dor O, Zhou Y |year=2006 |pages=838–45 |title=Achieving 80% tenfold cross-validated accuracy for secondary structure prediction by large-scale training |issue=4 |volume=66 |journal=Proteins |pmid=17177203 |doi=10.1002/prot.21298}}</ref> The theoretical upper limit of accuracy is around 90%,<ref name="Dor" /> partly due to idiosyncrasies in DSSP assignment near the ends of secondary structures, where local conformations vary under native conditions but may be forced to assume a single conformation in crystals due to packing constraints. Limitations are also imposed by secondary structure prediction's inability to account for [[tertiary structure]]; for example, a sequence predicted as a likely helix may still be able to adopt a beta-strand conformation if it is located within a beta-sheet region of the protein and its side chains pack well with their neighbors. Dramatic conformational changes related to the protein's function or environment can also alter local secondary structure.
-===Chou-Fasman method===
-{{Main|Chou-Fasman method}}
-The [[Chou-Fasman method]] was among the first secondary structure prediction algorithms developed and relies predominantly on probability parameters determined from relative frequencies of each amino acid's appearance in each type of secondary structure.<ref name="Chou">{{cite journal |doi=10.1021/bi00699a002 |author=Chou PY, Fasman GD |year=1974 |title=Prediction of protein conformation |journal=Biochemistry |volume=13 |issue=2 |pages=222–245 |pmid=4358940}}</ref> The original Chou-Fasman parameters, determined from the small sample of structures solved in the mid-1970s, produce poor results compared to modern methods, though the parameterization has been updated since it was first published. The Chou-Fasman method is roughly 50-60% accurate in predicting secondary structures.<ref name="Mount" />
-===GOR method===
-{{Main|GOR method}}
-The [[GOR method]], named for the three scientists who developed it — ''G''arnier, ''O''sguthorpe, and ''R''obson — is an [[information theory]]-based method developed not long after Chou-Fasman. It uses a more powerful probabilistic techniques of [[Bayesian inference]].<ref name="Garnier">{{cite journal |doi=10.1016/0022-2836(78)90297-8 |author=Garnier J, Osguthorpe DJ, Robson B |year=1978 |title=Analysis of the accuracy and implications of simple methods for predicting the secondary structure of globular proteins |journal=J Mol Biol |volume=120 |issue=1 |pages=97–120 |pmid=642007}}</ref> The method is a specific optimized application of mathematics and algorithms developed in a series of papers by Robson and colleagues, e.g.<ref>{{cite journal |doi=10.1016/0022-2836(71)90243-9 |author=Robson B, Pain RH |title=Analysis of the code relating sequence to conformation in proteins: possible implications for the mechanism of formation of helical regions |journal=J. Mol. Biol. |volume=58 |issue=1 |pages=237–59 |date=May 1971 |pmid=5088928 |url=http://linkinghub.elsevier.com/retrieve/pii/0022-2836(71)90243-9}}</ref> and <ref>{{cite journal |author=Robson B |title=Analysis of code relating sequences to conformation in globular proteins. Theory and application of expected information |journal=Biochem. J. |volume=141 |issue=3 |pages=853–67 |date=September 1974 |pmid=4463965 |pmc=1168191 }}</ref>). The GOR method is capable of continued extension by such principles, and has gone through several versions. The GOR method takes into account not only the probability of each amino acid having a particular secondary structure, but also the [[conditional probability]] of the amino acid assuming each structure given the contributions of its neighbors (it does not assume that the neighbors have that same structure).  The approach is both more sensitive and more accurate than that of Chou and Fasman because amino acid structural propensities are only strong for a small number of amino acids such as [[proline]] and [[glycine]].  Weak contributions from each of many neighbors can add up to strong effect overall.  The original GOR method was roughly 65% accurate and is dramatically more successful in predicting alpha helices than beta sheets, which it frequently mispredicted as loops or disorganized regions.<ref name="Mount" /> Later GOR methods considered also pairs of amino acids, significantly improving performance{{Citation needed|date=March 2012}}. The major difference from the following technique is perhaps that the weights in an implied network of contributing terms are assigned ''a priori'', from statistical analysis of proteins of known structure, not by feedback to optimize agreement with a training set of such.
-===Machine learning===
-[[Artificial neural network|Neural network]] methods use training sets of solved structures to identify common sequence motifs associated with particular arrangements of secondary structures. These methods are over 70% accurate in their predictions, although beta strands are still often underpredicted due to the lack of three-dimensional structural information that would allow assessment of [[hydrogen bonding]] patterns that can promote formation of the extended conformation required for the presence of a complete beta sheet.<ref name="Mount" />
-[[Support vector machine]]s have proven particularly useful for predicting the locations of [[turn (biochemistry)|turns]], which are difficult to identify with statistical methods.<ref name="Pham">{{cite journal |doi=10.1142/S0219720005001089 |author=Pham TH, Satou K, Ho TB |year=2005 |title=Support vector machines for prediction and analysis of beta and gamma-turns in proteins |journal=J Bioinform Comput Biol |volume=3 |issue=2 |pages=343–358 |pmid=15852509}}</ref> The requirement of relatively small training sets has also been cited as an advantage to avoid overfitting to existing structural data.<ref name="Zhang">{{cite journal |author=Zhang Q, Yoon S, Welsh WJ |year=2005 |title=Improved method for predicting beta-turn using support vector machine |journal=Bioinformatics |volume=21 |issue=10 |pages=2370–4 |pmid=15797917 |doi=10.1093/bioinformatics/bti358}}</ref>
-Extensions of machine learning techniques attempt to predict more fine-grained local properties of proteins, such as [[protein backbone|backbone]] [[dihedral angle]]s in unassigned regions. Both SVMs<ref name="Zimmermann">{{cite journal |author=Zimmermann O, Hansmann UH |year=2006 |title=Support vector machines for prediction of dihedral angle regions |journal=Bioinformatics |volume=22 |issue=24 |pages=3009–15 |pmid=17005536 |doi=10.1093/bioinformatics/btl489}}</ref> and neural networks<ref name="Kuang">{{cite journal |author=Kuang R, Leslie CS, Yang AS |year=2004 |title=Protein backbone angle prediction with machine learning approaches |journal=Bioinformatics |volume=20 |issue=10 |pages=1612–21 |pmid=14988121 |doi=10.1093/bioinformatics/bth136}}</ref> have been applied to this problem.<ref name="Pham">{{cite journal |doi=10.1142/S0219720005001089 |author=Pham TH, Satou K, Ho TB |title=Support vector machines for prediction and analysis of beta and gamma-turns in proteins |journal=J Bioinform Comput Biol |year=2005 |volume=3 |issue=2 |pages=343–358 |pmid=15852509}}</ref> More recently, real-value torsion angles can be accurately predicted by SPINE-X and successfully employed for ab initio structure prediction.<ref name="torsion">{{cite journal |author=Faraggi E, Yang Y, Zhou Y|title=Predicting continuous local structure and the effect of its substitution for secondary structure in fragment-free protein structure prediction |journal=Structure |year=2009 |volume=17 |pages=1515–1527 |pmid=19913486|pmc=2778607 |doi=10.1016/j.str.2009.09.006}}</ref>
-===Other improvements===
-It is reported that in addition to the protein sequence, secondary structure formation depends on other factors. For example, it is reported that secondary structure tendencies depend also on local environment,<ref name="a0">{{cite journal |doi=10.1073/pnas.89.10.4462 |unused_data=year1992 |author=Zhong L, Johnson WC Jr |title=Environment affects amino acid preference for secondary structure |journal=Proc Natl Acad Sci USA |volume=89 |issue=10 |pages=4462–5 |year=1992 |pmid=1584778 |pmc=49102}}</ref> solvent accessibility of residues,<ref name="a1">{{cite journal |author=Macdonald JR, Johnson WC Jr |year=2001 |title=Environmental features are important in determining protein secondary structure |journal=Protein Sci. |volume=10 |issue=6 |pages=1172–7 |pmid=11369855 |pmc=2374018 |doi=10.1110/ps.420101}}</ref> protein structural class,<ref name="a2">{{cite journal |author=Costantini S, Colonna G, Facchiano AM |year=2006 |title=Amino acid propensities for secondary structures are influenced by the protein structural class |journal=Biochem Biophys Res Commun. |volume=342 |issue=2 |pages=441–451 |pmid=16487481 |doi=10.1016/j.bbrc.2006.01.159}}</ref> and even the organism from which the proteins are obtained.<ref name="a3">{{cite journal |author=Marashi SA, ''et al.'' |year=2007 |title=Adaptation of proteins to different environments: a comparison of proteome structural properties in ''Bacillus subtilis'' and ''Escherichia coli'' |journal=J Theor Biol |volume=244 |issue=1 |pages=127–132 |pmid=16945389 |doi=10.1016/j.jtbi.2006.07.021}}</ref> Based on such observations, some studies have shown that secondary structure prediction can be improved by addition of information about protein structural class,<ref name="m">{{cite journal |author=Costantini S, Colonna G, Facchiano AM |year=2007 |title=PreSSAPro: a software for the prediction of secondary structure by amino acid properties |journal=Comput Biol Chem |volume=31 |issue=5-6 |pages=389–392 |pmid=17888742 |doi=10.1016/j.compbiolchem.2007.08.010}}</ref> residue accessible surface area<ref name="P">{{cite journal |author=Momen-Roknabadi A, ''et al.'' |year=2008 |title=Impact of residue accessible surface area on the prediction of protein secondary structures |journal=BMC Bioinformatics |volume=9 |page=357 |pmid=18759992 |pmc=2553345 |doi=10.1186/1471-2105-9-357}}</ref><ref name="Ph">{{cite journal |author=Adamczak R, Porollo A, Meller J |year=2005 |title=Combining prediction of secondary structure and solvent accessibility in proteins |journal=Proteins |volume=59 |issue=3 |pages=467–475 |pmid=15768403 |doi=10.1002/prot.20441}}</ref> and also [[contact number]] information.<ref name="az">{{cite journal |author=Lakizadeh A, Marashi SA |year=2009 |url=http://www.excli.de/vol8/lakizadeh_03_2009/lakizadeh_250309a_proof.pdf |title=Addition of contact number information can improve protein secondary structure prediction by neural networks |journal=Excli J. |volume=8 |pages=66–73}}</ref>
-Sequence covariation methods rely on the existence of a data set composed of multiple [[homology (biology)|homologous]] RNA sequences with related but dissimilar sequences. These methods analyze the covariation of individual base sites in [[evolution]]; maintenance at two widely separated sites of a pair of base-pairing nucleotides indicates the presence of a structurally required hydrogen bond between those positions. The general problem of pseudoknot prediction has been shown to be [[NP-complete]].<ref name="Lyngso">{{cite journal |doi=10.1089/106652700750050862 |author=Lyngsø RB, Pedersen CN |year=2000 |title=RNA pseudoknot prediction in energy-based models |journal=J Comput Biol |volume=7 |issue=3-4 |pages=409–427 |pmid=11108471}}</ref>
-== Tertiary structure ==
-The practical role of protein structure prediction is now more important than ever. Massive amounts of protein sequence data are produced by modern large-scale [[DNA]] sequencing efforts such as the [[Human Genome Project]]. Despite community-wide efforts in [[structural genomics]], the output of experimentally determined protein structures—typically by time-consuming and relatively expensive [[X-ray crystallography]] or [[Protein NMR|NMR spectroscopy]]—is lagging far behind the output of protein sequences.
-The protein structure prediction remains an extremely difficult and unresolved undertaking. The two main problems are calculation of [[Gibbs free energy|protein free energy]] and [[energy minimization|finding the global minimum]] of this energy. A protein structure prediction method must explore the space of possible protein structures which is [[Levinthal's paradox|astronomically large]]. These problems can be partially bypassed in "comparative" or [[homology modeling]] and [[fold recognition]] methods, in which the search space is pruned by the assumption that the protein in question adopts a structure that is close to the experimentally determined structure of another homologous protein. On the other hand, the ''de novo'' or [[De novo protein structure prediction|ab initio protein structure prediction]] methods must explicitly resolve these problems. The progress and challenges in protein structure prediction has been reviewed in Zhang 2008.<ref name="zhang2008"/>
-===''Ab initio'' protein modelling===
-{{main|De novo protein structure prediction}}
-====Energy- and fragment-based methods====
-''Ab initio''- or ''de novo''- protein modelling methods seek to build three-dimensional protein models "from scratch", i.e., based on physical principles rather than (directly) on previously solved structures. There are many possible procedures that either attempt to mimic [[protein folding]] or apply some [[stochastic]] method to search possible solutions (i.e., [[global optimization]] of a suitable energy function). These procedures tend to require vast computational resources, and have thus only been carried out for tiny proteins. To predict protein structure ''de novo'' for larger proteins will require better algorithms and larger computational resources like those afforded by either powerful supercomputers (such as [[Blue Gene]] or [[MDGRAPE-3]]) or distributed computing (such as [[Folding@home]], the [[Human Proteome Folding Project]] and [[Rosetta@Home]]). Although these computational barriers are vast, the potential benefits of structural genomics (by predicted or experimental methods) make ''ab initio'' structure prediction an active research field.<ref name="zhang2008">{{cite journal |author=Zhang Y |title=Progress and challenges in protein structure prediction |journal=Curr Opin Struct Biol |volume=18 |issue=3 |pages=342–8 |year=2008 |doi=10.1016/j.sbi.2008.02.004 |pmid=18436442 |pmc=2680823}}</ref>
-As of 2009, a 50-residue protein could be simulated atom-by-atom on a supercomputer for 1 millisecond.<ref>http://dl.acm.org/citation.cfm?id=1654126</ref> As of 2012, comparable stable-state sampling could be done on a standard desktop with a new graphics card and more sophisticated algorithms.<ref>http://pubs.acs.org/doi/abs/10.1021/ct300284c</ref>
-====Evolutionary covariation to predict 3D contacts====
-As sequencing became more commonplace in the 1990s several groups used protein sequence alignments to predict correlated [[mutation]]s and it was hoped that these coevolved residues could be used to predict tertiary structure (using the analogy to distance constraints from experimental procedures such as [[NMR]]). The assumption is when single residue mutations are slightly deleterious, compensatory mutations may occur to restabilize residue-residue interactions.
-This early work used what are known as ''local'' methods to calculate correlated mutations from protein sequences, but suffered from indirect false correlations which result from treating each pair of residues as independent of all other pairs.<ref>Gobel, U. et al. (1994): ''Correlated mutations and residue contacts in proteins.'' In: ''Proteins'', 18, 309–317.</ref><ref>Taylor, W. R. & Hatrick, K.(1994): "Compensating changes in protein multiple sequence alignments" In: "Protein Eng" 7, 341-348.</ref><ref>Neher, E (1994): "How frequent are correlated changes in families of protein sequences?" In: "Proc Natl Acad Sci U S A", 91, 98-102.</ref>
-In 2011, a different, and this time ''global'' statistical approach, demonstrated that predicted coevolved residues were sufficient to predict the 3D fold of a protein, providing there are enough sequences available.<ref name="marks">Marks, D. S. et al. (2011): "Protein 3D structure computed from evolutionary sequence variation". In: "PLoS One" 6, e28766</ref> The method, [http://evfold.org EVfold], uses no homology modeling,  threading or 3D structure fragments and can be run on a standard personal computer even for proteins with hundreds of residues. The accuracy of the contacts predicted using this and related approaches has now been demonstrated on many known structures and contact maps,<ref>Lapedes, A. et al (2012, submitted in 2002): "Using Sequence Alignments to Predict Protein Structure and Stability With High Accuracy." In: "arXiv", 29.</ref><ref>Burger, L. & van Nimwegen, E (2010): "Disentangling direct from indirect co-evolution of residues in protein alignments". In: "PLoS Comput Biol" 6, e1000633.</ref><ref>Morcos F, et al. (2011): ''Direct-coupling analysis of residue coevolution captures native contacts across many protein families.'' In: ''Proc Natl Acad Sci USA'' 108:E1293–E1301.</ref><ref>Jones, D. T. et al.: "PSICOV: precise structural contact prediction using sparse inverse covariance estimation on large multiple sequence alignments". In "Bioinformatics" 28, 184-190.</ref><ref>Nugent T., Jones D.T. (2012): "Accurate de novo structure prediction of large transmembrane protein domains using fragment-assembly and correlated mutation analysis." In: "Proc Natl Acad Sci U S A", 109(24), E1540-7."</ref> including the prediction of experimentally unsolved transmembrane proteins.<ref>Hopf, T. A. et al. (2012): "Three-dimensional structures of membrane proteins from genomic sequencing" In: "Cell" 149, 1607-1621.</ref>
-===Comparative protein modeling===
-Comparative protein modelling uses previously solved structures as starting points, or templates. This is effective because it appears that although the number of actual proteins is vast, there is a limited set of [[tertiary structure|tertiary]] [[structural motif]]s to which most proteins belong. It has been suggested that there are only around 2,000 distinct protein folds in nature, though there are many millions of different proteins.
-These methods may also be split into two groups:<ref name="zhang2008"/>
-;[[Homology modeling]] : is based on the reasonable assumption that two [[Homology (biology)#Homology of sequences in genetics|homologous]] proteins will share very similar structures. Because a protein's fold is more evolutionarily conserved than its amino acid sequence, a target sequence can be modeled with reasonable accuracy on a very distantly related template, provided that the relationship between target and template can be discerned through [[sequence alignment]]. It has been suggested that the primary bottleneck in comparative modelling arises from difficulties in alignment rather than from errors in structure prediction given a known-good alignment.<ref name="zhang2005">{{cite journal |author=Zhang Y and Skolnick J |title=The protein structure prediction problem could be solved using the current PDB library |journal=Proc Natl Acad Sci USA |volume=102 |issue=4 |pages=1029–34 |year=2005 |doi=10.1073/pnas.0407152101 |pmid=15653774 |pmc=545829}}</ref> Unsurprisingly, homology modelling is most accurate when the target and template have similar sequences.
-;[[Protein threading]]:<ref name="bowie1991">{{cite journal |author=Bowie JU, Luthy R, Eisenberg D |title=A method to identify protein sequences that fold into a known three-dimensional structure |journal=Science |volume=253 |issue=5016 |pages=164–170 |year=1991 |doi=10.1126/science.1853201 |pmid=1853201}}</ref> scans the amino acid sequence of an unknown structure against a database of solved structures.  In each case, a scoring function is used to assess the compatibility of the sequence to the structure, thus yielding possible three-dimensional models. This type of method is also known as '''3D-1D fold recognition''' due to its compatibility analysis between three-dimensional structures and linear protein sequences. This method has also given rise to methods performing an '''inverse folding search''' by evaluating the compatibility of a given structure with a large database of sequences, thus predicting which sequences have the potential to produce a given fold.
-===Side-chain geometry prediction===
-Accurate packing of the amino acid [[side chain]]s represents a separate problem in protein structure prediction. Methods that specifically address the problem of predicting side-chain geometry include [[dead-end elimination]] and the [[self-consistent mean field (biology)|self-consistent mean field]] methods. The side chain conformations with low energy are usually determined on the rigid polypeptide backbone and using a set of discrete side chain conformations known as "[[rotamer]]s." The methods attempt to identify the set of rotamers that minimize the model's overall energy.
-These methods use rotamer libraries, which are collections of favorable conformations for each residue type in proteins. Rotamer libraries may contain information about the conformation, its frequency, and the standard deviations about mean dihedral angles, which can be used in sampling.<ref name="Rotamers21stCentury">{{cite journal |author=Dunbrack, RL |journal=Curr. Opin. Struct. Biol. |year=2002 |volume=12 |issue=4 |pages=431–440 |title=Rotamer Libraries in the 21st Century |pmid=12163064 |doi=10.1016/S0959-440X(02)00344-5}}</ref> Rotamer libraries are derived from [[structural bioinformatics]] or other statistical analysis of side-chain conformations in known experimental structures of proteins, such as by clustering the observed conformations for tetrahedral carbons near the staggered (60°, 180°, -60°) values.
-Rotamer libraries can be backbone-independent, secondary-structure-dependent, or backbone-dependent. Backbone-independent rotamer libraries make no reference to backbone conformation, and are calculated from all available side chains of a certain type (for instance, the first example of a rotamer library, done by Ponder and [[Frederic M. Richards|Richards]] at Yale in 1987).<ref>{{cite journal |doi=10.1016/0022-2836(87)90358-5 |author=Ponder JW, Richards FM |title=Tertiary templates for proteins: use of packing criteria in the enumeration of allowed sequences for different structural classes |journal= J. Mol. Biol. |year=1987 |volume=193 |issue=4 |pages=775–791 |pmid=2441069 }}</ref> Secondary-structure-dependent libraries present different dihedral angles and/or rotamer frequencies for <math>\alpha</math>-helix, <math>\beta</math>-sheet, or coil secondary structures.<ref>{{cite journal |author=Lovell SC, Word JM, [[Jane S. Richardson|Richardson JS]], Richardson DC |title=The penultimate rotamer library |journal=Proteins: Struc. Func. Genet. |year=2000 |volume=40 |pages=389–408 |doi=10.1002/1097-0134(20000815)40:3<389::AID-PROT50>3.0.CO;2-2}}</ref><ref>[http://kinemage.biochem.duke.edu/databases/rotamer.php Richardson Rotamer Libraries]</ref> Backbone-dependent rotamer libraries present conformations and/or frequencies dependent on the local backbone conformation as defined by the backbone dihedral angles <math>\phi</math> and <math>\psi</math>, regardless of secondary structure.<ref name="bbdep2010">{{cite journal |author=Shapovalov MV, Dunbrack, RL |title=A smoothed backbone-dependent rotamer library for proteins derived from adaptive kernel density estimates and regressions |journal=Structure (Cell Press)  |volume=19 |issue=6 |pages=844–858 |year=2011 |doi=10.1016/j.str.2011.03.019 |pmid=    21645855| pmc=3118414}}</ref>
-The modern versions of these libraries as used in most software are presented as multidimensional distributions of probability or frequency, where the peaks correspond to the dihedral-angle conformations considered as individual rotamers in the lists. Some versions are based on very carefully curated data and are used primarily for structure validation,<ref>[http://molprobity.biochem.duke.edu/ MolProbity]</ref> while others emphasize relative frequencies in much larger data sets and are the form used primarily for structure prediction, such as the Dunbrack rotamer libraries.<ref>[http://dunbrack.fccc.edu/bbdep2010 Dunbrack Rotamer Libraries]</ref>
-Side-chain packing methods are most useful for analyzing the protein's [[hydrophobic]] core, where side chains are more closely packed; they have more difficulty addressing the looser constraints and higher flexibility of surface residues, which often occupy multiple rotamer conformations rather than just one.<ref name="voigt2000">{{cite journal |author=Voigt CA, Gordon DB, Mayo SL |title=Trading accuracy for speed: A quantitative comparison of search algorithms in protein sequence design |journal=J Mol Biol |volume=299 |issue=3 |pages=789–803 |year=2000 |doi=10.1006/jmbi.2000.3758 |pmid=10835284}}</ref><ref name="scwrl4">{{cite journal |author=Krivov GG, Shapovalov MV, Dunbrack, RL |title=Improved prediction of protein side-chain conformations with SCWRL4 |journal=Proteins  |volume=77 |issue=3 |pages=778–795 |year=2009 |doi=10.1002/prot.22488 |pmid=19603484| pmc=2885146}}</ref>
-=== Prediction of structural classes ===
-Statistical methods have been developed for predicting structural classes of proteins based on their amino acid composition,<ref name="pmid7587280">{{cite journal | author = Chou KC, Zhang CT | title = Prediction of protein structural classes | journal = Crit. Rev. Biochem. Mol. Biol. | volume = 30 | issue = 4 | pages = 275–349 | year = 1995 | pmid = 7587280 | doi = 10.3109/10409239509083488 }}</ref> [[pseudo amino acid composition]]<ref name="pmid16920060">{{cite journal | author = Chen C, Zhou X, Tian Y, Zou X, Cai P | title = Predicting protein structural class with pseudo-amino acid composition and support vector machine fusion network | journal = Anal. Biochem. | volume = 357 | issue = 1 | pages = 116–21 |date=October 2006 | pmid = 16920060 | doi = 10.1016/j.ab.2006.07.022 }}</ref><ref name="pmid16908032">{{cite journal | author = Chen C, Tian YX, Zou XY, Cai PX, Mo JY | title = Using pseudo-amino acid composition and support vector machine to predict protein structural class | journal = J. Theor. Biol. | volume = 243 | issue = 3 | pages = 444–8 |date=December 2006 | pmid = 16908032 | doi = 10.1016/j.jtbi.2006.06.025 }}</ref><ref name="pmid17330882">{{cite journal | author = Lin H, Li QZ | title = Using pseudo amino acid composition to predict protein structural class: approached by incorporating 400 dipeptide components | journal = J Comput Chem | volume = 28 | issue = 9 | pages = 1463–6 |date=July 2007 | pmid = 17330882 | doi = 10.1002/jcc.20554 }}</ref><ref name="pmid18634802">{{cite journal | author = Xiao X, Wang P, Chou KC | title = Predicting protein structural classes with pseudo amino acid composition: an approach using geometric moments of cellular automaton image | journal = J. Theor. Biol. | volume = 254 | issue = 3 | pages = 691–6 |date=October 2008 | pmid = 18634802 | doi = 10.1016/j.jtbi.2008.06.016 }}</ref> and functional domain composition.<ref name="pmid15358128">{{cite journal | author = Chou KC, Cai YD | title = Predicting protein structural class by functional domain composition | journal = Biochem. Biophys. Res. Commun. | volume = 321 | issue = 4 | pages = 1007–9 |date=September 2004 | pmid = 15358128 | doi = 10.1016/j.bbrc.2004.07.059 }}</ref>
-==Quaternary structure==
-{{main|Protein–protein interaction prediction}}
-In the case of [[protein complex|complexes of two or more proteins]], where the structures of the proteins are known or can be predicted with high accuracy, [[Macromolecular docking|protein–protein docking]] methods can be used to predict the structure of the complex. Information of the effect of mutations at specific sites on the affinity of the complex helps to understand the complex structure and to guide docking methods.
-== Software ==
-{{main|Protein structure prediction software}}
-[http://zhanglab.ccmb.med.umich.edu/I-TASSER I-TASSER] is the best server for protein structure prediction according to the 2006-2012 [http://predictioncenter.org CASP experiments] ([http://www.predictioncenter.org/casp7/Casp7.html CASP7], [http://www.predictioncenter.org/casp8/index.cgi CASP8],  [http://www.predictioncenter.org/casp9 CASP9] and
-[http://predictioncenter.org/casp10/groups_analysis.cgi?type=server&tbm=on&tbm_hard=on&tbmfm=on&fm=on&submit=Filter CASP10]). The standalone I-TASSER package is freely available for [http://zhanglab.ccmb.med.umich.edu/I-TASSER/download/ download].
-[[HHpred / HHsearch|HHpred]] was the leading server for template-based protein structure prediction in the 2010 [http://predictioncenter.org/casp9/groups_analysis.cgi?type=server&tbm=on&submit=Filter CASP9 experiment]. It has a median response time of a few minutes instead of days like other top-performing servers. HHpred is often used for remote homology detection and homology-based function prediction. It runs with the free, open-source software package [[HH-suite]] for fast sequence searching, protein threading and remote homology detection.
-[[RaptorX / software for protein modeling and analysis|RaptorX]] excels at aligning hard targets according to the 2010 [http://www.predictioncenter.org/casp9 CASP9] experiments.
-RaptorX generates the significantly better alignments for the hardest 50 CASP9 template-based modeling targets than other servers including those using consensus and refinement methods.
-The RaptorX server is available at [http://raptorx.uchicago.edu server]
-[[MODELLER]] is a popular software tool for producing homology models by satisfaction of spatial restraints using methodology derived from [[protein NMR|NMR spectroscopy]] data processing. The [http://salilab.org/modweb ModWeb] comparative protein structure modeling web-server uses primarily MODELLER for automatic comparative modeling.
-[http://swissmodel.expasy.org/ SWISS-MODEL] provides an automated web server for protein structure homology modeling.
-[http://meta.bioinfo.pl/submit_wizard.pl bioinfo.pl] and [http://robetta.bakerlab.org/ Robetta] widely used servers for protein structure prediction.
-[http://sparks.informatics.iupui.edu/yueyang/sparks-x/ SPARKSx] is one of the top performing servers in the CASP focused on the remote fold recognition.<ref name=SPARKSx>{{cite journal|last=Yang|first=Yuedong|coauthors=Eshel Faraggi, Huiying Zhao, Yaoqi Zhou|title=Improving protein fold recognition and template-based modeling by employing probabilistic-based matching between predicted one-dimensional structural properties of the query and corresponding native properties of templates|journal=Bioinformatics|year=2011|volume=27|pages=2076–82|url=http://bioinformatics.oxfordjournals.org/content/27/15/2076.long|issue=15|doi=10.1093/bioinformatics/btr350}}</ref>
-[http://bioserv.rpbs.univ-paris-diderot.fr/PEP-FOLD/ PEP-FOLD] is a ''de novo'' approach aimed at predicting peptide structures from amino acid sequences, based on a HMM structural alphabet.<ref name="pepfold">{{cite journal |author=Maupetit J, Derreumaux P, Tuffery P |title=A fast and accurate method for large-scale de novo peptide structure prediction. |journal=J Comput Chem.|pages=In press. |year=2009}}</ref><ref name="pepfold2">{{cite journal |author=Maupetit J, Derreumaux P, Tuffery P |title=PEP-FOLD: an online resource for de novo peptide structure prediction. |journal=Nucleic Acids Res.|year=2009 |doi=10.1093/nar/gkp323 |pmid=19433514 |pmc=2703897 |volume=37 |issue=Web Server issue |pages=W498–503}}</ref>
-[[Phyre / Phyre2|Phyre and Phyre2]] are amongst the top performing servers in the CASP international blind trials of structure prediction in homology modelling and remote fold recognition, and are designed with an emphasis on ease of use for non-experts.
-[[RAPTOR (software)]] is a protein threading software that is based on integer programming. The basic algorithm for threading is described in Bowie (1991)<ref name="bowie1991"/> and is fairly straightforward to implement.
-[http://zhanglab.ccmb.med.umich.edu/QUARK QUARK] is an algorithm developed for ''ab initio'' protein structure modeling.
-[http://www.biomolecular-modeling.com/Abalone/index.html Abalone] is a [[Molecular Dynamics]] program for folding simulations with explicit or implicit [[water model]]s.
-[http://www.eidogen-sertanty.com/products_tip_content.html TIP] is a knowledgebase of STRUCTFAST<ref name="debe2006">{{cite journal |author=Debe DA, Danzer JF, Goddard WA, Poleksic A |title=STRUCTFAST: Protein sequence remote homology detection and alignment using novel dynamic programming and profile-profile scoring |journal=Proteins |volume=64 |pages=960–7 |year=2006 |doi=10.1002/prot.21049 |pmid=16786595 |issue=4}}</ref> models and precomputed similarity relationships between sequences, structures, and binding sites. Several [[distributed computing]] projects concerning protein structure prediction have also been implemented, such as the [[Folding@home]], [[Rosetta@home]], [[Human Proteome Folding Project]], [[Predictor@home]], and [[TANPAKU]].
-[http://atlas.princeton.edu/refinement Princeton_TIGRESS (server)] is a protein structure refinement server<ref name="khoury2013">{{cite journal |author=Khoury GA, Tamamis P, Pinnaduwage N, Smadbeck J, Kieslich CA, Floudas CA | title=Princeton_TIGRESS: ProTeIn Geometry REfinement using Simulations and Support vector machines |journal=Proteins |doi=10.1002/prot.24459 }}</ref>  whose underlying method was ranked in 5th place in blind predictions during CASP10 (http://predictioncenter.org/casp10/doc/presentations/ranking_CASP10_refinement_DJ.pdf). It uses monte carlo and molecular dynamics based sampling techniques and support vector machines for selection. It can consistently increase the model accuracy of many top 3-D structure prediction servers' predictions, increasing the potential usability of a predicted structure in a biological application.
-CABS-FOLD<ref>http://nar.oxfordjournals.org/content/41/W1/W406.long</ref>  is a server that provides tools for protein structure prediction from sequence only (de novo modeling) and also using alternative templates (consensus modeling).
-Bhageerath<ref>http://nar.oxfordjournals.org/content/34/21/6195.long</ref> is another Ab-initio modelling server.
-The [[Foldit]] program seeks to investigate the pattern-recognition and puzzle-solving abilities inherent to the human mind in order to create more successful computer protein structure prediction software.
-Computational approaches provide a fast alternative route to antibody structure prediction. Recently{{when|date=February 2011}} developed antibody F<sub>V</sub> region high resolution structure prediction algorithms, like [http://antibody.graylab.jhu.edu RosettaAntibody], have been shown to generate high resolution homology models which have been used for successful docking.<ref>{{cite journal |author=Sivasubramanian A, Sircar A, Chaudhury S, Gray J J |title=Toward high-resolution homology modeling of antibody Fv regions and application to antibody–antigen docking|journal=Proteins |volume=74 |pages=497–514 |year=2009 |doi=10.1002/prot.22309 |pmc=2909601 |pmid=19062174 |issue=2}}</ref>
-Reviews of software for structure prediction can be found at.<ref name="nayeem2006">{{cite journal |author=Nayeem A, Sitkoff D, Krystek S Jr |title=A comparative study of available software for high-accuracy homology modeling: From sequence alignments to structural models |journal=Protein Sci |volume=15 |pages=808–824 |year=2006 |doi=10.1110/ps.051892906 |pmid=16600967 |issue=4 |pmc=2242473}}</ref>
-=== Evaluation of automatic structure prediction servers ===
-{{main|CASP}}
-[[CASP]], which stands for Critical Assessment of Techniques for Protein Structure Prediction, is a community-wide experiment for protein structure prediction taking place every two years since 1994. CASP provides users and research groups with an opportunity to assess the quality of available methods and automatic servers for protein structure prediction. The first official assessment for automatic structure prediction servers in the CASP7 benchmark (2006) are discussed by Battey ''et al.''.<ref>{{cite journal
-  | author = Battey JN, Kopp J, Bordoli L, Read RJ, Clarke ND, Schwede T
-  | title = Automated server predictions in CASP7
-  | journal = Proteins
-  | year = 2007
-  | volume = 69
-  | issue = Suppl 8
-  | pages = 68–82
-  | pmid = 17894354
-  | doi = 10.1002/prot.21761
-}}
-</ref> The newest official results of automated assessment in 2012 CASP10 are available [http://predictioncenter.org/casp10/groups_analysis.cgi?type=server&tbm=on&tbm_hard=on&tbmfm=on&fm=on&submit=Filter for automated servers] and [http://predictioncenter.org/casp10/groups_analysis.cgi for human and server predictors]. Unofficial assessment result for automatic servers of the CASP10 benchmark are summarized by the Zhang Lab at [http://zhanglab.ccmb.med.umich.edu/casp10 http://zhanglab.ccmb.med.umich.edu/casp10/].
-The [[CAMEO]] Continuous Automated Model EvaluatiOn Server evaluates automated protein structure prediction servers on a weekly basis using blind predictions for newly release protein structures. CAMEO publishes the results on its website ([http://cameo3d.org]).
-==See also==
-* [[Protein design]]
-* [[Protein function prediction]]
-* [[Protein structure prediction software]]
-* [[De novo protein structure prediction]]
-* [[Molecular design software]]
-* [[List of software for molecular mechanics modeling|Molecular modeling software]]
-* [[Modelling biological systems]]
-* [[Protein fragment library|Fragment libraries]]
-* [[Lattice protein]]s
-* [[Statistical potential]]
-* [[Protein circular dichroism data bank]]
-==References==
-{{reflist|2}}
-{{cite journal |author=Samudrala R, Moult J |title=An all-atom distance-dependent conditional probability discriminatory function for protein structure prediction |journal=J. Mol. Biol. |volume=275 |issue=5 |pages=895–916 |date=February 1998 |pmid=9480776 |doi=10.1006/jmbi.1997.1479 |url=http://linkinghub.elsevier.com/retrieve/pii/S0022-2836(97)91479-0}}
-== External links ==
-*[http://www.cbs.dtu.dk/services/NetSurfP/ NetSurfP — Secondary Structure and Surface Accessibility predictor]
-*[http://predictioncenter.org/ CASP experiments home page]
-*[http://www.russell.embl-heidelberg.de/gtsp/flowchart2.html Structure Prediction Flowchart] (a clickable map)
-* [http://www.expasy.ch/tools/ ExPASy Proteomics tools] — list of prediction tools and servers
-* [http://bioinf.cs.ucl.ac.uk/dompred/ DomPred] — London's Global University
-* [http://www.ics.uci.edu/~baldig/dompro.html DOMpro] — University of California Irvine
-* [http://structure.pitt.edu/servers/domainsplit/ DomainSplit] — University of Pittsburgh
-* [http://www.predictprotein.org/ PredictProtein]
-* [http://protinfo.compbio.washington.edu Protinfo] — comparative and de novo protein structure and complex modelling server
-* [http://scratch.proteomics.ics.uci.edu/ SCRATCH] Protein structure prediction suite that includes SSpro
-* [http://bioinf.cs.ucl.ac.uk/psipred/ PSIPRED] The PSIPRED protein structure prediction server
-* [http://zhanglab.ccmb.med.umich.edu/PSSpred/ PSSpred] A multiple neural network training program for protein secondary structure prediction
-{{Protein structure determination}}
-{{Protein methods}}
-{{Biomolecular structure}}
-{{DEFAULTSORT:Protein Structure Prediction}}
-[[Category:Bioinformatics]]
-[[Category:Protein structure]]
-[[Category:Protein methods]]

Sine–Gordon equation: Difference between revisions

Latest revision as of 16:23, 10 February 2014

Navigation menu

Search