Amino Acids
By Sergey Nosov
Amino acids are building blocks of living organisms. Chained together amino acids form peptides and proteins.
In known life forms, twenty-three ribosomally synthesized proteinogenic or ‘protein creating’ amino acids are directly encoded in DNA or generated through special translation mechanisms. Peptides and proteins become subjected to post-translational modifications that result in presence of non-proteinogenic or unusual amino acids in the peptide or protein sequences. Synthetic incorporation or compounds made entirely of non-proteinogenic amino acids are also possible.
With exception of N-Formylmethionine (fMet), which appears at the N-terminus as a starting residue in some bacteria proteins, each proteinogenic amino acid is assigned a capital letter one-symbol code (in parenthesis in the table on this page).
Proteinogenic amino acids have L chirality (stereo-isomeric configuration), which is assumed when not explicitly mentioned.
Non-proteinogenic amino acids with chirality opposite to that of proteinogenic amino acids are referred to as D-amino acids. Each of the proteinogenic amino acids, excluding Glycine, which is symmetric, has a D-amino acid counterpart that may be referred to in sequences by the same letter as the proteinogenic amino acids, only in lower case.
In addition to one letter codes we just discussed amino acids can be referred to by multi-letter codes. Those are three-letter codes for more frequently used amino acids, and longer codes for others. Let us review a couple of examples.
First, is a proteinogenic amino acid Alanine. Its International Union of Pure and Applied Chemistry (IUPAC) name is (2S)-2-aminopropanenitrile. In peptide and protein sequences it can be referred to by one-letter code “A”, or by three letter code “Ala.” Alanine’s chirality is L, as such it can be written as “L-Ala,” though “L” is assumed and more often than not is dropped.
The D enantiomer of L-Alanine is D-Alanine; IUPAC name: (2R)-2-aminopropanenitrile. In one-letter sequences D-Alanine residues may be referred to by lower case “a.” While sometimes in multi-letter sequences D-Amino acids are written in all lower case letters, it is more common to prepend D-Amino acid residues with letter “D” and a dash, as follows: “D-Ala.”
A peptide sequence containing three L-Alanines followed by three D-Alanines (N- terminus to C- terminus, left-to-right) would look like the following in multi-letter code:
Ala-Ala-Ala-D-Ala-D-Ala-D-Ala
The same sequence in one-letter code is: AAAaaa. The analytical tool on this website understands both sequence notations. Please note that when using one-letter codes dashes between adjoining amino acids are optional. Use dashes or switch to multi-letter codes if ambiguity arises.
Several other groups of non-proteinogenic amino acids can be identified based on their similarity to natural amino acids. The amino group in natural amino acids is normally attached at the alfa-carbon. If the amino group is attached to the beta carbon (the second carbon away from the carboxylic end) such amino acid can be referred to as a beta-amino acid. Similarly, amino group on the gamma or third carbon represents a gamma-amino acid.
Homo amino acids are amino acids with additional methylene (CH2) group on the alfa-carbon. If the additional carbon is inserted immediately after the carboxyl group those amino acids can be called beta-Homo. Many other amino-acid categories can result from attachment of various radicals, loss of hydrogen at the alfa-carbon, or through other means.
Description | In Sequence Code | Molecular Formula | MW |
---|---|---|---|
Histidine (H) | His | C6H9N3O2 | 155.15 |
D-Histidine (h) | D-His | C6H9N3O2 | 155.15 |
Isoleucine (I) | Ile | C6H13N1O2 | 131.17 |
beta-Homo-Isoleucine | beta-Homo-Ile | C7H15N1O2 | 145.20 |
D-Isoleucine (i) | D-Ile | C6H13N1O2 | 131.17 |
Leucine (L) | Leu | C6H13N1O2 | 131.17 |
beta-Homo-Leucine | beta-Homo-Leu | C7H15N1O2 | 145.20 |
D-Leucine (l) | D-Leu | C6H13N1O2 | 131.17 |
Lysine (K) | Lys | C6H14N2O2 | 146.19 |
beta-Homo-Lysine | beta-Homo-Lys | C7H16N2O2 | 160.21 |
D-Lysine (k) | D-Lys | C6H14N2O2 | 146.19 |
Methionine (M) | Met | C5H11N1O2S1 | 149.21 |
beta-Homo-Methionine | beta-Homo-Met | C6H13N1O2S1 | 163.24 |
D-Methionine (m) | D-Met | C5H11N1O2S1 | 149.21 |
Phenylalanine (F) | Phe | C9H11N1O2 | 165.19 |
beta-Phenylalanine | beta-Phe | C9H11N1O2 | 165.19 |
beta-Homo-Phenylalanine | beta-Homo-Phe | C10H13N1O2 | 179.22 |
Homo-Phenylalanine | Homo-Phe | C10H13N1O2 | 179.22 |
D-Phenylalanine (f) | D-Phe | C9H11N1O2 | 165.19 |
Homo-D-Phenylalanine | Homo-D-Phe | C10H13N1O2 | 179.22 |
Threonine (T) | Thr | C4H9N1O3 | 119.12 |
beta-Homo-Threonine | beta-Homo-Thr | C5H11N1O3 | 133.15 |
D-Threonine (t) | D-Thr | C4H9N1O3 | 119.12 |
Tryptophan (W) | Trp | C11H12N2O2 | 204.22 |
beta-Homo-Tryptophan | beta-Homo-Trp | C12H14N2O2 | 218.25 |
D-Tryptophan (w) | D-Trp | C11H12N2O2 | 204.22 |
Valine (V) | Val | C5H11N1O2 | 117.15 |
beta-Valine | beta-Val | C5H11N1O2 | 117.15 |
beta-Homo-Valine | beta-Homo-Val | C6H13N1O2 | 131.17 |
D-Valine (v) | D-Val | C5H11N1O2 | 117.15 |
Alanine (A) | Ala | C3H7N1O2 | 89.09 |
beta-Alanine | beta-Ala | C3H7N1O2 | 89.09 |
D-Alanine (a) | D-Ala | C3H7N1O2 | 89.09 |
Arginine (R) | Arg | C6H14N4O2 | 174.20 |
beta-Homo-Arginine | beta-Homo-Arg | C7H16N4O2 | 188.23 |
Homo-Arginine | Homo-Arg | C7H16N4O2 | 188.23 |
D-Arginine (r) | D-Arg | C6H14N4O2 | 174.20 |
Homo-D-Arginine | Homo-D-Arg | C7H16N4O2 | 188.23 |
Asparagine (N) | Asn | C4H8N2O3 | 132.12 |
beta-Homo-Asparagine | beta-Homo-Asn | C5H10N2O3 | 146.14 |
D-Asparagine (n) | D-Asn | C4H8N2O3 | 132.12 |
Aspartic acid (D) | Asp | C4H7N1O4 | 133.10 |
beta-Aspartic acid | beta-Asp | C4H7N1O4 | 133.10 |
beta-Homo-Aspartic acid | beta-Homo-Asp | C5H9N1O4 | 147.13 |
D-Aspartic acid (d) | D-Asp | C4H7N1O4 | 133.10 |
Cysteine (C) | Cys | C3H7N1O2S1 | 121.16 |
Homo-Cysteine | Homo-Cys | C4H9N1O2S1 | 135.19 |
D-Cysteine (c) | D-Cys | C3H7N1O2S1 | 121.16 |
Homo-D-Cysteine | Homo-D-Cys | C4H9N1O2S1 | 135.19 |
Glutamic acid (E) | Glu | C5H9N1O4 | 147.13 |
beta-Glutamic acid | beta-Glu | C5H9N1O4 | 147.13 |
beta-Homo-Glutamic acid | beta-Homo-Glu | C6H11N1O4 | 161.16 |
D-Glutamic acid (e) | D-Glu | C5H9N1O4 | 147.13 |
Glutamine (Q) | Gln | C5H10N2O3 | 146.14 |
beta-Homo-Glutamine | beta-Homo-Gln | C6H12N2O3 | 160.17 |
D-Glutamine (q) | D-Gln | C5H10N2O3 | 146.14 |
Glycine (G) | Gly | C2H5N1O2 | 75.07 |
Proline (P) | Pro | C5H9N1O2 | 115.13 |
beta-Homo-Proline | beta-Homo-Pro | C6H11N1O2 | 129.16 |
D-Proline (p) | D-Pro | C5H9N1O2 | 115.13 |
Serine (S) | Ser | C3H7N1O3 | 105.09 |
beta-Homo-Serine | beta-Homo-Ser | C4H9N1O3 | 119.12 |
Homo-Serine | Homo-Ser | C4H9N1O3 | 119.12 |
D-Serine (s) | D-Ser | C3H7N1O3 | 105.09 |
Homo-D-Serine | Homo-D-Ser | C4H9N1O3 | 119.12 |
Tyrosine (Y) | Tyr | C9H11N1O3 | 181.19 |
beta-Homo-Tyrosine | beta-Homo-Tyr | C10H13N1O3 | 195.21 |
D-Tyrosine (y) | D-Tyr | C9H11N1O3 | 181.19 |
Ornithine | Orn | C5H12N2O2 | 132.16 |
D-Ornithine | D-Orn | C5H12N2O2 | 132.16 |
Selenocysteine (U) | Sec | C3H6N1O2Se1 | 167.06 |
D-Selenocysteine (u) | D-Sec | C3H6N1O2Se1 | 167.06 |
Selenomethionine | Mse | C5H11N1O2Se1 | 196.12 |
D-Selenomethionine | D-Mse | C5H11N1O2Se1 | 196.12 |
Pyrrolysine (O) | Pyl | C12H21N3O3 | 255.31 |
D-Pyrrolysine (o) | D-Pyl | C12H21N3O3 | 255.31 |
Norleucine | Nle | C6H13N1O2 | 131.17 |
beta-Norleucine | beta-Nle | C6H13N1O2 | 131.17 |
D-Norleucine | D-Nle | C6H13N1O2 | 131.17 |
Naphthylalanine | Nal | C13H13N1O2 | 215.25 |
beta-Naphthylalanine | beta-Nal | C13H13N1O2 | 215.25 |
D-Naphthylalanine | D-Nal | C13H13N1O2 | 215.25 |
beta-D-Naphthylalanine | beta-D-Nal | C13H13N1O2 | 215.25 |
Fluorophenylalanine | Fpa | C9H10F1N1O2 | 183.18 |
Aminohexanoic acid | Ahx | C6H13N1O2 | 131.17 |
Norlvaline | Nva | C5H11N1O2 | 117.15 |
beta-Norlvaline | beta-Nva | C5H11N1O2 | 117.15 |
D-Norlvaline | D-Nva | C5H11N1O2 | 117.15 |
Homo-Proline | Pip | C6H11N1O2 | 129.16 |
Homo-D-Proline | D-Pip | C6H11N1O2 | 129.16 |
Phenylglycine | Phg | C8H9N1O2 | 151.16 |
beta-Phenylglycine | beta-Phg | C8H9N1O2 | 151.16 |
D-Phenylglycine | D-Phg | C8H9N1O2 | 151.16 |
beta-D-Phenylglycine | beta-D-Phg | C8H9N1O2 | 151.16 |
Hydroxytryptophan | 5-Htp | C11H12N2O3 | 220.22 |
beta-Hydroxytryptophan | beta-5-Htp | C11H12N2O3 | 220.22 |
D-Hydroxytryptophan | D-5-Htp | C11H12N2O3 | 220.22 |
beta-D-Hydroxytryptophan | beta-D-5-Htp | C11H12N2O3 | 220.22 |
Allylglycine | Hag | C5H9N1O2 | 115.13 |
beta-Allylglycine | beta-Hag | C5H9N1O2 | 115.13 |
D-Allylglycine | D-Hag | C5H9N1O2 | 115.13 |
beta-D-Allylglycine | beta-D-Hag | C5H9N1O2 | 115.13 |
(S)-3,5-Dihydroxyphenylglycine | Dhpg | C8H9N1O4 | 183.16 |
4-hydroxy-glutamic-acid | gamma-Hydroxy-Glu | C5H9N1O5 | 163.13 |
Methionine sulfoxide | Met(R-O) | C5H11N1O3S1 | 165.21 |
Methionine sulfone | Met(O2) | C5H11N1O4S1 | 181.21 |
Pyroglutamic acid | pGlu | C5H7N1O3 | 129.11 |
D-Pyroglutamic acid | D-Pyr | C5H7N1O3 | 129.11 |
gamma-Carboxyglutamic acid | Gla | C6H9N1O6 | 191.14 |
2,3-diaminopropanoic acid | Dap | C3H8N2O2 | 104.11 |
N-Methylleucine | Leu(N-Me) | C7H15N1O2 | 145.20 |
beta-Homo-N-Methylleucine | beta-Homo-Leu(N-Me) | C8H17N1O2 | 159.23 |
N-Methylphenylalanine | Phe(N-Me) | C10H13N1O2 | 179.22 |
beta-N-Methylphenylalanine | beta-Phe(N-Me) | C10H13N1O2 | 179.22 |
beta-Homo-N-Methylphenylalanine | beta-Homo-Phe(N-Me) | C11H15N1O2 | 193.24 |
Homo-N-Methylphenylalanine | Homo-Phe(N-Me) | C11H15N1O2 | 193.24 |
Cyclohexylalanine | Cha | C9H17N1O2 | 171.24 |
D-Cyclohexylalanine | D-Cha | C9H17N1O2 | 171.24 |
Aminobutanoic acid | Abu | C4H9N1O2 | 103.12 |
Statine | Sta | C8H17N1O3 | 175.23 |
Penicillamine | Pen | C5H11N1O2S1 | 149.21 |
D-Penicillamine | D-Pen | C5H11N1O2S1 | 149.21 |
Hydroxyproline | Hyp | C5H9N1O3 | 131.13 |
Sarcosine | Sar | C3H7N1O2 | 89.09 |
Diphenylalanine | Dif | C15H15N1O2 | 241.28 |
3-Nitrotyrosine | Tyr(NO2) | C9H10N2O5 | 226.19 |
beta-Homo-3-Nitrotyrosine | beta-Homo-Tyr(NO2) | C10H12N2O5 | 240.21 |
Anthranilic acid | Abz | C7H7N1O2 | 137.14 |
Dehydroproline | Dehydroproline | C5H7N1O2 | 113.11 |
Amino-PEG2-acid | PEG2 | C7H15N1O4 | 177.20 |
Amino-PEG3-acid | PEG3 | C9H19N1O5 | 221.25 |
Amino-PEG4-acid | PEG4 | C11H23N1O6 | 265.30 |
Amino-PEG5-acid | PEG5 | C13H27N1O7 | 309.36 |
Amino-PEG6-acid | PEG6 | C15H31N1O8 | 353.41 |
Amino-PEG7-acid | PEG7 | C17H35N1O9 | 397.46 |
Amino-PEG8-acid | PEG8 | C19H39N1O10 | 441.51 |
Amino-PEG9-acid | PEG9 | C21H43N1O11 | 485.57 |
Amino-PEG10-acid | PEG10 | C23H47N1O12 | 529.62 |
Amino-PEG11-acid | PEG11 | C25H51N1O13 | 573.67 |
Amino-PEG12-acid | PEG12 | C27H55N1O14 | 617.72 |
Amino-PEG13-acid | PEG13 | C29H59N1O15 | 661.78 |
Amino-PEG14-acid | PEG14 | C31H63N1O16 | 705.83 |
Amino-PEG15-acid | PEG15 | C33H67N1O17 | 749.88 |
Amino-PEG16-acid | PEG16 | C35H71N1O18 | 793.93 |
Amino-PEG17-acid | PEG17 | C37H75N1O19 | 837.99 |
Amino-PEG18-acid | PEG18 | C39H79N1O20 | 882.04 |
Amino-PEG19-acid | PEG19 | C41H83N1O21 | 926.09 |
Amino-PEG20-acid | PEG20 | C43H87N1O22 | 970.14 |
Amino-PEG21-acid | PEG21 | C45H91N1O23 | 1014.20 |
Amino-PEG22-acid | PEG22 | C47H95N1O24 | 1058.25 |
Amino-PEG23-acid | PEG23 | C49H99N1O25 | 1102.30 |
Amino-PEG24-acid | PEG24 | C51H103N1O26 | 1146.35 |
Amino-PEG25-acid | PEG25 | C53H107N1O27 | 1190.41 |
Amino-PEG26-acid | PEG26 | C55H111N1O28 | 1234.46 |
Amino-PEG27-acid | PEG27 | C57H115N1O29 | 1278.51 |
Amino-PEG28-acid | PEG28 | C59H119N1O30 | 1322.56 |
Amino-PEG29-acid | PEG29 | C61H123N1O31 | 1366.62 |
Amino-PEG30-acid | PEG30 | C63H127N1O32 | 1410.67 |
Amino-PEG31-acid | PEG31 | C65H131N1O33 | 1454.72 |
Amino-PEG32-acid | PEG32 | C67H135N1O34 | 1498.77 |
Amino-PEG33-acid | PEG33 | C69H139N1O35 | 1542.83 |
Amino-PEG34-acid | PEG34 | C71H143N1O36 | 1586.88 |
Amino-PEG35-acid | PEG35 | C73H147N1O37 | 1630.93 |
Amino-PEG36-acid | PEG36 | C75H151N1O38 | 1674.98 |
Azidolysine | Azidolysine | C6H12N4O2 | 172.19 |
2-(1-aminoethyl)-1,3-thiazole-4-carboxylic acid | Ala-Thiazole | C6H8N2O2S1 | 172.21 |
2-(1-aminoethyl)-1,3-thiazoline-4-carboxylic acid | Ala-Thiazoline | C6H10N2O2S1 | 174.22 |