Sergey Nosov

Amino Acids

By Sergey Nosov

Amino acids are building blocks of living organisms. Chained together amino acids form peptides and proteins.

In known life forms, twenty-three ribosomally synthesized proteinogenic or ‘protein creating’ amino acids are directly encoded in DNA or generated through special translation mechanisms. Peptides and proteins become subjected to post-translational modifications that result in presence of non-proteinogenic or unusual amino acids in the peptide or protein sequences. Synthetic incorporation or compounds made entirely of non-proteinogenic amino acids are also possible.

With exception of N-Formylmethionine (fMet), which appears at the N-terminus as a starting residue in some bacteria proteins, each proteinogenic amino acid is assigned a capital letter one-symbol code (in parenthesis in the table on this page).

Proteinogenic amino acids have L chirality (stereo-isomeric configuration), which is assumed when not explicitly mentioned.

Non-proteinogenic amino acids with chirality opposite to that of proteinogenic amino acids are referred to as D-amino acids. Each of the proteinogenic amino acids, excluding Glycine, which is symmetric, has a D-amino acid counterpart that may be referred to in sequences by the same letter as the proteinogenic amino acids, only in lower case.

In addition to one letter codes we just discussed amino acids can be referred to by multi-letter codes. Those are three-letter codes for more frequently used amino acids, and longer codes for others. Let us review a couple of examples.

First, is a proteinogenic amino acid Alanine. Its International Union of Pure and Applied Chemistry (IUPAC) name is (2S)-2-aminopropanenitrile. In peptide and protein sequences it can be referred to by one-letter code “A”, or by three letter code “Ala.” Alanine’s chirality is L, as such it can be written as “L-Ala,” though “L” is assumed and more often than not is dropped.

The D enantiomer of L-Alanine is D-Alanine; IUPAC name: (2R)-2-aminopropanenitrile. In one-letter sequences D-Alanine residues may be referred to by lower case “a.” While sometimes in multi-letter sequences D-Amino acids are written in all lower case letters, it is more common to prepend D-Amino acid residues with letter “D” and a dash, as follows: “D-Ala.”

A peptide sequence containing three L-Alanines followed by three D-Alanines (N- terminus to C- terminus, left-to-right) would look like the following in multi-letter code:

Ala-Ala-Ala-D-Ala-D-Ala-D-Ala

The same sequence in one-letter code is: AAAaaa. The analytical tool on this website understands both sequence notations. Please note that when using one-letter codes dashes between adjoining amino acids are optional. Use dashes or switch to multi-letter codes if ambiguity arises.

Several other groups of non-proteinogenic amino acids can be identified based on their similarity to natural amino acids. The amino group in natural amino acids is normally attached at the alfa-carbon. If the amino group is attached to the beta carbon (the second carbon away from the carboxylic end) such amino acid can be referred to as a beta-amino acid. Similarly, amino group on the gamma or third carbon represents a gamma-amino acid.

Homo amino acids are amino acids with additional methylene (CH2) group on the alfa-carbon. If the additional carbon is inserted immediately after the carboxyl group those amino acids can be called beta-Homo. Many other amino-acid categories can result from attachment of various radicals, loss of hydrogen at the alfa-carbon, or through other means.

Description In Sequence Code Molecular Formula MW
Histidine (H) His C6H9N3O2 155.15
D-Histidine (h) D-His C6H9N3O2 155.15
Isoleucine (I) Ile C6H13N1O2 131.17
beta-Homo-Isoleucine beta-Homo-Ile C7H15N1O2 145.20
D-Isoleucine (i) D-Ile C6H13N1O2 131.17
Leucine (L) Leu C6H13N1O2 131.17
beta-Homo-Leucine beta-Homo-Leu C7H15N1O2 145.20
D-Leucine (l) D-Leu C6H13N1O2 131.17
Lysine (K) Lys C6H14N2O2 146.19
beta-Homo-Lysine beta-Homo-Lys C7H16N2O2 160.21
D-Lysine (k) D-Lys C6H14N2O2 146.19
Methionine (M) Met C5H11N1O2S1 149.21
beta-Homo-Methionine beta-Homo-Met C6H13N1O2S1 163.24
D-Methionine (m) D-Met C5H11N1O2S1 149.21
Phenylalanine (F) Phe C9H11N1O2 165.19
beta-Phenylalanine beta-Phe C9H11N1O2 165.19
beta-Homo-Phenylalanine beta-Homo-Phe C10H13N1O2 179.22
Homo-Phenylalanine Homo-Phe C10H13N1O2 179.22
D-Phenylalanine (f) D-Phe C9H11N1O2 165.19
Homo-D-Phenylalanine Homo-D-Phe C10H13N1O2 179.22
Threonine (T) Thr C4H9N1O3 119.12
beta-Homo-Threonine beta-Homo-Thr C5H11N1O3 133.15
D-Threonine (t) D-Thr C4H9N1O3 119.12
Tryptophan (W) Trp C11H12N2O2 204.22
beta-Homo-Tryptophan beta-Homo-Trp C12H14N2O2 218.25
D-Tryptophan (w) D-Trp C11H12N2O2 204.22
Valine (V) Val C5H11N1O2 117.15
beta-Valine beta-Val C5H11N1O2 117.15
beta-Homo-Valine beta-Homo-Val C6H13N1O2 131.17
D-Valine (v) D-Val C5H11N1O2 117.15
Alanine (A) Ala C3H7N1O2 89.09
beta-Alanine beta-Ala C3H7N1O2 89.09
D-Alanine (a) D-Ala C3H7N1O2 89.09
Arginine (R) Arg C6H14N4O2 174.20
beta-Homo-Arginine beta-Homo-Arg C7H16N4O2 188.23
Homo-Arginine Homo-Arg C7H16N4O2 188.23
D-Arginine (r) D-Arg C6H14N4O2 174.20
Homo-D-Arginine Homo-D-Arg C7H16N4O2 188.23
Asparagine (N) Asn C4H8N2O3 132.12
beta-Homo-Asparagine beta-Homo-Asn C5H10N2O3 146.14
D-Asparagine (n) D-Asn C4H8N2O3 132.12
Aspartic acid (D) Asp C4H7N1O4 133.10
beta-Aspartic acid beta-Asp C4H7N1O4 133.10
beta-Homo-Aspartic acid beta-Homo-Asp C5H9N1O4 147.13
D-Aspartic acid (d) D-Asp C4H7N1O4 133.10
Cysteine (C) Cys C3H7N1O2S1 121.16
Homo-Cysteine Homo-Cys C4H9N1O2S1 135.19
D-Cysteine (c) D-Cys C3H7N1O2S1 121.16
Homo-D-Cysteine Homo-D-Cys C4H9N1O2S1 135.19
Glutamic acid (E) Glu C5H9N1O4 147.13
beta-Glutamic acid beta-Glu C5H9N1O4 147.13
beta-Homo-Glutamic acid beta-Homo-Glu C6H11N1O4 161.16
D-Glutamic acid (e) D-Glu C5H9N1O4 147.13
Glutamine (Q) Gln C5H10N2O3 146.14
beta-Homo-Glutamine beta-Homo-Gln C6H12N2O3 160.17
D-Glutamine (q) D-Gln C5H10N2O3 146.14
Glycine (G) Gly C2H5N1O2 75.07
Proline (P) Pro C5H9N1O2 115.13
beta-Homo-Proline beta-Homo-Pro C6H11N1O2 129.16
D-Proline (p) D-Pro C5H9N1O2 115.13
Serine (S) Ser C3H7N1O3 105.09
beta-Homo-Serine beta-Homo-Ser C4H9N1O3 119.12
Homo-Serine Homo-Ser C4H9N1O3 119.12
D-Serine (s) D-Ser C3H7N1O3 105.09
Homo-D-Serine Homo-D-Ser C4H9N1O3 119.12
Tyrosine (Y) Tyr C9H11N1O3 181.19
beta-Homo-Tyrosine beta-Homo-Tyr C10H13N1O3 195.21
D-Tyrosine (y) D-Tyr C9H11N1O3 181.19
Ornithine Orn C5H12N2O2 132.16
D-Ornithine D-Orn C5H12N2O2 132.16
Selenocysteine (U) Sec C3H6N1O2Se1 167.06
D-Selenocysteine (u) D-Sec C3H6N1O2Se1 167.06
Selenomethionine Mse C5H11N1O2Se1 196.12
D-Selenomethionine D-Mse C5H11N1O2Se1 196.12
Pyrrolysine (O) Pyl C12H21N3O3 255.31
D-Pyrrolysine (o) D-Pyl C12H21N3O3 255.31
Norleucine Nle C6H13N1O2 131.17
beta-Norleucine beta-Nle C6H13N1O2 131.17
D-Norleucine D-Nle C6H13N1O2 131.17
Naphthylalanine Nal C13H13N1O2 215.25
beta-Naphthylalanine beta-Nal C13H13N1O2 215.25
D-Naphthylalanine D-Nal C13H13N1O2 215.25
beta-D-Naphthylalanine beta-D-Nal C13H13N1O2 215.25
Fluorophenylalanine Fpa C9H10F1N1O2 183.18
Aminohexanoic acid Ahx C6H13N1O2 131.17
Norlvaline Nva C5H11N1O2 117.15
beta-Norlvaline beta-Nva C5H11N1O2 117.15
D-Norlvaline D-Nva C5H11N1O2 117.15
Homo-Proline Pip C6H11N1O2 129.16
Homo-D-Proline D-Pip C6H11N1O2 129.16
Phenylglycine Phg C8H9N1O2 151.16
beta-Phenylglycine beta-Phg C8H9N1O2 151.16
D-Phenylglycine D-Phg C8H9N1O2 151.16
beta-D-Phenylglycine beta-D-Phg C8H9N1O2 151.16
Hydroxytryptophan 5-Htp C11H12N2O3 220.22
beta-Hydroxytryptophan beta-5-Htp C11H12N2O3 220.22
D-Hydroxytryptophan D-5-Htp C11H12N2O3 220.22
beta-D-Hydroxytryptophan beta-D-5-Htp C11H12N2O3 220.22
Allylglycine Hag C5H9N1O2 115.13
beta-Allylglycine beta-Hag C5H9N1O2 115.13
D-Allylglycine D-Hag C5H9N1O2 115.13
beta-D-Allylglycine beta-D-Hag C5H9N1O2 115.13
(S)-3,5-Dihydroxyphenylglycine Dhpg C8H9N1O4 183.16
4-hydroxy-glutamic-acid gamma-Hydroxy-Glu C5H9N1O5 163.13
Methionine sulfoxide Met(R-O) C5H11N1O3S1 165.21
Methionine sulfone Met(O2) C5H11N1O4S1 181.21
Pyroglutamic acid pGlu C5H7N1O3 129.11
D-Pyroglutamic acid D-Pyr C5H7N1O3 129.11
gamma-Carboxyglutamic acid Gla C6H9N1O6 191.14
2,3-diaminopropanoic acid Dap C3H8N2O2 104.11
N-Methylleucine Leu(N-Me) C7H15N1O2 145.20
beta-Homo-N-Methylleucine beta-Homo-Leu(N-Me) C8H17N1O2 159.23
N-Methylphenylalanine Phe(N-Me) C10H13N1O2 179.22
beta-N-Methylphenylalanine beta-Phe(N-Me) C10H13N1O2 179.22
beta-Homo-N-Methylphenylalanine beta-Homo-Phe(N-Me) C11H15N1O2 193.24
Homo-N-Methylphenylalanine Homo-Phe(N-Me) C11H15N1O2 193.24
Cyclohexylalanine Cha C9H17N1O2 171.24
D-Cyclohexylalanine D-Cha C9H17N1O2 171.24
Aminobutanoic acid Abu C4H9N1O2 103.12
Statine Sta C8H17N1O3 175.23
Penicillamine Pen C5H11N1O2S1 149.21
D-Penicillamine D-Pen C5H11N1O2S1 149.21
Hydroxyproline Hyp C5H9N1O3 131.13
Sarcosine Sar C3H7N1O2 89.09
Diphenylalanine Dif C15H15N1O2 241.28
3-Nitrotyrosine Tyr(NO2) C9H10N2O5 226.19
beta-Homo-3-Nitrotyrosine beta-Homo-Tyr(NO2) C10H12N2O5 240.21
Anthranilic acid Abz C7H7N1O2 137.14
Dehydroproline Dehydroproline C5H7N1O2 113.11
Amino-PEG2-acid PEG2 C7H15N1O4 177.20
Amino-PEG3-acid PEG3 C9H19N1O5 221.25
Amino-PEG4-acid PEG4 C11H23N1O6 265.30
Amino-PEG5-acid PEG5 C13H27N1O7 309.36
Amino-PEG6-acid PEG6 C15H31N1O8 353.41
Amino-PEG7-acid PEG7 C17H35N1O9 397.46
Amino-PEG8-acid PEG8 C19H39N1O10 441.51
Amino-PEG9-acid PEG9 C21H43N1O11 485.57
Amino-PEG10-acid PEG10 C23H47N1O12 529.62
Amino-PEG11-acid PEG11 C25H51N1O13 573.67
Amino-PEG12-acid PEG12 C27H55N1O14 617.72
Amino-PEG13-acid PEG13 C29H59N1O15 661.78
Amino-PEG14-acid PEG14 C31H63N1O16 705.83
Amino-PEG15-acid PEG15 C33H67N1O17 749.88
Amino-PEG16-acid PEG16 C35H71N1O18 793.93
Amino-PEG17-acid PEG17 C37H75N1O19 837.99
Amino-PEG18-acid PEG18 C39H79N1O20 882.04
Amino-PEG19-acid PEG19 C41H83N1O21 926.09
Amino-PEG20-acid PEG20 C43H87N1O22 970.14
Amino-PEG21-acid PEG21 C45H91N1O23 1014.20
Amino-PEG22-acid PEG22 C47H95N1O24 1058.25
Amino-PEG23-acid PEG23 C49H99N1O25 1102.30
Amino-PEG24-acid PEG24 C51H103N1O26 1146.35
Amino-PEG25-acid PEG25 C53H107N1O27 1190.41
Amino-PEG26-acid PEG26 C55H111N1O28 1234.46
Amino-PEG27-acid PEG27 C57H115N1O29 1278.51
Amino-PEG28-acid PEG28 C59H119N1O30 1322.56
Amino-PEG29-acid PEG29 C61H123N1O31 1366.62
Amino-PEG30-acid PEG30 C63H127N1O32 1410.67
Amino-PEG31-acid PEG31 C65H131N1O33 1454.72
Amino-PEG32-acid PEG32 C67H135N1O34 1498.77
Amino-PEG33-acid PEG33 C69H139N1O35 1542.83
Amino-PEG34-acid PEG34 C71H143N1O36 1586.88
Amino-PEG35-acid PEG35 C73H147N1O37 1630.93
Amino-PEG36-acid PEG36 C75H151N1O38 1674.98
Azidolysine Azidolysine C6H12N4O2 172.19
2-(1-aminoethyl)-1,3-thiazole-4-carboxylic acid Ala-Thiazole C6H8N2O2S1 172.21
2-(1-aminoethyl)-1,3-thiazoline-4-carboxylic acid Ala-Thiazoline C6H10N2O2S1 174.22