The coding information for constructing proteins exists in a highly conserved codon table, and organisms can arrange and combine different proteins to perform a variety of biological functions using only 20 natural amino acids. The rapid development of synthetic biology has made it possible to controllably introduce unnatural amino acids (UAAs) into protein synthesis. This has greatly expanded the structure and function of proteins, and facilitated the development of biological tools and the study of biological physiological processes. UAAs with active groups can be widely used in many fields such as protein structure research, protein function regulation, construction of new biomaterials, and pharmaceutical research and development. Gene codon expansion technology uses an orthogonal translation system to modify the central dogma by redistributing codons, and can introduce UAAs at designated sites of proteins.
Aminoacyl tRNA synthetases (aaRS) are a key class of enzymes responsible for correctly linking amino acids to their corresponding transfer RNAs (tRNAs), a process that is essential for protein synthesis. In eukaryotes and prokaryotes, aminoacyl tRNA synthetases are responsible for parsing the genetic code and maintaining high fidelity during translation. Their functions are not limited to protein synthesis; an increasing number of studies have shown that these enzymes are also involved in a variety of cellular processes including signal transduction, apoptosis, and stress response.
Aminoacyl tRNA synthetases can be roughly divided into two major categories, Class I and Class II, based on their structural and sequence characteristics. These two classes of enzymes exhibit different structural folding types and functional domain arrangements. Class I enzymes typically have a Rossmann fold structure, and their active sites display a nucleotide binding pocket, also known as a Class I Motif. Class II enzymes have a completely different antiparallel β-fold form, with a prominent characteristic tripeptide rotation and α-helical structure, and are represented by the Class II core domain. Although the two types of enzymes have significant differences in overall structure and sequence, they show considerable commonality in their mechanism of action. Each aaRS catalyzes the activation of the amino acid and the aminoacylation of tRNA by specifically recognizing a certain amino acid and the corresponding tRNA, and also through the action of ATP. This process generally involves two main steps: (1) the amino acid combines with ATP to form an aminoacyl-AMP intermediate; (2) the aminoacyl-AMP intermediate then transfers the amino acid to the 3'-end of the tRNA.
The main function of aminoacyl-tRNA synthetases is to ensure that each amino acid is correctly added to its corresponding tRNA. These tRNAs then correctly insert the amino acids into the nascent peptide chain on the ribosome according to the mRNA coding sequence. This process involves several key steps and depends on the participation of ATP.
This step-by-step mechanism ensures high specificity and efficiency, and a series of complementary calibration mechanisms have evolved to prevent the wrong amino acids from being connected to tRNA. Some class I aaRS even have editing functions, which further improve the fidelity of the translation process by hydrolyzing incorrectly connected aminoacyl-tRNA. Each aaRS is responsible for only one amino acid and its corresponding tRNA, so in a typical cell, the number of such enzymes is equal to the number of amino acids, generally 20. aaRS cloned and expressed from different species show a high degree of conservation, especially in the active site and tRNA binding site. However, there are some exceptions, such as atypical aaRS present in some organisms, which are responsible for the recognition and connection of UAAs.
Among the 64 triplet codons in organisms, 61 are sense codons and the other 3 are stop codons. The former encodes 20 natural amino acids while the latter is used to terminate the translation process. At present, more than 140 amino acids have been identified from naturally occurring proteins, but most of them except for the 20 natural amino acids are products of post-translational modifications. Only selenocysteine (Sec) and pyrrolysine (Pyl) are directly encoded by genes through UGA and UAG codons respectively. This discovery expands the coding information of codons and also indicates that the codon table can be artificially modified to introduce UAAs, and then the translation system can be modified to synthesize the proteins people need.
For the introduction of UAAs, early work mainly focused on the chemical modification of the thiol group of cysteine and the ε-amino group of lysine. Researchers have developed a series of synthetic methods for post-translational modifications based on these two amino acids. But this is still far from meeting people's needs for exploring and developing protein functions. There are many technologies available today for inserting UAAs into proteins. For example, the selective pressure insertion method uses a strain that is deficient in one or more amino acids. During the culture, ncAA with similar structure and chemical properties to the deficient amino acid is added to make it recognized and utilized by the host's endogenous translation machinery. Although all of the above technical means can introduce ncAA into proteins, the selectivity and overall insertion efficiency are low, which limits their application (for example, the selective pressure insertion method can only introduce amino acids of specific structures, and the product of the in vitro aminoacylation method is consumed quickly and inefficiently).
* Unnatural amino acids for tRNA synthetase insertion:
Catalog | Name | Cas | Price |
BAT-005802 | 2-Aminoisobutyric Acid | 62-57-7 | Inquiry |
BAT-004069 | N-Methyl-L-alanine hydrochloride | 3913-67-5 | Inquiry |
BAT-007844 | 4-(Aminomethyl)-L-phenylalanine | 150338-20-8 | Inquiry |
BAT-005712 | O-Phospho-L-tyrosine | 21820-51-9 | Inquiry |
BAT-007839 | 3-Nitro-L-tyrosine | 621-44-3 | Inquiry |
BAT-004075 | N-Methyl-L-valine hydrochloride | 2480-23-1 | Inquiry |
BAT-005611 | L-Thyronine | 1596-67-4 | Inquiry |
BAT-005615 | L-α-Aminobutyric acid | 1492-24-6 | Inquiry |
BAT-004174 | O-tert-Butyl-L-serine | 18822-58-7 | Inquiry |
BAT-005598 | L-Phenylglycine | 2935-35-5 | Inquiry |
BAT-007853 | 4-Amino-L-phenylalanine | 943-80-6 | Inquiry |
BAT-006759 | 2-Nitro-L-phenylalanine | 19883-75-1 | Inquiry |
The newly developed gene codon expansion technology uses an orthogonal translation system to modify the central dogma by reallocating codons, which can achieve efficient site-specific insertion of ncAA into proteins. There are several types of codon expansion technologies developed today: including the use of stop codons, quadruple (quintuple) codons, reallocated sense codons, and special codons that introduce non-natural bases. Since the introduction of quadruple codons requires the modification of ribosomes and the degeneracy of sense codons requires the redesign of the genome, although non-natural base pairs can expand the codon table, their stability and fidelity are still under further study. In most protein synthesis, the frequency of stop codons is much lower than that of sense codons and does not encode any amino acids, so the allocation of stop codons for the insertion of ncAA will not compete with endogenous amino acids, and the difficulty is lower and easier to achieve. The codon expansion technology based on terminators is the earliest and most complete technology for the site-specific introduction of ncAA. At present, the insertion of more than 200 unnatural amino acids has been achieved, of which more than 60 were inserted by modifying the orthogonal pair of tyrosyl tRNA synthetase/tRNA (MjTyrRS/tRNATyr) of the archaea Methanococcus jannaschii (M. jannaschii).
The application of codon expansion technology in biology is essentially the application of introduced active groups. Some chemically selective groups are often used in bioorthogonal labeling reactions, which promotes people's understanding of the structure-activity relationship of proteins and explores the biochemical processes involved in proteins. For example, copper-catalyzed or metal-free click chemistry using alkynyl-azide reactions, Staudinger ligation reactions using azide and phosphorus reactions, inverse electron demand DA reactions (IEDDA), and ring strain-facilitated DA reactions (SPAAC), photo-click chemistry reactions of alkynyl and tetrazole, metal-mediated olefin metathesis and Suzuki coupling reactions, etc. These reactions achieve protein chemical modification by positioning active groups on UAAs. These chemically active groups also effectively achieve the connection between proteins of different structures. Through these chemical modifications, the post-translational modification mechanism of proteins can also be explored and new physical and chemical properties can be given to proteins. By introducing side chain groups as probes, research on protein conformation, positioning and intramolecular interactions can be further deepened.