Recombinant Proteins Background
Recombinant protein production has become an essential tool for providing the necessary amounts of a protein of interest to either research or therapy by providing adequate amounts of proteins for studying protein structure and function. Moreover, scaled-up facilities produce recombinant vaccines, hormones, antibodies, growth factors, blood components, and enzymes. The development of genetic engineering has made the production and expression of target proteins in a recombinant form possible by using different expression hosts, including bacterial, fungal, or eukaryotic host cells.
Recombinant proteins in Escherichia coli systems
In all of these expression systems, the use of the enterobacterium Escherichia coli is the most widely used. The main reasons for the extensive use of this bacterium in this area are as follows: extensive knowledge of the genetics of the bacterium (large number of cloning vectors and mutant host strains commercially available), ease of use, low cost, and a high yield of the target protein. The use of E. coli, however, for recombinant protein production has encountered several disadvantages. For example, many of the post-translational modifications found in eukaryotes, such as N- and O-glycosylation, amidation, hydroxylation, myristoylation, palmitation, or sulfation, are absent in E. coli, which limits its application. On top of this, the high expression levels of recombinant protein can often lead to the accumulation of aggregated insoluble protein, resulting in inclusion-body formation in the cytoplasm of the bacteria. High translation rate can be a serious problem when the target protein is a heterologous molecule. Thus, the soluble expression and native purification of the target protein in E. coli remains an important bottleneck in the production area of recombinant protein. Nevertheless, if the protein to be expressed is cytoplasmic, lacks the above-mentioned posttranslational modifications, possesses few disulfide bonds, and does not present a multidomain composition, the use of the E. coli as the host is the recommended choice for the first trials of protein production.
Production of recombinant protein in E. coli, whether for biochemical analysis, therapeutics, or structural studies, requires the success of mainly two crucial steps: (i) soluble expression of the target protein; and (ii) purification and stabilization of a functional molecule.
In the past three decades considerable efforts to improve the production of soluble and functional recombinant protein have been carried out. These advances include the development of different expression strains, a wide variety of plasmids under the control of different promoters, or the use of special tags. The co-expression of target protein with molecular chaperones or folding modulators has also been employed, as well as the introduction of mutations in the target gene. Additionally, diverse growth temperatures, different induction densities, as well as changes in media composition are also important variables evaluated with the purpose of improving the solubility and purification of the target protein. Because soluble does not always mean functional, quite often the protein can form soluble aggregates that can be unfolded, may be inactive, and/or difficult to crystallize, making the soluble protein useless. Therefore, it is also important to characterize the aggregation state of the protein after expressing the target protein. In this regard, the use of analytical gel filtration and/or static or dynamic light scattering could be used with this purpose.
Recombinant proteins in mammalian cells systems
In 1987 the first market approval for a recombinant protein, human tissue plasminogen activator (tPA) produced using mammalian cells was obtained. Since then the market of biopharmaceuticals is steadily increasing and by now over 200 biopharmaceuticals have reached the market. The most prominent classes of biopharmaceuticals are monoclonal antibodies (mAbs), hormones, growth factors and diverse fusion proteins. tPA (Activase) is a serine protease that converts the plasminogen to plasmin and therefore clinically used as an antithrombolytic agent. In 1989 Erythropoietin (EPO, Epogen), a red blood cell stimulating factor, was approved and the first recombinant product to reach the status of a blockbuster with sales of over 1 billion a year. During the following years mAbs became the bestselling class of biologics, reaching annual sale of approximately $18.5 billion in 2010 on US market. The top 3 thereof were Remicade, Avastin and Rituxan, reaching up to 3.3 billion dollars sales in 2010.
All of these recombinant biopharmaceuticals have to meet high quality standards to ensure safety of the patient and best possible effects, controlled by regulatory authorities. Therefore, quality and functionality of the product is one major issue to be tackled during recombinant production, especially with increasing complexity of the protein.
The proper function of a protein crucially depends on correct posttranslational modifications including, for example, phosphorylation, formation of disulfide bonds and glycosylation. Particularly the glycosylation of proteins is of high importance as it significantly influences its stability and solubility in serum by inhibition of polymerization, denaturation and degradation via proteases. Sugar residues are added to the nascent protein in different cell organelles as N- or O-glycosylation. While O-glycosylation occurs in the golgi apparatus where sugar moieties are attached to serine or threonine, N-glycosylation is a more complex process, where a precursor oligosaccharide is linked to asparagine via oligosaccharide transferase in the endoplasmatic reticulum (ER). Then, in additional steps, the preliminary structure is processed in the ER and the golgi apparatus by different enzymes.
Different host platforms such as bacteria, yeast, insect cells, plants or mammalian cells have been evaluated for their potential to produce high quality recombinant proteins. Bacteria and yeast have proven superior in terms of final product concentration, maximal cell density and growth rate. Nevertheless, mammalian cells are the host of choice for the production of complex proteins, because of their ability to form a human-like glycosylation pattern, while products from bacteria show no or low glycosylation and yeast cells produce products with high mannose forms and also for other host systems as plants or insect cells major differences are noted.
To meet product quality needs, several research groups have developed strategies for genetically engineering yeast, plants and Escherichia coli by modifying or creating the glycosylation pathway to produce more human like proteins. Nevertheless, until now mammalian systems remain the predominant expression for complex recombinant proteins.
Among such mammalian production hosts, Chinese hamster ovary cells (CHO), mouse myeloma cells (NS0), human kidney cells (HEK293), Baby Hamster Kidney cells (BHK) and many others were used for the production of recombinant proteins and although many have proven valuable, CHO is still the most used host system with 60–70% of all recombinant products being produced using this cell type. This predominance is mainly related to the low infectibility by human pathogenic virus, the early availability of amplification markers (DHFR-clones), ease of recombinant DNA integration, good growth characteristics and high specific productivity and of course the long track record of products accepted by FDA and EMA. In addition, other species than human are considered less likely to transfer pathogens across the species barrier from hamster to humans.
Recombinant proteins in yeast systems
Yeast species have been popular industrial hosts for recombinant protein production because they combine the advantages of unicellular organisms (i.e., ease of genetic manipulation and rapid growth) with the ability to perform eukaryotic post-translational modifications. Unlike more complex eukaryotic organisms, yeast expression systems are economical, can rapidly reach high cell densities, produce high protein titers and do not contain pyrogens, pathogens or viral inclusions.
Designing the optimal system for recombinant protein production involves many crucial steps: (1) selecting the host strain that enables proper folding and post-translational modifications, (2) choosing a suitable vector (episomal or integrative) with an appropriate promoter (constitutive, inducible or repressible) and selectable marker, (3) codon-optimizing the gene (4) fusing the gene to an epitope tag if necessary for affinity purification or detection of the recombinant protein, (5) choosing the signal sequence to target the recombinant protein to the intracellular or extracellular medium, (6) preventing the proteolytic cleavage of the product, (7) designing the fermentation medium (carbon and nitrogen sources, induction conditions), and (8) optimizing the bioprocess parameters (temperature, pH, oxygen transfer).
Saccharomyces cerevisiae, the first and best characterized yeast expression system, was developed in the 1980s and highly benefited from its traditional use in baking, brewing and wine making. However, numerous cases of plasmid instability, low protein yields and the hyperglycosylation of proteins have limited the number of commercial products on the market from S. cerevisiae. Moreover, S. cerevisiae produces proteins with N-linked glycosylation terminated via α-1,3-linked mannose residues, which are considered to be allergenic.
These issues have led to the development of alternative expression systems that are now well established, including two methylotrophic yeasts Pichia pastoris and Hansenula polymorpha, the budding yeast Kluyveromyces lactis, the fission yeast Schizosaccharomyces pombe, and two dimorphic yeast Arxula adeninivorans and Yarrowia lipolytica. In addition, host strains are being engineered to perform more humanized N-glycosylation, which was accomplished first in P. pastoris followed by initial studies in H. polymorpha, Y. lipolytica, K. lactis and S. pombe, opening the route for yeast to become the major industrial hosts for therapeutic proteins. However, no single yeast expression system can provide all the desired properties for recombinant protein production.
Recombinant proteins in plant systems
A broad array of plants has been used in medicine for thousands of years. However, thanks to rapid progress in genetic engineering; 25 years ago, it was confirmed that plants are capable of producing recombinant proteins. In 1990, the first recombinant protein with potential therapeutic used human serum albumin was expressed in potato and tobacco leaves, as well as in cell suspension cultures. Since then, hundreds of recombinant proteins have been successfully expressed in immensely diverse plants, but the development of molecular farming, which was supposed to revolutionize the production of biopharmaceuticals, started to slow down. As a result, only few plant-derived recombinant proteins are currently registered as pharmaceuticals. Compared to animal, bacterial, and yeast cells, the commercial use of plant cells for protein production has only started recently. However, thanks to their advantages in the area of protein processing; plant cells are becoming an accepted alternative to mammalian and microbial platforms. The lack of plant pathogens, which are similar to the ones affecting animals and humans, results in higher safety. Compared to other hosts, the recombinant proteins in plants show more stability. Furthermore, their expression systems are highly scalable and rapid. Therefore, plant-based protein production systems may play an important role in the fast production of large amounts of medicine, such as vaccines during influenza epidemic. Unlike microbes, higher plants are capable of producing proteins with desired N-glycosylation (human-like glycomodification) and folding. Moreover, plant cells can produce substances that would be toxic for mammalian or bacterial cells. In addition, unlike mammalian cells, plants are insensitive to slight changes of conditions, such as pH, temperature, availability of metabolites, and their protein yield is high. Compared to mammalian or animal cells, upstream processing in plants and plant cells offers a wider range of methods and higher diversity of species. The cultivation of plants demands lower infrastructural costs because it can use an existing agricultural base. Moreover, the storage of plants with recombinant proteins (e.g. seeds) is easier and cheaper than the storage of host cells from different organisms. Seeds can be stored at room temperature for long periods of time. Compared to other recombinant protein platforms, such a production can be scaled up to agricultural levels by people with lower qualifications. Using plant platforms is economically justified by their lowest cost of production among the recombinant protein hosts. It is estimated that, compared to other systems, costs of protein production in plants could be 10-50 times lower.
In spite of many advantages of the recombinant protein production in plants, biotech industry still relies on the small number of standardized technologies. Such a situation is caused by few barriers constricting clinical development and commercialization of plant-derived pharmaceutical proteins. However, few pharmaceuticals derived from plants, including the enzyme glucocerebrosidase, interferon alpha 2b, and insulin, are proceeding toward commercialization. The economic efficiency is strictly connected to the type of protein produced in plants. To bypass regulatory difficulties, several companies focused on the production of non-clinical proteins, such as technical reagents, enzymes, and diagnostic proteins, which are commercially successful. Because of the novelty of the process, its efficacy still requires improvements, especially in the area of biotechnological procedures: accelerating and optimizing the process, as well as generating new products. Furthermore, problems with meeting the high, aseptic standards of biopharmaceutical production, tailored to platforms based on animal and microbial cells, still occur. During the production process of proteins in whole plants, maintaining Good Manufacturing Practice (GMP) rules is usually challenging due to the differences in cultivation. There is a big demand for the methods allowing plant cultivation in highly standardized conditions, with the reduction of pollutants and with an improved logistic.