Recombinant protein expression and purification for therapeutic development or for in vitro studies can sometimes be a real challenge. It requires huge investments in both time and cost in order to obtain high yields of pure and active recombinant proteins. My aim in writing this post, is to give you a simple guide to easily establish optimal conditions for recombinant protein expression and production in E. Coli, of problematic unstable proteins of interest.
E. Coli is the most widely used protein production system for pharmaceutical target studies. It has considerable advantages, because bacterial protein production is cheap and can rapidly provide proteins. The holistic optimization process for the production of problematic proteins begins with bioinformatic analysis.
Indeed, analysis of the biological and physico-chemical properties of the target protein is the first step you want to keep in mind. This can be achieved by using online bioinformatic resources, or dedicated software. These tools allow you to simply analyse the primary sequence (MW, pI…). They can also predict secondary and tertiary structures, domains and signal peptide localisation, potential disorder zones, solubility, stability and hydrophobicity… By taking into account this information, you can then better identify the experimental strategy you need to adopt.
Once you’ve collected all this information, and when you suspect or identify difficult to express proteins, 6 additional parameters can be adapted to optimise recombinant protein expression in E. coli or even prevent possible problems that might appear in the production process.
Tips to remember to optimise protein expression in E. coli
#1. Choose the most adapted Bacterial strains
To enhance the quantity and the quality of proteins produced in E. Coli, several strategies with innovative approaches have been developed during the last decades. Various tools are available on the market with a plethora of bacterial strains with specific characteristics for the production of toxic or membrane proteins, or for proteins with rare codons, or with di-sulfide bonds (see my previous post “Which bacterial strains for recombinant protein expression?” for some examples).
#2. Choose the right expression vector
The Expression vectors include multiple replicons, promoters, selection markers, cloning sites and fusion protein.
The number of plasmids available on the market is huge. Therefore, choosing the right one for the recombinant protein expression is a real challenge. Each promoter has different characteristics such as:
- the lac promoter, a key component of the lac operon, and its derivative lac UV5 (Müller-Hill, 1996; Makoff and Oxer, 1991, Deuschle et al., 1986). Examples of commercial plasmids that use the lac or tac promoters to drive protein expression are the pUC series (lacUV5 promoter, Thermo Scientific) and the pMAL series of vectors (tac promoter, NEB).
- The T7 promoter present in the Novagen’s® pET vectors (pMB1 ori, medium copy number, Novagen) which is extremely popular for recombinant protein expression. This isn’t surprising, as the target protein can represent 50% of the total cell protein in successful cases (Graumann and Premstaller, 2006; Baneyx, 1999). In this system, the gene of interest is cloned behind a promoter recognized by the phage T7 RNA polymerase (T7 RNAP).
- The Ara promoter used as a positive control because it has lower background expression levels (Siegele and Hu, 1997). Eg. : araPBAD promoter present in the pBAD vectors (Guzman et al., 1995). Interestingly, the AraC protein has the dual role of repressor/activator.
- The pL promoter which is widely used. The gene is under the control of a regulated phage promoter. The strong leftward promoter (pL) of phage lambda directs expression of early lytic genes (Dodd et al., 2005). The promoter is tightly repressed by the λcI repressor protein, which sits on the operator sequences during lysogenic growth.
(see Fig. 1)
#3. Optimise the codons
The vast majority of amino acids are encoded by multiple codons, which means that there are multiple tRNAs that correspond to one amino acid. Some redundant tRNAs are much more abundant than the others coding for the same amino acids in a given cell.
Codon optimisation is switching the codons used in a transgene without changing the amino acid sequence that it encodes for. This typically increases the abundance of the protein because it removes rare codons and replaces them with abundant codons present in the host organism (Gustafsson et al., 2012). Nevertheless, in some instances, this effect is not systematically profitable for all target proteins (Maertens et al., 2010; Gustafsson et al., 2012 ). As it is widely accepted, optimising codon usage of the gene of interest in the chosen heterologous expression systems is a way to optimise its translation rate by exploiting the natural ratios in tRNAs content in the expression system.
But the elongation of mRNA is a non-uniform process and speed seems to be linked to the need to modulate the time required to allow the correct folding of domains in multi-domain proteins. Optimising the codon usage would mean increasing the speed of the elongation through all the sequence, but promoting then the chances to obtain partially folded species that would be either degraded or sent to the inclusion bodies. Novel algorithms need to be developed to take this problem into account.
#4. Use lower expression temperatures
It’s well known and documented that bacterial cultivation at reduced temperatures is often used to reduce protein aggregation. It slows down the rate of protein synthesis and folding kinetics, decreasing the hydrophobic interactions that are involved in protein self-aggregation (Schumann and Ferreira, 2004; Sorensen and Mortensen, 2005).
Low cultivation temperatures can also reduce or impair protein degradation due to poor activity of heat shock proteases that are usually induced during protein overproduction in E. coli (Chesshyre and Hipkiss, 1989). This strategy has, however, some drawbacks as temperature reduction can also affect replication, transcription, and translation rates, besides decreasing bacterial growth and protein production yields. Nevertheless, these limitations can be circumvented by the use of cold inducible promoters that maximise protein production under low temperature conditions (Mujacic et al., 1999).
#5. Produce protein in particular media and adapted conditions
Optimising recombinant protein expression can also be carried out at the level of the composition of the culture media and related additives used.
The various experimental conditions described in figure 2 have been performed in our R&D facilities at tebu-bio’s headquarters (Le Perray-en-Yvelines, France). These optimisations were aimed at determining the best bacterial growth medium composition(s) and optimal conditions for protein expression of viral protein X fused with the MBP tag; the main purpose being to enhance the solubility of protein X. They were performed on a MBP fused viral protein of 62 kDa which is particularly unstable with :
- 2 predictive unfolded disorder zones
- a very hydrophobic core
- 4 di-sulfide bonds
For this purpose, we tested 72 different expression conditions with 6 different media and then analysed the results obtained. The data obtained clearly show that the quantity of Protein X expressed was highly enhanced in SuperBroth and Magic Media in presence of a cocktail of additives including magnesium, compared to the amount produced in classical media (LB, 2YT, TB…) (see Fig. 2).But each protein is particular and optimal conditions won’t be the same for all proteins. These optimization steps are therefore crucial for increased protein expression yields.
#6. Add stabilising sequences
Moreover, several tags can be used to improve protein solubility and purification efficiency.
a. Big stabilising and solubilising tags
The fusion tag can be add at the N- and/or the C-terminal part of the protein. The first tags (initially developed to enhance the solubility of the protein) were Protein A (280 amino acids) and LacZ (1024 amino acids).
Later, a plethora of tags were developed and are now available to protein experts. They have numerous characteristics like SUMO, MBP, GST, GFP, Trx (see Tab. 1)…
In addition, Strep II-tag and Fh8, which are small tags of respectively 8 and 69 amino acids, are also able to enhance protein solubility (see Tab. 1). Moreover, Strep II tag does not interfere with membrane translocation or protein folding.
Therefore, the best way to proceed is to perform HTS screening to determine which tag is the most efficient for your protein of interest.
b. Small peptides composed of single amino acids
Peptide sequences consisting of a single amino acid type (poly-amino-acid peptides) can be useful to partly overcome the problem related to protein solubility. These sequences tend to amplify the adhesive, aggregation, polymerisation, and solubility properties of the amino acids when they were added at the C-terminus part of the protein (have a look at these articles: “Analysis of amino acid contributions to protein solubility using short peptide tags fused to a simplified BPTI variant” by Mohammad Monirul Islam et al.; 2012) and “Analysis of protein aggregation kinetics using short amino acid peptide tags” by Monsur Alam Khan et al.; 2013)).
#7. By co-expressing molecular chaperones, folding modulators or fusion partner proteins
The major limitation of the E. Coli protein expression system is the absence of post-translational modifications (a.k.a. PTMs) which have important role in protein life-time and function (see these previous posts related to PTMs)
Incidentally, identifying interacting sequences of partners of the target protein may also be used to help for well protein folding which has a direct impact on the protein conformation, solubilisation and activity.
For example, co-expression of a protein of interest with human Jun-N-terminal kinase 1 induces the phosphorylation of the protein of interest and with Ubiquitin knowing that Ubiquitin ligase can induce ubiquitination.
Additionally, methylation, myristoylation and acetylation have been successfully performed in E. coli by co-expressing a methyltransferase, myristoyltransferase and acetylase, respectively. Co-expressing hsp protein, protein partner or peptide corresponding to the interacting region of the partner can also be helpful for the protein folding.
To conclude, each protein is a particular case. Expressing an unstable protein in E. coli systems is not an easy thing and there are no “magic recipes”.
Nevertheless, solutions exist by taking into account their specific physico-chemistry, performing efficient design of experiment (DoE), and implementing new strategies and/or using innovative reagents. Above all, it requires taking some time to properly interpret all the data and to screen the results to make this quest possible.
Want to know more about protein expression & purification optimizations ?
See how tebu-bio help their clients in their R&D by optimising protein expression in E. coli & HEK 293 systems, together with purification processes and buffer composition as presented recently at the SBCN 2018 meeting (Bordeaux, France).
Download your copy of the poster “More and Better – Innovative tips and tricks for production and purification of unstable proteins“.
What about you? Are you considering recombinant protein expression, production and purification in E. coli?
Is your target protein unstable, and you need to quickly find optimal protein expression conditions? Our laboratories have already assisted many life scientists in their recombinant protein expression and purification programs. Why not take advantage of this expertise too? Discover our recombinant protein platform and tailored solutions here.
Don’t hesitate to contact our lab specialists directly, or to get in touch with your local tebu-bio office – we’ll be pleased to help you!
Of course, you can also leave us a message below!