Promoter Toolkits to Study and Control Gene Expression across the Proteobacteria

Material Information

Promoter Toolkits to Study and Control Gene Expression across the Proteobacteria
Schuster, Layla Angela
University of Florida
Publication Date:

Thesis/Dissertation Information

Doctorate ( Ph.D.)
Degree Grantor:
University of Florida
Degree Disciplines:
Microbiology and Cell Science
Committee Chair:
Committee Co-Chair:
Committee Members:
De Crecy,Valerie A
Rice,Kelly C
Bruner,Steven Douglas


Subjects / Keywords:


General Note:
The precise control of gene expression is crucial for the study of gene function and our ability to engineer bacteria. Promoters are essential regulators of gene expression, playing a key role in the modulation of cellular responses to various stimuli and mediating rates of transcription. Promoters can be broadly differentiated into constitutive promoters and inducible promoters. Constitutive promoters are unregulated by transcription factors and express at a constant rate. Transcription from constitutive promoters is dependent on the sequence of the promoter and the growth of the cell. Inducible systems consist of an allosteric transcription factor that binds regulatory DNA near the controlled promoter to change the rate of transcriptional initiation. Inducible promoters can be regulated across a dynamic range of transcription levels, making them powerful tools in bacterial engineering. Synthetic biology applies engineering principles to model and design genetic systems. In this context, promoters are considered modular genetic parts that can be utilized in synthetic systems and assembled into more complex pathways. To be effective genetic tools, promoters should have characterized, predictable activity and be transferrable between species. However, most genetic systems are optimized for use in Escherichia coli and prior to this work, no easy-to-use genetic toolbox was available for controlled gene expression in a wide range of species. Further, one of the most widely used gene expression systems in E. coli, the T7 system, is difficult to move between species and its activity can be unpredictable. At a fundamental level, it is currently not possible to predict promoter function from its sequence nor how activity will change when the promoter is moved to a different host. In the work presented here, we develop a broad-host-range plasmid toolbox that enables the identification of well-regulated promoters in nine species across three classes of Proteobacteria. We then expand this toolbox into E. coli and compare the expression from our inducible promoters to the canonical T7 expression system. Finally, we utilize a previously designed library of synthetic constitutive promoters to measure activity across Proteobacterial species to explore relationships between sequence, function, and phylogeny among the closely related species.

Record Information

Source Institution:
Rights Management:
All applicable rights reserved by the source institution and holding location.
Embargo Date:


This item has the following downloads:

Full Text




© 2022 Layla A. Schuster


To my family


4 ACKNOWLEDGMENTS First and foremost, I would like to express my deepest gratitude to my advisor, Dr. Chris Reisch , for his guidance , patience, and enthusiasm in my research and writing efforts . I will always appreciate the time and care he took to answer my scientific questions . I would also like to thank the other members of my committee ; Drs. Val é rie de Cr é cy Lagard, Kelly Rice, Steve n Bru n er , and Raquel Dias. I would like to especially thank Dr. Raquel Dias , who joined my committee in my last semester and gave her time and energy to help me edit this dissertation. I would also like to thank the members of the Reisch lab for their immeasurable support through the years. Specifically , I would like to thank Dr. Lidimarie Trujillo Roderiguez, Adam Ellington, C atalina Mejia, Dr. Chanel Mosby Haundrup, Dr. Cl á udia Alves , and Dr. Karla Franco. Their helpfulness and enthusiasm created a healthy, productive, and fun lab environment that I will miss. In addition to those in my lab, I would like to thank Autumn Dove, Jill Babor, and Dr. Mark Gorelik , members of my graduate cohort who became very close friends . On a per s onal level, I would like to thank my wonderful partner, Tom, for being supportive through every up and down. I am also grateful for my dog , Bernie , and cat, Sylvester, for providing amusement and love , and my robot vacuum for keeping me sane. Last but not least, I would like to thank my family , including my dad, Brian, my mom, Layali, and my brother , Adam, for their unending love and support .


5 TABLE OF CONTENTS page ACKNOWLEDGMENTS ................................ ................................ ................................ .. 4 LIST OF TABLES ................................ ................................ ................................ ............ 8 LIST OF FIGURES ................................ ................................ ................................ .......... 9 LIST OF OBJECTS ................................ ................................ ................................ ....... 11 LIST OF ABBREVIATIONS ................................ ................................ ........................... 12 ABSTRA CT ................................ ................................ ................................ ................... 13 CHAPTER 1 LITERATURE REVIEW ................................ ................................ .......................... 15 Synthetic Biology as an Emerging Field ................................ ................................ .. 15 Principles of Synthetic Biology ................................ ................................ ......... 15 Applications of Synthetic Biology ................................ ................................ ...... 16 Techniques for Cloning and Plasmid Assembly ................................ ............... 17 Promoters as Tools ................................ ................................ .......................... 21 Constitutive Promoters in Synthetic Biology ................................ ........................... 22 Transcriptional Initiation ................................ ................................ ................... 22 Mo dular Structure of 70 dependent Promoters ................................ ............... 24 Investigating the Fundamentals of Gene Expression Through Synthetic Promoter Toolboxes ................................ ................................ ...................... 27 Inducible Systems in Synthetic Biology ................................ ................................ ... 28 Towards the Ideal Inducible Sy stem ................................ ................................ . 30 The lac Operon ................................ ................................ ................................ . 31 AraC and RhaS RhaR regulated systems in Secondary Carbon Metabolism ................................ ................................ ................................ .... 33 Quorum sensing Systems ................................ ................................ ................ 36 Inducible Systems for Antibiotic Resistance ................................ ..................... 38 Hybrid Inducible Promoters ................................ ................................ .............. 39 The T7 Expression System ................................ ................................ .............. 41 Desi gn and application ................................ ................................ ............... 41 Drawbacks and alternative solutions ................................ .......................... 42 Rationale of Presented Studies ................................ ................................ .............. 44 2 A PLASMID TOOLBOX FOR CONTROLLED GENE EXPRESSION ACROSS THE PROTEOBACTERIA ................................ ................................ ....................... 50 Introduction ................................ ................................ ................................ ............. 50 Materials and Methods ................................ ................................ ............................ 52


6 Plasmid Construction ................................ ................................ ....................... 52 Plasmid Transformations ................................ ................................ .................. 52 Fluorescence Assays ................................ ................................ ....................... 54 Antibiot ic Assays ................................ ................................ .............................. 56 Library Construction and Screening ................................ ................................ . 57 Violacein Experiments ................................ ................................ ...................... 58 Results ................................ ................................ ................................ .................... 59 Plasmid Design ................................ ................................ ................................ 59 Plasmid Screens Across the Proteobacteria ................................ .................... 60 Assessment of Context Dependence ................................ ............................... 64 Controlled Expression of a Physiologically Relevant Gene .............................. 67 Promoter Libraries Enable Varied Dynamic Range ................................ .......... 68 Discussion ................................ ................................ ................................ .............. 71 3 PLASMIDS FOR CONTROLLED AND TUNABL E HIGH LEVEL EXPRESSION IN E. COLI ................................ ................................ ................................ .............. 85 Introduction ................................ ................................ ................................ ............. 86 Materials and Methods ................................ ................................ ............................ 89 Plasmi d Construction and Transformation ................................ ........................ 89 Fluorescence Assays ................................ ................................ ....................... 90 Stability Screen Experiments ................................ ................................ ............ 91 Lycopene Experiments ................................ ................................ ..................... 92 Results ................................ ................................ ................................ .................... 93 Plasmid Design, Tunability, and Context dependence ................................ ..... 93 Expression Stability Over Time ................................ ................................ ........ 96 Growth Medium Dependent Properties ................................ ............................ 98 Multi plasmid Strains ................................ ................................ ...................... 100 Lycopene Production Comparison ................................ ................................ . 104 Discussion ................................ ................................ ................................ ............ 106 4 CHARACTERIZING CONSTITUTIVE PROMOTERS ACROSS THE PROTEOBACTERIA ................................ ................................ ............................. 119 Introduction ................................ ................................ ................................ ........... 119 Materials and Methods ................................ ................................ .......................... 121 Library Construction and Transformations ................................ ...................... 121 Library Picking ................................ ................................ ................................ 123 Library Screening ................................ ................................ ........................... 124 Promoter Mapping ................................ ................................ .......................... 124 Data Analysis ................................ ................................ ................................ . 125 Promoter Transferability Experiments ................................ ............................ 127 Results ................................ ................................ ................................ .................. 127 Constitutive Promoter Library Screens ................................ ........................... 127 Conservation of Core Promoter Sequences ................................ ................... 129 The Extended 10 Element ................................ ................................ ............ 132 The Spacer Region ................................ ................................ ........................ 134


7 Characterized Promoters with Varying Lengths ................................ ............. 137 Promoter Transferability ................................ ................................ ................. 138 Discussion ................................ ................................ ................................ ............ 142 5 CONCLU SIONS AND FUTURE DIRECTIONS ................................ .................... 162 LIST OF REFERENCES ................................ ................................ ............................. 168 BIOGRAPHICAL SKETCH ................................ ................................ .......................... 202


8 LIST OF TABLES Table page 1 1 Inducible systems and their regulation ................................ ............................... 4 9 2 1 Strains in vestigated in this study ................................ ................................ ........ 76 2 2 Primers used in this study ................................ ................................ .................. 84 3 1 List of E. coli BL21(DE3) strains tested in Figure 3 5 ................................ ....... 115 3 2 Primers used in this study ................................ ................................ ................ 118 4 1 Strains investigated in this study ................................ ................................ ...... 148 4 2 Strain growth conditions ................................ ................................ ................... 149 4 3 Transformation conditions ................................ ................................ ................ 150 4 4 Promoter categorization ................................ ................................ ................... 151


9 LIST OF FIGURES Figure page 1 1 Interactions between E. coli RNAP holoenzyme and a 70 dependent promoter ................................ ................................ ................................ ............. 46 1 2 Inducible systems controlled by negative, positive, and both negative and positive regulation ................................ ................................ ............................... 47 1 3 Canonical design of the T7 expression system ................................ .................. 48 2 1 Plasmid toolbox assembly scheme and nomenclature ................................ ....... 77 2 2 Experimental outline and induction scr een results ................................ .............. 78 2 3 Induction range of 12 expression systems in nine Proteobacteria ...................... 79 2 4 Measurement of mRFP at titrated inducer concentrations ................................ .. 80 2 5 Conditionally essential gene to measure tightness of repression ....................... 82 2 6 Expression of total library and select library isolates ................................ .......... 83 3 1 Expression range of 28 plasmids in E. coli ................................ ....................... 111 3 2 Expression across titrated inducer concentrations ................................ ........... 112 3 3 Stability over extended passages ................................ ................................ ..... 113 3 4 Comparison between T7 lac and toolbox regulated promoters in E. coli BL21(DE3) ................................ ................................ ................................ ........ 114 3 5 Expression data from induction experiments of single and multi plasmid strains of E. coli BL21(DE3) ................................ ................................ ............. 116 3 6 Lycopene production in two strains of E. coli ................................ .................... 117 4 1 Phylogenetic tree of species included in the study ................................ ........... 152 4 2 Range of expression from promoter libraries screened in each species .......... 153 4 3 Weblogos of core hexamers from high expression promoter groups ................ 154 4 4 Logos of non consensus 10 elements from high activity promoters in the Alphaproteobacteria ................................ ................................ ......................... 155 4 5 Violin plots of GC content of full spacer region for 15 species .......................... 156


10 4 6 Violin plots of GC content of proximal spacer region for 15 species ................. 157 4 7 High expressing promoters without clear core elements ................................ .. 158 4 8 Promoter transferability screens ................................ ................................ ....... 159 4 9 Sequences included in promoter transferability experiments ........................... 160 4 10 Phylogenetic tree constructed from ............ 161


11 LIST OF OBJECTS Object page 2 1 Supplementary information for A Plasmid Toolbox for Controlled Gene Expression Across the Proteobacteria, PDF 3.2 MB ................................ .......... 75 3 1 Supplementary information for Plasmids for controlled and tunable high level expression in E. coli , PDF 1.7 MB ................................ ................................ .... 110 4 1 Table of promoter sequences for all species included in this work. PDF, 662 KB ................................ ................................ ................................ ..................... 129 4 2 Weblogos for all species included in this work. PDF, 728 KB ........................... 129 4 3 KpLogos for all species included in this work. PDF, 850 KB ............................ 130


12 LIST OF ABBREVIATIONS AI Artificial intelligence ATc Anhydrotetracycline bp Base pair cAMP Cyclic adenosine monophosphate CCR Carbon catabolite repression CFU Colony forming units CPEC Circular polymerase extension cloning CRISPR Clustered regularly interspaced short palindromic repeats CRP cAMP receptor protein DNA Deoxyribonucleic acid GOI Gene of interest IPTG d 1 thiogalactopyranoside LB Lysogeny broth, Luria Bertani MPRA Massively parallel reporter assay mRNA Messenger RNA OC6 N (3 oxohexanoyl) homoserine lactone OD Optical density PCR Polymerase chain reaction RE Restriction enzyme RFU Relative fluorescence units RNA Ribonucleic acid RNAP RNA polymerase


13 Abstract of Dissertation Presented to the Graduate School of the University of Florida in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy PROMOTER TOOLKITS TO STUDY AND CONTROL GENE EXPRESSION ACROSS THE PROTEOBACTERIA By Layla A. Schuster August 2022 Chair: Raquel Dias Cochair: Christopher R. Reisch Major: Microbiology and Cell Science The precise control of gene expression is crucial for the study of gene function and our ability to engineer bacteri a. Promoters are essential regulators of gene expression, playing a key role in the modulation of cellular responses to various stimuli and mediating rates of transcription. Promoters can be broadly differentiated into constitutive promoters and inducible promoters. Constitutive promoters are unregulated by transcription factors and express at a constant rate. Transcription from constitutive promoters is dependent on the sequence of the promoter and the growth of the cell. Inducible systems consist of an al losteric transcription factor that binds regulatory DNA near the controlled promoter to change the rate of transcriptional initiation. Inducible promoters can be regulated across a dynamic range of transcription levels, making them powerful tools in bacter ial engineering. Synthetic biology applies engineering principles to model and design genetic systems. In this context, promoters are considered modular genetic parts that can be utilized in synthetic systems and assembled into more complex pathways. To be effective genetic tools, promoters should have characterized, predictable activity and be


14 transferrable between species. However, most genetic systems are optimized for use in Escherichia coli and prior to this work, no easy to use genetic toolbox was ava ilable for controlled gene expression in a wide range of species. Further, one of the most widely used gene expression systems in E. coli, the T7 system, is difficult to move between species and its activity can be unpredictable. At a fundamental level, it is currently not possible to predict promoter function from its sequence nor how activity will change when the promoter is moved to a different host. In the work presented here, we develop a broad host range plasmid toolbox that enables the identificatio n of well regulated promoters in nine species across three classes of Proteobacteria. We then expand this toolbox into E. coli and compare the expression from our inducible promoters to the canonical T7 expression system. Finally, we utilize a previously d esigned library of synthetic constitutive promoters to measure activity across Proteobacterial species to explore relationships between sequence, function, and phylogeny among the closely related species.


15 CHAPTER 1 LITERATURE REVIEW Synthetic Biology as an Emerging Field Principles of Synthetic Biology Synthetic biology leverages an engineering perspective to uncover the design principles of natural systems (1) . Through the modification of endogenous pathways and the rational design of new genetic circuits, the genera l objective is to make cells programmable (2) . In this engineer ing framework, biological systems are broken down into discrete functional blocks that can be reassembled into novel pathways. Synthetic biology is focused on four main tenets : modularity, orthogonality, characterization, and standardization (3, 4) . Modularity describes genetic parts that can be efficiently assembled into larger systems while retaining their activity independent of changing contexts (5 8) . A system is orthogonal when it does not interfere or cross react with existing cellular processes or other engineered elements in the host (9, 10) . Characterization involves defining the parameters of genetic part or system behavior so that it functions in a predictable way (3, 11) . Finally, the standardization of genetic parts and methodologies supports the goals of synthetic biology by improving communication, reproducibility, and compatibility (3, 12, 13) . Altogether, the principles of synthetic biology unde rline the necessity of predictable genetic parts and systems to effectively engineer bacteria. Cells become more programmable when the elements of their regulatory systems are precisely characterized genetic parts that can be modified and repurposed in syn thetic networks (14) . In building these pathways, synthetic biologists take a bottom


16 complex biomolecular systems (15, 16) . Genetic parts and assembly methods such as ( 17, 18) . While predictable behavior of novel systems relies on the modularity of the genetic parts involved, this is far from guaranteed in living organisms. Adhering to engineering principles while working in biological systems helps to separate the fu nction of genetic elements from biological noise (19) . In this way, researchers can mix and match genetic parts within a system and into different hosts to best suit the needs of a given experiment (15, 20) . The bacterial hosts of synthetic biology are often referred to as microbial chassis because the organism is understood as the supporting frame for harboring synthetic elements and circuits within its molecular machinery (21) . E . coli is a model organism in microbiology and most genetic tools are optimized for use in an E. coli host (22) . Additionally, this species has operated as a microbial workbench for the development of tools that are destined for other hosts (23, 24) . Here, the tenets of synthetic biology are essential to ensure that engineered systems can be moved and u tilized effectively in the target bacterial chassis. Without standardized parts that are characterized across genetic backgrounds, researchers are limited to ad hoc system construction that can waste time and resources and importantly, lacks transferabilit y to function in a different context. Applications of Synthetic Biology Synthetic biology is translatable across a range of applications in medicine, agriculture, and sustainability. As early as the 1980s, bacteria have been engineered to produce importa nt pharmaceutical ingredients including insulin and monoclonal antibodies (25 27) . Genetically engineered viruses have been used to correct defective


17 genes in certain inherited diseases (28, 29) and cell lines have been engineered to act as in vivo diagnostic tools to sense disease states and generate a therapeutic response (29, 30) . In agriculture, synthetic biology has been utilized to improve carbon efficiency in p lants (31) , decrease the use of nitrogen fertilizers ( 32) , and increase crop nutritional value through initiative like the Golden Rice Project (33) . Synthetic biology is also enhancing bioremediation capabilities through recombinant DNA technology, generating rhizospheric and endophytic bacteria l strains that can break down toxic organic compounds (34, 35) . Much of the deployment of synthetic biology for these applications is dependent on the ability to genetically engineer diverse bacterial hosts (22, 25) . Species that have adapted to thrive in harsh physiochemical conditions and those that possess versatile metabolisms are particularly valuable (25) . In other words, synthetic biology becomes (29) . For example, Agrobacterium fabrum possesses the natural ability to transfer DNA into plant genomes, making it an important host for genetic manipulation experiments (36, 37) . Pseudomonas putida is utili zed as a biotechnological tool due to its ability to degrade organophosphates, pyrethroids, and carbamates (38) . P. putida has also shown encouraging results in soil remediation studies (39, 40) . Still, the development of genetic tools for non model organisms is an emerging field. There is a growing demand for standardized engineering toolsets that are characterized across species so that these novel hosts can be investigated efficiently and effectively. Techniques for Cloning and Plasmid Assembly Using recombin ant DNA technology to construct synthetic systems relies on the ability to effectively join genetic parts in new combinations. Restriction endonucleases,


18 more commonly known as restriction enzymes (REs), play a fundamental role in DNA manipulation. Though their primary biological role is to protect the host genome against foreign DNA, their utility as a tool for recombin in g and cloning DNA was recognized only a few years after their discovery (41) . This was spurred by reports that some REs ma k e the DNA that are identical and complementary and, w ith the addition of DNA ligase, DNA can be recombined from different genomes to generate hybrid molecu les (42) . These enzymes are now divided into type I, II, III and IV DNA restriction modification systems, with type II being the most relevant to current DNA assembly techniques (42) . Type II REs have been used for recombinant DNA construction from the 1970s to the current day, with the BioBrick (43) and SEVA (44) platforms utilizing REs in their plasmid assembly workflows. Though it continue s to be widely used, RE cloning has a number of drawbacks; protocols are low efficiency, labor intensive , time consuming, and importantly, rely on the availability of restriction sites in specific locations in the cloning vector and gene of int erest (GOI) (45) . This necessitates the removal of interfering restriction sites within the vector and GOI so that DNA is not cut in unintended locations , hampering the usability of vector platforms that use RE cloning. F or example, any new genetic part added to the SEVA platform must not contain 20 different restriction sites which would interfere with REs that are part of the assembly system (44) . The process of re moving RE recognition sites from genetic parts prior to cloning is referred to as part domestication (46) and can be a cumbersome step in a RE cloning protocol . Recently,


19 novel approaches to cloning and genetic part assembly have been developed that are highly efficient , us ing a different subtype of REs or independent o f REs entirely. Golden Gate cloning is a technique for assembling DNA in a single restriction ligation reaction (47) . The st rategy was developed as an alternative to site specific recombination, which rel ies on a recombinase enzyme to rearrange and join strands of DNA and leaves behind a recombination sequence in the final construct. To avoid this scar site, Golden Gate cloning utilizes type IIs REs, which cleave DNA at a fixed distance outside of their recognition site. When the RE recognition sites are placed outside of the cleavage site relative to the GOI or destination vector sequence, the recognition site is eliminated in the final construct. This allows restriction and ligation to occur in the same reaction and is unidirectional, greatly increasing the efficiency . This strategy was used in the Modular Cloning (MoClo) platform, where multigene constructs are assembled in a hierarchical fashion (48) . Though Golden Gate cloning is highly efficient, it still requires genetic part domestication to remove recognition sites of the REs used in cloning. Around the same time Golden Gate cloning was described, two alternative sequence independent methods for cloning were developed that utilize the activity of DNA polymerase rather than REs . Circular polymerase extension cloning, or CPEC, uses a single polymerase to assemble and clone DNA fragments into any vect or in a single reaction (49) . Here, the vector an d GOI share overlapping regions at their ends and the method relies on polymerase extension to extend the double stranded overlap to form a complete plasmid. The protocol for CPEC is the same as that for a one cycle PCR using a high fidelity DNA polymerase . After the DNA is denatured in the first step


20 of the reaction , the vector and GOI DNA hybridize in the annealing step and then polymerase extension occurs using the GOI and vector as template for each other until the full plasmid circle is completed. This leaves a nick in each strand which are sealed when the plasmid is introduced into E. coli . CPEC is effective for single gene cloning, library cloning, and multi fragment cloning, though recommended PCR cycles increase depending on library complexity and n umber of fragments to be assembled . CPEC is convenient and economical as it only requires a single enzyme that most molecular biology labs have on hand. Though, polymerase derived mutations can still be a problem and mis priming is possible anywhere along the GOI and vector sequence. A DNA assembly method developed by Gibson et al., referred to as Gibson cloning or Gibson assembly, is another sequence independent method for assembling multiple fragments of DNA (50) . The technique uses the combined activities of three DNA in a one step isothermal reaction. Here, the DNA fragme nts for assembly possess shared of the reaction to reveal complementary single stranded overhangs. The DNA polymerase then fills in the gaps and DNA ligase seals the nicks. The reaction takes place at a constant temperature of 50°C as the exonuclease is heat labile and will not compete with DNA polymerase activity. All three enzymes are also unable to process circular DNA and so plasmid DNA is enriched during the reaction. The or iginal description of Gibson assembly found that the overhangs could be as short as 40 bp and our own work has found that these overhangs can be even shorter (51) . Though the mixture of enzymes is costly, Gibson assembly had the highest rate of success


21 compared to other DNA assembly methods in our hands and was used to assemble the majority of plasmids presented in the work included herein. Promoters as T ools Promoters are key regulatory elements of gene expression in bacteria. They can be grouped into inducible and constitutive promoters where inducible promoters are regulated by a transcription factor to change the rate of transcriptional initiation (41) and constitutive p romoters allow continuous expression of the associated gene (42) . Understanding, modifying, and controlling promoters is the focus of a large body o f work in synthetic biology. To be useful tools, it is essential that promoters have reliable and predictable behavior (11) . Promoters that are native to one species are often reutilized in the same host or moved to a different species to reprogram a genetic circuit, but this can have undesirable effects. Cryptic regulatory elements and a lternative transcriptional start sites can yield inconsistent levels of gene expression and tie promoter activity to host endogenous transcriptional regulation (54 56) . Natural promoters are usually context dependent, meaning their transcription is affected by surrounding genetic elements (57, 58) . For these reasons, synthetic promoters are often utilized for investigating , modifying, and tuning gene expression (44) . For increased orthogonality, g enetic insulators like RiboJ are often included in the design of synthetic promoters , decreas ing the effect of context dependence and standardiz ing promoter output in different genetic backgrounds (7) . While cons truction of a single synthetic promoter is sometimes done to improve or change a natural promoter (48) , more often they are developed and screened as libraries. Libraries of synthetic promoters with a wide range of expression levels are immensely valuable tools, allowing the user to choose the promoter with an output level


22 most appropriate for their system. I dentifying promoters with very high activity is useful for overexpression in hosts engineered as microbial cell factories (60) . Additionally, promoters with mid to low level expression are useful to match physiological expression levels in complementation studies or ba lance metabolic flux (51, 56, 61) . Synthetic promoter libraries are mainly generated in three ways: i) through PCR amplification with primers containing degenerate bases at targeted locations (11, 14, 62) , ii) through mutagenic PCR (63, 64) , or iii) by mixing and matching predefined promoter elements (11, 55, 56, 65) . These libraries have been used to optimize metabolic flux (66, 67) , improve the dynamic range of inducible systems (55, 68, 69) , provide toolkits for community use (11, 61, 70) , and to investigate the effect of promoter elements on overall expression (65, 71) . Constitutive Promoters in Synthetic Biology Transcriptional Initiation Transcription drives the expression of genetic information and transcriptional initiation is the most well studied step in the regulation of gene exp ression (72, 73) . The core enzyme of bacterial transcription is the multiunit RNA polymerase (RNAP), 2 (74, 75) . While core RNAP is capable of RNA polymeriza specific recognition and binding (76) . Similarly, of binding and unwinding promoter DNA (75) until associated with RNAP in a complex called the holoenzyme (75, 77, 78) . Bacteria transcription from housekeeping genes duri ng exponential growth (79, 80) . In E. coli , 70 (80) 70 dependent promoters includes the


23 rs making up the core promoter, the spacer region, the UP 1) (81 83) . Transcriptional initiation by the RNAP holoenzyme is a multi step process. The holoenzyme first binds to the promoter in a closed complex (RPc), driven by interactions between RN 70 and the promoter while the DNA is in a fully double stranded, or closed, state (84) . RPc is stabilized by interactions between the C terminal domains of the RNAP subunit factor and core promoter elements (Figure 1 1) (77, 85) . The RNAP enzyme has a crab ive site cleft (84) . The channel is too narrow to fit double stranded DNA but it can accommodate the DNA template and DNA RNA hybrid that forms during transcription (86) . With the holoenzyme positioned on the promoter in the closed complex, the DNA bends to move into the active site cleft and the process of DNA melting begins. For transcription initiation to occur, R 70 must undergo major conformational changes, or isomerization, to transition to the open state (RPo), which includes unwinding promoter DNA. There are three intermediate steps between RPc and RPo, including bubble nucleation, early melting, and bubbl e propagation (81) . The stages of RPo formation correspond to opening of the claw and movement of DNA into the Specifically, the highly conserved non template 70 (87, 88) . This initiates the melting process and stabilizes DNA strand separation (87, 88) . In early melting, four to five base pairs are unwound and the enzyme claw shifts to


24 accommodate the transcription bubble while additional interactions with unpaired DNA bases stabi lize the complex (81, 89) 70 sits in the primary channel but during bubble propagation, the region is ejected. The complex now tran sitions to RPo where the DNA around the transcriptional start site is fully separated and the +1 nucleotide is in the active site ready for base pairing (90) . Modular Structure of 70 dependent Promoters Constitutive promoters are unregulated by transcription factors and express at a constant rate in the cell (53) 70 initiates transcription from mo st bacterial promoters during log 70 dependent promoters are among the best studied and most well understood (75, 91) . The 70 promoters are generally divided into discrete, modular sequence elements (Figure 1 1) and these elements cooperatively interact with the holoenzyme to determine the level of expression (65, 81 83) . Analogou 70 is a highly modular protein that can be segmented into four domains and additional regions based on function and sequence conservation (79, 80) 70 interact directly with .0 interacts 1) (83, 87, 92 95) . In E. coli 70 TTGACA ( 35 hexamer) spacer (17 ± 2 bp) TATAAT ( 10 hexamer) (96) . In general, promoters with core elements at or near the consensus sequences have higher levels of expression. Though, in E. coli , expression is the highest when one but not both of the core elements match consensus (65) . Specifically, it has been shown that promoter activity increases as the 10 element approaches consensus but the strongest


25 consensus by a single nucleotide (65) . The spacer region is another important element transcriptional initiation (78, 97, 98) . While it does not have a defined consensus sequence (99) , higher GC content is associated with lower promoter output (65) . Additionally, the spacer plays an important role in the to pology of promoter structure. The optimal spacer length is 17 ± 2 bp in E. coli (97, 100) which places the center of the (97, 101) . This corresponds 70 and enables simultaneous contact 70 to initiate transcription (69, 97, 102, 103) . 70 to the promoter (104) . UP elements consist of AT rich regions between the 6 4 0 positions relative to the transcriptional start site and RNAP through minor groove DNA interactions (105, 106) . Though UP elements are not required for promoter activity, their presence can benefit weaker promoters with non consensus core element s (107, 108) and they are most prevalent in ribosomal RNA and protein promoters (97) . to 1 3 (109) . Similar to UP elements, promoters with core hexamer seque nces that rescue activity (102) . Interestingly, while a single TGn motif is correlated with high expression , the presence of tand em TG motifs centered around position 16 dampens promoter activity (97) . Rather than weakening the promoter, t his reduced activity is likely due to increased 70 binding affinity such that it is immobilized at


26 the promoter , analogous to the lower expression observed from some promoters with core sequences at consensus (65, 110) . (65) , assigning promoter strength based only on proximity to consensus would be an oversimplification (102, 111) . As discussed above, it has been well established that 70 contacts can make up for elements with weaker interactions (65, 10 2, 112) . Promoters do not require all elements to be active; further, the presence of both elements of the core promoter is not a requirement for function (102) (82, 83, 113) (98) (65, 102) . Indeed, a recent study demonstrated that sequence recognition by transcriptional machinery is surprisingly permissive. Here, the researchers found that a single mutation in a 100 bp string of ran dom sequences could lead to substantial promoter activity when the downstream genes conferred a fitness advantage (115) . 70 and its cognate promoter has encouraged mixing, matching, and removing elements as a strategy to study sequence function relationships (102) . Mixing n atural and synthetic elements generate s hybrid promoter s that can achieve targeted expression levels or are used to better understand how the elements contribute to overall strength (116 119) . chimeras have been constructed to identify protein DNA interactions (120, 121) and to (122, 123) . Overall, the modular 70 and 70 dependent promoters make them readily accessible for engineering and constructing genetic systems (124, 125) .


27 Investigating the Fundamentals of Gene Expression Through Synthetic Promoter Toolboxes Despite dec ades of research on promoters, there are still fundamental gaps in our understanding of the relationship between sequence and activity. Consensus sequences of core promoter elements in E. coli are well established and mutation of core sequences towards con sensus is a commonly employed method to increase activity in weaker promoters (126) . Though, promoter function is determined by more (65, 127) . Indeed, complex interactions between the UP element, promoter, and background promoter sequence all contribute to promoter function (65) . Early attempts to det ermine the contribution of these elements to overall promoter output involved characterizing the parts individually. Here, researchers leveraged the 70 dependent promoters to swap element variants into the same promoter context to determine how each affected activity (109, 118, 128) . These st udies demonstrated that AT rich spacer sequences, UP elements containing A tracts, TGn While the piecewise dissection of promoter elements has contributed to our understand ing of promoters, it does not account for higher order relationships between elements. For this reason, current methods for promoter sequence activity investigations test element variants systematically across a variety of promoter backgrounds. In practice , this involves constructing and screening libraries of thousands of promoters and quantifying the activity of each variant. This necessitat es the use of high throughput assays such as massively parallel reporter assays (MPRAs) which were developed to test large libraries of genetic variants for transcriptional activity in a


28 single experiment (65, 129) . Here, genetic variants are designed in silico to include barcode sequences within the transcribed region such that in vivo mRNA output can be linked to a specific variant and quantified to determine activ ity in downstream RNA seq data analysis (130) . Recently, Urtecho et al. analyzed the elements of E. coli 70 promoters to determine the contribution each plays to overall promoter strength, employing a MPRA to explore transcriptional activity with RNA seq (65) . The researchers constructed a library of over 10,000 promoter variants and screened discrete sequence elements in every combination. They then used the expression data in E. coli to train a neural network to predict promoter strength and identify higher order relationships between the elements. Their data suggested that 74% of the variance in promoter strength comes from the 35 and 10 hexamers and that substantial unexplained va riance results from more complex interactions among other sequence elements (65) . Further analysis suggested that while the UP element and spacer reg ion contribute independently to promoter strength, there is an avidity between the 10 and 35 hexamers (131) . This and related studies (71, 132) are at the forefront of promoter research, utilizing artificial intelligence for promoter prediction based on comprehensive datasets of promoter sequence function relationships. Inducible Systems in Synthetic Biology Inducible promoters are one of the many regulatory mechanisms that organisms have evolved to sense and respond to changes in their intra and extracellular environments (52) . Within inducible promoter regulator pairs, transcription factors link stimulus to response by binding a metabolite or other sma ll molecule inducer to activate or repress transcriptional initiation. Transcription factors can modulate expression


29 through negative regulation, positive regulation, or a combination of both (Figure 1 2, Table 1 1) (133) . Negative regulation involves transcriptional repressors that bind to re gulatory DNA at or near the controlled promoter, called operator sites, to inhibit the activity of RNAP (134) . In the absence of inducer, transcription remains in the off state. When inducer is added to the system, it binds allosterically to the repressor and the resulting conformational change prevents the repressor from binding to the operator sit es, allowing transcription to occur (133, 135) . Positive regulation involves a transcriptional activator that, when bound to inducer, facilitates the binding of RNAP to the promoter and stimulates expression (136 138) . Inducible systems are crucial for bacteria to regulate growth stage dependent physiological responses (139 141) , develop antibiotic resistance (135, 142, 143) , coordinate intercellular communication (144 146) , and to preferentially metabolize substrates that enable rapid growth (147 149) . Inducible systems are powerful tools for under standing bacterial physiology, gene function, and genotype phenotype relationships (68, 150) . Usually, a recombinant gene or operon is placed under the control of an inducible promoter on a multicopy plasmid to be transformed into the bacterial host of interest (136) . While work has mainly focused on model organisms like E. coli , Bacillus subtilis , and Saccharomyces cerevisiae , research is rapidly expanding into non model species both for foun dational molecular biology studies and as novel chassis in genetic engineering (21, 151 154) . In the context of synthetic biology, inducible promoter regulator pairs are utilized in complementation studies (155 158) , for the overexpression of heterologous proteins


30 (159 161) , and to construct larger metabolic pathways to produce value added end products (68, 162 164) . Towards the Ideal Inducible System Inducible systems allow for multiple avenues of control over gene expression. Inducible promoters are turned on with the addi tion of inducer, enabling temporal control over gene output (52, 69) . This is important in coordinating expression to a certain growth phase for protein overproduction (165) and metabolic engineering studies (164, 166) . Ideally, the regulated promoter is tightly controlled over a gradient of expression levels, enabling a dynamic range of gene output. While highly active promoters are valuable in biotechnological applications, promoters with medium and low activity are also necessary when the goal is to control flux in metabolic pathways or match physiological levels of gene expression (136, 167 170) . Tuning promoter output is another layer of control over gene expression. Inducible systems that are tunable have a positive correlation between expression and inducer concentration, enabling the user to vary the expression level from a system and avoid overburdening the host (171) . While an ideal inducible system pr obably does not exist, decades of work has focused on improving existing promoter regulator pairs and constructing novel inducible systems. While a large dynamic range of expression is optimal, there is often a trade off between a tight off state and high level expression in the on state. Leakiness, or expression in the absence of inducer, is a persistent problem for many systems (171 173) . Most st rategies for developing tightly controlled inducible systems involve engineering regulatory elements such as the operator sites, the regulated promoter, and the promoter of the transcription factor to generate one with the desired characteristics (174) . Engineering can be either through rational design, with hybrid systems


31 constructed from mixed promoter and transcription factor binding elements, or random mutagenesis, with synthetic pro moter libraries to identify an improved variant (64, 174 176) . Recently, an impressive work by Meyer et al. engineered 12 inducible systems for improved dynamic range and orthogonality in E. coli (68) . Here, the regulator proteins are integrated into the genomes of three strains of E. coli , MG1655, DH10B, and BL21, to generate a wild type, cloning, and pro tein expression strain, respectively . All of the output promoters are contained on a single plasmid. Theoretically, all 12 systems can be used simultaneously in the host as the promoter regulator pairs are engineered for minimal cross reactivity. To demons trate multi system capabilities, the researchers successfully utilized their system in a five enzyme lycopene pathway where each gene was controlled by a different promoter regulator pair . The tunability of their systems enabled them to vary the expression level of each gene to arrive at the optimal combination to maximize lycopene titers, all without the need for additional plasmid construction. The development of the Marionette system is arguabl y the greatest expansion of the toolset available for recombi nant gene expression in E. coli to date. The lac Operon Regulation of the lac operon is a classic example of a native inducible system (177, 178) . The operon consists of three genes required for the transport and metabolism of lactose in E. coli and other enteric bacteria (179, 180) . In general terms, regulation of the lac operon is a form of cellular decision making where, in the presence cellular resources to metabolize other, lesser preferred sugars (147, 180) . In the presence of a mixture of different carbon sources, most bacteria can sequentially utilize carbohydrates according


32 to a sugar hierarchy (147, 148) . Carbon catabolite repression (CCR) acts as a regulatory mechanism to control and coordinate sugar metab olism systems to make the best use of available carbon sources. Where glucose is the preferred carbon source, CCR works to prevent the metabolism of secondary carbon sources when glucose is present and enable the utilization of other sugars when glucose is depleted or otherwise unavailable (149) . To ensure that lactose metabolizing enzymes are only produced when necessary, the lac operon is controlled through me chanisms that sense levels of both lactose and glucose. The lac operon is both positively and negatively regulated and includes sites for activator and repressor binding (179) . In the absence of lactose, the lac repressor protein LacI interferes with RNAP transcription by binding tightly to its operator sites and lac operon genes are transcribed only at low levels (181) . In the presence of lactose, a lactose metabolite called allolactose binds to LacI and the resulting conformation change prevents the repressor from staying bound to the operator, allowing transcri ption to occur. The cell uses glucose to regulate the lac operon indirectly through the signaling molecule cyclic adenosine monophosphate (cAMP) (182) . Glucose levels are inversely proporti onal to concentrations of cAMP in the cell and when glucose is depleted, high levels cAMP bind to the cAMP receptor protein (CRP), which activates the lac promoter. This transition from glucose to lactose utilization is known as the glucose lactose diauxic shift and is a paradigm for preferential carbon source metabolism and inducible regulation (147, 183) . In addition to being one of the best studied mechanism of gene regulation, the lac operon promoter repressor pair was also among the first to be repurposed for


33 biotechnological applications (184, 185) . Though P lac has only moderate activity when derepress ed and activated (186) , its elements have been frequently modified and repurposed to give the system a larger dynamic range (187) . For improved orthogonality and more predictable behavior in different genetic contexts, the regulatory architecture of the lac p romoter has been modified to be activator independent. This can be done by replacing the native lac promoter with the lac UV5 promoter, a mutated variant of P lac 2A). In this way, regulation of LacI/P lac UV5 is focused on LacI repression as activation by CRP is no longer required (188, 189) . For additional control of the system, the allolactose inducer is replaced with a nonmetabolizable analog, IPTG, which remains at a constant level throughout an experiment (190, 191) . LacI/P lac UV5 has been used extensively in E. coli , perhaps the best example of which being the T7 expression system in E. coli BL21(DE3), discussed in detail below. AraC and RhaS RhaR regulated systems in Secondary Car bon Metabolism Systems to metabolize arabinose and rhamnose are also regulated by CCR and similarly reutilized for bacterial engineering (Table 1 1). In E. coli , arabinose metabolism is regulated by AraC, which controls the expression of araFGH and araE , i nvolved in arabinose transport, the poorly characterized araJ (192) , and the araBAD operon for arabinose catabolism (193 195) . AraC is a positive and negative regulator at these promoters and also autoregulates itself (196) . For the purposes of synthetic systems, work has focused on the regulation and reutilization of AraC/P BAD (Figure 1 2C). Here, three AraC binding sites upstream of P BAD , designated I 1 , I 2 , and O 2 , bind the regulator in different configurations to activate or repress transcription (170) . AraC is a dimeric protein and in the absence of arabinose, AraC acts as a repressor, binding the O 2 and I 1


34 sites to loop DNA upstream of P BAD , inhib it ing binding of RNAP. Upon binding arabinose, the dimer undergoes a conformational change and binds I 1 and I 2 , opening the DNA loop and activating transcription. Activation of arabinose metabolism genes is also stimulated by CRP and as such, regulated by CCR (170) . The rhamnose regulons in E. coli include rhaT, encoding the rhamnose transporter, the rhaS and rhaR regulator genes, and the rhaBAD rhamnose catabolism genes. The system is regulated by an induction cascade where, in the presenc e of rhamnose, RhaR is activated to induce the expression of rhaS which also binds rhamnose to activate transcription from P rhaT and P rhaBAD (Figure 1 2B) (197) . The three promoters regulating rhamnose metabolism, P rhaBAD , P rhaSR , and P rhaT , are also subject to catabolite repression and full activation of expression also requires the binding of CRP (198) . The RhaS RhaR/P rhaBAD promoter regulator pair is often repurposed for synthetic systems, though an understanding of the entire reg ulon can be helpful in troubleshooting system failure or adjusting regulatory elements to fit certain experimental needs. While AraC/P BAD and RhaS RhaR/P rhaBAD were adopted early for recombinant protein production, some induction characteristics of these systems are not ideal. The AraC regulated system exhibits stringent and tight regulation, but it is also known to have an all or none response to inducer. This means that at sub saturating levels of arabinose, induction is not homogeneous across the popula tion. Instead, there is a mixture of expressing and non expressing cells, impeding tunability at the cellular level until higher sugar concentrations induce the entire population (199) . This is likely due to regulation of the arabinose transport genes, as replacing P araE and P araFGH with


35 arabinose independent promoters enables a homogeneous response (194, 200) . Similar to AraC/P BAD , the RhaS RhaR regulated system exhibits a bimodal response to inducer (201 203) . Further, it has been demonstrated that in strains able to metabolize rhamnose, P rhaBAD regulated protein production stopped in a dose dependent manner as rhamnose was consumed (203) . An E. coli rhatT deletion mutant strain that was also unable to metabolize rhamnose demonstrated a more tightly regulated response when the system relied on the passive diffusion of rhamnose for inductio n (203) . In both cases, untangling the regulatory structure of sugar transport and metabolism genes enabled more fine tuned and context independent control of expression. Both the RhaS RhaR/P rhaBAD and AraC/P BAD inducible systems have been used extensively in bacterial engineering and are frequently included in toolkits for gene expression in non model hosts (158, 204 209) . The pBAD expression vectors containing the AraC/P BAD promoter regulator pair were developed in 1995 and are widely used across species (170, 210) . Both the arabinose and rhamnose inducible syst ems are often employed when very tight regulation of the recombinant gene is required, as is the case with membrane proteins (210) . The study of membrane proteins is relevant as their malfunctioning is implicated in many human diseases (211) . Obtaining sufficient quan tities for structural and functional studies is primarily done through heterologous expression as membrane proteins are produced at very low levels in native systems (212) . Though, overexpression of membrane proteins in E. coli c auses their aggregation and is toxic to the cell, possibly due to overloading of the Sec dependent translocation machinery (171, 213) . For this reason, both RhaS -


36 RhaR/P rhaBAD and AraC/P BAD are often chosen due to low levels of basal expression and in modified systems, tunable control of expression (171, 214) . Quorum sensing Systems Bacterial quo rum sensing is a method of intercellular communication that utilizes inducible promoter regulator pairs. Quorum sensing systems are regulatory networks that enable bacteria to alter their gene expression in response to cell density (146) . Bacteria perceive cell density through the diffusion of signaling molecules called autoinducers released by nearby cells and as such, extracellular autoinducer concentration increases as cell density increases. When autoinducer concentrations reach a cer tain threshold, coordinated physiological changes are induced and gene expression profiles shift at the population level (215) . Indeed, the term quorum sensing is in reference to the observation that autoinducer signals only accumulate in envi ronments able to support a dense population, or quorum, or bacteria (216) . Induced behaviors include protein secretion (217, 218) , virulence (219, 220) , antibiotic production (221, 222) , conjugation (223, 224) , and biofilm production (225) . These changes allow the population to behave as a multicellular organism, enabling coordinated gene alterations that only occur when bacteria are living in a community (226) . One of the most well studied examples of quorum sensing in bacteria comes from Aliivibrio fischeri , where this regulatory structure enables symbiotic relationships between the bacteria and different eu karyotic hosts (227) . Here, the population density of A. fischeri is t (144) . Eukaryotes such as t he squid Euprymna scolopes and fish Monocentris japonicus have developed specialized organs that allow dense colonization of A. fischeri to grow and


37 produce light. In these organs, the bacteria receive nutrients to support their growth (228) and the host uses light to evade prey (229) , attract a mate (144, 230) , and for other purposes (227) . The symbiosis between Eu. scolopes and this bioluminescent luminescent competent A. fis cheri (231, 232) . A. fischeri uses a LuxI/R type quorum sensing system, as do most other Gram negative bacteria (216, 233) . In these systems, the LuxI enzyme synthesizes th e autoinducer acyl homoserine lactone (AHL), which freely diffuses across biological membranes (234) . As population density increases, either within a light organ o r laboratory flask, AHL concentration builds up until it reaches a certain threshold, at which point it is detected by and binds to the transcriptional activator LuxR (145) . Binding of AHL to the regulator induces a conformational change, en abling the LuxR AHL complex to bind a palindromic sequence of DNA upstream of the controlled promoter called the lux box. This activates expression of the bioluminescence luxICDABEG operon (217) . Operon gene products include LuxI (235) , the bioluminescence enzyme luciferase (236) , enzymes generating luciferase substrates (237) , and a probable flavin reductase (238) . Activation of P luxICDABEG increases both light output and synthesis of AHL, inducing more LuxR to continue the process of autoinduction throughout the population (146) . Quorum sensing systems are targets for programmed gene expression, having been used to construct cell density dependent gene circuits (239) and for high level gene expression independent of population size (68) . The AHL synthase, regulator, and autoinducer have been utilized in the design of genetic Boolean logic gates with quorum


38 E. coli strains that house different gates, carrying the signal along the circuit (240) . Quorum sensing has also been applied in metabolic engineering, specifically to redirect carbon flux from endogenous pathways and effectively transition cultures from growing to producing the desired product when cell density reached the appropriate level (167) . Both the LuxR/P LuxB and CinR AM /P Cin quorum sensing promoter regulator pairs have been used for tightly regulated, dynamic control of gene expression in E. coli independent of population density (68) . Our work expanded on this study by characterizing these systems in nine additional Proteobacteria and our results demonstrated that these systems generated high levels of induced gene ex pression in the majority of the species tested (51) . Inducible Systems for Antibiotic R esistance In addition to systems regulating sugar metabolism and intercellular communication, bacteria can also sense and respond to toxic molecules in their environment. Tet repressor proteins (TetR) are some of the best studied examples of bacterial antibiotic resistance (142) . Tetracycline is a bacteriostatic antibiotic that binds to the bacterial ribosome to halt protein production (241) . The tetracycline resistance system functions by actively pumping the antibiotic out of t he cell through an efflux pump, TetA. Expression of the tet genes is negatively regulated by TetR and in the presence of tetracycline , repression is lift ed and the antibiotic is actively pumped out of the cell (242) . More specifically, the tetR and tetA genes are divergently transcribed a nd overlapping TetR operator sites are located within the intergenic region. A pair of TetR dimers block transcription of both tetR and tetA in the absence of tetracycline. When the inducer is present, it binds the TetR dimers allosterically, causing a con formational change such that TetR can no longer bind DNA, allowing the transcription of both tetA


39 and tetR (243, 244) . Much of the finely tuned expression of this system results from TetR both negatively re gulating TetA and autoregulating its own expression (143) . The TetR/P TetA system naturally possesses qualities that make it well suited for synthetic systems. For the antibiotic resistance mechanism to be effective, the system has evolved to be highly sensitive to its inducer (245) . The system is also very tightly regulated as e ven low levels of TetA confer a fitness cost to the host (246) . In synthetic systems, tetracycline is replaced with anhydrotetracycline ( A T c ), a derivative that is a more potent inducer and less toxic to the cell (247) . TetR/P TetA is commonly included in transcription factor toolboxes across bacterial species (157, 248) and in genetic circuitry (249) and its constituent genetic parts have been utilized in mammalian cells (250) and other organisms (251) . Because TetR family regulators are widely used in synthetic systems, a selection of TetR like systems including TtgR, PmeR, and NalC were chosen for engineering in a recent study. Here, the researchers constructed a promoter library with 10 5 operator site variants that had different affinities for their cognate regulators. Their results were used to generate a suit of inducible promoters with a range of gene expression levels and fed into a machine learning model to identify relationships between operator site sequence and function (69) . Hybrid Inducible Promoters Promoter engineering has been an active field of study since the early 1980s. First identified in a study of lac promoter mutants, the lac UV5 promoter is more active than its parent sequence and catabolite insensitive (Table 1 1) (188, 189) , making it a more valuable genetic part for engineering. In addition to selecting mutants with desirable qualities, researchers have constructed h ybrid inducible promoters. By mixing and matching promoter elements and operator sites, tightly controlled, highly inducible


40 promoters can be made from disparate parts including from bacteriophages (187) . Among the first hybrid inducible promoters was P tac, derived from the trp and lac operon promoters (59) . The lac and trp operons contain genes for the metabolism of lactose and tryptophan, respectively, in E. coli (252) . In the tac promoter, the DNA upstream of trp P lac UV5 (59) . Another lac derived hybrid promoter is P trc , which differs from P tac by one base pair in the spacer region (253) . Accordingly, both P tac and P trc are inducible by IPTG, repres sible by LacI, and 5 11 times stronger than the P lac UV5 (254) . The high activity of the tac and trc promoters comes at th e expense of tight repression and both have high levels of basal expression (255, 256) . Still, they have been utilized for engineering orthogonal circuits (163, 257) , metabolic engineering (258) , and for gene expression in diverse species (254, 259, 260) . Promoters have also been derived from the strong P L promoter of phage lambda. Here, the core promoter elements of P L are kept intact, and the P L operator sites are replaced by tandem operators for LacI or TetR repressors to generate P LlacO 1 and P LtetO 1 , respectively (187) . These hybrid promoters are both tightly repressed and inducible over a large dynamic range of expression, particularly P LtetO 1 . When derepressed with the inducer A T c , the LtetO 1 pr omoter has a dynamic range of 5 , 000 fold and when repressed by TetR, basal expression occurs at a rate of one mRNA molecule per three E. coli cells (187, 245) . This tight regulation has been exploited to conditionally express the Flp recombinase to invert cloned gene s (261) , to construct a transposon to screen protein localization and expression patterns (262) , and for inde pendent tuning of


41 multiple genes (263) . Both P LtetO 1 and P LlacO 1 are used for high level gene expression (264) and are included as parts in the BglBrick expression vectors (265) . Another hybrid promoter developed by the same group is P lac/ara 1 , which contains the binding sites for AraC as well as LacI operator sites. In this way, the lac/ara 1 promoter is regulated by both AraC and LacI as well as their respective inducers, arabinose and IPTG. P lac/ara 1 has been used as a tool to investigate transcription kinetics as the regulators controlling expression affect intermediate steps of transcriptional initiation differently (266 268) . Finally, par ts from the T7 promoter and lac promoter were combined to generate P T7 lac . The T7 lac promoter is a widely used hybrid promoter as part of the T7 expression system in E. coli BL21(DE3). The T7 promoter is a highly active promoter recognized by the bacterio phage T7 RNAP and, when used in a lysogenic protein production strain, this overexpression comes at a high fitness cost the cell (269) . To better control expression, a LacI operator was placed downstream of the T7 promoter to generate P T7 lac (270) . The T7 Expression System Design and application The T7 system is among the most popular expression systems for protein overproduction. Its canonical design involves a chromosomally encoded bacteriophage ith the T7 cognate promoter and target gene contained on a plasmid (126) . E. coli BL21 is the typical protein production strain of E. coli as deficiencies in the Lon and OmpT proteases make it more amenable to recombinant protein expression (271) . The lysogenic strain BL21(DE3) is used for T7 expression (126, 272) . For increased control of gene expression, T7 native regulatory elements have been replaced with those of the lac operon; more specifically, the


42 promoters regulating both T7 RNAP and the target gene have b een replaced or modified to be inducible by IPTG (273) . The T7 RNAP promoter was replaced with P lac UV5 and lac operator sites were added downstream of the start site of the T7 promoter, generating P T7 lac (Figure 1 3) (270, 274) . The pET vectors were developed to take advantage of the T7 system and are widely used in recombinant protein production (270, 274) . The T7 system is a popular choice for gene overexpressio n studies because T7 RNAP recognizes its promoter with stringent specificity and has a high rate of processivity, generating high polymerase flux to maximize target protein production (126, 275, 276) . The system is also orthogonal to host metabolism as the T7 RNAP does not recognize native bacter ial promoters and bacterial RNAPs will not transcribe genes regulated by the T7 promoter (277) . Due to its capacity for high level protein overexpression, the T7 system is widely used for generating high titers of value added products (162, 278, 279) , confirming the function of recombinant p roteins (280, 281) , and generating sufficient levels of a target protein for structural or functional studies (282 284) . Drawbacks and alternative solutions Though the T7 system is often chosen as the default when high level protein overexpression is desired, its usage is associated with distinct disadvantages. First, there is a general lack of control over g ene expression, even with the regulatory elements from the lac operon in place. The T7 system is both notoriously leaky and not readily tunable. Though LacI represses activity from both P lac UV5 and P T7 lac , low levels of uninduced expression of T7 RNAP can lead to substantial expression of the target gene (273, 285, 286) . High uninduced expression can


43 cause transformation toxicity, or difficulty in obtaining transformants, and decrease host st ability when the target protein affects cell fitness (270, 287) . Additionally, the system lacks tunability, exhibiting a mostly binary response to inducer (288) . Another persistent issue with utilizing the T7 system is that the high activity of T7 RNAP places a large metabolic burden on the host, often resulting in toxicity and inhibition of growth (277, 289) . T 7 RNAP transcribes up to eight times faster than E. coli RNAP (126, 275) , resulting in competition for expression machinery (290) . While hig h mRNA production is generally correlated to high levels of protein, this no longer holds true when host resources are exhausted and the cell frequently mutates away from target gene expression to avoid the high fitness cost (291, 292) . T7 system instability is often the result of chromosomal mutation rather than loss of the plasmid containing the target gene (287) . Toxicity escape mutations occur in P lac UV5 regulating T7 RNAP expression and within the T7 RNAP and LacI genes (293 295) , dampening the production or activity of T7 RNAP (287, 294, 296 298) . Mutant strains with altered T7 RNAP activity have been isolated and cleverly reutilized as protein production hosts as a strategy to improve host stability and protein yields (288, 293) . Alternative strategies have been developed for tighter control over T7 system expression and to alleviate phage polymerase associated toxic ity. An early development to combat leaky expression was the utilization of T7 lysozyme to inhibit the activity of T7 RNAP (159, 299, 300) . Another strategy is to change the regulation of the T7 system. Rathe r than rely on LacI repression, the T7 system has been reconstructed and placed under the control of AraC (301) , TetR (302) , and both auto inducible (303) , and constitut ive promoters (287) with varying degrees of success. More complicated


44 solutions involve utilizing negative feedback loops to control T7 RNAP (304) , inhibiting T7 RNAP with RNA aptamers (305) , spl itting T7 RNAP to alleviate stress and toxicity (257, 306) , and regulating T7 RNAP with light (307) . Even with its drawbacks, the T7 system remains an attractive option for protein overproduction studies an d creative solutions to improving its regulation continue to expand its usage in the field. Rationa le of Presented Studies As synthetic biology moves into more diverse hosts, the availability of standardized genetic parts that are characterized across bact erial species continues to lag behind (308) . Industrially relevant organism s are underutilized as microbial chassis because the toolboxes available for engineering non model species are limited (309, 310) . Most genetic toolkits are made for use in E. coli and portability into other hosts, even other Gammaproteobacteria, is not guaranteed (259, 304) . Without characterized genetic parts and reliable tools for their assembly in diverse species, the design build test learn cycle of biological circuit engineering can become tediously protracted, increasing the time and resources necessary to develop a functioning genetic system (311) . The work presented here expands the systems available for inducible and constitutive gene expression in the Proteobacteria by characterizing genetic parts across species. The development and characterization of our broad host range plasmid toolbox is detailed in Chapter 2. Here, we aim to make engineering non model hosts more accessible in three main ways. First, by defining modular genetic parts and implementing a standardized plasmid construction method, our system enables customizable and efficient plasmid a ssembly and genetic part swapping. Second, the plasmid toolbox is easily portable between hosts and our workflows detail the


45 transformation and screening of our expression plasmids in diverse species. Finally, we characterized the inducible systems across nine species to demonstrate their utility as single and multi plasmid systems. Overall, this toolbox expedites the utilization of non model hosts in further synthetic biology studies. In Chapter 3, we expand the broad host range plasmid toolbox into E. c oli for protein overexpression experiments (312) . While the T7 system in E. coli BL21(DE3) is often chosen when high level gene exp ression is desired, its use is associated with certain disadvantages including toxicity to the host and a general lack of control over expression (313) . This presents an opportunity to test alternative systems that might achieve similarly high levels of expression but also possess increased stability and predictability. We demonstrated that our inducible systems did in fact reach the levels of overexpression obtained by the T7 system. Further, our toolbox systems were more stable, tunable, and tightly controlled than the T7 in direct comparison screens. These results challenge the assumption that the T7 expression system sho uld be used as the default in overexpression experiments. In Chapter 4, we characterized a synthetic constitutive promoter library across 15 species of Proteobacteria to establish a toolset for each species and investigate sequence activity relationships in highly active promoters. We then screened a subset of these promoters in a second species to test their transferability. This work is valuable as few studies have tested constitutive promoter libraries acr oss species and those that have are only done on a small scale (314) . Overall, this work provides toolsets of characterized genetic parts that are ready for use across the Proteobacteria and can be leveraged to study found ational aspects of promoter function.


46 Figure 1 1. Interactions between E. coli RNAP holoenzyme and a 70 dependent promoter . Dashed lines indicate interactions between regions of 70 ( green) and the C terminal domain s of the subunit s of RNAP ( CTD , blue) . Details on the interactions between the holoenzyme and promoter involved in transcriptional initiation in text.


47 Figure 1 2. Inducible systems controlled by negative, positive, and both negative and positive regulation. A) LacI n egative ly regulat es the LacI/P lac UV5 system in the absence of the inducer , IPTG. When IPTG is added to the system, it binds to LacI allosterically such that LacI is no longer able to bind DNA and the P lac UV5 promoter is depressed , enabl ing transcription of the gene of interest (GOI) . B) The rhamnose inducible system is positively regulated through a feedback circuit where the addition of rhamnose causes R haS to dimerize and activate P RhaSR . Rha R also binds rhamnose, dimerizes, and activates expression of P rhaBAD . C) AraC is both a positive and negative regulator. In the absence of arabinose, AraC binds the O1 and O2 sites and loops DNA to inhibit access of RNAP to P BAD . When arabinose is present, AraC binds to the O1 and I sites, opening the DNA loop , and activating expression of P BAD .


48 Figure 1 3. Canonical design of the T7 expression system. The host cell contains a chromosomally encoded bacteriophage T7 RNAP, most often generated with the DE3 lysogen. T7 RNAP is regulated by the lac UV5 promoter which is recognized by E. coli RNAP and repressed by LacI in the absence of the inducer, IPTG. The gene of interest (GOI) is on a plasmid, often a pET vector, and under the control of P T7 la c , a promoter recognized by T7 RNAP and modified to be repressible by LacI.


49 Table 1 1. Inducible systems and their regulation. Promoter Regulator Source Inducer Description Ref. lac LacI, CRP E. coli IPTG Regulates expression of genes to metabolize lactose (315) trp TrpR E. coli IPTG Regulates expression of genes to metabolize tryptophan (316) lacUV5 LacI Synthetic IPTG P lac mutant with a consensus 10 element, activator independent (126) tac LacI Synthetic IPTG Chimeric fusion of upstream P trp and downstream P lac elements with LacI operator site (317) trc LacI Synthetic IPTG Differs from P tac by one base pair in spacer element (253) rhaBAD RhaS Rha R E. coli L rhamnose Regulates expression of L rham nose metabolizing genes (318) BAD AraC E. coli L arabinose Regulates expression of L arabinose metabolizing genes (170) t etA TetR E. coli A Tc Regulates expression of TetA efflux pump (244) LlacO 1 LacI Synthetic IPTG P L promoter with LacI operator sites (187) LtetO 1 TetR Synthetic A Tc P L promoter with TetR operator sites (187) lac/ara 1 LacI, A raC Synthetic IPTG, L arabinose Modified P lac with LacI and AraC binding sites (187) T7 lac LacI Synthe tic IPTG T7 RNAP promoter with LacI operator sites (270) luxICDABEG LuxR A. fischeri OC6 Regulates expression of bioluminescence genes (233)


50 CHAPTER 2 A PLASMID TOOLBOX FOR CONTROLLED GENE EXPRESSION ACROSS THE PROTEOBACTERIA * Controlled gene expression is fundamental for the study of gene function and our ability to engineer bacteria. However, there is currently no easy to use genetics toolbox that enables controlled gene expression in a wide range of diverse species. To facilitate the development of geneti cs systems in a fast, easy, and standardized manner, we constructed and tested a plasmid assembly toolbox that will enable the identification of well regulated promoters in many Proteobacteria and potentially beyond. Each plasmid is composed of four catego ries of genetic parts (i) the origin of replication, (ii) resistance marker, (iii) promoter regulator, and (iv) reporter. The plasmids can be efficiently assembled using ligation independent cloning, and any gene of interest can be easily inserted in place of the reporter. We tested this toolbox in nine different Proteobacteria and identified regulated promoters with over fifty fold induction range in eight of these bacteria. We also constructed variant libraries that enabled the identification of promoter regulators with varied expression levels and increased inducible fold change relative to the original promoter. A selection of over 50 plasmids, which contain all of the construc tion and testing of genetics systems in both model and non model bacteria. Introduction Genetic tools to control gene expression commonly consist of an allosteric transcription factor that can bind regulatory DNA near a controlled promoter to initiate or * Reprinted with permission from Schuster , L . A . and Reisch , C . R . (2021) A plasmid toolbox for controlled gene expression across the Proteobacteria . Nucleic Acids Res . , 49 , 7189 72 02 .


51 repress transcriptional initiation. The addition of a small molecule ligand binds the tra nscription factor and enables RNA polymerase binding to the promoter for transcriptional initiation (52) . These promoter regulator pairs enable finely tuned genetic control in a few well studied bacteria (68, 152, 153, 319) , facilitating research in areas such as essential gene analysis, metabolic pathway optimization, and biosensor development (151, 320, 321) . An ideal system has a large dynamic range of expression, providing a tunable response where promoter output correlates positiv ely with the concentration of inducer added (322) . Hindering this fine tuning ability is expression in the absence of inducer, often referred to as "leakiness" (323, 324) . Low leakiness is important for greater predictability of the system and avoids the consequences of unintended low level expression that can obfuscate physiological experimen ts, allow the buildup of toxic proteins, or lower product yields in metabolic engineering (325 327) . While a high dynamic range is often desired, in prac tice, inducible promoters often have a tight off state but only middling on state, or have a leaky off state but very high on state (328, 329) . The ability to dynamically control gene expression in E. coli is very well developed, with at least 10 promoter regulators pairs that can operate orthogonally with high dynamic range (68) . Other well studied bacteria have smaller but still reliable toolboxes (153, 204, 319, 330, 331) . The Standard European Vector Architecture (SEVA) toolbox is well designed and possesses genetic parts for a broad range of bacte ria (44) . However, the complete toolbox is not widely available and adding parts to the system requires re coding to remove incompatible restriction sites. Moreover, this toolb ox was not explicitly designed for controlled gene expression. While a limited


52 number of regulatory proteins are available, including both a regulatory system and reporter (or gene of interest, GOI) requires additional design and cloning because each eleme nt is not inherent in the design scheme. Modular cloning (MoClo) toolboxes are also well developed and highly customizable with hierarchical Golden Gate cloning schemes (48, 332) . Construction of these plasmids requires several steps with intermediate plasmids and compatible restriction sites must be available in each genetic part, limiting the speed and ease of making new vectors. A full comparison between toolboxes can be found in Object 2 1 Supplementary Note 1. Recent work has combined the SEVA and MoClo standards for increased flexibility and cloning efficiency (46, 332) . In cluded in these toolboxes are some tried and tested genetic parts from E. coli but determining whether these parts function in other bacteria often requires tedious de novo cloning and testing (333, 334) . While these previously built systems are valuable and contribute to the goals of parts standardization in synthetic biology, we believe that an easy to use toolbox for controlled gene expression is also needed. Materials and Methods Plasmid Construction Plasmids were assembled using NEB HiFi Assembly with PCR amplified genetic parts. A detailed protocol for plasmid assembly is provided in Object 2 1 Supplementary Note 2 and the source of genetic parts is in Object 2 1 Supplementary Table 2. Sequences of verified p lasmids are available at Addgene as indicated in Object 2 1 Supplementary Table 1. Primers used in this study are listed in Table 2 2. Plasmid Transformations All recipient strains except for A. baylyi ADP1 and A. fischeri were transformed via electroporation. The cell cultures made electrocompetent were taken from either an


53 overnight growth ( B. thailandensis , P. aeruginosa , P. putida , and X. campestris ) or subcultured from an overnight growth and made electrocompetent when the cells reached mid log phase ( A. fabrum , R uegeria . sp. TM1040, Sulfitobacter sp. EE 36). The protocol for the preparation of electrocompetent cells was as follows: 6 mL of each wild type strain was incubated with shaking in the indicated media and temp erature conditions ( Object 2 1 Supplementary Table 3) overnight or until mid log phase. The total culture was then spun down in four 1.5 mL tubes in a microcentrifuge at 5,000 rpm for 2 minutes. Culture supernatants were aspirated, cell pellets resuspended in 1 mL 300 mM sucrose at room temperature and then centrifuged again for 2 minutes at 5,000 rpm. This process was repeated to wash with sucrose twice and then the pellet was resuspended in a final volume of 1:10 of the initial culture volume or 150 µL in each of the four microcentrifuge tubes. 50 µL of each suspension was then transferred to a 1 mm gap width electroporation cuvette and cells were electroporated at the specified voltage for each strain ( Object 2 1 Supplementary Table 4). Cells were recover ed in 1 mL of their respective recovery media and incubated with shaking at either 30 or 37° C in a deep well plate for 2 h before plating. A natural transformation protocol was followed for A. baylyi ADP1, adapted from a previous protocol (259) . Here, 5 m L of fresh LB was inoculated with wild type ADP1 from a glycerol stock and grown overnight at 30°C. The next day, 1 m L of fresh LB was inoculated with 70 µ L of this culture and approximately 100 ng of the plasmid were incubated for 3 h before plating onto selection plates. Conjugation was performed to introduce plasmids into A. fischeri using the RP4 system in the following steps. On the day prior to conjugation, 5 mL cultures were


54 inoculated from glycerol stocks of wild type A. fischeri , all required donor strains of E. coli, and an E. coli helper strain containing pEVS104 and grown overnight. The following day, donor and helper cultures were spun down separately at 10,000 rpm for 1 minute and resuspended in LBS to remove residual antibiotic. A sufficient volume of donor and helper cultures was pelleted and resuspended such that each conjugation used 500 µL of both donor and helper strains in addition to 500 µL of recipient A. fischeri . Each mixture of donor, helper, and recipient was centrifuged at 10,000 rpm for 1 minute and the supernatant decanted, leaving approximately 100 µL of LBS to resuspend the pellet. The resuspensions were spotted on an LBS plate and incubated on the benchtop overnight. Each spot was streaked on to a fresh marine media plate containing the appropriate antibiotic th e following day. For transformation efficiency assays, 100 ng of each plasmid was transformed following the protocols above. The recovered cultures were serially diluted ten fold four times, and 10 µL of each dilution was spotted onto an agar plate with t he appropriate antibiotic and incubated for 1 3 days until colonies were visible. For fluorescence assays, plasmids were transformed with the same protocol and after recovery, cultures were streaked onto agar plates with the appropriate antibiotic. Isolate d colonies were then grown up and saved as glycerol stocks. Fluorescence Assays For each set of promoter regulator pairs, glycerol stocks of all relevant strains were struck out on to fresh agar plates to obtain isolated colonies. The following day, a deep well plate containing 1 mL of rich medium and the appropriate antibiotic was inoculated with isolated colonies and incubated on a plate shaker overnight. After approximately 20 hours of growth, cultures were subcultured to an OD of 0.1 into fresh


55 media an d antibiotic and incubated on a plate shaker until the cultures reached mid log phase. Cultures were then diluted again to an OD of 0.07 into 96 well plates (Costar, black, clear bottom); where the wells contained 100 µL of rich media with antibiotic or ri ch media only, for strains with plasmids and wild type strains, respectively. At least eight wells on each 96 well plate were not inoculated and used as controls. Plates were then incubated on a plate shaker for 0.5 hours. An additional 100 µL of the respe ctive media with 2X inducer concentration was added to each well of the plate so that the final concentration was 1X for induced samples (inducer concentrations in Object 2 1 Supplementary Table 5). The plate was then incubated on a plate shaker, and OD an d fluorescence were measured at 2 , 4 , 6 , and 24 hours post induction on a plate reader (Molecular Devices SpectraMax M3), with an additional timepoint taken at 8 hours for slower growing strains. Fluorescence readings were taken using a plate adapter an d top read settings on the plate reader. All experiments were performed with three technical replicates and with two to three independent experiments. All calculations and data analysis were performed using Microsoft Excel. For screens of the 12 inducible systems within each bacterial strain, absorbance and fluorescence data were organized by timepoint. Optical density was adjusted to a 1 cm pathlength by dividing by a factor of 0.56 or 0.28 when the culture volume in the well was 200 µL or 100 µL, respecti vely. This adjustment is applicable when the wells of a 96 well plate are completely flat and was empirically validated in our lab. In addition, when optical density measurements were above the threshold for linearity (approximately OD = 1.0 on our machine ), cultures were diluted 1:10 into a total of 100 µL/well in another 96 well plate and measured for a more accurate reading.


56 Fluorescence data taken after removing culture for these OD readings were adjusted so that data was consistent across timepoints. T he raw fluorescence data was either used directly or further modified by normalizing the adjusted optical density and subtracting the fluorescence of an empty vector control included in each screen. Fold change was then calculated for the raw and modified fluorescence data. Antibiotic Assays The assay was adapted from a previous protocol (320) . Strains were g rown and diluted following a similar protocol to that outlined for the fluorescence assays. Specifically, three individual colonies were grown up from a freshly streaked agar plate in 1 mL nutrient rich broth with the appropriate antibiotics and incubated overnight in a deep well plate with shaking. The cultures were then diluted into 1 mL of fresh media to an OD of 0.1 and grown to mid exponential phase. The cultures were then diluted again into 1 mL of fresh media to an OD of 0.07 with each of the three c ultures inoculating two additional cultures, one that remained uninduced and one in which inducer was added. These cultures were grown for 0.5 hours, at which point inducer was added to three of the six cultures, marking time zero for the spot dilution pla ting. Spot dilution plating occurred when cultures were in mid exponential phase of growth and 200 µL were taken from each of the six cultures and serially diluted 10 fold in water. 10 µL of each dilution, including the undiluted culture, were spotted on t o agar plates in the following way: the cultures that were grown in the absence of inducer were spotted on to LB agar with kanamycin to obtain a colony count of the total viable transformants and LB agar with gentamicin where only transformants with leaky expression would grow. The induced cultures were spotted onto the same selection plates as specified above that also contained inducer (cumate) spread at a 1x concentration to preserve induction.


57 Spots were grown overnight or until the appearance of coloni es. For B. thailandensis , a gentamicin concentration of 20 µ g/mL was used, which is inhibitory to wild type cells. Alongside the experimental strains, wild type strains were grown and diluted as described above in parallel and plated on to LB gentamicin pl ates to confirm the lack of visible colonies. To make the graph in Figure 2 5, CFUs within the countable range were recorded on all plates and counts from the gentamycin selection plates were normalized to those on the kanamycin selection plates. The same was done for the set of plates spotted with induced cultures. Images of the full plates used to make Figure 2 5 are shown in Object 2 1 Supplementary Figure 20. Library Construction and Screening Libraries were assembled using NEB HiFi Assembly with PCR am plified genetic parts. A list of primers and a description of the construction is provided in Object 2 1 Supplementary Note 5. Protocols for amplification and assembly are provided in Object 2 1 Supplementary Note 2. Library mutants were screened in two steps where the initial screen was a simplified version of the protocol followed for fluorescence assays th at did not include subculturing. Library mutant transformants were inoculated by hand or using a QPix2 colony picker into 96 or 384 well plates depending on library size, and plates were incubated overnight on a plate shaker. Each plate also contained at l east three wells inoculated with the original plasmid control strain and at least three well were left as blanks. The following day, overnight cultures were stamped into two additional plates with a plate replicator, one with inducer added to the media and one without. These plates were incubated overnight, and optical density and fluorescence readings were taken the following day. Data from this screen served to identify non functional library


58 mutants and those that displayed a dynamic range of expression similar to or better than the control. These potentially improved mutants were isolated from the screening plate by streaking onto fresh agar plates and were included in the next screen. This second screen follows the full fluorescence assay protocol of su bculturing and induction with both the original plasmid control strain and empty vector control strain included in each plate. OD and fluorescence readings were taken during exponential phase and at 24 hours post induction. Library mutants that were charac terized to be an improvement on the original plasmid control were Sanger sequenced. Violacein Experiments Cultures were started from single colonies and grown overnight in a deep well plate. The following day, cultures were set back following the protocol outlined in the Fluorescence Assays section. After the final subculture, violacein was extracted from cultures in mid log phase and after an overnight growth. Violacein extractions were done using a protocol adapted from previous protocols (164, 335) and were as follows: for each measurement, 1 mL of culture was pelleted a t 21,000 × g for 10 minutes. The supernatant was removed and the pellet was resuspended in a methanol solution containing 1% v/v acetic acid. Tubes were then incubated at 58°C for 10 minutes with periodic vortexing to extract the violacein followed by cent rifugation to pellet cell debris (21,000 × g, 10 min). Multiple extractions were required from samples producing a high amount of violacein and background subtracted measurements were summed during data analysis. From the supernatant, 200 µL was aliquoted into a 96 well and absorbance was read at 585 nm.


59 Results Plasmid Design To enable quick and easy assembly of customized plasmids, we developed a combinatorial strategy to construct plasmids compatible with a broad range of bacteria that possess inducible expression systems, as outlined in Figure 2 1. Our assembly scheme uses ligation independent cloning, which requires a unique primer pair for each part's initial cloning into the vector. The 3' end of the primers anneal to the new part, and the 5' ends hav e an overlap sequence conserved for each part category. Overlapping sequences between genetic parts were designed for optimal primer annealing temperatures and to give consistently high yields after amplification. After this initial cloning, only four prim er pairs are required to assemble any variation of the plasmid (protocol detailed in Object 2 1 Supplementary Note 2). The combinatorial assembly is very efficient in our hands, and we routinely assembled four pieces using the New England Biolabs (NEB, Ips wich, MA) HiFi assembly mix ( Object 2 1 Supplementary Notes 1 2). In most cases, we were also able to amplify three parts as a single product and efficiently create a library of variants with the fourth piece. Two and three piece assemblies can also be ef ficiently performed using NEB HiFi assembly or CPEC cloning (49) . The toolbox includes several variants of each part to facilitate identifying a part that may function in any member of the Proteobacteria. The available parts for assembly include 4 origins of replication or a Tn7 integration vector as the backbone, 12 promoter regulator pairs, 8 antibiotic markers, and 7 reporters (detailed in Object 2 1 Supplementary Figures 1 4, Object 2 1 Supplementary Table 1). Of the 12 regulators, 7 were taken from the Marionette strain of E. coli , where these parts were engineered for


60 increased orthogonality in E. coli . These include quorum sensing systems regulated by CinR AM and LuxR, cumate , salicylic acid , and naringenin inducible systems derived from P. putida , a vanillate inducible system from C. crescentus , and a PcaU regulated system from Acinetobacter sp. ADP1 (68) ( Object 2 1 Supplementary Tables 2, 5). The remaining p romoter regulator pairs incorporate TetR regulated systems and sugar inducible systems regulated by LacI, AraC, and RhaS RhaR from E. coli . In most cases, the RiboJ ribozyme site was included downstream of the promoter to decrease context dependence issue s that may affect gene expression. The ribozyme site self cleaves in the 5' UTR, so that the same sequence is present regardless of the promoter used (336) . The promoter regulator part also has a strong RBS positioned upstream of the reporter or GOI in the final assembly for seamless insertion of the coding sequence. All four origins of replication in our toolbox are known to be broad host range, though not all origins are efficiently transformed or maintained in all Proteobacteria ( 206, 337) . Plasmid Screens Across the Proteobacteria To demonstrate that our plasmids can identify functional and inducible promoters across the Proteobacteria, we tested each of the organisms in Table 2 1, representing the 3 major classes of Proteobact eria. We first assessed the transformation efficiency by electroporation of all four origins in our selected bacteria (Table 2 1) and found that the efficiency varied significantly ( Object 2 1 Supplementary Figure 5 and Object 2 1 Supplementary Table 4). I n some bacteria, all plasmids were transformed with high efficiency, others had large differences in efficiency between each origin of replication, and some origins could not be transformed into a particular host. Each origin part also possesses an origin of transfer (OriT) that allows conjugation with RP4 conjugal


61 machinery for transfer to bacteria that cannot be efficiently electroporated. We confirmed that that conjugation was possible using triparental mating with the mobilizable pEVS104 helper plasmid to conjugate Aliivibrio fischeri (338) . We n ext collected data from inducible systems expressing different fluorescent proteins as reporters in a few strains ( Object 2 1 Supplementary Figure 6, 7). After preliminary screens using the reporters G FPmut3 (339) , mKelly2 (340) , tdKatushka2 (341) , and mRFP (342) , we proceeded with mRFP as it consistently gave a larger dynamic range of expression and was less affected by autofluorescence of cells and medium. The other fluorescent proteins are still included in the toolbox as they could be useful for specific circumstances. Plasmids were th en constructed with each of the 12 promoter regulator pairs, the mRFP reporter, the gentamycin resistance marker, and either the pK, pB, or pF origin. We then transformed the full set of 12 plasmids from the most efficiently electroporated or conjugated or igin of replication for each bacterium a nd screened all inducible systems within each strain in parallel. At least five inducer concentrations were included in the screening plates as an exploratory approach to determine the inducer concentration that gave the highest expression without appreciably compromising growth. The cells were induced during early log phase, followed by optical density and fluorescence measurements at mid log and stationary phases of growth. The promoters that had at least 50 fold di fference between the uninduced and induced cultures at the 24 hour timepoint are shown in Figure 2 2 B and induction results for all 12 inducible systems in the nine Proteobaceteria are shown in Figure 2 3 (all data shown in Object 2 1 Supplementary Figures 8 16). Because some of the uninduced cultures possessed


62 slightly lower fluorescence than a control without mRFP, the background fluorescence was not subtracted and the data is presented as raw values. In eight bacteria, at least two promoter regulators were found to have an induction range of over 50 fold. Surprisingly, commonly used systems derived from E. coli , such as LacI and TetR regulated promoters were not among those with the largest expression ranges in our dataset. The CinR AM /P Cin and LuxR/P LuxB systems were generally the best performing across all of the bacteria we screened, with the CinR AM regulated system achieving over 120 fold induction in all but three of the nine strains (Figure 2 2 C). While eight of the nine strains possess at least one quorum sensing system induced by a homoserine lactone, functionality of these systems does not appear related to native quorum sensing capabilities. The one strain that does not possess related quorum sensing genes, X. campestris, is still inducible by over 100 fold (343) . Nonetheless, our data shows these regulators are highly sensitive for the majority of the strains tested, with concentrations less than 10 µM were sufficient for full induction. NahR AM /P SalTTC was consistently among those with the highest induction, though very leaky by 24 hours in some strains. Conversely, CymR AM /P CymRC was moderately inducible but remained tight in the absence of inducer in vi rtually all strains. Though not as highly inducible as the quorum sensing derived systems, VanR AM /P VanCC and AraC/P BADmin were functional across all strains tested with varying levels of leakiness. TtgR AM /P Ttg and PcaU AM /P 3B5B were most likely to be non fu nctional and many times there was little to no difference in output between on and off states (Figure 2 2 C, Figure 2 3).


63 The induction profiles of Alphaproteobacteria A. fabrum , Ruegeria sp. TM1040, and Sulfitobacter sp. EE 36 share some similarities (Fig ure 2 3, Object 2 1 Supplementary Figure 8, 14, 15). The cumate , arabinose , and OHC14 inducible systems are similarly tightly off in the absence of inducer with CymR AM /P CymRC and AraC/P BADmin inducible to nearly the same degree across all three strains. The expression profiles of TetR/P LtetO 1 are also very similar across timepoints, with leakiness apparent at four hours post induction and moderate though leaky expression after an overnight of growth. LacI/P LlacO 1 is leaky but still highly inducible in both roseobacter species and in all three Alphaproteobacteria, the RhaS RhaR and PcaU AM regulated systems were leaky and not inducible. Behavior across promoter regulator pairs in the two Pseu domonas species was surprisingly different given their close phylogenetic relationship. The LuxR and CinR AM regulated systems were inducible to 846 and 597 fold in P. putida, but only 14 and 2 fold in P. aeruginosa , due to high leakiness (Figure 2 2 C, Fi gure 2 3, Object 2 1 Supplementary Figure 11). There was also more than a 150 fold difference between the induction levels of AraC/P BADmin , though both reached similar levels of induction. Conversely, while CymR AM , TetR , VanR AM , and LacI regulated syste ms were inducible over 50 fold in P. aeruginosa , the same systems had a less than 20 fold change in P. putida . The shared induction trends in the two r oseobacter species and stark differences between Pseudomonas species highlight the lack of predictability of tools for controlling gene expression in closely related bacteria. The inducer concentration that yielded the highest expression level was not consistent between the different bacteria tested, likely a consequence of different


64 uptake capabilities (330, 344) (Figure 2 4A F, Object 2 1 Supplementary Note 3, 4). In s ome cases, there was little difference in expression across multiple inducer concentrations, and in a few cultures the highest level of inducer exhibited toxicity that prevented growth of the culture (Figure 2 4A). Alternatively, some inducible systems give expression ranges in the shape of a bell curve across titrated inducer concentrations ( Figure 2 4B C). In other cases, a saturation of fluorescent protein expression was clearly reached at a concentration less than the maximum (Figure 2 4D). C onsistent with promoter responses observed previously, most of our data showed a limited degree of tunability with varying inducer concentration (345) (Figure 2 4D F, Object 2 1 Supplementary Note 3, 4). The quorum sensing derived systems r egulated by CinR AM and LuxR frequently exhibited a near binary response across the inducer concentrations tested, especially after 24 hours of induction, though further titrations of these inducers might show a more linear response. Arabinose , cumate , va nillate , and ATc inducible systems were more likely to show some tunability across at least three inducer concentrations, indicating a space where finely tuned dose dependent responses could be explored. Assessment of Context Dependence Each genetic part in our library should function independently of the other parts in the plasmid. However, in practice, the issue of context dependence frequently arises, where changing a single gene or genetic part affects nearby parts (5) . To demonstra te that plasmid performance was not heavily influenced by context dependence, we first characterized the effect of each origin of replication on the behavior of the promoter regulator. To do so, we compared four plasmids that were identical except for the origin of replication in both P. putida and P. aeruginosa . The uninduced expression was


65 similar across the four different origins in both bacteria ( Object 2 1 Supplementary Figure 17). The LuxR/P LuxB promoter regulator pair in P. putida performed similarly on a pBBR, RSF1010, and pSa backbone with a fold change difference of less than 10 at two of the three inducer concentrations tested. The same system on an RK2 backbone had a noticeably smaller range of expression, likely due to the lo wer copy number of RK2 in P. putida as compared to other broad host range origins (330) . This gives an example of how the choice of plasmid origin can be used to add an additional level of control over gene expression (346) . VanR AM /P VanCC in P. aeruginosa also exhibited a smaller dynamic range on an RK2 backbone. The Tn7 integration plasmid was tested in P. aeruginosa and A. fabrum with the VanR AM /P VanCC system. Both bacteria had a decreas ed leakiness and a lower range of expression compared to the same regulator on a replicating plasmid, as expected because of single copy expression from the chromosome ( Object 2 1 Supplementary Figure 18). These results demonstrate that promoter regulator integration provides an alternative to vector based expression to tune gene expression to the levels needed. We also assessed whether changing the resistance marker on otherwise identical plasmids affected promoter induction in A. fabrum . mRFP expression was measured from four promoter regulator pairs, VanR AM /P VanCC , CinR AM /P Cin , CymR AM /P CymRC , and LuxR/P LuxB on kanamycin and gentamicin backbones. The results show that the relative expression for each system remained the same though there were slight differences in absolute level of expression ( Object 2 1 Supplementary Figure 19). Comparing induction of the sa me system on each marker within each timepoint, only one of the four promoter regulator pairs had a difference of more than 2 fold in


66 basal expression and one had a difference of higher than 3 fold in induced expression. Relative to the range of expression of these systems, these differences are quite small. In sum, these results demonstrate that there is context dependence in gene expression, but it is minimal when comparing most plasmids. Moreover, the differing levels of expression could be strategically used to optimize expression needed for specific usage. To test whether these systems were orthogonal to each other and thus allow independent induction in the same cell, two plasmids with different inducible systems were transformed into P. putida , A. fa brum, and P. aeruginosa (plasmid combinations listed in Object 2 1 Supplementary Table 6). The systems were induced both individually and simultaneously to assess cross reactivity and metabolic burden. In the absence of its cognate inducer, there was littl e to no expression from any of the regulators from early log phase through late exponential (data from overnight induction shown in Figure 2 4G). In A. fabrum and P. putida , strains with both LacI and AraC regulated systems were tested specifically as IPT G is known to inhibit AraC activity (347) . Induction of AraC/P BADmin in the presence of IPTG was lower tha n when the two plasmid strains were induced with only arabinose, though the effect was less in P. putida than in A. fabrum and the AraC regulated systems were still functional in both cases. Induction rank order trends in the two plasmid systems generally followed those from individual plasmid experiments (plotted on Figure 2 4G with dotted lines, data from Figure 2 3, Object 2 1 Supplementary Figure 8, 9, 11). However, in most cases the expression from single inducer induction did not achieve the same maxi mum levels as in the individual plasmid experiments, likely due to stress from the maintenance of two


67 plasmids and their corresponding antibiotics. Expression was generally lower when both systems were induced, as expected, though the difference in dynamic range between single and simultaneous induction varied considerably depending on the two systems involved. For example, induction of pBLlG2 with IPTG in the presence of rhamnose dramatically decreased expression of GFP in P. putida , while the induction of LacI/P LlacO 1 was virtually unchanged when arabinose was also present. Nevertheless, these results confirm that our plasmids can be used to independently control the expression of multiple genes, though optimization of experimental design may be necessary to attain desired expression levels in any given host. Controlled Expression of a Physiologically Relevant Gene While the fluorescent protein assays described above are useful for identifying the fold change in expression, they are not ideal for determin ing the tightness of the promoter because low levels of fluorescent protein expression can be obscured by autofluorescence of the cells and medium (15, 348) . To examine the leaky expression with a physiologically relevant gene, we cloned the gentamicin resistance gene aacC1 into a plasmid with the CymR AM /P CymRC regulator to create plasmid pFCyGe2. Based on our fluorescence data, this system remained tightly repressed through log phase in B. thailandensis ( Object 2 1 Supplementary Figure 10) and should yield cells that are sensitive to gentamicin in the absence o f inducer. After reaching mid log phase the strains were serially diluted and spotted on agar plates (Figure 2 5). Both induced and uninduced cultures were spotted onto gentamicin plates, to quantify expression of aacC1 , and kanamycin plates, to count the total number of viable cells. The results showed nearly a 200,000 fold change difference between uninduced and induced colony counts when each value is calculated as a


68 proportion of total viable cells, demonstrating that this assay was very sensitive and t here was a quick response to induction. In the absence of inducer only 5 x 10 4 cells/mL were viable, demonstrating that the promoter regulators identified in the mRFP screen did in fact possess very low levels of leaky transcription ( Object 2 1 Supplement ary Figure 20). Promoter Libraries Enable Varied Dynamic Range Often, experiments that use inducible expression systems require a specific range of expression or very tight repression (167, 288, 349, 350) . Accordingly, we sought to construct plasmids that had an array of dynamic ranges with a single inducible system, such that the user can easily choose the one that best fits their needs. There are a few available methods in the literature to change dynamic range, including modification of promoter architecture or mutating transcription factors themselves (351 353) . Because our inducible systems are diverse (i.e. utilizing activators or represso rs as transcription factors) and each system host pairing is unique, we chose an exploratory approach through degenerate promoter libraries. Libraries were built in the LuxR/P LuxB , NahR AM /P SalTTC , and TetR/P TetA systems by targeting both the promoter driving expression of the reporter and the promoter of the regulator ( Object 2 1 Supplementary Note 5) , since the concentration of the regulatory protein has a direct influence on GOI output (328) . We then examined these libraries in B. thailandensis, P. aeruginosa, A. fabrum. or A. fischeri by first screening 200 1,100 variants to identify those that possessed decreased leakiness or a larger dynamic range of expression tha n the original after 24 hours (Figure 2 6) . Between 20 40 variants were then re screened in a more sensitive assay to calculate the induction range, a subset of which are shown in the chart inserts in Figure


69 2 6 (expression data from all re screened isola tes shown in Object 2 1 Supplementary Figure 21). In A. fabrum, libraries of plasmids pFNR5 and pFTR5 had variants with induction fold changes of 54 and 108, compared to 40 and 26 for the original plasmids. For the pFLxR5 library, the fold change increased from 120 to 722, with the tightest variants nearing the detection limit for mRFP in our assay. T he original pFTR5 construct performed poorly in B. thailandensis with a fold change of only 3, while the best members of the TetR/P TetA library had a fold chan ge of 105, with increased expression in the on state and decreased leakiness in the off state. In A. fischeri , the pFLxR5 library had variants with an increased on state but none that had a decreased leakiness. Lastly, in P. aeruginosa the original pFLxR5 plasmid was very leaky at the 24 hour time point, whereas the selected variants were much tighter in the absence of inducer at 24 hours, increasing the fold change from 29 to 445 (Figure 2 6, Object 2 1 Supplementary Figure 21). Library isolates with an i ncreased dynamic range were either more tightly off in the absence of inducer, more highly expressing when induced, or both. The improved isolates from the LuxR regulated library in P. aeruginosa and the LuxR and NahR AM regulated libraries in A. fabrum were less leaky than the original systems after an overnight induction but were similar in the on state compared to the original. Expression data from the complete library and re screened selected isolates suggests that t he original inducible system was already expressing near the physiological limit for that strain (Figure 2 6, Object 2 1 Supplementary Figure 21). Induction results from single plasmid screens support this as the LuxR/P LuxB system is among the highest expr essing of the 12 systems screened in P. aeruginosa and A. fabrum, and


70 NahR AM /P SalTTC is the highest overall expressing system in A. fabrum (Figure 2 3). Conversely, the improved isolates from libraries of the LuxR regulated system in A. fischeri and the Te tR/P TetA system in B. thailandensis had dramatically increased induced expression when compared to the original systems. Though a LuxR regulated system is present natively in A. fischeri , the system included in this toolbox has a mutation in the 10 hexame r of the regulated promoter (68) . The mutation was made rationally to improve dynamic range and decrease cross reactivity in E. coli and ostensibly had the effect of decreas ing induction in its native host. In a few libraries, isolates with the highest fold change were so tightly off in the absence of inducer that they neared the detection limit of our fluorescence assay. To further confirm that these LuxR/P LuxB promoters in P. aeruginosa were tightly repressed, we cloned the five gene violacein biosynthesis pathway from Pseudoalteromonas luteoviolacea with its native operon structure in place of mRFP in the original pFLxR5 plasmid and several members of the LuxR/P LuxB librar y (354) . Despite the large size of this operon (7.4 kb), four piece plasmid assemblies were efficient. Without inducer, the amount of violacein extracted from each strain with a variant promoter was near the limit of detection because the crude extract measuremen ts were similar to the negative control, while the strain with the original promoter was nearly 6 fold greater ( Object 2 1 Supplementary Note 6). In total, these results demonstrate that the dynamic range can be changed by screening a modest number of vari ants with targeted degeneracies and that promoter regulators that appear non functional can sometimes be improved to produce a robust dynamic response, as was the case with the TetR/P TetA system in B.


71 thailandensis. Moreover, these libraries can be directl y used to screen for activity with desired properties in bacteria where few or no such systems exist. Discussion The comprehensive screening of inducible systems described in this work demonstrates the utility of a standard vector assembly method for identifying and characterizing gene expression systems in diverse bacteria. The plasmid based design of these systems ensures modularity and facilitates quick and easy movement into and between hosts for testing with a fluorescent protein, with the option to integrate the system into the chromosome if desired. The inherent design scheme of the system requires that all four genetic parts are present and in the pre determined order in the plasmid, and deviating from the d esign requires creating new primers (details in Object 2 1 Supplementary Note 1). Nevertheless, moving from testing to utilization is quick since there are no sequence based restrictions for adding a gene of interest in place of the reporter. The bacteria tested in this work include species that are well studied and widely used, such as Pseudomonas . sp, A. fabrum , and B. thailandensis , as well as those lesser studied, i.e., Ruegeria . sp. TM1040 and Sulfitobacter sp. EE 36. Similarly, the inducible systems screened herein include the commonly used rhamnose, arabinose, and IPTG inducible systems (170, 187, 355) , while t he NahR, LuxR, and CinR regulators are arguably under utilized given their effectiveness for controlling gene expression in several of the bacteria that we tested. When comparing the data gathered here to published expression data from systems in the same bacteria, our plasmids provide either a larger range of inducibility or a lesser degree of leakiness in many cases. Further, these comparisons demonstrate a lack of predictability for expression


72 systems moved from host to host, highlighting the need for st andardization and broadly available genetic tools. P. putida has emerged as a prominent microbial chassis for metabolic engineering due to its versatile metabolism and stress endurance traits (39) . Both native and heterologous inducible systems have been employed to control gene expression with varying success. Screening natural E. coli inducible promoters P RhaB , P AraB , P LacUV5 and P T7 and Pseudomonas promoters P m , P Sal , and P AlkB in P. putida , all but P AraB had leakiness that was at least two orders of magnitude above background, with the P m promoter being the leakiest (344) . In comparison, our data shows that basal expression was near baseline for the majority of systems screened while still inducible by up to 850 fold (Figure 2 2C). Differen ces in expression observed when comparing E. coli to P. putida emphasize the unpredictability that comes with moving a system into a new host (208, 344, 356, 357) . Increasingly important synthetic biolo gy hosts include A. fabrum and Burkholderia sp., due to their relevance as etiological agents of disease (358 360) , and A. baylyi (361) , due to its genetic malleability. The most commonly used systems for regulating gene expression in these bacteria are the E. coli IPTG , arabinose , and rhamnose induced systems. Previous work in A. fabrum had mixed re sults on LacI regulated systems' effectiveness, with LacI q /P Lac exhibiting only a 6 fold change in expression when induced (337) while LacI/P Lac was induced over 300 fold (345) . AraC regulated systems are highly inducible in A. fabrum with low basal expression (206, 345) . Our data demonstrat e that LuxR and CinR regulated systems are significant


73 additions to these available tools, with expression levels at 100 and near 800 fold, respectively. Arabinose and rhamnose inducible system are the most widely used for controlling gene expression i n Burkholderia sp. (158, 206, 210) , inducing 5 to 21 fold higher than E. coli (205) . In B. thailandensis , we observed inductions of over 350 fold with LuxR/P LuxB and over 50 fold with a cumate inducible system. In A . baylyi , the E. coli promoters P BAD and P Tac were inducible to over 100 fold with varying levels of basal ex pression (153) . Similarly, another gro up found the IPTG inducible Trc, Tac, and T5 promoters to be highly inducible in A. baylyi ADP1 and generated a Trc promoter library, identifying isolates with up to a 73 fold induction (259) . While the E. coli derived systems that we tested did not have a high range of induction, we observed fold changes of 100 to 200 from CinR AM , LuxR , and CymR AM reg ulated systems in A. baylyi , thus expanding the options for gene regulation in this organism. The degenerate library methodology employed here successfully expanded the expression range and presented a simple method to identify promoters with expression in the desired range. Perhaps the most surprising observation in our data was that the largest dynamic range of expression came from the CinR AM regulated system in Ruegeria sp. TM1040 and Sulfitobacter sp. EE 36, 1,235 fold and 2,191 fold respectively. These results confirm that our toolbox can identify systems for gene regulation where none existed before. The toolbox described here allows for the systematic evaluation of the three key components needed to develop genetics systems in non model bacteria: the origin of replication, antibiotic resistance marker, and promoter regulator. Designed for inducible gene expression, these plasmids are not as customizable as other toolboxes, but their


74 simplicity and ease of use expedites the design and build stages of t he bio engineering design build test learn cycle. Even in bacteria that have developed genetics systems, these plasmids will enable parts standardization, increased reproducibility, and streamlined cloning to speed plasmid construction. As microbial synthe tic biology continues to move into more diverse hosts, predictable broad host range expression systems will be essential to advance the field. These systems can be used for various applications, such as directing flux toward value added products in metabolic engineering (167) and implementing h eterologous tools for genetic manipulation. For example, tools such as CRISPR interference or CRISPR activation have enhanced our capacity to manipulate bacterial cells, but they still require control at the transcription level for precise temporal functio n (362 364) . These systems are only as good as the underlying promoters driving the expression of the Cas genes. In most bacteria, there are no w ell developed options, making this toolbox of immediate practical value. Though we only tested members of the Proteobacteria, the same genetic parts will likely function in other Gram negatives, and perhaps an even wider range of bacteria since there is a precedence that the RSF1010 origin of replication being maintained in some Gram positive bacteria (365) . There is evidence that some of the inducible systems tested here are functional in Gram positive species as well. For example, the Tn10 encoded tet regulatory system has been used in Bacillus subtilis (366) , and the AraC regulated system functions in Corynebacterium glutamicum (367) . Importantly, t he orthogonality of most of the promoter regulators and the availability of both multiple origins of replication and antibiotic markers will facilitate experiments that require independent expression of multiple genes in the same cell. With genetic part ve rsatility,


75 flexible swapping, and ease of new part addition, this toolbox is a valuable addition to the field and will be useful as new microbial hosts are explored. Object 2 1. Supplem entary information for A Plasmid Toolbox for Controlled Gene Expression Across the Proteobacteria , PDF 3.2 MB


76 Table 2 1. Strains investigated in this study. Strain Phylogenetic Class Acinetobacter baylyi ADP1 Gammaproteobacteria Agrobacterium fabrum C58 Alphaproteobacteria Burkholderia thailandensis E264 Betaproteobacteria Pseudomonas aeruginosa PAO1 Gammaproteobacteria Pseudomonas putida KT2440 Gammaproteobacteria Sulfitobacter sp. EE 36 Alphaproteobacteria Ruegeria sp. TM1040 Alphaproteobacteria Xanthomonas campestris ATCC 33913 Gammaproteobacteria Aliivibrio fischeri ES114 Gammaproteobacteria


77 Figure 2 1. Plasmid t oolbox a ssembly s cheme and n omenclature. Each plasmid is composed of four genetic parts that share overlapping primer sequences, requiring only four primer pairs to assemble any version of the plasmid. Plasmids are named based on the codes provided, in the order: Origin, Regulator, Reporter, and Marker.


78 Figure 2 2. Experimental o utline and i nduction s creen r esults. A ) Workflow for inducible systems screens. B ) Promoter regulators with >50 fold induction range. Fold change was calculated without correcting for autofluorescence of the cells a nd medium. Floating lines represent the induction range of mRFP with fluorescence in the absence of inducer plotted at the bottom of each line and induced expression plotted at the top of the vertical line. Data is clustered by the host strain. Strains on x axis: Pa ( P. aeruginosa ), Pp ( P. putida ), Af ( A. fabrum ), Bt ( B. thailandensis ), Xc ( X. campestris ), Ab ( A. baylyi ), Sb ( Sulfitobacter sp. EE 36), and Rp ( Ruegeria sp. TM1040). Horizontal lines at each cluster represent the average fluorescence of contro l strains that did not possess mRFP . C ) Fold Change Heatmap of all Bacteria and Inducible Systems. The fold change was calculated from RFU data normalized to OD and background fluorescence of the medium and empty vector control after 24 hours of growth for all bacteria except A. fischeri, where the medium only was used for normalization. Strain abbreviations are the same as in B. plus Av ( Aliivibrio fischeri ). Inducible systems on the y axis are labeled with the transcription factor. For TetR systems, TetR 1 refers to TetR/P TetA and TetR 2 refers to TetR/P Ltet O1 .


79 Figure 2 3 . Induction r ange of 12 e xpression s ystems in n ine P roteobacteria. A ) Expression of mRFP for each of 12 inducible expression systems after overnight growth in each bact erium with induction range represented by floating bars with fluorescence in the absence of inducer plotted at the bottom of each bar and induced expression plotted at the top. B ) Expression of mRFP in late exponential phase of growth graphed by the induci ble system. On each graph, expression from the nine Proteobacteria are displayed in the following order: P. aeruginosa , B. thailandensis , A. fabrum , P. putida , A. baylyi , X. campestris , Ruegeria sp. TM1040, Sulfitobacter sp. EE 36, and A. fischeri . Data is presented without correcting for autofluorescence of the cells and medium.


80 Figure 2 4 . Measurement of mRFP at t itrated i nducer c oncentrations. A ) pFLxR5 in P. putida B ) pKNR5 in X. campestris C ) pBCyR5 in P. aeruginosa . Vertical bars represent range of expression at five concentrations of inducer in RF U and gray circles are OD 660 at late stationary phase +/ SE of triplicates. D ) pFCiR5 in A. fabrum E ) pFAR5 in B. thailandensis F ) pBLtR5 in Ruegeria sp. TM1040. Data points represent fl uorescence normalized to growth (OD 660 ) from samples grown in the absence of inducer (U) and at five inducer concentrations. Exponential phase (closed circles) and late stationary phase (open squares) +/ SE of triplicates. G ) Expression data from independ ent induction experiments. Strains containing two plasmids with a unique promoter regulator pair and reporter were induced both individually and simultaneously. For each bacterium, the top and bottom graphs show fluorescence data from GFP and mRFP, respect ively. Plasmid combinations are listed in Object 2 1 Supplementary Table 6. For each data cluster, floating lines represent expression from the following conditions in order: expression with inducer (closed circle), expression without inducer (closed circl e), and expression with both inducers (open square). Data from strains with the corresponding single plasmid are included on mRFP graphs (dashed line). The data shown are the average RFU of triplicates after an overnight induction.


81 Figure 2 4. Continued .


82 Figure 2 5 . Conditionally e ssential g ene to m easure t ightness of r epression. The gentamicin acetyltransferase gene aacC1 is placed under control of P CymRC in non inducing and inducing conditions. A) B. thailandensis pFCyGe2 strains are plotted as a percentage of the total number of viable cells containing the plasmid. B ) Serial dilutions of B. thailandensis plated onto media containing gentamicin or the backbone antibiotic kanamycin with and without the addition of th e inducer cumate. Data from cultures in exponential phase of growth.


83 Figure 2 6 . Expression of t otal l ibrary and s elect l ibrary i solates. Expression data from all screened isolates of the LuxR/P LuxB library in P. aeruginosa (top left), A. fabrum (top right), and A. fischeri (center left), the TetR/P TetA library in B. thailandensis (center right) and A. fabrum (bottom left), and the NahR AM /P SalTTC library in A. fabrum (bottom right). Grey and black lines show uninduced and induced expression of each isolate, respectively, and the overlayed scatterplot shows corresponding fold change. Symbols in red represent the fold change of original plasmi ds. Data is sorted by induced RFU. Inserted floating bar charts represent expression ranges from isolates with the highest fold change from each respective library. Expression range from original plasmid represented in a shaded box, fluorescence from empty vector control shown as a black horizontal line. Data is an average of three replicates after overnight induction.




85 CHAPTER 3 P LASMIDS FOR CONTROLLED AND TUNABLE HIGH LEVEL EXPRESSION IN E. COLI Controlled gene expression is crucial for engineering bacteria for basic and applied research. Inducible systems enable tight regulation of expression; wherein a small molecule inducer causes the transcription factor to activate or repress transcriptional initiation. The T7 expression system is one of the most widely used inducible systems, particularly for high overexpression of proteins. Though, it is well known that the highly active T7 RNA polymerase (RNAP) has several drawbacks, including toxicity to t he host and substantial leaky expression in the absence of an inducer. Much work has been done to address these issues; however, current solutions require special strains or additional plasmids, making the system more complicated and less accessible. Here, we challenge the assumption that the T7 expression system is the best choice for obtaining high protein titers. We hypothesized that expression from strong inducible promoters expressed from high copy plasmids could compete with expression levels obtained from T7 RNAP, but possess improved control of transcription. Employing inducible systems from a toolbox we developed previously, we demonstrate that our plasmids consistently give higher outputs and greater fold changes over basal expression than the T7 s ystem across rich and minimal media. In addition, we show that they outperform the T7 system using an engineered metabolic pathway to produce lycopene. Genetic systems for protein overexpression are required tools in microbiological and biochemical resear ch. Ideally, these systems include standardized genetic parts with predictable behavior, enabling the construction of stable expression systems in the host organism. Modularity of a genetic system is advantageous so that the expression


86 system can be easily moved into a host that best suits the needs of a given experiment. The T7 expression system lacks both predictability and stability and requires special host strains to function. Despite this, it remains one of the most popular systems for protein overpro duction. This study directly compares the T7 system to four inducible systems from our broad host range plasmid toolbox, demonstrating these alternative expression systems have distinct advantages over the T7. The systems are entirely plasmid based and not constrained to a specific bacterial host, expanding the options for high level protein expression across strains. Introduction Escherichia coli has been a workhorse in the field of microbiology for decades, serving as both a model organism and intracellular workbench for molecular biology studies (2 2 24, 368) . A variety of systems exist for heterologous protein expression in these cellular factories, with the T7 expression system among the most popular (126, 270, 274) . This system involves a chromosomally encoded bacteriophage T7 RNA The T7 promoter regulates expression of the target gene and is usually contained on a plasmid. The T7 RNAP recognizes its promoter sequence with stringent specificity and is very efficient, generating high polymerase flux to maximize target protein product ion (126, 275) . For greater control of gene expression, the native T7 RNAP promoter was replaced with the lac UV5 inducible promoter, while an inducible variant of the T7 promoter, T7 lac , is often used to drive expression of the target gene (126, 270, 369) . Even with these parts in place, T7 systems are notoriously leaky. Due to the high activity of the T7 RNAP, even low level basal expression of the polymerase leads to high expression of the target gene. This basal expression of T7 RNAP decreases the


87 stability of protein production strains, mainly when the target proteins affect cel l fitness (273, 285, 286) . Additionally, the high processivity of T7 RNAP can come at a large fitness cost to the host due to competition for cellular resources (125, 126, 269, 274) . Numerous strategies have bee n used to increase the stringency of T7 RNAP repression, but toxicity and leakiness remain a concern (59, 159, 171, 288, 293, 299 301, 324) . The conventional T7 system also lacks tunability, meaning inducer concentration is not correlated to protein output levels in a dose dependent manner (288) . Moreover, induction often results in a mixed population of cells that express the target protei n at different levels. Uniformity of expression can vary depending on the available carbon source and the presence of toxicity escape mutants which can lower or abolish protein production (288, 293, 294, 296, 303, 370) . These problems necessitate the use of freshly transformed cells due to the propensity for chromosomal mutations of the host strain that diminish levels of T7 RNAP (15 9, 297) . Still, systems for high level overexpression of recombinant proteins from single genes or multi gene pathways are extremely valuable. The regulated coexpression of multiple genes is necessary for building biosynthetic pathways and can reduce t he occurrence of non functional protein aggregates (368) . The Duet plasmids (Novagen) were developed to meet this need, enabling the coexpression of up to eight genes on four compatible vectors (371) and T7 lac pro moters regulate all target genes, so many of the limitations discussed above apply. While plasmid borne T7 RNAP can also be used, the aforementioned problems are often exacerbated (163, 287, 372) . Significantly, use of the T7 promoter to


88 regulate all inserted genes only allows for a rough measure of tunability in the form of relative copy number, making fine control of gene expression virtually impossible (373) . More recently, an impressive work by Meyer et al . greatly expanded the tools available for coexpression with the development of the Marionette system. Here, E. coli strains house 12 evolved transcriptional regulators on the chromosome and cognate output promoters on a single plasmid (68) . The researchers demonstrated that many of the promoters have large dynamic ranges, and several could be used together to construct a biosynthetic pathway where each gene can be independently tuned. However, the system is restricted to the Marionette strains that include all regulatory elements integrated into the ch romosome and lacks the additional level of tunability that comes from varying copy number with different plasmid backbones (346) . We recently developed a broad host range plasmid toolbox for tunable gene expression and tested it across nine species of Proteobacteria (51) . Having attained very high levels of expression in many of these species, we wondered if some of the inducible systems were capable of competing with the expression levels obtained from T7 RNAP in E. coli . We hypothesized that expression from our toolbox promoter regulator pairs would enable high protein overexpre ssion and improved control of transcription. Our plasmid system's ease of assembly helped us efficiently construct a collection of 28 plasmid variants to test four toolbox inducible systems on a set of origin and marker backbones. We were interested in dem onstrating a dynamic range of expression using high and low copy plasmids, tuning transcription via titrated inducer, or a combination of both strategies.


89 In work presented here, we add four E. coli origins of replication to our toolbox and report charact erization data for four inducible expression systems to demonstrate their large dynamic range and utility in gene coexpression experiments in E. coli . We also assessed whether these plasmids could outperform canonical T7 promoter plasmids in strength, stab ility, and utility. Materials and Methods Plasmid C onstruction and T ransformation Plasmids were assembled using NEB HiFi Assembly with PCR amplified genetic parts using a protocol established in previous work from this lab (51) . A list of regulatory parts and their sources are available in Object 3 1 Supplementary Table 2 and schematics of genetic parts are in Object 3 1 Supplementary Figure 1. Plasmid available at Addgene listed in Object 3 1 Supplementary Table 1. Primers used in this study listed in Table 3 2. All recipient E. coli strains were transformed via electroporation using the following protocol: 5 mL cultures were started from isolated colonies and incubated with shaking overnight. The following day, the cells were subcultured 1:50 in 5 mL of fresh media until cells reached exponential growth or an OD of approximately 0.5. The total culture volume was then spun down in microcentrifuge tubes at 5000 × g for 2 min. Culture supernatants were aspirated and cell pellets were resuspended in 1 mL of 300 mM sucrose at room temperature and centrifuged again for an additional 2 min at 5000 rpm. The wash was repeated, and then cells were resuspended in a final volume of 1/10 of the initial culture volume. Suspensions of 50 µL were electroporated in a 1 mm electroporation cuvette, and cells were electroporated at 1.8 kV. Cells were recovered in 1 mL of LB and incubated for 1 hour at 37°C.


90 Both plasmids were electroporated together for the multiple plasmid systems tested in Figure 3 5. Strains utilized in the lycopene production experiments in Figure 3 6 could not be efficiently transformed simultaneously, and the transformat ions were done sequentially. In this way, wild type E. coli MG1655 or BL21(DE3) were transformed with the first plasmid following the protocol outlined above and from the transformation plate, a single colony was grown up and the same process was followed to transform the second plasmid. Fluorescence Assays As displayed in Figures 3 1 through 3 5, fluorescence measurements were taken as follows. For each set of plasmids housed in E. coli MG1655 to be screened, glycerol stocks were struck onto fresh plates. Isolated colonies were used to inoculate 1 mL of media in a deep well plate and incubated on a plate shaker overnight. The following day, the cultures were diluted to an OD of 0.1 in 1 mL of fresh media and antibiotics in a deep well plate. At exponential phase the cultures were diluted into 100 µL of fresh media in a 96 well plate (Costar, black, clear bottom) to an OD of 0.07 and incubated with shaking for 0.5 hours. At this point, 100 µL of media with antibiotic with 2× inducer were added to wells to ind uce samples and 100 µL of media with antibiotic only were added to uninduced control wells. The plate was incubated on a plate shaker and fluorescence and OD measurements were taken at 1, 2, 4, 6, and 24 h post induction in a plate reader (Molecular Device s SpectraMax M3). All experiments were performed in three technical replicates and both WT and empty vector controls (cells transformed with a plasmid lacking mRFP ) were included on each plate as negative controls. The same protocol was followed for measur ements in E. coli BL21(DE3) except all cultures were started from freshly transformed cells.


91 Calculations and data analysis were performed using Microsoft Excel. Each screening dataset was first organized into timepoint OD and RFU measurements and OD was a djusted to a 1 cm pathlength by dividing by a factor of 0.56 or 0.28 for the measurement of 200 or 100 µL of culture, respectively (51) . As noted in the figure description, raw fluorescence data was either used directly or normalized to optical density readings. Fold change values were calculated by subtracting the uninduced fluorescence from the induced fluorescence and dividing this value by the uninduc ed fluorescence. Stability Screen Experiments Culturing: Cultures were started from freshly transformed BL21(DE3) cells in 1 mL M9 glucose 0.4% with the appropriate antibiotics in a deep well plate. Eight technical replicates were included for each of th e 12 strains under study. The deep well plate was grown with shaking overnight at 37°C, and the following day, the cultures were diluted 1:1000 into fresh media in a deep well plate, marking day 1 of the screen. This plate was grown with shaking overnight and the following day, the culture was diluted 1:100 into two deep well plates: one with glucose supplemented minimal media and relevant antibiotics only and one with relevant inducers added to the media. This process was repeated, diluting 1:100 of overni ght cultures from the induced and uninduced plates into fresh media with and without inducer respectively, for 12 total days. Fluorescence m easurement: On each day, 200 µL of overnight culture was transferred from both deep well plates into 96 well plates (Costar, black, clear bottom) to read fluorescence in a plate reader (Molecular Devices SpectraMax M3). On day 11, 50 µL of culture was taken from induc ed cultures and struck on to LB plates with the


92 appropriate antibiotic and relevant inducer. After an overnight incubation at 37°C, pictures were taken of each plate under blue light to visualize RFP fluorescence of colonies. Visualization: The day 12 pla te of induced cultures was used for cytometry analysis. After an overnight growth, 500 µL was transferred to a deep well plate and spun down in a plate spinner until cells were pelleted in each well. The supernatant of minimal media was aspirated, and the pellet was resuspended in 500 µL of a 4% formalin cell fixing solution and incubated for 10 minutes. The plate was then spun down again to pellet the cells, the fixing solution was removed using suction, and the cells were resuspended in 500 µL of PBS. Flow c ytometry: Measurements were taken using a green laser (488 nm) with the standard 670 LP filter and 10000 events were used for analysis. Samples were left ungated to allow for the detection of multiple peaks. Data was analyzed using FlowJo (version 10 .8.1) and displayed as histograms. The geometric mean and coefficient of variation are shown for each replicate in Object 3 1 Supplementary Table 4. Lycopene Experiments Cultures were started from freshly transformed BL21 and MG1655 cells and grown overni ght at 37°C in 5 mL of LB media with the appropriate antibiotics. Overnight cultures were inoculated at 1% vol/vol into 3 mL of M9 media supplemented with 0.32% glucose, 0.5% casamino acids, and ATCC Trace Mineral supplement and the appropriate antibiotics and grown in glass tubes with rubber stoppers at 37°C until cultures reached exponential phase. At this point, cognate inducer(s) and 25 mM isoprenol were added to the cultures and the glass vials were sealed with a Teflon coated stopper and crimp sealed to prevent the evaporation of isoprenol. Inducer


93 concentrations per strain were as follows: 0.1 mM IPTG for LycO, 10 µM OC6 for Lyc1, 10 µM OC6 and 100 µM cumate for Lyc2. Cultures were then grown overnight and lycopene extraction was performed the followi ng morning. 500 µL of each culture was spun down in 1.5 mL microcentrifuge tubes at 16 000 x g for 1 min. The supernatant was decanted and the pellet was resuspended in 1 mL of a solution of 50% ethanol and 50% acetone to extract lycopene. The tubes were t hen vortexed for 15 minutes and centrifuged again for 1 min at 16 000 x g to remove particulates. The extraction took place in a darkened room as lycopene is light sensitive. 200 µL was then transferred to a microplate and absorbance was recorded at 475 nm . Results Plasmid Design, Tunability, and Context dependence Plasmids were constructed with four interchangeable parts with common linker sequences so that combinatorial assembly by ligation independent cloning could be quickly performed. With four antibi otic markers, four promoter regulator pairs, and the addition of four enteric bacteria specific origins of replication to our toolbox, we constructed 28 plasmid variants and characterized their expression in E. coli . The expression data in Figure 3 1 has t he data grouped in shaded boxes such that only one genetic part is varied within each box to assess how this single variable changed the expression levels. Our apriori assumption was that changing the marker would have little influence on the expression le vels, while changing origin and regulator would have greater influence. Pairwise comparison between plasmids that differ only in their origin shows that expression levels change greatly between some plasmids (Figure 3 1A). In E. coli , the pACYC backbone h as a copy number of 10 12, pCOLA and pCDF have a copy number


94 of 20 40, and the pET backbone has a copy number of ~40 (368, 371) . Plasmid copy number is often directly related to gene expression because more DNA templates are available for transcription, though this relationsh ip is not always maintained (15, 169, 330, 346) . Our expression data mostly follows copy numbe r trends, with all expression systems on a pACYC backbone giving a maximal expression around 10 3 RFU while systems on pCDF and pET backbones were more likely to approach 10 4 RFU. LacI regulated systems on a pET backbone are notable exceptions; while pELx, pECy, and pEV have an output near 10 4 RFU by late exponential phase, pELl was at or under 10 3 RFU after an overnight induction. Indeed, expression from LacI/P LlacO 1 was consistently lower than the other promoter regulator pairs tested across origin and ma rker combinations, suggesting that this system has a lower output than the others, rather than its plasmid context. In Figure 3 1B, expression levels after changing only the marker are compared within four different origin regulator pairs, and in Figure 3 1C, inducible systems are compared on the same origin marker backbone. At the late exponential phase measurement, leakiness and expression levels are remarkably similar for pCLx, pELl, and pAV, regardless of the marker. Expression from pDCy is comparative ly less consistent and is influenced by the marker to a greater extent. Similarly, in the late exponential phase, plasmids with the pCDF backbone varied with the promoter regulator more than other origins. Another trend is the high level of uninduced expre ssion of mRFP in VanR AM regulated systems after overnight growth. This is true across plasmid backbones, suggesting a characteristic intrinsic to VanR AM /P VanCC in E. coli . While the original description of the evolved VanR AM showed leakiness in M9


95 medium (68) , our results showed high uninduced expression in rich medium as well. All other systems maintain a low basal expression level, except for pCLxR4, which becomes leaky at stationary phase. We next assessed titrating expression level by changing the inducer concentration. The pA R1 plasmids with each promoter regulator pair were screened with varying inducer levels, and the fluorescence was measured at late exponential and stationary phases (Figure 3 2, Object 3 1 Supplementary Figure 2). Both the LacI and VanR AM regulated systems exhibit a bell curve of expression levels across titrated inducer concentrations, particularly at the late exponential timepoint. Across inducer concentrations, the VanR AM /P VanCC system is inducible from 19 to nearly 430 fold at six hours post induction, though much of this range is lost by stationary phase due to increased leakiness in the absence of inducer. After an overnight induction, pAVR1 h as a maximal fold change of 40 with 2 mM vanillate. At the late exponential measurement, the LacI regulated system has a smaller range of induction across inducer concentrations than VanR AM /P VanCC . However, LacI/P LlacO 1 maintains a relatively low level of basal expression through stationary phase and remains inducible over 200 fold LlacO 1 is sensitive to inducer such that it exhibits a 30 IPTG at both timepoints measured, suggesting a space for further titrations to tune expression while minimizing inducer cost. Across the inducer concentrations tested, the LuxR regulated system is tunable change values following a nearly log linear response. At stationary phase, LuxR/P LuxB is also inducible to 40 fold at the


96 lowest concentration of inducer tested (64 nM OC6), suggesting that this system is highly sensitive and that further inducer titration s may continue the log linear trend seen across most of this dataset. Though CymR AM /P CymRC has the highest maximal fold change of the entire dataset at both timepoints measured, this system exhibits the most binary response. Though expression reached 50 to 80 fold with 3.2 µM cumate at both timepoints, fold changes under 3 or over 300 were found at other inducer tuned expressio n. Expression Stability Over Time We were interested in testing the stability of our plasmids to ascertain if expression remained consistent throughout several passages. If this were true, our toolbox systems would demonstrate a distinct advantage over T7 expression systems in BL21(DE3) by maintaining stable and predictable high level expression. Our experiment measured mRFP expression from eight toolbox plasmids and four Duet plasmids in minimal medium over 12 daily passages under the pressure of continuo us induction. Eight replicates were included for each of the 12 plasmids to monitor changes in mRFP expression (Figure 3 3). On day 11, one replicate from each sample was struck on a plate spread with inducer and photographed under blue light to visualize colonies with red fluorescence ( Object 3 1 Supplementary Figure 3). The percent change in growth normalized fluorescence on each day was compared to the benchmark measurement (day 2) and plotted for each replicate in Figure 3 3. The T7 systems were very u nstable, consistent with previous studies (287) . Among the Duet vectors, there were no apparent differences in system behavior among the different origin and marker backbones. The T7 expression s ystems consistently lost


97 protein production capacity across all plasmid genetic backgrounds, and most replicates had lost over 80% of their fluorescence signal by day 7. The loss of fluorescence occurred most rapidly in pCOLADuet 1, with five of the eight replicates exhibiting over a 75% decrease compared to benchmark measurements by day 4. Though pETDuet 1 possesses the origin with the highest copy number, the percent change throughout the experiment is not notably different from the others. In fact, at le ast one replicate from all other Duet vectors lost almost 60% fluorescence after just 2 to 3 passages, whereas this degree of reduction was not seen in pETDuet 1 until day 5. This suggests that copy number alone does not determine the stability of an expre ssion vector, and toxicity escape mutations cannot necessarily be avoided by using a low copy plasmid. Among the toolbox expression systems, those tested on pACYC and pCOLA backbones best maintained expression over the 12 day experiment (Figure 3 3). Expr ession from pACyR1 and pCCyR2 were particularly stable, and by day 12, only three of sixteen replicates from these two plasmids had decreased expression compared to benchmark measurements, with the greatest decrease at just 12%. Though mRFP expression from LacI/P LlacO 1 on pA and pC backbones generally decreased throughout the experiment compared to benchmark measurements, percent decreases were 20 and 30% on average by day 12 for pCLlR2 and pALlR1, respectively. Conversely, the expression decline from pDLx R6 and pDCyR6 was only slightly less pronounced than that from the T7 system on the same backbone, though pCDFDuet 1 lost fluorescence sooner and was less consistent across replicates. A notable anomaly among our toolbox plasmids is that of pELxR4. Fluores cence from each replicate in pETDuet 1 and pELxR4 is less than half the benchmark


98 measurements by day 7, and most have decreased over 90% by day 12. CymR AM /P CymRC on the same backbone, pECyR4, lost an average of 6% fluorescence by day 12 compared to 93% fr om pELxR4. Again, expression from pETDuet 1 is less consistent across replicates, likely a result of accumulated mutations that affect protein production in different ways. To measure changes in promoter activity at the cellular level, we applied flow cyt ometry to the day 12 samples ( Object 3 1 Supplementary Figure 4). In the Duet vectors, cell counts peak at different fluorescence intensities and, the spread of peaks on the x axis representing relative fluorescence varies considerably ( Object 3 1 Suppleme ntal Table 4). With our toolbox plasmids, the peaks are more narrow and uniform, though there is increased variability among replicates in pDLxR6 and pELxR4, as expected based on the population measurements. For most of the Duet vector samples, the geometr ic mean of fluorescence intensity is an order of magnitude lower than those of the toolbox plasmids, and three of the four Duet vectors had replicates with a coefficient of variation an order of magnitude higher than any replicate from the toolbox plasmids ( Object 3 1 Supplementary Table 4). This highlights the poor stability and lack of predictability in T7 expression and an overall decline in protein production. The fully plasmid based inducible systems in this toolbox on the same backbones consistently g enerate higher protein levels after many subculturings and have reduced cell to cell variation. Growth Medium Dependent Properties We next took the pD toolbox plasmid variants and pCDFDuet 1 with m RFP cloned into the first multiple cloning site and compared the expression levels in E. coli BL21(DE3) with different growth media: Luria Broth (LB), M9 supplemented with


99 glucose (M9Glu), and M9 supplemented with glycerol (M9Gly). While we expected overall ex pression to be higher in LB, we were interested in comparing expression levels between M9 with glucose (M9Glu) and M9 with glycerol (M9Gly) as glycerol is less expensive than glucose and is being used as an alternative carbon source in metabolic engineerin g experiments (374, 375) . In both rich and minimal media, all toolbox plasmids exhibite d lower leaky expression than pCDFDuet 1. At the same time, many maintained output levels that were as high or higher when induced (Figure 3 4). Output levels from the T7 lac promoter across all media tested were less than 15 fold above basal expression by the stationary phase timepoint, and in all cases, these low fold changes can be attributed to high uninduced expression, highlighting a lack of controlled induction. In LB, the highest mRFP expression over basal levels came from pDCyR6 with a fold change of over 650 at both timepoints taken. Here, leaky expression from pDCyR6 remained two orders of magnitude below that of the T7 promoter by stationary phase and was among the lowest of all expression vectors tested. Expression from pDCyR6 also had the bigge st fold change of the pD plasmid variants by stationary phase in M9Glu and M9Gly with similarly low uninduced expression. In this way, pDCyR6 provides a versatile option for tightly regulated plasmid based expression in different culturing conditions. Whil e induction profiles show that pDCyR6 and pDCyR2 have similar maximal RFU outputs across media types, pDCyR2 has lower fold changes due to leaky expression, consistent with our previous screening data (Figure 3 1). While very high levels of inducible expr ession are often desirable, some experiments require low expression to match physiological levels or avoid over -

PAGE 100

100 burdening the host (169) . Of the four CymR AM regulated systems tested here, pDCyR1 consistently had the lowest expression across the three media types by stationary phase and remained very tightly off after overnight growth. T he pDLlR6 plasmid had the lowest induced expression from the three remaining systems while still tightly off in the absence of an inducer, consistent with data showing that LacI/P LlacO 1 generally has lower induced expression levels (Figure 3 1). The pDLx R6 plasmid had the highest overall RFU output, with expression 4 fold and 12 fold higher than the T7 promoter in M9Glu and M9Gly respectively after an overnight induction and growth, and over 350 fold in rich media at both timepoints. This is consistent wi th our previous data, ranking pDLxR6, pELxR4, and pDCyR2 among the highest expressing plasmids in the toolbox. Though pDLxR6 and pDCyR2 have higher leaky expression than the same systems on different backbones, these plasmids are still useful where high ou tput is necessary and expression in the absence of inducer is less of a concern. Though the LuxR construct was leaky at both timepoints, it was still an order of magnitude lower than uninduced T7 expression on the same pCDF backbone in both rich and minima l media. Multi plasmid Strains We next measured expression in strains that possessed two plasmids to determine whether this increased burden affected expression. The main feature of the Duet vectors is expression of several target genes from multiple plas mids with compatible origins, though because P T7 lac regulates all genes, expression is tunable only through variation in copy number (371) . In our plasmid system, target genes are controlled by different inducible systems on compatible plasmids, enabling more fine tuned and temporal control for expressing different genes or op erons. Though plasmid -

PAGE 101

101 based expression of multiple genes is employed frequently in metabolic engineering, direct comparisons to the widely used T7 system are lacking (324, 376, 377) . Accordingly, E. coli BL21(DE3) strains were constructed that possessed Duet and toolbox plasmids both as single plasmid and multi plasmid systems. Strains are organized into six groups with multi plasmid strains designated mT7s1 mT7s2, and mS1 4 and single plasmid strains des listed in Table 3 1. The single plasmid strains in this experiment were also used in Figures 3 3 and 3 4; however, the single plasmid strain labels in Table 3 1 are used here for clarity. Across groups, induc tion profiles from toolbox multi plasmid strains where both plasmids were induced simultaneously were compared to the Duet multi plasmid strains in the presence and absence of IPTG and within groups, expression from multi plasmid strains are compared to si ngle plasmid strains (Figure 3 5). First, we compared the effect of genetic context in the six multi plasmid strains (Table 3 1). Expression from the T7 promoters in mT7s1 and mT7s2 is consistent with expected behavior based on plasmid copy number. GFP expression from pETDuet 1 in mT7s1 is consistently higher than expression from all other systems in the multi plasmid Duet strains and mRFP expression from the low copy pACYCDuet 1 plasmid in mT7s2 is the lowest. Among the toolbox plasmids, mRFP expression from the CymR AM regulated systems in mS1 and mS2 are very similar by the stationary phase timepoint and GFP expression from pELxG4 in both mS1 and mS2 is consistently higher than GFP expression from pDLxG6 in mS4. These trends remain consistent across bot h timepoints in both rich and minimal media, supporting the idea that copy number itself

PAGE 102

102 can effectively be a rough mechanism for tuning expression of target genes on different plasmids. Among the toolbox strains, mS1 and mS2 both possess pELxG4 and mRFP is expressed from CymR AM regulated systems on different plasmid backbones (Figure 3 5). Though the pCDFDuet 1 and pCOLADuet 1 backbones have similar copy numbers, mRFP expression from pCCyR2 in mS1 is consistently higher than that from pDCyR1 in mS2 in the presence of only cumate and both cognate inducers ( Object 3 1 Supplementary Figure 5, Figure 3 5). Strain mS3 possesses pDLxR6, which is among the highest expressing toolbox plasmids in single plasmid strains (Figures 3 1, 3 3, 3 5). Surprisingly, mRFP ex pression from pDLxR6 in mS3 was the lowest of the four toolbox multi plasmid strains by stationary phase in minimal media conditions. Independent induction of pDLxR6 in mS3 shows similar results. These induction profiles exemplify how changing plasmid pair ings and culturing media influences independent and dual expression in multi plasmid systems in unexpected ways We then compared expression between multi and single plasmid strains. Expression levels from multi plasmid strains are generally expected to be lower than from single plasmid strains containing the same plasmids individually due to an increased plasmid burden. Expression from the single plasmid Duet strains was consistently higher and more leaky than expression of the same reporter in mT7S1 and m T7s2 except for T7s1g, which had slightly lower induced expression levels than GFP expression from mT7s1 at stationary phase. Expression from T7s2r was markedly higher than mRFP expression from mT7s2, which had the lowest expression level of all Duet multi plasmid strains. This could be due to the phenomenon of T7 RNAP

PAGE 103

103 sequestration (166, 304) , where the amount of T7 RNAP available for transcription is split between each copy of different plasmids, and expression from the lower copy plasmid, in thi s case, pAT7R1, can be disproportionately reduced by the presence of higher copy plasmids, here pDT7G6. Differences between single and multi plasmid systems were most extreme for groups S3 and S4 in minimal media. By stationary phase, fold changes were tw o orders of magnitude higher for both S3r and S3g compared to mRFP and GFP expression from mS3 respectively and with the exception of S4r in M9Glu, fold changes from S4r and S4g were an order of magnitude higher than that from mS4. Conversely, mRFP express ion from mS1 and mS2 was not notably different than that from S1r and S2r single plasmid strains and fold changes were similar across media types. Interestingly, the GFP expression from mS1 was higher than from S1g across all measurements except for M9Glu after overnight growth. By stationary phase, GFP expression from mS2 was higher than that from S2g. In comparison, strains with two toolbox plasmids had a much higher dynamic range than multi plasmid Duet strains in most cases, especially in minimal media . With IPTG induction of mT7s1 and mT7s2 in minimal media, neither mRFP nor GFP were inducible over 90 fold at late exponential phase, and the fold change dropped to a maximum of 12 after an overnight induction, mostly due to leaky expression. In contrast, at least one multi plasmid toolbox strain expressed both GFP and mRFP over 60 fold in LB and over 185 fold in M9Glu and M9Gly by the stationary phase timepoint. Moreover, the overall expression of each reporter in at least one toolbox strain was higher th an T7 lac driven reporters after an overnight induction across all media conditions.

PAGE 104

104 This data suggests that toolbox inducible expression systems can effectively be utilized in a multi plasmid system and outperform widely used T7 systems in several key way s. First, toolbox plasmids allow independent expression of target genes with minimal crosstalk, a feature inherently unattainable with multi plasmid Duet systems. Second, expression in the absence of inducer is similarly low or lower in the toolbox systems as compared to Duet systems. Based on our previous data (Figure 3 4), this was expected and holds true in multi plasmid systems. Finally, toolbox systems have induced expression levels that are higher than T7 systems in almost every case, even when both i nducers are present. Though the toxicity of the bacteriophage expression system is often seen as a necessary evil to obtain very high expression levels, our data suggests the same expressions can be achieved in a more controllable manner and without the as sociated issues. Lycopene Production Comparison To demonstrate that our toolbox plasmids can also compete with the T7 promoter in metabolic engineering experiments, we re cloned a previously described pathway for lycopene production and compared it against the original plasmids (162) . In the previous two plasmid system, the IUP genes ChK, IPK, and idi were expressed from the pro4 constitutive promoter (pSEVA228 pro4IUPi), while the lycopene genes crtl , crtB , ipi , and ggpps were expressed from a low copy plasmid with the T7 promoter (p5T7 LYCipi ggpps). To compare our toolbox plasmids, we cloned the IUP and lycopene genes under the control of the CymR AM /P CymRC and LuxR/P LuxB, respectively. For consistency, the same markers were used, and the p15A and RK2 origins were chosen, which closely matched the low plasmid copy numbers used in the original work (162, 371, 378, 379) .

PAGE 105

105 We first measured lycopene production in BL21(DE3) cells with three different plasmid pathways: (i) the original p5T7 LYCipi ggpps and pSEVA228 pro4IUPi system, designated LycO (ii) pALxLyc6 and pSEVA228 pro4IUPi, designated Lyc1, and (iii) pALxLyc6 and pRCyIUP2, designated Lyc2. The results show that the lycopene production from Lyc2 is significantly higher than Lyc1 and LycO in minimal medium (Figure 3 6A). In rich medium, Lyc1 and Lyc2 produced significantly more lycopene than LycO as wel l. Because our toolbox plasmids are not constrained to BL21(DE3) strains, we also tested the E. coli K strain MG1655. The strains were screened in parallel with Lyc1 and Lyc2 systems in BL21(DE3) (Figure 3 6B). Surprisingly, we found that the highest lyco pene production came from the Lyc2 system in MG1655, with production 13 fold over basal levels compared to 6 fold from the same system in BL21(DE3). Induced lycopene production from Lyc2 in MG1655 was significantly higher than production from Lyc1 in both MG1655 and BL21(DE3) and from Lyc2 in BL21(DE3). Output from the Lyc1 systems in the two E. coli strains were not significantly different. Overall, lycopene was notably lower in BL21(DE3) (Figure 3 6A) as compared to the same strain in the assay comparing BL21(DE3) to MG1655 (Figure 3 6B). Though, this may be due to a difference in protocols during the subculturing of the strains rather than physiological differences between the two strains. In the assay across BL21(DE3) strains, the cells were subcultured twice prior to induction, similar to the fluorescence assays (Methods). While this method has been shown to increase output in other studies (51, 68) , it has the effect of decreasing lycopene in this experiment.

PAGE 106

106 Discussion In this study, we expand the application of our previously described plasmid toolbox to include both larg e dynamic range and high overexpression in E. coli . Utilizing our combinatorial assembly method (51) , we efficiently constructed plasmids with four prom oter regulator pairs, four antibiotic markers, and four enteric bacteria specific origins of replication, generating a collection of 28 variants. After validating that LuxR/P LuxB , CymR AM /P CymRC , LacI/P LlacO 1 , and VanR AM /P VanCC were functional on these vector backbones in E. coli , we assessed the dynamic range of expression compared to T7 systems in the same genetic background and found that our toolbox promoter regulator pairs outperformed the T7 system in most cases. We show th at our toolbox systems are tunable, capable of independent expression in multi plasmid systems, and produce higher titers of the end product in a re engineered metabolic pathway. These findings challenge the use of the T7 system as the default when protein overexpression is desired. Protein overexpression is necessary to generate high titers of value added end products (162, 278, 279) , to confirm recombinant protein function (280, 281) , or to generate sufficient levels of a target protein for functional or structural studies (282 284) . The T7 system is often chosen due to the high processivity of the RNAP and specificity in recognizing i ts promoter (126, 274) . The T7 RNAP has been lauded for its high activity based on the premise that more mRNA would result in more protein, but this relationship quickly breaks down when host resources a re exhausted, leading to growth inhibition and low protein yields (126, 159, 277, 289, 291) . The system exhibits substantial leaky expression as only small amounts of T7 RNAP can lead to high target gene expression in the absence of an inducer. In fact, omi tting the inducer entirely and

PAGE 107

107 treating the T7 RNAP as constitutively expressed has been used effectively to generate membrane and secretory proteins (380) . Howeve r, high uninduced expression can cause difficulty in obtaining transformants, for example when the metabolic burden for protein production is too high or when the heterologous protein is toxic to the host (270, 287) . In protein overexpression experiments, it is often necessary to manage the level of T7 RNAP activity to balance protein production and host growth (298) . Because expression of and from T7 RNAP is not easily tunable by titrating inducer concentration, alternative solutions have been utilized including the addition of T7 lysozyme to inhibit the activity of T7 RNAP (159, 299, 300, 381) and the construction or identification of mutants that are better equipped to handle the metabolic stress (288, 293, 294, 296, 298) . T7 lysozyme is usually included on a s eparate plasmid and while it can decrease leaky expression by up to 10 times, it also decreases protein expression following induction and results in growth inhibition as the lysozyme, a bifunctional enzyme, can also cut a specific bond in the cell wall of E. coli (286, 299) . Moreover, additional plasmids complicate the system, limiting options in multi plasmid experiments and increasing any necessary fine tuning for optimal protein production (299, 371) . T7 expression systems are notoriously unstable and frequently mutate to reduce the burden of protein overexpression on the host cell. This instability is often the result of mutation rather than plasmid loss (287) , and toxicity escape mutations have been found within the lac UV5 promoter regulating T7 RNAP expression (293) , within the regulatory region of the T7 RNAP gene (295) , and within LacI (294) . These mutations often dampen the production or activity of T7 RNAP or lower the affinity of T7 RNAP to

PAGE 108

108 i ts promoter (287, 294, 296 298) . Mutant hosts have emerged from overproduction experiments, themselves becoming popular protein producing strains and also helping to inform targeted mutations that offer increased control and reduced toxicity (171, 292, 293) . The obvious drawback to these proposed solutions is the necessity for specific strains, i.e. the Walker strains, C44(DE3), and C45(DE3), and in some cases additional inducers and plasmids (171, 288, 293) . Additionally, these mutations are unpredictable and often have undesirable effects, including the total loss of target gene expression (297, 382, 383) . Other proposed s olutions include utilizing different inducible systems, constitutive promoters, or negative feedback loops to control T7 RNAP (287, 301, 304) , splitting T7 RNAP to alleviate stress and toxicity (306) , or inhibiting the T7 RNAP through other means (163, 305 307, 384) . Though effective, splitting T7 RNAP is labor intensive and reconfiguring the T7 system to change its regulation necessitates re engineering and may still result in toxicity intrinsic to T7 RNAP (287, 289) . In the work presented here, we demonstrate that our plasmid toolbox systems are capable of high, tightly controlled expressio n. The combinatorial strategy we have developed for plasmid assembly with choice of origin, marker, inducible system, and target gene is straightforward and efficient (51) . Our plasmids offer a dynamic range of expression levels, and most systems are tunable through titrated inducer concentration (Figure 3 2) and stable over several passages (Figure 3 3), all without the need for special mutant strains or auxiliary plasmids. Our plasmids also provide independent control over different target proteins in coexpression experiments. While the Duet vectors are important additions to the plasmids available for coexpression, their

PAGE 109

109 combined usage has been associat ed with unexpected problems. When both high and low copy Duet vectors are used in a metabolic pathway, T7 RNAP sequestration can occur where expression from genes on the lower copy plasmid are less than expected (166) . This was evident in our experiments when comparing induced expression from strain mT7s2 to expression from a pACYCDuet 1 single plasmid strain T7s2r (Figure 3 5). This differential partitioning phenomenon is inherent to the orthogonality of T7 RNAP to its promoter and can occur when multiple T7 promoters are used in the same system (304) . Because the T7 lac promoter regulates all target genes in the Duet vectors, independent and tunable expression is not possible. Our toolbox promoter regulator pairs can effectively replace the T7 system in a synthetic multi plasmid pathway synthesizing lycopene. We argue that the modularity and tunability of our plasmid toolbox offers advantages over those currently available for protein overproduction in E. coli and have the potential to be expan ded to other hosts. Conventional T7 systems, with a chromosomally integrated T7 RNAP and cognate expression plasmid, have been used in non model hosts but are often inefficient or completely non functional, likely due to phage polymerase associated toxici ty (385) . In developing successful phage derived expressions system outside of E. coli , researchers have turned to part mining T7 like expression systems (385 387) or using similar strategies to those discussed above for controlling the activity of T7 RNAP. Notably, the UBER and HITES systems have been used effectively in Gram negative and Gram positive hosts. Still, they require that T7 RNAP basal expression be repressed through elaborate feedback loops or antisense RNA (304, 372) . Even still, toxicity remained a concern and continues to restrict use of the T7 expression system in

PAGE 110

110 non model bacteria (309) . In a previous study, we constructed and tested broad host range vectors containing 12 inducible systems, including the four utilized here, across nine members of the Proteobacteria. We dem onstrated that LuxR/P LuxB , CymR AM /P CymRC , LacI/P LlacO 1 , and VanR AM /P VanCC have a range of induction levels in these species and can be used successfully in two plasmid systems (51) . Where E. coli BL21(DE3) is not an ideal host for protein overproduction or structure/function studies, our expression systems can be moved and utilized in other bacteria quickly and effectively. Overall, we achieved high expr ession from our toolbox plasmids in both E. coli BL21(DE3) and MG1655 in rich and minimal media conditions, demonstrating that our inducible systems can accommodate versatile protein production strategies. We also demonstrated the evolutionary stability of our expression systems as compared to T7 systems through an extended passage experiment under the pressure of constant induction. Finally, we re engineered a previously constructed metabolic pathway that utilized the T7 system to incorporate our toolbox p romoter regulators and showed that the re engineered pathway produced a higher titer of the end product, lycopene. We argue that the benefits of orthogonality in the T7 systems are outweighed by its toxicity and lack of control over expression and that our toolbox plasmids offer an alternative with distinct advantages. These expression plasmids further expand the broad host range toolbox for investigating, coordinating, and optimizing gene expression in E. coli . Object 3 1. Supplementary information for Plasmids for controlled and tunable high level expression in E. coli , PDF 1.7 MB

PAGE 111

111 Figure 3 1. Expression range of 28 plasmids in E. coli . Induction data is shown for all plasmid variants included in this study grouped to best show the effect of a single genetic part on overall expression. A) Plasmids are grouped with only the origin part changed, B) plasmids are grouped with only the marke r changed, and C) Plasmids are grouped with only the promoter regulator changed. For each plasmid, expression data is shown from both the late exponential (open bar) and stationary phase (filled bar) timepoints. Bars represent the induction range of mRFP f or each plasmid with fluorescence in the absence of inducer plotted at the left end of each bar and induced expression plotted at the right end. Inducer concentrations are listed in Object 3 1 Supplementary Table 3. All measurements are in E. coli MG1655 g rown in rich media and are the average of three technical replicates.

PAGE 112

112 Figure 3 2. Expression across titrated inducer concentrations. Expression of mRFP in E. coli MG1655 strains containing plasmids pACyR1, pALxR1, pALlR1, and pAVR1 were induced with ti trated inducer and measured at exponential and stationary phase of growth. Graphs show average RFU induction data (horizontal bars) and calculated fold change (colored notches) within each data cluster. Vertical lines at each cluster represent the average fluorescence of uninduced samples. Inducer was serially diluted 5 fold from 10 mM (cym, IPTG, van) and 1 mM (OC6) across seven concentrations for each plasmid. All measurements taken in rich media and are the average of three technical replicates.

PAGE 113

113 Figure 3 3 . Stability over extended passages. Sparkline plots of percent change in fluorescence compared to baseline measurements for all replicates across four Duet vectors and eight toolbox plasmids. Plots are bound on the y axis by 100% and 20%. Data was excluded from graphing when growth of replicate was below a threshold of OD = 0.2. For replicates that grew over the threshold after the first day of measurements, the first fluorescence measurement taken when cultures are above OD = 0.2 is used as the baseline measurement. All plotted fluorescence is normalized to growth and inducer concentrations are listed in Object 3 1 Supplementary Table 3.

PAGE 114

114 Figure 3 4 . Comparison between T7 lac and toolbox regulated promoters in E. coli BL21(DE3). Expression of mRFP regulated by the T7 system, and toolbox systems regulated by LuxR, CymR AM , VanR AM , and LacI was measured from a set of eight pCDF plasmids in LB and minimal media with two different carbon sources. Measurements were taken at late exponential (open bar) and stationary (filled bar) phase of growth. Horizontal bars represent the induction range of mRFP for each toolbox plasmid with fluorescence in the absence of inducer plotted at the le ft end of each bar and induced expression plotted at the right end. The induction range from pCDFDuet 1 is extended through the charts with dotted lines. All measurements are the average of three technical replicates.

PAGE 115

115 Table 3 1. List of E. coli BL21(DE3) strains tested in Figure 3 5. Each strain group includes a multi plasmid strain and two single plasmid strains, each possessing one of the two plasmids included in the multi plasmid strain. Group Strains Plasmid (s) T7s1 m T7s1 pDT7R6 + pET7G4 T7s1r pDT7R6 T7s1g pET7G4 T7s2 m T7s2 pAT7R1 + pDT7G6 T7s2r pAT7R1 T7s2g pDT7G6 S1 mS 1 pCCyR2 + pELxG4 S1r pCCyR2 S1g pELxG4 S2 mS 2 pDCyR1 + pELxG4 S2r pDCyR1 S2g pELxG4 S3 mS 3 pDLxR6 + pACyG1 S3r pDLxR6 S3g pACyG1 S4 mS 4 pEVR4 + pDLxG6 S4r pEVR4 S4g pDLxG6

PAGE 116

116 Figure 3 5 . Expression data from induction experiments of single and multi plasmid strains of E. coli BL21(DE3). Each strain group on the x axis includes induction data from the multi plasmid strain (solid bars) and single plasmid strain (patterned bars) with mRFP re adings at the top and GFP readings at the bottom of each timepoint pairing. Expression of GFP and mRFP from multi plasmid strains was induced simultaneously by both cognate inducers and expression from single plasmid strains was induced by the single cogna te inducer. Vertical bars represent the induction range of mRFP or GFP for each strain with fluorescence in the absence of inducer plotted at the bottom of each bar and induced expression plotted at the top. Fold change for each strain represented by diamo nds. Strains were tested in three media types (LB, M9Glu, and M9Gly) and data was recorded at two timepoints. Inducer concentrations listed in Object 3 1 Supplementary Table 3. All data is the average of triplicates.

PAGE 117

117 Figure 3 6 . Lycopene production in two strains of E. coli . E. coli MG1655 and BL21(DE3) were tested for lycopene production via a two plasmid system incorporating a pathway developed by Stephanopoulos et. al. A ) Lycopene pathways were screened in BL21(DE3) on three different plasmid backbo nes including p5T7 LYCipi ggpps and pSEVA228 pro4IUPi (LycO), pALxLyc6 and pSEVA228 pro4IUPi (Lyc1), and pALxLyc6 and pRCyIUP2 (Lyc2). Induced production of lycopene is shown in two media conditions: minimal (blue bars) and rich (yellow bars). B ) Lyc1 and Lyc2 were compared in E. coli MG1655 and BL21(DE3) in minimal media. For each pair of bars, the first bar represents uninduced expression and the second, induced expression. Lycopene is quantified through absorbance readings taken at 475 nm. Data shown is the average of three replicates with standard deviation displayed on charts. Statistical significance was determined with two way ANOVA and p <0.01, *** p <0.001.

PAGE 118


PAGE 119

119 CHAPTER 4 CHARACTERIZING CONSTITUTIVE PROMOTER S ACROSS THE PROTEOBACTERIA Although research on promoters has spanned decades, the precise prediction of promoter activity from DNA sequence remains a challenge even in model organisms (65, 388) . R ecent literature has identified important differences in the core sequence of 70 promoters across classes of Proteobacteria a s well as a lack of transferability when promoters are moved from host to host (122, 389, 390) . Currently, there is a need for synthetic constitutive promoters spanning a range of expression levels in species outside of Escherichia coli . Additionally, c haracterization data defin ing behavior of the same promoter across multiple species would be extremely valuable to the field. Here, we analyze d promoter activity in three classes of Proteobacteria , which enabl ed us to better understand the sequence elements correlated with a strong pro moter in different hosts. In doing so, w e identif ied and characterize d constitutive promoters spanning a range of expression in these species for community use and described the portability of a subset of these promoters as they were moved between hosts . T hese promoter libraries have broad applications as predictable genetic tools to control gene expression in diverse species (3, 51, 330) . This work adds to the toolkit for gene expression in non model bacteria and is a step towards the larger goal of accurate promoter prediction in a given host from a de novo sequence. Introduction Promoters are essential regulators of gene expression. They are largely responsible for the modulation of cellular responses to various stimuli and affect organism behavior by mediating rates of transcriptional initiation (57, 65, 391) . As such, they have immense utility as a genetic tool to control gene expression (151, 392, 393) .

PAGE 120

120 specificity f actor . I n Gram negative bacteria , 70 is responsible for the transcription of most active promoters during log phase growth (75, 76, 80, 9 1, 394) . The 70 proteins and an amino (81, 395) . Promoters 70 are generally divided into discrete sequence elements including the 10 and 35 hexamers, comprising the core promoter, and the UP element, spacer region, and extended 10 element, if present (81 83, 396, 397) . These elements 70 during promoter binding (87, 395) . Region 2.4 of 10 element and has nearly 100% identity in 35 element and is less but still very well conserved (83, 87, 93 95, 389, 398) . Despite decades of research on promoters and transcriptional initiation (91, 394) , we are still unabl e to predict the activity of a promoter from its sequence and how that activity will vary when the promoter is moved between hosts (65) . More urgentl y, constitutive promoters that are characterized outside of E . coli and validated in different species are severely lacking in the field. The closer core promoter sequences are to the consensus, the stronger the promoter (399) . In E. coli , a 70 dependent consensus promoter sequence includes a TTGACA 35 hexamer and TATAAT 10 hexamer separated by a 17 bp spacer (96) . For non model species in some classes of Proteobacteria, the consensus promoter sequence has been predicted to be very similar to that of E. coli , but studies have conflicting results (390, 400) . Accordingly, studies hav e shown inconsistencies in

PAGE 121

121 expression levels when the same promoter is moved between Proteobacteria. For example, E. coli promoters retain similar relative activity levels in some Pseudomonas and Alphaproteobacteria species but the reverse is not true in either case (400 405) . This is known as the transcriptional laxity phenomenon (122) , with some species able to recognize promoters with more laxity than others . Without characterized promoters in these Proteobacteria, researchers are constrained to genetic tools optimized for E. coli or moving systems into E. coli In the work described here, we reutilize a previously described E. coli constitutive promoter toolbox to test promoters in 15 species across the Al pha , Beta , and Gammaproteobacteria. W e present characterized libraries of 15 to 43 promoter sequences for each species with expression spanning 3 5 orders of magnitude within each library. We then surveyed the promoter libraries within and across species to identify conserved elements and other characteristics of high expressing promoters. Specifically, we compared sequences of the core hexamers, the presence or absence of an extended 10 element, and the GC content of the spacer region. Finally, w e teste d the transferability of a subset of promoters from the libraries by rescreening them in three species, one from each Proteobacterial class. Materials and Methods Library Construction and Transformations This work utilized a previously described synthetic constitutive promoter library with 4350 unique promoter variants (71) . To move the promoter variants into our previously developed broad host range vectors, we followed our protocol for new part addition in the combinatorial assembly workflow (51) . Namely, the promoter variant and m RFP reporter gene were amplified as a single piece from the promoter library via

PAGE 122

122 PCR, treated with Dpn1 restriction enzyme , run on a gel for size verification, and gel purified. The same protocol was followed for amplification of the plasmid backbone and amplified parts were assembled using NEB HiFi Assembly. Here, the backbone contained a pBBR ori gin and gentamicin marker for all species except for A. baylyi , where the RK2 origin was used. Recipient strains were transformed via electroporation, conjugation, or natural transformation, as specified in Table 4 3 . Electroporation: Cells to be made ele ctrocompetent were taken either from overnight cultures or subcultured from an overnight growth and made electrocompetent when in mid log phase. The protocol for the preparation of electrocompetent cells was as follows: 6 m L of each wild type strain was in cubated with shaking in fresh media with the specific culturing conditions listed in Table 4 2 . The total culture was then spun at 5000 rpm for 2 min to pellet the cells. Culture supernatants were aspirated, cell pellets resuspended in 1 m L 300 mM sucrose at room temperature , and then centrifuged again for 2 min at 5000 rpm. This process was repeated to wash the cell pellet with sucrose twice and then the pellet was resuspended in a final volume of 1:10 of the initial culture volume. 50 µ L of each sucrose c ell suspension was then transferred to a 1 mm gap width electroporation cuvette and cells were electroporated at the specified voltage for each strain (Table 4 3 ). Cells were recovered in 1 m L of their respective recovery media and incubated for 2 h in a deep well plate with the specified conditions (Table 4 2 ) before plating on to selection plates. Natural transformation: A natural transformation protocol was followed for A. baylyi which was adapted from a previous protocol (259) . Here, 5 m L of fresh LB was inoculated with wild type ADP1 from a glycerol stock and grown overnight at 30°C. The

PAGE 123

123 next day, 1 m L of fresh LB was inoculated with 70 µ L of this culture and approximately 100 ng of the plasmid were incubated for 3 h before plating onto selection plates. Conjugation: Conjugation was performed using the RP4 system in the following steps. On the day p rior to conjugation, 5 m L cultures were inoculated from glycerol stocks of wild type strains, all requiring donor strains of E. coli, and an E. coli helper strain containing pEVS104, and grown overnight. The following day, donor and helper cultures were sp un down separately at 10 000 rpm for 1 min and resuspended in fresh media to remove residual antibiotic. A sufficient volume of donor and helper cultures was pelleted and resuspended such that each conjugation used 500 µ L of both donor and helper strains i n addition to 500 µ L of the recipient strain to be transformed. Each mixture of donor, helper, and recipient was centrifuged at 10 000 rpm for 1 min and the supernatant decanted, leaving approximately 100 µ L of media to resuspend the pellet. The resuspensi ons were spotted on agar plates and incubated on the benchtop overnight. Each spot was streaked on to a selection plates the following day. Library Picking For the promoter library in each strain, colonies were picked to inoculate 92 wells of a 96 the wild type strain to be used as background fluorescence control s . The transformants were picked by hand and a mix of red, pink, and white colonies were chosen to provide a range of promoter expression levels in the relatively small sampling. The picked plate was incubated overnight with shaking and the following day, used to inoculate four 96 well plates , including screening plates in triplicate and a stock plate. Stock and screening plates were inoculated at 1:200 for fast growing strains or 1:50 for slow growing strains. The stock plate was incubated until cultures were in stationary phase,

PAGE 124

124 glycerol was added to a final concentration of 25%, and then the plate was frozen at 80°C to be used for library strain retrieval. The screening plates (Costar, black, clear bottom) were used to measure mRFP expression . After inoculation of the stock and screening plates, the p icked plate was frozen to be used for downstream PCR amplification. Library Screening Fluorescence readings of mRFP were taken to measure promoter variant activity. Inoculation of the screening plates from the picked plate marks the start of the screen. OD 660 and fluorescence readings were taken in a plate reader (Molecular Devices SpectraMax M3) every hour from 0 5 hours for faster growing strains, or from 4 8 hours for slower growing strains, and a final timepoint was taken at 24 hours. Absorbance at 660 nm is used to measure growth as 600 nm is significantly absorbed by mRFP (348) . Libraries were screened in tr iplicate. Promoter Mapping To determine the promoter sequence associated with the fluorescence measurement of each well of the screening plates, a hierarchical barcoding scheme was employed. Here, unique pairings of barcoded primers were assigned to each w ell of the screening plate to amplify the promoter region and in a second PCR reaction, unique indexing primers were used to differentiate screening plate s , including replicate plates of the same species. Specifically, primers were aliquoted into wells of 96 well PCR plates according to predetermined maps such that each well has a unique combination of forward and reverse primers. These aliquoted primer plates were prepared in bulk and frozen ahead of experiments for increased efficiency. To prepare the PCR reaction, OneTaq master mix was added to the primer aliquoted PCR plates

PAGE 125

125 and cell material was added to the wells from the thawed picked plate via plate stamper. The promoter variant part was amplified with 30 cycles , the reaction products from each plate were pooled into a single tube , and a small volume was run on a gel for verification. 200 µ l L was used in the second PCR where indexing primers and adapters were added for Illumina sequencing. Here , amplif ication was limited to 7 10 cycles. Pooled plate amplification products were gel purified, measured with Qubit, and sequenced using 2 x 150 reads on a shared Novaseq6000 instrument. In this way, each promoter variant in every test species could be identifi ed and mapped to a fluorescence measurement characterizing its behavior. Plates were sequenced in technical duplicates except for A. fabrum , E. coli , P. putida , and R . sp. TM1040, which did not have sequencing replicates. This data was processed in a slightly modified workflow as discussed below. Data Analysis All calculations and data analysis were performed using Microsoft Excel and a custom R script to parse promoter seq uences into core promoter, spacer, and extended elements . For library screens in each bacterial strain, absorbance and fluorescence data were organized by timepoint. Optical density was adjusted to a 1 cm pathlength by dividing by a factor of 0.56 for a culture volume of 200 µ L in the wells of the 96 well plate, as done in a previous study (51) . The average r aw fluorescence data of the replicates was used directly in the graphs and analysis included herein unless specified otherwise. Sequencing data for each species was analyzed through a pipeline that included joining the reads with F lash, trimming the reads with C utadapt , and performing quality check s with F astqc. Expression from promoters of the entire screened library in each

PAGE 126

126 species is represented in Figure 4 1 . A subset of the entire promoter library passed quality checking and those sequences are analyzed in depth in this work. Only reads that we re consistent across the two replicate sequencing runs were included in the final promoter set. For the four species without replicate sequencing data ( E. coli , A. fabrum , P. putida , and R . sp. TM1040) , only wells with >30% of the total sequences having a single defined promoter sequence were included in the study. The original in silico design of the nonrepetitive promoter s specifies a length of 78 bp (71) . As such, promoters were separated into those with a length of 78 bp and those that deviated from that length . Promoters w ith a length of 78 bp are listed in Object 4 1 and were used to generate Weblogos and K p L ogos as discussed below. Promoters with this length make up the vast majority of the library within each species . P romoters deviating from 78 bp are discussed separate ly as the core promoter elements are less positionally defined but nonetheless may still be highly active in some species. Weblogo at (406) and K p L ogo probability logos (407) . For all species except for R. sp. TM1040 , promoters with a length of 78 bp wer e grouped into high, medium, and low expressing p romoters and logos were built for each group. In R. sp. TM1040, most promoters were split into only high and low expressing groups as only a single promoter express ed at a medium level . The majority of promo ter groupings differ from each other by an order of magnitude of RFU expression. The full promoter sequences are included in a larger table in Object 4 1 . Here, promoters are additionally separated into their discrete regions including the spacer, extended elements .

PAGE 127

127 Promoter Transferability Experiments Promoter swapping experiments were done by moving promoter variants from the source species, where library expression data was gathered, into a target species, where promoter variants are measured again in a new host. Based on the mRFP expression data, pro moter variants across a range of expression levels were chosen from a subset of the 15 species for transferability experiments. The promoter variants were struck from their associated wells in the stock plate and isolated colonies were grown in 5 m L cultur es for subsequent plasmid extraction. The plasmid extracts were then transformed into E. coli grown , plasmids extracted, and the promoter region was sequenced via Sanger sequencing to verify that th e sequence matched Illumi na results. Once verified, a collection of plasmids from the chosen source species were transformed into a single target strain. The transformants were used to start 1 m L cultures in a deep well plate, incubated overnight, and subcultured 1:200 or 1:50 int o screening plates for mRFP readings. Promoter transferability fluorescence measurements followed the same steps as library fluorescence measurements, with readings taken in exponential and late stationary phase. Results Constitutive Promoter Library Scre ens A previously developed library of 4 , 350 synthetic promoters was utilized in this study (71) . The library is a product of the Nonrepetitive Parts Calculator which generates nonrepetitive DNA sequences with the goal of increasing stability in constructed genetic system s . The library used here is a combination of three toolboxes; the first consisting of 800 promoters with consensus hexamers, the second with 3,500

PAGE 128

128 promoters with 500 each of 0 6 mismatches in the hexamers, and the third with 50 promoters where all 12 positions of the hexamers are non consensus. The hexamers are modeled from E. coli cons ensus 10 and 35 elements and the spacer region is a length of 17 bp for all promoters. Outside of the core hexamers , the surrounding sequence is degenerate , though still within the constraints of a nonrepetitive sequence . To test this promoter library across different Proteobacteria, we moved the promoter and m RFP reporter into our previously described broad host range vector backbones using the combinatorial plasmid assembly strategy (51) (Object 2 1) . Based on our previous transformation efficiency data (51) , we constructed two p lasmid libraries wi th differing origins of replication to obtain sufficient transformants in all our chosen species for the library screen s . Except for Acinetobacter baylyi , all species were efficiently transformed with plasmids containing the pBBR origin ; RK2 was used for A. baylyi and all plasmids possessed a gentamicin resistance marker. To characterize promoter activity in the Proteobacteria , we tested the libraries in each of the 15 species listed in Table 4 1 , which are spread across the Alpha , Beta , a nd Gammaproteobacteria (Figure 4 1). We measured relative fluorescence units (RFU) of mRFP expression as a proxy for promoter activity and the promoter sequences were obtained from high throughput sequencing. Approximately 100 library transformants were pi cked in each species and f luorescence and optical density readings were taken through exponential and late stationary phase of growth to measure promoter activity. The RFU output of all the promoters screened in each species are represented in Figure 4 2 . The promoter s were then amplified from each library transformant in a high throughput workflow and sent for deep sequencing. To

PAGE 129

129 match promoter sequence to RFU output , the promoters were barcoded during PCR amplification such that each barcode corresponded to a well in the 96 well screening plate. In this way, we established sets of promoters with expression spanning at least 3 orders of magnitude in species across the Alpha , Beta , and Gammaproteobacteria. After data processing and quality checks, the number of unique promoters included in each library ranged from 15 to 43 ( Table 4 4 ) . A comprehensive table of p romoter sequences and output s for all species included in this study are in Object 4 1 . Object 4 1. Table of promoter sequences for all species included in this work. PDF, 662 KB Conservation of Core Promoter Sequ ences To survey trends in the promoter elements correlated with high activity in our dataset , we began by examining the core promoter elements . S equence s of the 10 the most important determinants of promoter function (65) and we were interested in the relationship between conserved motifs and promoter output in each phylogenetic classes . We first categorized the promoters from each species into those with high, medium, and low output s ( promoter counts in each category listed in Table 4 4). Ruegeria sp. TM1040 is an exception a s its library included mostly promoters with high or low expression and only one sequence expressed within a middle range ( Object 4 1) . This is apparent in the expression data from the total library screen in R . sp. TM1040 (Figure 4 2). We then used the core promoter elements from the promoters within each category to build sequence logos ( Web logos for all species available in Object 4 2 , KpLogo probability logos for all species available in Object 4 3 ). Object 4 2. Weblogos for all species included in this work. PDF, 728 KB

PAGE 130

130 Object 4 3. KpLogos for all species included in this work. PDF, 850 KB Because the housekeeping factors in Proteobacteria are hig hly conserved (81, 395) , we expected the core hexamers of high activity promoters to also be conserved at most positions. Unsurprisingly, the E. coli consensus motifs on the which library is bas ed, TATAAT and TTGACA in the 10 and 35 region , respectively, were prominent in the logos of the high expressi on promoter group s (Figure 4 3 ) . Within these motifs, certain positions are known to be more highly conserved than others due to specific interactions with 70 during transcriptional initiation. T he adenine at position 11 and the thymine at position 7 in the 10 element are thought to b e highly conserved due to their stabilizing effect during DNA strand separation (88, 389) . In our data, t he thymine at position 7 was completely conserved in the high expression promoters across 14 species. A notable exception was X anthomonas campestris , where the second highest expressing promoter contained a non consensus cytosine in that position. The adenine at 11 is well conserved across the Beta and Gamma proteobacteria but less so in the Alphaproteobacteria, where promoters screened in both A. fabrum and R . sp. TM1040 possess a guanine in that position. Indeed , a AG TAAT 10 element paired with a consensus 35 and extended 10 element was th e highest expressing promoter in A. fabrum (Object 4 1) . The 12 position is important in fork junction recognition by 70 (408, 409) . T h e Alphaproteobacteria were also more likely to possess a non consensus nucleotide at positions 12 , particularly in R. sp. TM1040, where three of the eight high expression promoters did not contain a thymine in this position . Sequence logos of the non cons ensus 10 elements of the high expressing promoters in four Alphaproteobacteria reveal little conservation beyond the conserved 7 thymine (Figure 4 4). R. palustris is

PAGE 131

131 excluded from these logos as only one high expressing promoter lacks a consensus 10 element. The TTG motif in the 35 element is also known to be highly conserved across species (389) . As expected, o ur data shows that TTG is more conserved than the ACA motif at the end of the hexamer . Some Alpha an d Gammaproteobacteria did not have perfect consensus of this motif in the high expressing promoters (Figure 4 3 ). I n ranking promoters from highest to lowest expressing, promoter 5 did not contain the TTG motif in E. coli and interestingly , the highest expressing promoter in P. syringae possessed a AC G motif in place of TTG , though both a consensus 10 and extended 10 element were present (Object 4 1) . While these results were not statistically significant, possibly due to small datasets, it is nonetheless interesting to identify promoters lacking conserved nucleotides in positions that are thought to be highly conserved across species or necessary for transcription al initiation. Though, we did not map transcriptional start sites experimentally, thus it is possible that alternate promoters are responsible for expression. T he presence of consensus hexamers is not necessarily indicative of a highly active promoter . Even in our small datasets, t wo thirds of the species tested contain ed a promoter sequence with both consensus hexamers in the lowest expressing promoter category, including E. coli , R . palustris , M. niabensis , B. thailandensis , E. meliloti , S. enterica, P . syringae, P. aeruginosa, S. sp. EE 36, and E . cloacae . O ver a quarter of the promoters in the low activity group s in B. thailandensis, P. syringae , and R. pa lustris contained 10 and 35 consensus sequences but were inactive regardless . This ma y be a result of RNAP holoenzyme binding these elements so tightly that it cannot

PAGE 132

132 effectively clear the promoter (78, 91, 102) . Among promoters in the lowest expressing category t hat possessed one core consensus element, the 10 element was absent more frequently than the 35 element in most species . In our dataset , a consensus 10 element was absent in all in active promoters in A. baylyi . Though X. campestris contained the smallest promoter library, our data shows that at least one consensus core element was required for moderate to high activity. The lack of a definitive association between the presence or absence of two consensus core elements and high or low activity, respectively, is consistent with previous studies (54, 65) . This emphasizes the need for larger, more comprehensive da tasets that can thoroughly ex plore interactions between core hexamer and other sequence elements that influence overall promoter output. The Extended 10 Element TGn 10 element and its presence has been shown to rescue activity from promoters with core hexamer sequences that deviate from consensus in E. coli and other Proteobacteria (102, 410) . To inve stigate the effect of the extended 10 element in our promoter libraries, we analyzed sequences with and without the TGn motif in relation to the identities of the core promoter hexamers. Of the 15 species under study, A. baylyi and E. meliloti did not con tain promoters with extended 10 elements in their characterized libraries. Though, the small sample size of screened promoters prevents conclusions about the significance of this. Across the remaining 13 species , 43 of the 41 6 promoters screened contained an extended 10 element . T he motif was more often found in promoters with at least one non consensus core promoter element than in sequences with the consensus . 20 of the 43 promoters were in the high expression categor ies

PAGE 133

133 across species . Of these , three promoters contain ed a consensus 35/extended 10 element pairing and two contain ed a consensus 10/extended 10 element pairing . In A. fabrum and P. syringae , these were the highest expressing promoters in their respective libraries, support ing the theory that an extended 10 element can compensate for weak core promoter elements (102) . The remaining 23 promoters with an extended 10 element were paired with two consensus or two non consensus core promoter elements. While t he extended 10 element is often discussed as a compensatory binding site for 70 in t he absence of two core promoter elements , this was not the case for many of the promoters in our dataset. 11 of the 20 highly expressing p romoters with a TGn motif also contained consensus sequences for both core elements and in the remaining four promoters, neither core element was at consensus . For example, t he eig h th highest expressin g promoter in S. enterica contained an extended 10 element and non consensus TACAAT and TTGCGA motifs in the 10 and 35 core elements, respectively . Here, as the highest expressing promoter of that library. This high expression could be due to cryptic activator sites that make the promoter less dependent on the sequence of its core elements (411) or more complex interactions among sequence elements . Indeed, in the ranking of highest to lowest activity promoter s in S. enterica , the third promoter does not contain core consensus nor an extended 10 element (Object 4 1) . Conversely, three highly active promoters containing an extended 10 element in B. thailandensis also possess ed consensus core promoter hexamers . Though the 10 and 35 elements are at consensus, the distance between the elements may not be ideal for

PAGE 134

134 some species and as such, may benefit from additional binding sites like the extended 10 element (122, 400) . The presence of an extended 10 element did not consistently yield a more highly active promoter . In fact, in M. niabensis and R . sp. TM1040 , the TGn motif was only found in promoters with little to no detectable expression , even when the sequence contained one or both consensus hexamers (Object 4 1) . A recent study investigated the e ffect of multiple TG motifs centered at position 16 . They found that while a single TG motif was present in the majority of strong promoters as the extended 10 element, tandem TG motifs dampened promoter activity (97) . To investigate the effect of tandem TG motifs in our dataset, we took a subset of the spacer region from positions 19 to 1 4 and tallied the occurrence of TGTG and TGTGTG and we found one promoter that met this criteria. O ne of the lowest expressing promoters in A. fabrum contained TGTGTG at positions 19 to 14 . Because this promoter also contained non consensus core hexamers that are a unique pairing in th e A. fabrum promoter library , the contribution of the tandem TG motif to low expression is unclear and would be an interesting avenue for further study. The Spacer Region While the spacer region between the two core promoter elements does not have a defined consensus sequence, its length and G C content are known to affect expression (128) . Further, a recent study found a correlation between a specific region of the spacer element between positions 20 and 13 and ex pression in a screen of synthetic promoters in E. coli (65) . Because virtually all our promoters have a set spacer length of 17 bp , w e were intereste d in examining the GC content of the entire spacer as well as the 20 to 13 region to see how each affected expression and if the 20 to 13

PAGE 135

135 region was a better indicator of promoter activity . We compared the GC content of the full spacer region from the high expression group to that from the low expression group for each species and visualized the data as violin plots (Figure 4 5 ). GC content in the high expressing promoter group was not notabl y lower than that of the low expressing promoters, even in E. coli . Other studies have found a correlation between lower GC content and increased promoter activity and so this trend is somewhat surprising (65, 97) . This could be due to the size of the dataset and the well studied phenomenon of complex non linear interactions among promoter elements in determining overall promoter function (65, 71, 102) . The genome GC content of each species is another confounding factor (412) . Though no obvio us trends were apparent across the phylogenetic classes, some insight can be gaine d by examining more closely related species. In the Gammaproteobacteria, both P. putida and P. syringae have a cluster of low expressing promoters with a G C content of 47% while most of promoters in the high expression group have a GC content over 50%. In the Alphaproteobacteria, more than half of the low expression promoters had a GC content under 50% exce pt for R . sp. TM1040. The only two species with a majority of high expressing promoters with spacer regions less than 50% GC content were A. baylyi and B. thailandensis . While A. baylyi has the lowest genome GC content of the 15 species at 40%, B . thailandensis is among those with the highest and so this difference is the most striking. Next , we compared the GC content of the 20 to 13 spacer region across species . The 20 to 13 region in the spacer is proximal to the 10 element, which is where DNA strands are first separated during transcriptional initiation (87) and GC content at these positions may be more directly related to promoter activity. The 20 to

PAGE 136

136 13 region in the spacer will be referred to as the proximal spacer for clarity. W e analyzed these trends by generat ing another set of violin plot s (Figure 4 6 ). For most promoter libraries across species , the rank order of promoters by GC content of the full spacer were similar to the rank order by GC content of the proximal spacer . While the lowest GC content of the full spacer region in a promoter across species was 18%, the proximal spacer GC percentage was as low as 12.5% for promoters in A. baylyi , M. niabensis , S. enterica , and S . sp. EE 36. For S. enterica and A. baylyi , at least one of these promoters was in the high expression category. Indeed, our results show that a cross most Gammaproteobacteria, the median GC content of the proximal spacer was lower than the median GC content of the full spacer in the high expressing promoters. Though, t his was not true for E. coli or P. aeru ginosa ; in fact , E. coli had the highest median proximal spacer GC content of high activity promoters of all species screened at 75%. In comparison, the median GC content of the full spacer in this promoter group in E. coli was 59% and 54% for all promoters screened in E. coli . High spacer region GC content was not tied to the presence of one or both consensus core hexamers and the genome GC content of E. coli is 50% so biological causes remain unclear. In some species , higher GC c ontent in the full spacer and proximal spacer region were more likely to be correlated to promoters with very high expression. For both species of Betaproteobacteria, R. palustris , R . sp. TM1040, E. coli , and X. campestris , a promoter with 75% GC content i n the proximal spacer was among the top three highest expressing promoters in that species (Object 4 1) . In all but one case, this high GC spacer region was paired with two consensus hexamers . For some species, differences in GC content between high and low expression promoters is more exaggerated when

PAGE 137

137 the proximal spacer is analyzed rather than the full spacer, as in A. baylyi and R. palustris . Echoing the same trend, more proximal spacers in the high expression group have GC contents at or above 75% compared to GC contents of full spacers in most Gammaproteobacteria. As mentioned above, th e datasets analyzed here are not large enough to outline definitive relationships between certain promoter characteristics and expression , though the high GC content of the spacer regions in high expressing promoters was somewhat surprising. Characterized Promoters with Varying Lengths The nonrepetitive promoter library tested here was modeled off the standard 70 dependent promoter in E. coli with the core hexamers at defined positions relative to the spacer region of 17 bp (71) . P romoters with variable spacer region lengths may also be highly active in E. coli and other Proteobacteria (54, 122, 390, 402) and so w e were interested in identifying high expression promoters with spacer lengths greater or less than 17 bp . The results discussed in this work so far are based on promoters that follow th e standard promoter architecture of element positioning and were the majority of promoters screened in this work. Though, some promoters had variable lengths and required a differe nt data processing workflow to extract the core and spacer elements. There were two main issues in identifying the elements of these non standard promoters; i) every sequence position of a promoter in the library has a likelihood of being degenerate and so identification of discrete element s in variable positions was difficult and ii) synthesis of the custom oligonucleotide promoters was imperfect and many sequences have variable lengths, making a standard workflow for parsing core and spacer elements impos sible. Analysis of the standard length promoters revealed that the TTG motif at the end of the 35 element was present in most promoters

PAGE 138

138 across all expression categories (Object 4 2) and provided a n anchoring point for element positioning in the promoters of variable lengths. In doing so, we were able to extract hypothetical core and spacer elements from these promoters and identify promoters with a higher potential for a non standard spacer length. On e such promoter was in S. sp. EE 36 , where t w o consensus core hexamer sequences were 16 bp apart and achieved an RFU expression of three orders of magnitude . Another promoter with consensus core elements and a 16 bp spacer was in R. palustris and here, e xpression reached three orders of magnitude, the same as the highest expressing promoter in this species. Discrete promoter elements were more easily identified when both core hexamers were at consensus, but this was not always the case. Some promoters with variable lengths had poorly defined 35 and 10 regions but still achieved high expression. One of the highest expression promoters in P. syringae had no discernable AT rich 10 element nor was the TTG motif present in the upstream region . In S. enterica , a high expressing promoter lacked the TTG motif as well but a potential 10 element was clear. Both promoters and their predicted core promoter elements are shown in Figure 4 7 . Promoter Transferability To investigate the transferability of our promoters between Proteobacteria, we chose a subset of the chara cterized promoters from our librar ies to rescreen in another species. We refer to the species where promoter library expression data was collected as the source species and the species where the promoter was rescreened as the target species. We chose E. co li NEB5 , B. thailandensis , and S . sp. EE 36 as the target species for rescreening and tested promoters from four to eight sourc e species in

PAGE 139

139 each. For most source species, promoters across 2 3 categories of expression levels were chosen for rescreening . The results of the promoter rescreening experiment are shown in Figure 4 8 and an accompanying table of promoter names, sequences , and RFU output is in Figure 4 9 . In E. coli NEB5 , 2 2 promoters were rescreened from 8 source species (Figure 4 8 ) . For most promoters, trends in expression from a source species were similar in the NEB5 target. In other words, high, mid, and low expressing promoters were likely to follow this same rank order. The two notable exceptions were promoters from S . sp. EE 36 and M . niabensis , where promoter expression rank order changed from source to target species. For promoters from the S . sp. EE 36 library , promoter Su1 in the source species remained high in NEB5 at the exponential phase timepoint, but the rank order of expression from Su2 and Su3 was flipped in the target compared to the source . By stationary phase, promoter Su3 b ecomes the highest expressing promoter of Su1 Su3 in E. coli NEB5 . In fact, this promoter gives the highest expression across all promoters screened in E. coli NEB5 in this experiment . This promoter contained consensus sequences for both core hexamers, spacer , and proximal spacer region GC contents above 60%, and did not contain an extended 10 element (Figure 4 9) . This is consistent with our previous data that suggested promoters from our library with higher spacer region GC contents had high expression in E. coli . For promoters from the M. niabensis library, the highest expressing promoter in M. niabensis , Mn1, gave the lowest expression in E. coli NEB5 , with expression 131 fold higher in the Betaproteobacte ria. This promoter also contained consensus core promoter hexamers . The low activity of Mn1 in E. coli could b e the result of RNAP binding to the promoter

PAGE 140

140 too tightly and being u nable to clear the promoter and transition to elongation (78, 91, 102) . Both promoters from the B. thailandensis library tested in E. coli NEB5 achieved similar levels of expression in source and target species so this lack of transferability c annot be extended to promoters moved from Beta to Gammaproteobacteria. These results highlight the complex interplay of promoter elements and host dependent factors that are involved in promoter function. While the rank order of promoter expression from t he source bacteria to the E. coli NEB5 target species was consistent across most bacteria , the level of expression from the same promoter varied widely. By late stationary phase, the Pa3 promoter in P. aeruginosa was virtually non functional in E. coli NEB5 ; here, expression from the same promoter was 81 fold higher in P. aeruginosa than in E. coli . N either core promoter element of Pa3 matched consensus yet RFU expression reached three orders of magnitude in the Pseudomonas species. Similarly, t he highest expressing promoter chosen from the A. fabrum library , Af1, expressed 28 fold higher in A. fabrum than in E. coli NEB5 by late stationary phase . This promoter contained a consensus 35 element and extended 10 element. In genetic engineering experiments where genetic systems are built in E. coli and moved in to a non model species , this promoter characterization data is valuable to ensure predictable outcomes. We also rescreened promoters in B. thailandensis and S . sp. EE 36 target species (Figure 4 8 ) . In B. thailandensis , promoters from the P. aeruginosa library expressed in the same rank order between source and target species. While expression levels from the same promoter in B. thailandensis and P. putida were similar, expression from Pa2 and Pa3 was almost 20 fold higher in P. aeruginosa . Indeed,

PAGE 141

141 e xpression from promoters from the P. aeruginosa library were consistently higher when screened in P. aeruginosa than in any target species and , Pa1, t he highest expressing promoter from this library , consistently h ad the highest expression of all promoters in the transferability experimen t s when tested in P. aeruginosa . This high level expression in P. aeruginosa is consistent with another recent study that also found a higher capacity for transcriptional activation in this species (412) . Promoter Ec2 expressed an order of magnitude higher in E. coli MG165 5 than in B. thailandensis and rank order of expression was swapped between promoters Ec2 and Ec3 in the two species. The Ec2 promoter differs from consensus by one base pair in both core elements (Figure 4 9 ) . Similarly, rank order was reversed in promote rs Af2 and Af3, with the more highly expressing promoter in A. fabrum expressing two order s of magnitude lower in B. thailandensis by stationary phase . Af2 contains both core hexamers and while most high expressing promoters in the B. thailandensis library also contained both core elements at consensus, this data suggests that the promoter sequence outside of the core hexamers negatively impacted expression in this species. In S . sp. E E 36, rank order of expression was the same for E. coli MG1655 and B. thailandensis target species but not for P. aeruginosa and P. putida . The Pa1 promoter expressed highly in P. aeruginosa but had mid level expression in S . sp. EE3 6. Rather, Pa2 had the highest expression of Pa1 Pa3 in this target species. The Pa2 promoter contains a AATAAT TTGACT 10 and 35 element, respectively. This is consistent with o ur data set showing that the thymine at position 12 had low conservation in high expressing promoters with a non consensus 10 element in the Alphaproteobacteria (Figure 4 4 ) . The Pp1 promoter from the P.

PAGE 142

142 putida library was highly expressing in both the source and S . sp. EE 36 target species, but the rank order of Pp2 and Pp3 were swapped. Overall, t hough our dataset was too small to draw statistically significant differences in hexamer conservation across species, this work highlights the utility of a larger dataset to determine the requirements for high conservati on in diverse species. Discussion In this study, we established a toolbox of synthetic constitutive promoters characterized in 15 diverse species of Proteobacteria . The toolbox includes librar ies of 15 43 promoters tested in each species and promoter expr ession spans 3 5 orders of magnitude in each library. From this data, we surveyed trends in co re promoter sequences, the GC content of the spacer region, and the presence or absence of an extended 10 element in high expressing promoters . Though our dataset was too small to draw significant conclusions, we note d characteristics of promoters that behaved in unexpected ways and compared high activity promoters across species in the same Proteobacterial class . We then tested the transferability of s ome promoters by select ing a subset for rescreening in another species. For promoters in more than half of the species included in the rescreening experiment s , rank order of expression changed from source to target species. In some case s, a highly expressi ng promoter in one species lost virtually all activity when moved to another. Data gathered from these experiment s highlight th e variable activity of genetic parts across different host contexts , even when hosts are closely related species. Promoter transferability is v aluable in its own right and compl e ments the promoter toolbox with cross species promoter characterizing . This work re p resents the most comprehensive characterization of a promoter toolbox across Proteobacteria l species to date .

PAGE 143

143 Prior to this work , there were few established collections of constitutive promoters for use in non model bacteria (56, 413) . Of those, many consist of endogenous promoters, identified and reused in the same host. Using native promoters in synthetic circuits can be problematic as they are more likely to contain cryptic regulatory elements and alternative tran scriptional start sites , making them host context dependent and not orthogonal genetic parts (56, 414) . For these reasons, we chose to test synthetic promoter libraries as they offer more reliable and reproducible expression (55) . Constitutive promoter libraries have proven to b e valuable for the construct ion and testing of genetic systems as they provide optio ns for fine tun ing expression levels . For example, t he Anderson promoter library is among the most popular collections of synthetic constitutive promoters and includes 19 promoters spanning a gradient of expression levels in E. coli (70) . T hough the collection was developed for use in E. coli , it ha s been utilized in diverse species to optimize expression of transcriptional regulators, control the expression of an sgRNA, and expand toolkits for species where few characterized promoters are available (61, 150, 310, 4 15, 416) . This exemplifies the demand for characterized genetic parts in non model species as these promoters are often cho sen due to a lack of alternatives (150) . The promoter libraries tested in this work are made available here to meet this need, providing sequence and output of at least 15 promoters in 15 species spanning the Alpha , Beta , and Gammaproteob acteria . More recently, s ynthetic promoter libraries have been used to study the relationship between promoter sequence and function. In a landmark study by Urtecho et al . , the authors employed a massively parallel reporter assay to explore the

PAGE 144

144 transcript ional activit y of over 1 0,000 promoter variants in E. coli (65) . Their library consisted of every combination of a set of discrete promoter elements, including eight 35 elements, eight 10 elements, three UP elements, eight spacer regions, and eight background sequences. From th e ir expression data, they were able to train a statistical model to predict promoter strength and describe more complex interactions between elements that increased or hampered activity. Another study screened over 1 4 ,000 promoters to predict site specific transcriptional rates for 70 dependent promoters in E. coli , with over 100 variants each of the UP elemen t , 10 and 35 core element s, spacer , extended 10 element , discriminator, and initially transcribed region (54) . While our goal was different and focused on characterizing promoter activity in non model species, we surve yed the results of our screen s to find notable trends and promoters with unexpected results. Unlike the screen s mentioned above, our data sets were too small to draw statistically significant conclusions or definitively identify markers of high promoter function in the Proteobacteria ; i nstead, our work epitomizes the complex and non linear relationships among elements that contribute to prom oter function. For example, in 11 of the 15 species included in our screen, at least one promoter among those that were highly active did not possess a consensus core promoter element and in 10 of the 15 species , we found at least one inactive promoter wit h both consensus hexamers . T hese consensus sequences are based on research in E. coli but they are likely similar in other Proteobacteria based on existing research (390, 400, 402) and the high conservation among housekeeping factors (94, 95) . Though, core sequences are only part of a larger picture when considering the determinants of a highly active promoter.

PAGE 145

145 While not done to the scale of related work in E. coli (65, 71) , our work still identified some interesting and contradicting relationships between sequence and expression that could be further explored in more systematic screen s of promoter activity. One unexpected trend in our dataset was the high GC content of the spacer regions of highly active promoters. In larger studies of promoter function in E. coli , GC content is negatively correlated with the level of promoter expression (65) but most work investigating promoters in other Proteobacteria has focused on the length of the spacer region and not its sequence composition (390, 402) . A recent study investigating spacer elements has suggested that the GC content of the spacer region plays a crucial role in DNA supercoiling sensitivity, impacting the timing of expression and potentially the phasing of the 10 and 35 core promoter regions on the face of DNA (97) . The effect of s pacer region length and sequence on overall promoter output provides a n oth er avenue to further explore promoter function in non model species. The effect of an extended 10 element could be better explored in a larger dataset as well. Similar to its function in E. coli , t here is evidence that this motif functions to stabilize th e RNAP holoenzyme and compensate for weak core element binding in other Proteobacteria as well (102) . The contribution of the extended 10 element to overall promoter function has not been thoroughly explored outside of E. coli (65, 71) and would further our understanding of promoter function across species. In t his study, we rescreened a subset of our promoters in a second species to evaluate variations in activity when the same genetic part is moved into a new host. Cross species promoter characterization is valuable as research has shown promoter behavior is no t consistent even between closely related species (259, 314) . This may be

PAGE 146

146 due in part to transcriptional laxity, which refers to the capacity to efficiently transcribe from mutated endogenous promoters or horizontally transferred DNA (122) . Some species , particularly the Alphaproteobacteria, have higher transcriptional laxity than others and as such, are able to utilize a larger range of sequences effectively as promoters (122, 390) . In our experiments, promoters originally screened in P. aeruginosa consistently achieved higher expression in P. aeruginosa than in the re screening species (Figure 4 8). P. aeruginosa is an opportunistic pathogen with a complex metabolism that enables it to adapt to free living and pathogenic lifestyles (417) ; hence, an increased transcriptional laxity in this species would be evolutionarily beneficial. The same may be true of A. fabrum , P. putida , and other metabolically diverse species, but our dataset is not large enough to capture th ese trends. Given a larger dataset, w e hypothesize that species that ar e opportunistic pathogens or possess diverse metabolisms would have a high degree of transcriptional laxity and an increased likelihood of effectively using a given promoter . It would be interesting to systematically re screen promoters through this lens, compar ing levels of transcriptional laxity to the lifestyles and metabolisms of diverse Proteobacteria . Overall , we characterized libraries of constitutive promoters in 15 diverse species of Proteobacteria and further characterized a subset of the se promoters in a second species to test the transferability of the genetic parts. Though our dataset was too small to draw conclusions, we note interesting and unexpected sequence function relationships that can be further explored in a more comprehensive s tudy. Our work highlights complexities of determining promoter activity from sequence across species despite the high conservation of the housekeeping factors in Proteobacteria . Indeed,

PAGE 147

147 a phylogenetic tree made from the protein sequences of housekeeping factors in the 15 species studied here is virtually identical to that of a tree constructed from a 16S rRNA alignment ( Figure 4 1, Figure 4 10) , consistent with similar studies (95) . In a multiple sequence alignment of housekeeping factors in the se species, is 100% conserved and is less but still highly conserved . The discrepancy between the high the sequences of highly active promoters suggests greater complexity in the sequence determinants of a strong promoter than proximity of core elements to the canonic al E. coli consensus. In characterizing promoters across species and identifying transfer r able and non transferrable sequences , we have contributed to the body of work focused on promoter sequence function relationships . The study of promoters in different species and how they behave when transferred between hosts is highly valuable to the field and will accelerat e the study and engineering of previously unexplored bacteria.

PAGE 148

148 Table 4 1 . Strains investigated in this study Strain Phylogenetic class Acinetobacter baylyi ADP1 Gammaproteobacteria Agrobacterium fabrum C58 Alphaproteobacteria Burkholderia thailandensis E264 * Betaproteobacteria DSM Massilia niabensis Betaproteobacteria Ensifer meliloti SM1021 Alphaproteobacteria Enterobacter cloacae ATCC 13047 Gammaproteobacteria Escherichia coli MG1655 Gammaproteobacteria Pseudomonas aeruginosa PAO1 Gammaproteobacteria Pseudomonas putida KT2440 Gammaproteobacteria Pseudomonas syringae pv. syringae IBSBF 281 Gammaproteobacteria Rhodopseudomonas palustris Alphaproteobacteria Ruegeria sp. TM1040 Alphaproteobacteria Salmonella enterica Typhimurium Gammaproteobacteria Sulfitobacter sp. EE 36 * Alphaproteobacteria Xanthomonas campestris ATCC 33913 Gammaproteobacteria * Strains included in promoter swapping experiments

PAGE 149

149 Table 4 2 . Strain growth conditions Strain Growth medium Incubation temperature Gentamicin concentration Acinetobacter baylyi ADP1 LB 30°C Agrobacterium fabrum C58 LB 30°C Burkholderia thailandensis E264 LSLB 37°C DSM Massilia niabensis R2A 30°C Ensifer meliloti SM1021 LB 30°C Enterobacter cloacae ATCC 13047 LB 37°C Escherichia coli MG1655 LB 37°C Pseudomonas aeruginosa PAO1 LB 37°C Pseudomonas putida KT2440 LB 30°C Pseudomonas syringae pv. syringae IBSBF 281 LB RT Rhodopseudomonas palustris LB 30°C Ruegeria sp. TM1040 ½ YTSS 30°C Salmonella enterica Typhimurium LB 37°C Sulfitobacter sp. EE 36 ½ YTSS 30°C Vibrio vulnificus LB 30°C Xanthomonas campestris ATCC 33913 LB 30°C

PAGE 150

150 Table 4 3. Transformation conditions Strain Transformation method Recovery medium Acinetobacter baylyi ADP1 Natural transformation LB Agrobacterium fabrum C58 Electroporation 2 • LB Burkholderia thailandensis E264 Electroporation 1 • LSLB DSM Massilia niabensis Conjugation R2A Ensifer meliloti SM1021 Conjugation LB Enterobacter cloacae ATCC 13047 Electroporation 2 LB Escherichia coli MG1655 Electroporation 2 LB Pseudomonas aeruginosa PAO1 Electroporation 1 LB Pseudomonas putida KT2440 Electroporation 1 LB Pseudomonas syringae pv. syringae IBSBF 281 Electroporation 1 LB Rhodopseudomonas palustris Conjugation LB Ruegeria sp. TM1040 Electroporation 2 • ½ YTSS Salmonella enterica Typhimurium Electroporation 2 LB Sulfitobacter sp. EE 36 Electroporation 2 • ½ YTSS Xanthomonas campestris ATCC 33913 Electroporation 2 LB 1 Cells made electrocompetent taken from an overnight. growth 2 Cells were subcultured from an overnight growth and made electrocompetent when in mid log phase Electroporated with 1.8 kV, 1 pulse • Electroporated with 2.2 kV, 1 pulse

PAGE 151

151 Table 4 4. Promoter c ategorization . The number of promoters categorize d as high, mid, and low expression sequences and total number of promoters in the library screened in each species is listed below. The full sequences of promoters in each category available in Object 4 1. Strain High Mid Low Total Acinetobacter baylyi ADP1 6 5 15 26 Agrobacterium fabrum C58 12 9 8 29 Burkholderia thailandensis E264 15 11 10 36 DSM Massilia niabensis 5 4 19 28 Ensifer meliloti SM1021 14 14 14 42 Enterobacter cloacae ATCC 13047 16 9 9 34 Escherichia coli MG1655 9 1 7 1 7 43 Pseudomonas aeruginosa PAO1 17 17 9 43 Pseudomonas putida KT2440 1 9 9 13 4 1 Pseudomonas syringae pv. syringae IBSBF 281 7 8 20 35 Rhodopseudomonas palustris 7 10 10 27 Ruegeria sp. TM1040 8 1 16 2 5 Salmonella enterica Typhimurium 13 7 8 28 Sulfitobacter sp. EE 36 12 8 12 32 Xanthomonas campestris ATCC 33913 6 6 3 15

PAGE 152

152 Figure 4 1. Phylogenetic tree of species included in the study. This study includes species from the Alpha , Beta , and Gammaproteobacteria classes. Ensifer meliloti referred to by its synonym Si norhizobium meliloti .

PAGE 153

153 Figure 4 2. Range of expression from promoter librar ies screened in each species. Data from late stationary phase of growth and includes 92 promoters in each library. Library picking described in Materials and Methods. Black horizontal line indicates fluorescence of wild type cells .

PAGE 154

154 Figure 4 3 . Weblogos of core hexamers from high expression promoter group s . Logos are grouped into species from the Gamma (right), Alpha (top left), and Betaproteobacteria (bottom left). The 10 hexamer is in the left column of each group and the 35 hexamer is at the right . Number of sequences used to make each logo listed in Table 4 4.

PAGE 155

155 Figure 4 4 . L ogos of non consensus 10 elements from high activity promoters in the Alphaproteobacteria. Left column are logos representing conservation in bits and right column are probability logos where residues are scaled relative to statistical significance of each residue at each position with log10 (p value) on the y axis. Apart from the conserved thymine at posi tion 7, little conservation is apparent across this Proteobacterial family .

PAGE 156

156 Figure 4 5 . Violin plots of GC content of full spacer region for 15 species. Each colored pair of plots represents spread of GC content percentage in each species. In each pa ir of plo ts, the first represents data from the high expression promoter group and the second from the low expression group. Number and spread of datapoints used to make plots represented by open circle s within each plot.

PAGE 157

157 Figure 4 6 . Violin plots of GC content of proximal spacer region for 15 species. Each colored pair of plots represents spread of GC content percentage in each species. In each pair of plots, the first represents data from the high expression promoter group and the second from the lo w expression group. Number and spread of datapoints used to make plots represented by open circles within each plot.

PAGE 158

158 Figure 4 7 . High expressing p romoters without clear core elements. Promoter sequences in P. syringae and S. enterica possessed no discernable 10 or 35 elements yet achieved very high levels of expression . Potential core hexamers highlighted in purple ( 35 hexamer) and blue ( 10 hexamer).

PAGE 159

159 Figure 4 8 . Promoter transferability screens . Selected library promoters r etested in three species . In each pair of bars on each bar graph , the first bar represents expression from the target species, where the promoter was re secreened, and the second represents expression from the source host, where the promoter was first screened . Promoters are named based on their source host : Pa ( P. aeruginosa ), Pp ( P. putida ), Ec ( E. coli ), Bt ( B. thailandensis ), Su ( S . sp. EE 36), Mn ( M. niabensis ), Ag ( A. fabrum ), Ru ( R . sp. TM1040) . Data are average RFU expression at exponential and stationary phase, standard deviation of triplicates shown on graphs.

PAGE 160

160 Figure 4 9 . Sequences included in promoter transferability experiments . Promoter names are listed at the left a nd reference the species where the promoter was first screened , as lis ted in Figure 4 8 . The full promoter sequence is shown with the 10 and 35 regions highlighted in blue. Promoter expression as RFU from screening in the source species included in the right column.

PAGE 161

161 Figure 4 10. Phylogenetic tree constructed from housekeeping factor sequences . Tree represents the phylogenetic relationship of the housekeeping factors for Alpha , Beta , and Gammaproteobacteria species included in this work except for M. niabensis as no RpoD protein sequence is av ailable for this species. For S . sp. EE 36 , the closely related S . strain NAS 14.1 is used in place.

PAGE 162

162 CHAPTER 5 CONCLUSIONS AND FUTURE DIRECTIONS The work presented here has greatly expanded the genetic tools available for gene expression in the Proteobac teria. With the development of our broad host range plasmid toolbox, characterized inducible expression systems in non model bacteria are available where few options existed before. The toolbox, including 12 inducible systems screened across nine Proteobac teria, expedites the development of reliable genetic systems in these hosts and potentially many more. Researchers working in Burkholderia , Pseudomonas , Agrobacteria , and other Proteobacterial species can implement these tools directly or modify and optimize the systems to fit their needs. Additionally, we developed a standardized workflow for the assembly of these inducible systems into broad host range plasmids, increasing portability into new hosts and reproducibility across experiments and laborat ory settings. W e then further defined the capabilities of our plasmid toolbox by leveraging the high output of the inducible systems for protein overproduction in E. coli . We characterized these systems across parameters most important to overexpression w orkflows including dynamic range of expression, tunability, system stability, and functionality in multi plasmid experiments. Further, we demonstrated that our inducible systems could improve yields of a value added end product when compared to a previousl y designed metabolic pathway. While achieving levels of expression that were often as high or higher than T7 regulated expression, our systems were also more stable and more predictable. Overall, our plasmids proved to be attractive alternatives to the wid ely used pET and Duet vectors that utilize the T7 system, yielding high level expression without many of the associated drawbacks.

PAGE 163

163 Finally, we investigated the fundamental elements of promoter activity across Alpha , Beta , and Gammaproteobacteria through constitutive promoter libraries. Here, large scale screens characterized activity of the library in each of 15 species and small scale promoter swapping experiments measured activity from selected promoters transferred between species. Here, we demonstrat ed that promoters with 10 and 35 element sequences that are similar to E. coli consensus hexamers were likely to be highly active , though there were important sequence differences in mid level and low activity promoters. Promoters with lower activity may be necessary to match physiological levels of a native system during complementation studies or to balance metabolic flux when engineering synthetic metabolic circuits (166, 169, 207) . This work established collections of constitutive promoters with activity levels spanning 3 5 orders of magni tude in three classes of Proteobacteria. 70 proteins are highly similar across Proteobacteria, particularly in the regions interacting with core promoter elements (81, 87, 95, 395) , recent studies have highlighted distinct differences in conserved motif se quences (122, 389) . For example, m any Betaproteobacteria promoters lack a TTG sequence in the 35 hexamer and many Alphaproteobacteria promoters lack a thymine residue at position 7, both thought to be widely conserved (389, 410) . Alphaproteobacteria promoters recognized 70 also have more variable spacer region lengths and more diverse 10 elements compared to canonical E. coli promoters (390, 402) . T hese differences in core promoter sequences among Proteobacteria and how they relate to promoter function are worth investigating experimentally to determine the sequence determinants of promoter function across species. In E. coli , promoters with core ele ments perfectly matching

PAGE 164

164 consensus can have lower activity because the RNA P holoenzyme binds the promoter too tightly and cannot transition to promoter clearance efficiently (78, 91, 102) . Comprehensive promoter scree ns across different Proteobacteria would further our understanding of the sequence threshold for high activity without impeded promoter clearance in non model bacteria. While our broad host range plasmid toolbox study involved the most comprehensive screen ing of expression systems across species that we are aware of (51) , commonly used promoter regulator pairs have been screened in some of the same specie s by other groups (153, 206, 210, 259, 337, 344, 345) . Inducible systems native to E. coli usually retain functionality when moved to another host, however, ex pression levels differ and are frequently unpredictable (208, 344, 356, 357) . In our study, variability in output among species did not appear to follow phylogeny (51) . For example, though some induction profiles in the Alphapro teobacteria were similar, behavior across promoter regulator pairs in Pseudomonas aeruginosa and Pseudomonas putida were surprisingly different (51) . Wh ile variable expression among closely related species resembles what is seen with some constitutive promoters (389, 390, 402, 418) , no study has investigated this explicitly. Expression from an inducible system involves more variables than expression from a constitutive promoter. While the regulator protein, its promoter, inducing concentrations, and induction timing are impor tant for an inducible system, the rate of expression from a constitutive promoter depends on the sequence of the promoter and the growth of the cell (68, 419) . Though 70 in different Proteobacteria correspond with phylogenetic class (94, 95) . Therefore, there may be a

PAGE 165

165 70 dependent promoters and their expression rate may be simpler to predict. In our constitutive promoter study, 15 43 promoters were characterized in each of the species included in that work. This dataset is valuable to the scientific community in providing promoter sequences with defined expression levels in three classes of Pro te o bacteria . Though, the study did not generate enough data to systematically investigate the sequence elements necessary for promoter function across species. To do this, a much larger dataset of promoters characterized across many species is necessary . L arge scale library screens across multiple species require standardized genetic tools to screen thousands of promoter sequences effectively. In practice, this requires that the promoter library be easily portable into different species for testing. For thi s to be true, the species under study should be capable of being transformed and the library itself should be easily transferrable into and between hosts. The work included herein provides tools and workflows to the scientific community that support both g oals. Our system for the standardized assembly of broad host range vectors facilitates the efficient construction of plasmid libraries that are compatible across many species of Proteobacteria. Further, our workflows for transforming and screening diverse Proteobacteria lay the groundwork for larger scale experiments that systematically dissect promoter sequences as they relate to activity across species. Predicting the activity of a promoter from its sequence remains a challenge due to the complex nonlinear interactions among sequence elements (65) . Even short promoters comprise a vast sequencing space to be explored, with 4 50 possible

PAGE 166

166 sequence s in a 50 nucleotide promoter (132) . Analysis of promoter activity from datasets generated from s creening large libraries across multiple species will likely surpass the capabilitie s of partial least squares modeling that has been used previously (420, 421) . Rather, t hese large and complex datasets of interacting elements are ide al for artificial intelligence based (AI based) predictive modeling. AI has been used effectively in predicting promoter activity in E. coli (63, 65, 132) , distinguishing between productive and abortive promoters (422) , and predicting transcriptional start sites (54) . Machine learning algorithms such as XGBoost, random forest regressor, and Adaboost , in addition to neural network deep learning models , have been tested and compared on p romoter sequence activity datasets . Through these studies , researchers have developed platforms to optimize promoter strength prediction models and design de novo promoter elements in E. coli , furthering our understanding of cooperative interactions among promoter elements (63, 65, 132) . AI predictive modeling has not yet been applied to pr omoter prediction across species. Existing machine learning algorithm s th at have been implemented for these purposes in E. coli , particularly neural networks, can be used as foundational models to be gin analyzing data from more diverse species. Here, models could be utilized to predict promoter activity both within a species and across species. It would be 10 element and the GC content of the spac er region along with parameters for core promoter elements, UP elements, and spacer length, as no model has included all of these together (54, 65, 71) . For example, a model trained on data gathered from five Alphaproteobacteria could be implemented in predicting promoter behavior in a sixth

PAGE 167

167 Alphaproteobacteria species where no characterized promoters yet exist. While it is likely that promoters wi th 10 and 35 element sequences at or near the E. coli consensus are highly active in Proteobacteria, complex interactions among all the elements of a promoter contribute to overall expression and these interactions have not been elucidated across Proteob acteria l species . Indeed, our data demonstrated that the presence or absence of consensus core hexamers was not consistently related to high or low expression, respectively (Object 4 1) . W hether a phylogenetic relationship exists among promoter sequences yielding high, medium, or low activity across Proteobacterial classes has not yet been thoroughly investigated. Overall, genetic tools with predictable behavior are necessary to work eff ectively in bacteria. The identification and characterization of active promoters in different Proteobacteria will accelerate the study of non model species and facilitate the design of reliable genetic systems in previously under and unutilized hosts. Sy stematic and comprehensive screens of promoter activity across species will not only provide tools for community use but also add to the datasets available for predictive modeling. The work described here in provides toolsets to control gene expression acro ss diverse species of Proteobacteria, enabling the construction of reliable genetic systems in non model bacteria. Continuation of this work utilizing larger libraries and predictive modeling will provide valuable insight into the portability of promoters across species and facilitate research into promoter sequence function relationships in diverse bacteri a .

PAGE 168

168 LIST OF REFERENCES 1. Mukherji,S. and Van Oudenaarden,A. (2009) Synthetic biology: Understanding biologic al design from synthetic circuits. Nat. Rev. Genet. , 10 , 859 871. 2. Wang,Y.H., Wei,K.Y. and Smolke,C.D. (2013) Synthetic biology: Advancing the design of diverse genetic systems. Annu. Rev. Chem. Biomol. Eng. , 4 , 69 102. 3. Endy,D. (2005) Foundations for engineering biology. Nature , 438 , 449 453. 4. Kitney,R. and Freemont,P. (2012) Synthetic biology The state of play. FEBS Lett. , 586 , 2029 2036. 5. Del Vecchio,D. (2015) Modularity, context dependence, and insulation in engineered biological circuits. Trends Biotechnol. , 33 , 111 119. 6. Purnick,P.E.M. and Weiss,R. (2009) The second wave of synthetic biology: From modules to systems. Nat. Rev. Mol. Cell Biol. , 10 , 410 422. 7. Clifton,K.P., Jones,E.M., Paudel,S., Marken,J.P., Monette,C.E., Halleran,A.D., Epp,L. and Saha,M.S. (2018) The genetic insulator RiboJ increases expression of insulated genes. J. Biol. Eng. , 12 . 8. Cardinale,S. and Arkin,A.P. (2 012) Contextualizing context for synthetic biology identifying causes of failure of synthetic biological systems. Biotechnol. J. , 7 , 856 866. 9. Costello,A. and Badran,A.H. (2021) Synthetic Biological Circuits within an Orthogonal Central Dogma. Trends B iotechnol. , 39 , 59 71. 10. Rao,C. V. (2012) Expanding the synthetic biology toolbox: Engineering orthogonal regulators of gene expression. Curr. Opin. Biotechnol. , 23 , 689 694. 11. Davis,J.H., Rubin,A.J. and Sauer,R.T. (2011) Design, construction and chara cterization of a set of insulated bacterial promoters. Nucleic Acids Res. , 39 , 1131 1141. 12. Kelly,J.R., Rubin,A.J., Davis,J.H., Ajo Franklin,C.M., Cumbers,J., Czar,M.J., de Mora,K., Glieberman,A.L., Monie,D.D. and Endy,D. (2009) Measuring the activity of BioBrick promoters using an in vivo reference standard. J. Biol. Eng. , 3 , 4. 13. Canton,B., Labno,A. and Endy,D. (2008) Refinement and standardization of synthetic biological parts and devices. Nat. Biotechnol. , 26 , 787 793. 14. Mutalik,V.K., Guimaraes,J. C., Cambray,G., Lam,C., Christoffersen,M.J., Mai,Q.A., Tran,A.B., Paull,M., Keasling,J.D., Arkin,A.P., et al. (2013) Precise and reliable gene expression via standard transcription and translation initiation elements. Nat. Methods , 10 , 354 360.

PAGE 169

169 15. Guido,N .J., Wang,X., Adalsteinsson,D., McMillen,D., Hasty,J., Cantor,C.R., Elston,T.C. and Collins,J.J. (2006) A bottom up approach to gene regulation. Nature , 439 , 856 860. 16. Mutschler,H., Robinson,T., Tang,T.Y.D. and Wegner,S. (2019) Special Issue on Bottom U p Synthetic Biology. ChemBioChem , 20 , 2533 2534. 17. Knight,T. (2005) Idempotent Vector Design for Standard Assembly of Biobricks Standard Biobrick Sequence Interface. MIT Synth. Biol. Work. Gr. 18. Wu,G.C., Goler,J.A., Anderson,J.C., Keasling,J.D., Leguia,M., Arkin,A.P. and Dueber,J.E. (2010) BglBricks: A flexible standard for biological part assembly. J. Biol. Eng. , 10.1186/1754 1611 4 1. 19. Kærn,M., Blake,W.J. and Collins,J.J. (2003) The Engineering of Gene Regulatory Networks. Annu. Rev. Biomed. Eng. , 5 , 179 206. 20. Voigt,C.A. (2012) Synthetic biology. ACS Synth. Biol. , 1 , 1 2. 21. Nora,L.C., Westmann,C.A., Guazzaroni,M.E., Siddaiah,C., Gupta,V.K. and Silva Rocha,R. (2019) Recent advances in plasmid based tools for establishing novel microbial chassis. Biotechnol. Adv. , 37 , 107433. 22. Adams,B.L. (2016) The Next Generation of Synthetic Biology Chassis: Moving Synthetic Biology from the L aboratory to the Field. ACS Synth. Biol. , 5 , 1328 1330. 23. Gold,L. (1990) Expression of heterologous proteins in Escherichia coli. Methods Enzymol. , 185 , 11 14. 24. Hewitt,L. and McDonnell,J.M. (2004) Screening and optimizing protein production in E. coli . Methods Mol. Biol. , 278 , 1 16. 25. Brooks,S.M. and Alper,H.S. (2021) Applications, challenges, and needs for employing synthetic biology beyond the lab. Nat. Commun. , 12 , 1390. 26. Russo,E. (2003) Special Report: the birth of biotechnology. Nature , 421 , 456 457. 27. Kinch,M.S. (2015) An overview of FDA approved biologics medicines. Drug Discov. Today , 20 , 393 398. 28. Dunbar,C.E., High,K.A., Joung,J.K., Kohn,D.B., Ozawa,K. and Sadelain,M. (2018) Gene therapy comes of age. Science (80 . ). , 359 , eaan4672. 29. El Karoui,M., Hoyos Flight,M. and Fletcher,L. (2019) Future trends in synthetic biology a report. Front. Bioeng. Biotechnol. , 7 , 175. 30. P Teixeira,A. and Fussenegger,M. (2019) Engineering mammalian cells for disease diagnosis and treatment. Curr. Opi n. Biotechnol. , 55 , 87 94.

PAGE 170

170 31. François,J.M., Lachaux,C. and Morin,N. (2020) Synthetic Biology Applied to Carbon Conservative and Carbon Dioxide Recycling Pathways. Front. Bioeng. Biotechnol. , 7 , 446. 32. Batista,M.B. and Dixon,R. (2019) Manipulating nitrogen regulation in diazotrophic bacteria for agronomic benefit. Biochem. Soc. Trans. , 47 , 603 614. N. Biotechnol. , 27 , 478 481. 34 . Khan,S., Afzal,M., Iqbal,S. and Khan,Q.M. (2013) Plant bacteria partnerships for the remediation of hydrocarbon contaminated soils. Chemosphere , 90 , 1317 1332. 35. Kitagawa,W., Takami,S., Miyauchi,K., Masai,E., Kamagata,Y., Tiedje,J.M. and Fukuda,M. (200 2) Novel 2,4 dichlorophenoxyacetic acid degradation genes from oligotrophic bradyrhizobium sp. strain HW13 isolated from a pristine environment. J. Bacteriol. , 184 , 509 518. 36. Wood,D.W., Setubal,J.C., Kaul,R., Monks,D.E., Kitajima,J.P., Okura,V.K., Zhou, Y., Chen,L., Wood,G.E., Almeida,J., et al. (2001) The genome of the natural genetic engineer Agrobacterium tumefaciens C58. Science (80 . ). , 294 , 2317 2323. 37. Thompson,M.G., Moore,W.M., Hummel,N.F.C., Pearson,A.N., Barnum,C.R., Scheller,H. V. and Shih,P Primed for Synthetic Biology. BioDesign Res. , 2020 , 8189219. 38. Gong,T., Xu,X., Dang,Y., Kong,A., Wu,Y., Liang,P., Wang,S., Yu,H., Xu,P. and Yang,C. (2018) An engineered Pseudomonas putida can simultaneou sly degrade organophosphates, pyrethroids and carbamates. Sci. Total Environ. , 628 629 , 1258 1265. 39. Nikel,P.I., Martínez García,E. and De Lorenzo,V. (2014) Biotechnological domestication of pseudomonads using synthetic biology. Nat. Rev. Microbiol. , 12 , 368 379. 40. Nikel,P.I. and de Lorenzo,V. (2018) Pseudomonas putida as a functional chassis for industrial biocatalysis: From native biochemistry to trans metabolism. Metab. Eng. , 50 , 142 155. 41. Mertz,J.E. and Davis,R.W. (1972) Cleavage of DNA by R1 res triction endonuclease generates cohesive ends. Proc. Natl. Acad. Sci. U. S. A. , 69 , 3370 3374. 42. Loenen,W.A.M., Dryden,D.T.F., Raleigh,E.A., Wilson,G.G. and Murrayy,N.E. (2014) Highlights of the DNA cutters: A short history of the restriction enzymes. Nu cleic Acids Res. , 42 , 3 19. 43. Shetty,R.P., Endy,D. and Knight,T.F. (2008) Engineering BioBrick vectors from BioBrick parts. J. Biol. Eng. , 2 , 1 12.

PAGE 171

171 44. Silva Rocha,R., Martínez García,E., Calles,B., Chavarría,M., Arce Rodríguez,A., De Las Heras,A., Páez Espino,A.D., Durante Rodríguez,G., Kim,J., Nikel,P.I., et al. (2013) The Standard European Vector Architecture (SEVA): A coherent platform for the ana lysis and deployment of complex prokaryotic phenotypes. Nucleic Acids Res. , 41 , 666 675. 45. Jajesniak,P. and Wong,T.S. (2015) QuickStep Cloning: A sequence independent, ligation free method for rapid construction of recombinant plasmids. J. Biol. Eng. , 9 . 46. Valenzuela Ortega,M. and French,C. (2021) Joint universal modular plasmids (JUMP): a flexible vector platform for synthetic biology. Synth. Biol. , 6 . 47. Engler,C., Kandzia,R. and Marillonnet,S. (2008) A one pot, one step, precision cloning method with high throughput capability. PLoS One , 3 , e3647. 48. Weber,E., Engler,C., Gruetzner,R., Werner,S. and Marillonnet,S. (2011) A modular cloning system for standardized assembly of multigene constructs. PLoS One , 6 , e16765. 49. Quan,J. and Tian,J. (2009) Circular Polymerase Extension Cloning of Complex Gene Libraries and Pathways. PLoS One , 4 , e 6441. 50. Gibson,D.G., Young,L., Chuang,R.Y., Venter,J.C., Hutchison,C.A. and Smith,H.O. (2009) Enzymatic assembly of DNA molecules up to several hundred kilobases. Nat. Methods , 6 , 343 345. 51. Schuster,L.A. and Reisch,C.R. (2021) A plasmid toolbox for co ntrolled gene expression across the Proteobacteria. Nucleic Acids Res. , 49 , 7189 7202. 52. Mahr,R. and Frunzke,J. (2016) Transcription factor based biosensors in biotechnology: current state and future prospects. Appl. Microbiol. Biotechnol. , 100 , 79 90. 5 3. Shimada,T., Yamazaki,Y., Tanaka,K. and Ishihama,A. (2014) The whole set of constitutive promoters recognized by RNA polymerase RpoD holoenzyme of Escherichia coli. PLoS One , 9 . 54. La Fleur,T., Hossain,A. and Salis,H.M. (2021) Automated Model Predictive Design of Synthetic Promoters to Control Transcriptional Profiles in Bacteria. bioRxiv . 55. Yu,T.C., Liu,W.L., Brinck,M.S., Davis,J.E., Shek,J., Bower,G., Einav,T., Insigne,K.D., Phillips,R., Kosuri,S., et al. (2021) Multiplexed characterization of ration ally designed promoter architectures deconstructs combinatorial logic for IPTG inducible systems. Nat. Commun. , 12 , 325. 56. Elmore,J.R., Furches,A., Wolff,G.N., Gorday,K. and Guss,A.M. (2017) Development of a high efficiency integration system and promote r library for rapid modification of Pseudomonas putida KT2440. Metab. Eng. Commun. , 5 , 1 8.

PAGE 172

172 57. Jin,L., Nawab,S., Xia,M., Ma,X. and Huo,Y.X. (2019) Context dependency of synthetic minimal promoters in driving gene expression: a case study. Microb. Biotechn ol. , 12 , 1476 1486. 58. Köbbing,S., Blank,L.M. and Wierckx,N. (2020) Characterization of Context Dependent Effects on Synthetic Promoters. Front. Bioeng. Biotechnol. , 8 , 551. 59. de Boer,H.A., Comstock,L.J. and Vasser,M. (1983) The tac promoter: a functional hybrid derived from the trp and lac promoters. Proc. Natl. Acad. Sci. U. S. A. , 80 , 21 25. 60. Shi,T., Zhang,L., Liang,M., Wang,W., Wang,K., Jiang,Y., Liu,J., He,X., Yang,Z ., Chen,H., et al. (2021) Screening and engineering of high activity promoter elements through transcriptomics and red fluorescent protein visualization in Rhodobacter sphaeroides. Synth. Syst. Biotechnol. , 6 , 335 342. 61. Liow,L.T., Go,M.D.K. and Yew,W.S. (2019) Characterisation of Constitutive Promoters from the Anderson library in Chromobacterium violaceum ATCC 12472. Eng. Biol. , 3 , 57 66. 62. Bienick,M.S., Young,K.W., Klesmith,J.R., Detwiler,E.E., Tomek,K.J. and Whitehead,T.A. (2014) The interrelationsh ip between promoter strength, gene expression, and growth rate. PLoS One , 9 . 63. Zhao,M., Yuan,Z., Wu,L., Zhou,S. and Deng,Y. (2022) Precise Prediction of Promoter Strength Based on a De Novo Synthetic Promoter Library Coupled with Machine Learning. ACS Sy nth. Biol. , 11 , 92 102. 64. Georgi,C., Buerger,J., Hillen,W. and Berens,C. (2012) Promoter strength driving TetR determines the regulatory properties of tet controlled expression systems. PLoS One , 7 . 65. Urtecho,G., Tripp,A.D., Insigne,K.D., Kim,H. and Ko suri,S. (2019) Systematic Encoded Multiplexed Reporter Assay in Escherichia coli. Biochemistry , 58 , 1539 1551. 66. Koebmann,B.J., Westerhoff,H. V., Snoep,J.L., Nilsson,D. and Jen sen,P.R. (2002) The glycolytic flux in Escherichia coli is controlled by the demand for ATP. J. Bacteriol. , 184 , 3909 3916. 67. Du,J., Yuan,Y., Si,T., Lian,J. and Zhao,H. (2012) Customized optimization of metabolic pathways by combinatorial transcriptional engineering. Nucleic Acids Res. , 40 , e142. 68. Meyer,A.J., Segall Shapiro,T.H., Glassey,E., Zhang,J. and Voigt,C.A. (2019) molecule sensors. Nat. Chem. Biol. , 15 , 196 204.

PAGE 173

173 69. Liu,X., Gu pta,S.T.P., Bhimsaria,D., Reed,J.L., Rodríguez Martínez,J.A., Ansari,A.Z. and Raman,S. (2019) De novo design of programmable inducible promoters. Nucleic Acids Res. , 47 , 10452 10463. 70. Anderson,J.C. Promoters/Catalog/Anderson. Regist. Stand. Biol. Parts . 71. Hossain,A., Lopez,E., Halper,S.M., Cetnar,D.P., Reis,A.C., Strickland,D., Klavins,E. and Salis,H.M. (2020) Automated design of thousands of nonrepetitive parts for engineering stable genetic systems. Nat. Biotechnol. , 38 , 1466 1475. 72. Helmann,J.D. (2019) Where to begin? Sigma factors and the selectivity of transcription initiation in bacteria. Mol. Microbiol. , 112 , 335 347. 73. Mejía Almonte,C., Busby,S.J.W., Wade,J.T., van Helden,J., Arkin,A.P., Stormo,G.D., Eilbeck,K., Palsson,B.O ., Galagan,J.E. and Collado Vides,J. (2020) Redefining fundamental concepts of transcription initiation in bacteria. Nat. Rev. Genet. , 21 , 699 714. 74. Burgess,R.R. (1969) Separation and characterization of the subunits of ribonucleic acid polymerase. J. B iol. Chem. , 244 , 6168 6176. 75. Murakami,K.S. and Darst,S.A. (2003) Bacterial RNA polymerases: The wholo story. Curr. Opin. Struct. Biol. , 13 , 31 39. 76. Burgess,R.R., Travers,A.A., Dunn,J.J. and Bautz,E.K.F. (1969) Factor stimulating transcription by RNA polymerase. Nature , 221 , 43 46. 77. Murakami,K.S., Masuda,S. and Darst,S.A. (2002) Structural basis of transcription initiation: an RNA polymerase holoenzyme DNA complex. Science (80 . ). , 296 , 1280 1284. 78. Meng,C.A., Fazal,F.M. and Block,S.M. (2017) Rea l time observation of polymerase promoter contact remodeling during transcription initiation. Nat. Commun. , 8 , 1178. 79. Paget,M.S.B. and Helmann,J.D. (1986) The sigma70 family of sigma factors. Genome Biol. , 4 , 203. 80. Gruber,T.M. and Gross,C.A. (2003) M ultiple Sigma Subunits and the Partitioning of Bacterial Transcription Space. Annu. Rev. Microbiol. , 57 , 441 466. 81. Chen,J., Boyaci,H. and Campbell,E.A. (2021) Diverse and unified mechanisms of transcription initiation in bacteria. Nat. Rev. Microbiol. , 19 , 95 109. 82. Keilty,S. and Rosenberg,M. (1987) Constitutive function of a positively regulated promoter reveals new sequences essential for activity. J. Biol. Chem. , 262 , 6389 6395.

PAGE 174

174 83. Barne,K.A., Bown,J.A. and Minchin,S.D. (1997) Region 2.5 of the Es cherichia coli EMBO J. , 16 , 4034 4040. 84. Ruff,E.F., Thomas Record,M. and Artsimovitch,I. (2015) Initial events in bacterial transcription initiation. Biomolecules , 5 , 1035 1062. 85. Estrem,S.T., Gaal,T., Ross,W. and Gourse,R.L. (1998) Identification of an UP element consensus sequence for bac terial promoters. Proc. Natl. Acad. Sci. U. S. A. , 95 , 9761 9766. 86. Coulombe,B. and Burton,Z.F. (1999) DNA Bending and Wrapping around RNA Microbiol. Mol. Biol. Rev. , 63 , 457 478. 87. Feklistov,A. and Darst,S.A. (2011) Structural basis for promoter 10 element Cell , 147 , 1257 1269. 88. Karpen,M.E. and deHaseth,P.L. (2015) Base flipping in open complex formation at bacterial pro moters. Biomolecules , 5 , 668 678. template DNA strand contacts during the final step of transcription initiation. J. Mol. Biol. , 350 , 930 937. 90. Davis,C.A., Bingman,C.A. , Landick,R., Record,M.T. and Saecker,R.M. (2007) Real time footprinting of DNA in the first kinetically significant intermediate in open complex formation by Escherichia coli RNA polymerase. Proc. Natl. Acad. Sci. U. S. A. , 104 , 7833 7838. 91. Feklístov,A ., Sharon,B.D., Darst,S.A. and Gross,C.A. (2014) Bacterial sigma factors: A historical, structural, and genomic perspective. Annu. Rev. Microbiol. , 68 , 357 376. 92. Sanderson,A., Mitchell,J.E., Minchin,S.D. and Busby,S.J.W. (2003) Substitutions in the Esch 10 elements at promoters. FEBS Lett. , 544 , 199 205. 93. Shultzaberger,R.K., Chen,Z., Lewis,K.A. and Schneider,T.D. (2007) Anatomy of Nucleic Acids R es. , 35 , 771 788. conservation and evolutionary relationships. J. Bacteriol. , 174 , 3843 3849. 95. Gruber,T.M. and Bryant,D.A. (1997) Molecular systematic studies of eubacteria, usin type sigma factors of group 1 and group 2. J. Bacteriol. , 179 , 1734 1747.

PAGE 175

175 96. Hawley,D.K. and Mcclure,W.R. (1983) Compilation and analysis of Escherichia coli promoter DNA sequences. Nucleic Acids Res. , 11 , 2237 2255. 97. Klein,C.A., Teufel,M., Weile,C.J. and Sobetzko,P. (2021) The bacterial promoter spacer modulates promoter strength and timing by length, TG motifs and DNA supercoiling sensitivity. Sci. Rep. , 11 , 24399. 98. Yuzenkova,Y., Tadigotla,V.R., Severinov,K. an d Zenkin,N. (2011) A new basal promoter element recognized by RNA polymerase core enzyme. EMBO J. , 30 , 3766 3775. 99. Beutel,B.A. and Record,M.T. (1990) E.coli promoter spacer regions contain nonrandom sequences which correlate to spacer length. Nucleic Ac ids Res. , 18 , 3597 3603. 100. Aoyama,T., Takanami,M., Ohtsuka,E., Taniyama,Y., Marumoto,R., Sato,H. and Ikehara,M. (1983) Essential structure of E. coli promoter effect of spacer length between the two consensus sequences on promoter function. Nucleic Acid s Res. , 11 , 5855 5864. 101. Wang,J.C. (1979) Helical repeat of DNA in solution. Proc. Natl. Acad. Sci. U. S. A. , 76 , 200 203. 102. Hook Barnard, India, G.; Hinton, Deborah,M. (2007) Transcription Initiation by Mix and Match Elements: Flexibility for Polyme rase Binding to Bacterial Promoters. Gene Regul. Syst. Bio. 103. Dombroski,A.J., Johnson,B.D., Lonetto,M. and Gross,C.A. (1996) The sigma subunit of Escherichia coli RNA polymerase senses promoter spacing. Proc. Natl. Acad. Sci. U. S. A. , 93 , 8858 8862. 10 4. Browning,D.F. and Busby,S.J.W. (2004) The regulation of bacterial transcription initiation. Nat. Rev. Microbiol. , 2 , 57 65. 105. Naryshkin,N., Revyakin,A., Kim,Y., Mekler,V. and Ebright,R.H. (2000) Structural organization of the RNA polymerase promoter open complex. Cell , 101 , 601 611. 106. Ross,W., Ernst,A. and Gourse,R.L. (2001) Fine structure of E. coli RNA polymerase groove. Genes Dev. , 15 , 491 506. 107. Gaal,T., Ross,W., Blatter,E.E., Tang,H., Jia,X., Krishnan,V. V., Assa Munt,N., Ebright,R.H. and Gourse,R.L. (1996) DNA RNA polymerase: Novel DNA binding domain architecture. Genes Dev. , 10 , 16 26. 108. Ross,W., Gosink,K.K., Salomon,J., Igarashi,K. , Zou,C., Ishihama,A., Severinov,K. and Gourse,R.L. (1993) A third recognition element in bacterial promoters: DNA Science (80 . ). , 262 , 1407 1413.

PAGE 176

176 109. Mitchell,J.E., Zheng,D., Busby,S.J.W. and Minchin,S.D. (20 03) Identification and Nucleic Acids Res. , 31 , 4689 4695. 110. Graña,D., Gardella,T. and Susskind,M. (1988) The Effects of Mutations in the Ant Promoter of Phage P22 Depend on Context. Genetics , 120 , 319 327. 111. Jensen,D. and Galburt, Eric,A. (2021) The Context Dependent Influence of Promoter Sequence Motifs on Transcription Initiation Kinetics and Regulation. J. Bacteriol. , 203 , e00512 20. 112. Michalowski,C.B., Short,M.D. and Little,J.W. (2004) S equence tolerance of the J. Bacteriol. , 186 , 7988 7999. 113. Kumar,A., Malloch,R.A., Fujita,N., Smillie,D.A., Ishihama,A. and Hayward,R.S. (1993) The minus 35 recognition region of Escherichia coli sigma 70 is inessential J. Mol. Biol. , 232 , 406 418. 114. Thouvenot,B., Charpentier,B. and Branlant,C. (2004) The strong efficiency of the Escherichia coli gapA P1 pro moter depends on a complex combination of functional determinants. Biochem. J. , 383 , 371 382. 115. Yona,A.H., Alm,E.J. and Gore,J. (2018) Random sequences rapidly evolve into de novo promoters. Nat. Commun. , 9 , 1530. 116. Wang,J., Zhai,H., Rexida,R., Shen, Y., Hou,J. and Bao,X. (2018) Developing synthetic hybrid promoters to increase constitutive or diauxic shift induced expression in Saccharomyces cerevisiae. FEMS Yeast Res. , 18 . 117. Henderson,K.L., Felth,L.C., Molzahn,C.M., Shkel,I., Wang,S., Chhabra,M., Ruff,E.F., Bieter,L., Kraft,J.E. and Record,M.T. (2017) Mechanism of transcription initiation and promoter escape by E. coli RNA polymerase. Proc. Natl. Acad. Sci. U. S. A. , 114 , E3032 E3040. 118. Ross,W., Aiyar,S.E., Salomon,J. and Gourse,R.L. (1998) Esch erichia coli promoters with up elements of different strengths: Modular structure of bacterial promoters. J. Bacteriol. , 180 , 5375 5383. 119. Ojangu,E.L., Tover,A., Teras,R. and Kivisaar,M. (2000) Effects of combination of different 10 hexamers and downst ream sequences on stationary phase specific dependent transcription in Pseudomonas putida. J. Bacteriol. , 182 , 6707 6713. 120. Kumar,A., Grimes,B., Logan,M., Wedgwood,S., Williamson,H. and Hayward,R.S. (1995) A hybrid sigma subunit directs RNA polymerase to a hybrid promoter in Escherichia coli. J. Mol. Biol. , 246 , 563 571.

PAGE 177

177 121. Paget,M.S. (2015) Bacterial sigma factors and anti sigma factors: Structure, function and distribution. Biomolecules , 5 , 1245 1265. 122. Santillán,O., Ramírez Romero ,M.A., Lozano,L., Checa,A., Encarnación,S.M. and Dávila,G. (2016) Region 4 of Rhizobium etli primary sigma factor (SigA) confers transcriptional laxity in Escherichia coli. Front. Microbiol. , 7 , 1078. 123. Tanaka,K. and Takahashi,H. (1991) Cloning and anal ysis of the gene (rpoDA) for BBA Gene Struct. Expr. , 1089 , 113 119. 124. Tripathi,L., Zhang,Y. and Lin,Z. (2014) Bacterial sigma factors as targets for engineered or synthetic transcriptional control. Front. Bioeng. Biotechnol. , 2 , 33. 125. Bervoets,I., Van Brempt,M., Van Nerom,K., Van Hove,B., Maertens,J., De Mey,M. and Charlier,D. (2018) A sigma factor toolbox for orthogonal gene expression in Escherichia coli. Nucleic Acids Res. , 46 , 2133 2144. 126. Studier,F.W. and Moffatt,B.A. (1986 ) Use of bacteriophage T7 RNA polymerase to direct selective high level expression of cloned genes. J. Mol. Biol. , 189 , 113 130. 127. Heyduk,E. and Heyduk,T. (2018) DNA template sequence control of bacterial RNA polymerase escape from the promoter. Nucleic Acids Res. , 46 , 4469 4486. 128. Liu,M., Tolstorukov,M., Zhurkin,V., Garges,S. and Adhya,S. (2004) A mutant spacer sequence between 35 and 10 elements makes the P lac promoter hyperactive and cAMP receptor protein independent. Proc. Natl. Acad. Sci. U. S . A. , 101 , 6911 6916. High resolution analysis of DNA regulatory elements by synthetic saturation mutagenesis. Nat. Biotechnol. , 27 , 1173 1175. 130. Inoue,F. and Ahituv,N. (2015) Decoding enhancers using massively parallel reporter assays. Genomics , 106 , 159 164. 131. Einav,T. and Phillips,R. (2019) How the avidity of polymerase binding to the 35/ 10 promoter sites affects gene expression. Proc. Natl. Acad. Sci. U. S. A. , 116 , 13340 13345. 132. Wang,Y., Wang,H., Wei,L., Li,S., Liu,L. and Wang,X. (2020) Synthetic promoter design in Escherichia coli based on a deep generative network. Nucleic Acids Res. , 48 , 6403 6412. 133. Zhou,D. and Yang,R. (2006) Global analysis of gene t ranscription regulation in prokaryotes. Cell. Mol. Life Sci. , 63 , 2260 2290.

PAGE 178

178 134. Lewis,M., Chang,G., Horton,N.C., Kercher,M.A., Pace,H.C., Schumacher,M.A., Brennan,R.G. and Lu,P. (1996) Crystal structure of the lactose operon represser and its complexes with DNA and inducer. Science (80 . ). , 271 , 1247 1254. 135. Hillen,W. and Berens,C. (1994) Mechanisms Underlying Expression of TN10 Encoded Tetracycline Resistance. Annu. Rev. Microbiol. , 48 , 345 69. 136. Brautaset,T., Lale,R. and Valla,S. (2009) Positively regulated bacterial expression systems. Microb. Biotechnol. , 2 , 15 30. 137. Adhya,S. and Garges,S. (1990) Positive Control. J. Biol. Chem. , 265 , 10797 10800. 138. Barnard,A., Wolfe,A. and Busby,S. ( 2004) Regulation at complex bacterial promoters: How bacteria use different promoter organizations to produce different regulatory outcomes. Curr. Opin. Microbiol. , 7 , 102 108. 139. Miksch,G. and Dobrowolski,P. (1995) Growth phase dependent induction of st ationary phase promoters of Escherichia coli in different gram negative bacteria. J. Bacteriol. , 177 , 5374 5378. 140. Connell,N., Han,Z., Moreno,F. and Kolter,R. (1987) An E. coli promoter induced by the cessation of growth. Mol. Microbiol. , 1 , 195 201. 14 1. Aldea,M., Garrido,T., Hernandez Chico,C., Vicente,M. and Kushner,S.R. (1989) Induction of a growth phase dependent promoter triggers transcription of bolA, an Escherichia coli morphogene. EMBO J. , 8 , 3923 3931. 142. Cuthbertson,L. and Nodwell,J.R. (2013 ) The TetR Family of Regulators. Microbiol. Mol. Biol. Rev. , 77 , 440 475. 143. Grkovic,S., Brown,M.H., Skurray,R.A., Repressor,A. and Subtilis,B. (2006) Regulation of Bacterial Drug Export Systems. Microbiol. Mol. Biol. Rev. , 66 , 671 701. 144. Miller,M.B. and Bassler,B.L. (2001) Quorum Sensing in Bacteria. Annu. Rev. Microbiol. , 55 , 165 199. 145. Fuqua, Clay; Parsek, Matthew R; Greenberg,E.P. (2001) Regulation of Gene Expression by Cell to Cell Communication: Acyl Homoserine Lactone Quorum Sensing. Annu. Re v. Genet. , 35 , 439 68. 146. Whitehead,N.A., Barnard,A.M.L., Slater,H., Simpson,N.J.L. and Salmond,G.P.C. (2001) Quorum sensing in Gram negative bacteria. FEMS Microbiol. Rev. , 25 . 147. Aidelberg,G., Towbin,B.D., Rothschild,D., Dekel,E., Bren,A. and Alon,U. (2014) Hierarchy of non glucose sugars in Escherichia coli. BMC Syst. Biol. , 8 , 133.

PAGE 179

179 148. Görke,B. and Stülke,J. (2008) Carbon catabolite repression in bacteria: Many ways to make the most out of nutrients. Nat. Rev. Microbiol. , 6 , 613 624. 149. Brückner, R. and Titgemeyer,F. (2002) Carbon catabolite repression in bacteria: Choice of the carbon source and autoregulatory limitation of sugar utilization. FEMS Microbiol. Lett. , 209 , 141 148. 150. Hogan,A.M., Jeffers,K.R., Palacios,A. and Cardona,S.T. (2021) Improved Dynamic Range of a Rhamnose Inducible Promoter for Gene Expression in Burkholderia spp. Appl. Environ. Microbiol. , 87 , e00647 21. 151. Kent,R. and Dixon,N. (2020) Contemporary Tool s for Regulating Gene Expression in Bacteria. Trends Biotechnol. , 38 , 316 333. 152. Thanbichler,M., Iniesta,A.A. and Shapiro,L. (2007) A comprehensive set of plasmids for vanillate and xylose inducible gene expression in Caulobacter crescentus. Nucleic Ac ids Res. , 35 , e137. 153. Murin,C.D., Segal,K., Bryksin,A. and Matsumura,I. (2012) Expression vectors for Acinetobacter baylyi ADP1. Appl. Environ. Microbiol. , 78 , 280 283. 154. Kim,N.M., Sinnott,R.W. and Sandoval,N.R. (2020) Transcription factor based bios ensors and inducible systems in non model bacteria: current progress and future directions. Curr. Opin. Biotechnol. , 64 , 39 46. 155. Vidal,L., Pinsach,J., Striedner,G., Caminal,G. and Ferrer,P. (2008) Development of an antibiotic free plasmid selection sys tem based on glycine auxotrophy for recombinant protein overproduction in Escherichia coli. J. Biotechnol. , 134 , 127 136. 156. Fiedler,M. and Skerra,A. (2001) proBA complementation of an auxotrophic E. coli strain improves plasmid stability and expression yield during fermenter production of a recombinant antibody fragment. Gene , 274 , 111 118. 157. Ramsey,M.E., Hackett,K.T., Kotha,C. and Dillard,J.P. (2012) New complementation constructs for inducible and constitutive gene expression in Neisseria gonorrhoea e and Neisseria meningitidis. Appl. Environ. Microbiol. , 78 , 3068 3078. 158. Choi,K.H., Mima,T., Casart,Y., Rholl,D., Kumar,A., Beacham,I.R. and Schweizer,H.P. (2008) Genetic tools for select agent compliant manipulation of Burkholderia pseudomallei. Appl. Environ. Microbiol. , 74 , 1064 1075. 159. Wagner,S., Klepsch,M.M., Schlegel,S., Appel,A., Draheim,R., Tarry,M., Högbom,M., Van Wijk,K.J., Slotboom,D.J., Persson,J.O., et al. (2008) Tuning Escherichia coli for membrane protein overexpression. Proc. Natl. Ac ad. Sci. U. S. A. , 105 , 14371 14376.

PAGE 180

180 160. Thomas,M.D. and Van Tilburg,A. (2000) Overexpression of foreign proteins using the vibrio fischeri lux control system. Methods Enzymol. , 305 , 315 329. 161. Wong,C.F., Rahman,R.N.Z.R.A., Basri,M. and Salleh,A.B. (20 17) Construction of new genetic tools as alternatives for protein overexpression in Escherichia coli and pseudomonas aeruginosa. Iran. J. Biotechnol. , 15 , 194 200. 162. Chatzivasileiou,A.O., Ward,V., Edgar,S.M. and Stephanopoulos,G. (2019) Two step pathway for isoprenoid synthesis. Proc. Natl. Acad. Sci. U. S. A. , 116 , 506 511. 163. Temme,K., Hill,R., Segall Shapiro,T.H., Moser,F. and Voigt,C.A. (2012) Modular control of multiple pathways using engineered orthogonal T7 polymerases. Nucleic Acids Res. , 40 , 8773 8781. 164. Jones,J.A., Vernacchio,V.R., Lachance,D.M., Lebovich,M., Fu ,L., Shirke,A.N., Schultz,V.L., Cress,B., Linhardt,R.J. and Koffas,M.A.G.G. (2015) ePathOptimize: A combinatorial approach for transcriptional balancing of metabolic pathways. Sci. Rep. , 5 , 11301. 165. Ou,J., Wang,L., Ding,X., Du,J., Zhang,Y., Chen,H. and Xu,A. (2004) Stationary phase protein overproduction is a fundamental capability of Escherichia coli. Biochem. Biophys. Res. Commun. , 314 , 174 180. 166. Dhamankar,H., Tarasova,Y., Martin,C.H. and Prather,K.L.J. (2014) Engineering E. coli for the biosynthes is of 3 hydroxy butyrolactone (3HBL) and 3,4 dihydroxybutyric acid (3,4 DHBA) as value added chemicals from glucose as a sole carbon source. Metab. Eng. , 25 , 72 81. 167. Gupta,A., Reizman,I.M.B., Reisch,C.R. and Prather,K.L.J.J. (2017) Dynamic regulation of metabolic flux in engineered bacteria using a pathway independent quorum sensing circuit. Nat. Biotechnol. , 35 , 273 279. 168. McNerney,M.P., Watstein,D.M. and Styczynski,M.P. (2015) Precision metabolic engineering: The design of responsive, selective, and controllable metabolic systems. Metab. Eng. , 31 , 123 131. 169. Jones,K.L., Kim,S.W. and Keasling,J.D. (2000) Low copy plasmids can perform as well as or better than high copy plasmids for metabolic engineering of bacteria. Metab. Eng. , 2 , 328 338. 170. Guzman,L.M., Belin,D., Carson,M.J. and Beckwith,J. (1995) Tight regulation, modulation, and high level expression by vectors containing the arabinose P(BAD) promoter. J. Bacteriol. , 177 , 4121 4130.

PAGE 181

181 171. Kim,S.K., Lee,D.H., Kim,O.C., Kim,J.F. and Yoon,S. H. (2017) Tunable Control of an Escherichia coli Expression System for the Overproduction of Membrane Proteins by Titrated Expression of a Mutant lac Repressor. ACS Synth. Biol. , 6 , 1766 1773. 172. Abil,Z., Ellefson,J.W., Gollihar,J.D., Watkins,E. and Elli ngton,A.D. (2017) Compartmentalized partnered replication for the directed evolution of genetic parts and circuits. Nat. Protoc. , 12 , 2493 2512. 173. Kato,Y. (2020) Extremely low leakage expression systems using dual transcriptional translational control f or toxic protein production. Int. J. Mol. Sci. , 21 . strategy as a tool for optimization of inducible promoters. Microb. Cell Fact. , 17 , 40. 175. Gatti Lafranconi,P., Dijkman,W.P., Devenish,S.R.A. and Hollfelder,F. (2013) A single mutation in the core domain of the lac repressor reduces leakiness. Microb. Cell Fact. , 12 , 67. 176. Blount,B.A., Weenink,T., Vasylechko,S. and Ellis,T. (2012) Ration al diversification of a promoter providing fine tuned expression and orthogonal regulation for synthetic biology. PLoS One , 7 , e33279. 177. Jacob,F. and Monod,J. (1961) Genetic regulatory mechanisms in the synthesis of proteins. J. Mol. Biol. , 3 , 318 356. 178. Gilbert,W. and Maxam,A. (1973) The nucleotide sequence of the lac operator. Proc. Natl. Acad. Sci. U. S. A. , 70 , 3581 3584. 179. Phillips,K.N., Widmann,S., Lai,H.Y., Nguyen,J., Ray,J.C.J., Balázsi,G. and Cooper,T.F. (2019) Diversity in lac operon regu lation among diverse escherichia coli isolates depends on the broader genetic background but is not explained by genetic relatedness. MBio , 10 . 180. Eames,M. and Kortemme,T. (2012) Cost Benefit Tradeoffs in Engineered lac Operons. Science (80 . ). , 339 , 60 83. 181. Schlax,P.J., Capp,M.W. and Record,M.T. (1995) Inhibition of transcription initiation by lac repressor. J. Mol. Biol. , 245 , 331 350. 182. Notley mcrobb,L., Death,A. and Ferenci,T. (2006) The relationship between external glucose concentration and c AMP levels inside. Biochem. J. 183. Monod,J. (1942) Recherches sur la croissance des cultures bact riennes. 184. Browning,D.F., Godfrey,R.E., Richards,K.L., Robinson,C. and Busby,S.J.W. (2019) Exploitation of the Escherichia coli lac operon promoter for c ontrolled recombinant protein production. Biochem. Soc. Trans. , 47 , 755 763.

PAGE 182

182 185. Overton,T.W. (2014) Recombinant protein production in bacterial hosts. Drug Discov. Today , 19 , 590 601. 186. Kammerer,W., Deuschle,U., Gentz,R. and Bujard,H. (1986) Functiona l dissection of Escherichia coli promoters: information in the transcribed region is involved in late steps of the overall process. EMBO J. , 5 , 2995 3000. 187. Lutz,R. and Bujard,H. (1997) Independent and tight regulation of transcriptional units in Escher ichia coli via the LacR/O, the TetR/O and AraC/I1 I2 regulatory elements. Nucleic Acids Res. , 25 , 1203 1210. 188. Silverstone,A.E., Arditti,R.R. and Magasanik,B. (1970) Catabolite insensitive revertants of lac promoter mutants. Proc. Natl. Acad. Sci. U. S. A. , 66 , 773 779. 189. Arditti,R.R., Scaife,J.G. and Beckwith,J.R. (1968) The nature of mutants in the lac pr omoter region. J. Mol. Biol. , 38 , 421 426. 190. Politi,N., Pasotti,L., Zucca,S., Casanova,M., Micoli,G., Cusella De Angelis,M.G. and Magni,P. (2014) Half life measurements of chemical inducers for recombinant gene expression. J. Biol. Eng. , 8 , 5. 191. Fern ández Castané,A., Caminal,G. and López Santín,J. (2012) Direct measurements of IPTG enable analysis of the induction behavior of E. coli in high cell density cultures. Microb. Cell Fact. , 11 , 58. 192. Fritz,G., Megerle,J.A., Westermayer,S.A., Brick,D., Hee rmann,R., Jung,K., Rädler,J.O. and Gerland,U. (2014) Single cell kinetics of phenotypic switching in the arabinose utilization system of E. coli. PLoS One , 9 . 193. Stoner,C. and Schleif,R. (1983) The araE low affinity l arabinose transport promoter. Clonin g, sequence, transcription start site and DNA binding sites of regulatory proteins. J. Mol. Biol. , 171 , 369 381. 194. Khlebnikov,A., Datsenko,K.A., Skaug,T., Wanner,B.L. and Keasling,J.D. (2001) Homogeneous expression of the PBAD promoter in Escherichia co li by constitutive expression of the low affinity high capacity araE transporter. Microbiology , 147 , 3241 3247. 195. Hendrickson,W., Flaherty,C. and Molz,L. (1992) Sequence elements in the Escherichia coli araFGH promoter. J. Bacteriol. , 174 , 6862 6871. 19 6. Hahn,S. and Schleif,R. (1983) In vivo regulation of the Escherichia coli araC promoter. J. Bacteriol. , 155 , 593 600. 197. Tobin,J.F. and Schleif,R.F. (1987) Positive regulation of the Escherichia coli l rhamnose operon is mediated by the products of tan demly repeated regulatory genes. J. Mol. Biol. , 196 , 789 799.

PAGE 183

183 198. Wickstrum,J.R., Santangelo,T.J. and Egan,S.M. (2005) Cyclic AMP receptor protein and RhaR synergistically activate transcription from the L rhamnose responsive rhaSR promoter in Escherichia coli. J. Bacteriol. , 187 , 6708 6718. 199. Marschall,L., Sagmeister,P. and Herwig,C. (2016) Tunable recombinant protein expression in E. coli: enabler for continuous processing? Appl. Microbiol. Biotechnol. , 100 , 5719 5728. 200. Afroz,T., Biliouris,K., Boy kin,K.E., Kaznessis,Y. and Beisel,C.L. (2015) Trade offs in Engineering Sugar Utilization Pathways for Titratable Control. ACS Synth. Biol. , 4 , 141 149. 201. Afroz,T., Biliouris,K., Kaznessis,Y. and Beisel,C.L. (2014) Bacterial sugar utilization gives rise to distinct single cell behaviours. Mol. Microbiol. , 93 , 1093 1103. 202. E. M. Ozbudak, M. Thattai, H. N. Lim, B. I. Shraiman and Oudenaarden,A. Van (2004) Multistability in the lactose utilization network of Escherichia coli. Nature , 427 , 737 740. 203. Hjelm,A., Karyolaimos,A., Zhang,Z., Rujas,E., Vikström,D., Slotboom,D. J. and De Gier,J.W. (2017) Tailoring Escherichia coli for the l Rhamnose PBAD Promoter Based Production of Membrane and Secretory Proteins. ACS Synth. Biol. , 6 , 985 994. 204. Bi,C., Su,P., Müller,J., Yeh,Y.C., Chhabra,S.R., Beller,H.R., Singer,S.W. and Hi llson,N.J. (2013) Development of a broad host synthetic biology toolbox for ralstonia eutropha and its application to engineering hydrocarbon biofuel production. Microb. Cell Fact. , 12 , 107. 205. Lefebre,M.D. and Valvano,M.A. (2002) Construction and evalua tion of plasmid vectors optimized for constitutive and regulated gene expression in Burkholderia cepacia complex isolates. Appl. Environ. Microbiol. , 68 , 5956 5964. 206. Prior,J.E., Lynch,M.D. and Gill,R.T. (2010) Broad host range vectors for protein expre ssion across Gram negative hosts. Biotechnol. Bioeng. , 106 , 326 332. 207. Fricke,P.M., Link,T., Gätgens,J., Sonntag,C., Otto,M., Bott,M. and Polen,T. (2020) A tunable l arabinose inducible expression plasmid for the acetic acid bacterium Gluconobacter oxyd ans. Appl. Microbiol. Biotechnol. , 104 , 9267 9282. 208. Jeske,M. and Altenbuchner,J. (2010) The Escherichia coli rhamnose promoter rhaPBAD is in Pseudomonas putida KT2440 independent of Crp cAMP activation. Appl. Microbiol. Biotechnol. , 85 , 1923 1933. 209. Kelly,C.L., Taylor,G.M., Hitchcock,A., Torres Méndez,A. and Heap,J.T. (2018) A Rhamnose Inducible System for Precise and Temporal Control of Gene Expression in Cyanobacteria. ACS Synth. Biol. , 7 , 1056 1066.

PAGE 184

184 210. Qiu,D., Damron,F.H., Mima,T., Schweizer,H.P . and Yu,H.D. (2008) PBAD based shuttle vectors for functional analysis of toxic and highly regulated genes in Pseudomonas and Burkholderia spp. and other bacteria. Appl. Environ. Microbiol. , 74 , 7422 7426. 211. Lundstrom,K. (2007) Structural genomics and drug discovery: Molecular Pharmacology. J. Cell. Mol. Med. , 11 , 224 238. 212. Narayanan,A., Ridilla,M. and Yernool,D.A. (2011) Restrained expression, a method to overproduce toxic membrane proteins by exploiting operator repressor interactions. Protein Sci . , 20 , 51 61. 213. Du Plessis,D.J.F., Nouwen,N. and Driessen,A.J.M. (2011) The Sec translocase. Biochim. Biophys. Acta Biomembr. , 1808 , 851 865. 214. Nannenga,B.L. and Baneyx,F. (2011) Enhanced expression of membrane proteins in E. coli with a PBAD promo ter mutant: synergies with chaperone pathway engineering strategies. Microb. Cell Fact. , 10 , 105. 215. Garg,N., Manchanda,G. and Kumar,A. (2014) Bacterial quorum sensing: Circuits and applications. Antonie van Leeuwenhoek, Int. J. Gen. Mol. Microbiol. , 105 , 289 305. 216. Fuqua,W.C., Winans,S.C. and Greenberg,E.P. (1994) Quorum sensing in bacteria: the LuxR LuxI family of cell density responsive transcriptional regulators. J. Bacteriol. , 176 , 269 275. 217. Fuqua,C. and Greenberg,E.P. (2002) Listening in on bacteria: Acyl homoserine lactone signalling. Nat. Rev. Mol. Cell Biol. , 3 , 685 695. 218. Chapon He rvé,V., Akrim,M., Latifi,A., Williams,P., Lazdunski,A. and Bally,M. (1997) Regulation of the xcp secretion pathway by multiple quorum sensing modulons in Pseudomonas aeruginosa. Mol. Microbiol. , 24 , 1169 1178. 219. Passador,L., Cook,J.M., Gambello,M.J., Ru st,L. and Iglewski,B.H. (1993) Expression of Pseudomonas aeruginosa Virulence Genes Requires Cell to Cell Communication. Science (80 . ). , 260 , 1127 1130. 220. Brint,J.M. and Ohman,D.E. (1995) Synthesis of multiple exoproducts in Pseudomonas aeruginosa is under the control of RhlR RhlI, another set of regulators in strain PAO1 with homology to the autoinducer responsive LuxR LuxI family. J. Bacteriol. , 177 , 7155 7163. 221. Bainton,N.J., Stead,P., Chhabra,S.R., Bycroft,B.W., Salmond,G.P.C., Stewart,G.S.A.B. and Williams,P. (1992) N (3 Oxohexanoyl) L homoserine lactone regulates carbapenem antibiotic production in Erwinia carotovora. Biochem. J. , 288 , 997 1004.

PAGE 185

185 222. Williams,P., Bainton,N.J., Swift,S., Chhabra,S.R., Winson,M.K., Stewart,G.S.A.B., Salmond,G.P.C . and Bycroft,B.W. (1992) Small molecule mediated density dependent control of gene expression in prokaryotes: Bioluminescence and the biosynthesis of carbapenem antibiotics. FEMS Microbiol. Lett. , 100 , 161 167. 223. Zhang,L., Murphy,P.J., Kerr,A. and Tate ,M.E. (1993) Agrobacterium conjugation and gene regulation by N acyl L homoserine lactones. Nature , 362 , 446 448. 224. Dunny,G.M., Brown,B.L. and Clewell,D.B. (1978) Induced cell aggregation and mating in Streptococcus faecalis: evidence for a bacterial se x pheromone. Proc. Natl. Acad. Sci. U. S. A. , 75 , 3479 3483. 225. Davies,D.G., Parsek,M.R., Pearson,J.P., Iglewski,B.H., Costerton,J.W. and Greenberg,E.P. (1998) The involvement of cell to cell signals in the development of a bacterial biofilm. Science (80 . ). , 280 , 295 298. 226. Henke,J.M. and Bassler,B.L. (2004) Bacterial social engagements. Trends Cell Biol. , 14 , 648 656. 227. Nealson,K.H. and Hastings,J.W. (1979) Bacterial bioluminescence: Its control and ecological significance. Microbiol. Rev. , 43 , 4 96 518. 228. Graf,J. and Ruby,E.G. (1998) Host derived amino acids support the proliferation of symbiotic bacteria. Proc. Natl. Acad. Sci. U. S. A. , 95 , 1818 1822. 229. Ruby,E.G. and McFall Ngai,M.J. (1992) A squid that glows in the night: development of an animal bacterial mutualism. J Bacteriol , 174 , 4865 4870. 230. Ruby,E.G. and Nealson,K.H. (1976) Symbiotic Association of Photobacterium Fischeri with the Marine L uminous Fish Monocentris Japonica: A Model of Symbiosis Based on Bacterial Studies. Biol. Bull. , 151 , 574 586. 231. Montgomery,M.K. and McFall Ngai,M. (1994) Bacterial symbionts induce host organ morphogenesis during early postembryonic development of the squid Euprymna scolopes. Development , 120 , 1719 1729. 232. Visick,K.L., Foster,J., Doino,J., McFall Ngai,M. and Ruby,E.G. (2000) Vibrio fischeri lux genes play an important role in colonization and development of the host light organ. J. Bacteriol. , 182 , 4 578 4586. 233. Engebrecht,J., Nealson,K. and Silverman,M. (1983) Bacterial bioluminescence: Isolation and genetic analysis of functions from Vibrio fischeri. Cell , 32 , 773 781. 234. Kaplan,H.B. and Greenberg,E.P. (1985) Diffusion of autoinducer is involved in regulation of the Vibrio fischeri luminescence system. J. Bacteriol. , 163 , 1210 1214.

PAGE 186

186 235. Engebrecht,J.A. and Silverman,M. (1984) Identification of genes and gene products necessary for bacterial bioluminescence. Proc. Natl. Acad. Sci. U. S. A. , 81 , 4154 4158. 236. Dunn,D.K., Michaliszyn,G.A., Bogacki,I.G. and Meighen,E.A. (1973) Conversion of Aldehyde to Acid in the Bacterial Bioluminescent Reaction. Biochemistry , 12 , 4911 4918. 237. Boylan,M., Graham,A.F. and Meighen,E.A. (1985) Functional identific ation of the fatty acid reductase components encoded in the luminescence operon of Vibrio fischeri. J. Bacteriol. , 163 , 1186 1190. 238. Zenno,S. and Saigo,K. (1994) Identification of the genes encoding NAD(P)H flavin oxidoreductases that are similar in seq uence to Escherichia coli fre in four species of luminous bacteria: Photorhabdus luminescens, Vibrio fischeri, Vibrio harveyi, and Vibrio orientalis. J. Bacteriol. , 176 , 3544 3551. 239. Chu,T., Huang,Y., Hou,M., Wang,Q., Xiao,J., Liu,Q. and Zhang,Y. (2015) In vivo programmed gene expression based on artificial quorum networks. Appl. Environ. Microbiol. , 81 , 4984 4992. 240. Tamsir,A., Tabor,J.J. and Voigt,C.A. (2011) Robust multicellular computing using Nature , 469 , 212 215. 241. Chopra,I. and Roberts,M. (2001) Tetracycline Antibiotics: Mode of Action, Applications, Molecular Biology, and Epidemiology of Bacterial Resistance. Microbiol. Mol. Biol. Rev. , 65 , 232 260. 242. Beck,C.F., Mutzel,R., Barbe,J. and Muller,W. (1982) A multifunctional gene (tetR) controls Tn10 encoded tetracycline resistance. J. Bacteriol. , 150 , 633 642. 243. Bertrand,K.P., Postle,K., Wray,L. V. and Reznikoff,W.S. (1983) Overlapping divergent promoters control expression of Tn10 tetra cycline resistance. Gene , 23 , 149 156. 244. Wray,L. V., Jorgensen,R.A. and Reznikoff,W.S. (1981) Identification of the tetracycline resistance promoter and repressor in transposon Tn10. J. Bacteriol. , 147 , 297 304. 245. Bertram,R. and Hillen,W. (2008) The application of Tet repressor in prokaryotic gene regulation and expression. Microb. Biotechnol. , 1 , 2 16. 246. Bertrand,K.P. and Lenski,R.E. (1989) Effects of carriage and expression of the Tn10 tetracycline resistance operon on the fitness of Escherichia coli K12. Mol. Biol. Evol. , 6 .

PAGE 187

187 247. Lederer,T., Kintrup,M., Takahashi,M., Sum,P.E., Ellestad,G.A. and Hillen,W. (1996) Tetracycline analogs affecting binding to Tn10 encoded Tet repressor trigger the same mechanism of induction. Biochemistry , 35 , 7439 744 6. 248. Heravi,K.M., Watzlawick,H. and Altenbuchner,J. (2015) Development of an anhydrotetracycline inducible expression system for expression of a neopullulanase in B. subtilis. Plasmid , 82 , 35 42. 249. Stanton,B.C., Nielsen,A.A.K., Tamsir,A., Clancy,K., Peterson,T. and Voigt,C.A. (2014) Genomic mining of prokaryotic repressors for orthogonal logic gates. Nat. Chem. Biol. , 10 , 99 105. 250. Gossen,M., Freundlieb,S., Bender,G., Müller,G., Hillen,W. and Bujard,H. (1995) Transcriptional Activation by Tetracycl ines in Mammalian Cells. Science (80 . ). , 268 , 1766 1769. 251. Stebbins,M.J., Urlinger,S., Byrne,G., Bello,B., Hillen,W. and Yin,J.C.P. (2001) Tetracycline inducible systems for Drosophila. Proc. Natl. Acad. Sci. U. S. A. , 98 , 10775 10780. 252. Merino,E., Jensen,R.A. and Yanofsky,C. (2008) Evolution of bacterial trp operons and their regulation. Curr. Opin. Microbiol. , 11 , 78 86. 253. Brosius,J., Erfle,M. and Storella,J. (1985) Spacing of the 10 and 35 regions in the tac promoter. Effect o n its in vivo activity. J. Biol. Chem. , 260 , 3539 3541. 254. Bagdasarian,M.M., Amann,E., Lurz,R., Rückert,B. and Bagdasarian,M. (1983) Activity of the hybrid trp lac (tac) promoter of Escherichia coli in Pseudomonas putida. Construction of broad host range, controlled expression vectors. Gene , 26 , 273 282. 255. Tegel,H., Ottosson,J. and Hober,S. (2011) Enhancing the protein production levels in Escherichia coli with a strong promoter. FEBS J. , 278 , 729 739. 256. Royo,J.L., Manyani,H., Cebolla,A. and Santero,E. (2005) A new generation of vectors with increased in duction ratios by overimposing a second regulatory level by attenuation. Nucleic Acids Res. , 33 , e169. 257. Segall Shapiro,T.H., Meyer,A.J., Ellington,A.D., Sontag,E.D. and Voigt,C.A. (2014) nted T7 RNA polymerase. Mol. Syst. Biol. , 10 , 742. 258. Lee,K.H., Park,J.H., Kim,T.Y., Kim,H.U. and Lee,S.Y. (2007) Systems metabolic engineering of Escherichia coli for L threonine production. Mol. Syst. Biol. , 3 .

PAGE 188

188 259. Biggs,B.W., Bedore,S.R., Arvay,E., Huang,S., Subramanian,H., McIntyre,E.A., Duscent Maitland,V.C., Neidle,E.L., Tyo,K.E.J.J., Duscent Maitland,C. V., et al. (2020) Development of a genetic toolset for the highly engineerable and metabolically versatile Acinetobacter baylyi ADP1. Nucleic Ac ids Res. , 48 , 5169 5182. 260. Ruffing,A.M. (2014) Improved free fatty acid production in cyanobacteria with Synechococcus sp. PCC 7002 as host. Front. Bioeng. Biotechnol. , 2 , 17. 261. Sektas,M. and Szybalski,W. (1998) Tightly controlled two stage expression vectors employing the FLP/FRT medicated inversion of cloned genes. Appl. Biochem. Biotechnol. Part B Mol. Biotechnol. , 9 , 17 24. 262. Passaris,I., Tadesse,W.M., Gayán,E. and Aertsen,A. (2019) Construction and validation of the Tn5 PLtetO 1 msfGFP transposon as a tool to probe protein expression and localization. J. Microbiol. Methods , 161 , 56 62. 263. Silva,J.P.N., Lopes,S.V., Grilo,D.J. and Hensel,Z. (2019) Plasmids for Indep endently Tunable, Low Noise Expression of Two Genes. mSphere , 4 , e00340 19. 264. Hanning,G. and Makrides,S.C. (1998) Strategies for optimizing heterologous protein expression in Escherichia coli. Trends Biotechnol. , 16 , 54 60. 265. Lee,T.S., Krupa,R.A., Zh ang,F., Hajimorad,M., Holtz,W.J., Prasad,N., Lee,S.K. and Keasling,J.D. (2011) BglBrick vectors and datasheets: A synthetic biology platform for gene expression. J. Biol. Eng. , 5 , 15 17. 266. Kandhavelu,M., Lloyd Price,J., Gupta,A., Muthukrishnan,A.B., Yli Harja,O. and Ribeiro,A.S. (2012) Regulation of mean and noise of the in vivo kinetics of transcription under the control of the lac/ara 1 promoter. FEBS Lett. , 586 , 3870 3875. 267. Mäkelä,J., Kandhavelu,M., Oliveira,S.M.D., Chandraseelan,J.G., Lloyd Price ,J., Peltonen,J., Yli Harja,O. and Ribeiro,A.S. (2013) In vivo single molecule kinetics of activation and subsequent activity of the arabinose promoter. Nucleic Acids Res. , 41 , 6544 6552. 268. Kandhavelu,M., Mannerström,H., Gupta,A., Häkkinen,A., Lloyd Pri ce,J., Yli Harja,O. and Ribeiro,A.S. (2011) In vivo kinetics of transcription initiation of the lar promoter in Escherichia coli. Evidence for a sequential mechanism with two rate limiting steps. BMC Syst. Biol. , 5 , 149. 269. Dong,H., Nilsson,L. and Kurlan d,C.G. (1995) Gratuitous overexpression of genes in Escherichia coli leads to growth inhibition and ribosome destruction. J. Bacteriol. , 177 , 1497 1504.

PAGE 189

189 270. Dubendorf,J.W. and Studier,F.W. (1991) Controlling basal expression in an inducible T7 expression system by blocking the target T7 promoter with lac repressor. J. Mol. Biol. , 219 , 45 59. 271. Jeong,H., Kim,H.J. and Lee,S.J. (2015) Complete genome sequence of Escherichia coli strain BL21. Genome Announc. , 3 , e00134 15. 272. Jeong,H., Barbe,V., Lee,C.H. , Vallenet,D., Yu,D.S., Choi,S.H., Couloux,A., Lee,S.W., Yoon,S.H., Cattolico,L., et al. (2009) Genome Sequences of Escherichia coli B strains REL606 and BL21(DE3). J. Mol. Biol. , 394 , 644 652. 273. Studier,F.W. (2005) Protein production by auto induction in high density shaking cultures. Protein Expr. Purif. , 41 , 207 234. 274. Studier,W., Rosenberg,A.H., Dunn,J.J. and Dubendorff,J.W. (1990) Use of T7 RNA polymerase to direct expression of cloned genes. Methods Enzymol. , 185 , 60 89. 275. Chamberlin,M., Mcgrath,J. and Waskell,L. (1970) New RNA polymerase from Escherichia coli infected with bacteriophage T7. Nature , 228 , 227 231. 276. Borkotoky,S. and Murali,A. (2018) The highly efficient T7 RNA pol ymerase: A wonder macromolecule in biological realm. Int. J. Biol. Macromol. , 118 , 49 56. 277. Shis,D.L. and Bennett,M.R. (2014) Synthetic biology: the many facets of T7 RNA polymerase. Mol. Syst. Biol. , 10 , 745. 278. Kumar,S., Jain,K.K., Bhardwaj,K.N., Chakraborty,S. and Kuhad,R.C. (2015) Multiple genes in a single host: Cost effective production of bacterial laccase (cotA), pectate lyase (pel), and endoxylanase (xyl) by simultaneous expression and cloning in sing le vector in E. Coli. PLoS One , 10 , e0144379. 279. Zhang,J., Weng,H., Zhou,Z., Du,G. and Kang,Z. (2019) Engineering of multiple modular pathways for high yield production of 5 aminolevulinic acid in Escherichia coli. Bioresour. Technol. , 274 , 353 360. 280. Kurnasov,O., Goral,V., Colabroy,K., Gerdes,S., Anantha,S., Osterman,A. and Begley,T.P. (2003) NAD Biosynthesis: Identification of the Tryptophan to Quinolinate Pathway in Bacteria. Chem. Biol. , 10 , 1195 1204. 281. Buck,B., Zamoon,J., Kirby,T.L., DeSilva,T .M., Karim,C., Thomas,D. and Veglia,G. (2003) Overexpression, purification, and characterization of recombinant Ca ATPase regulators for high resolution solution and solid state NMR studies. Protein Expr. Purif. , 30 , 253 261. 282. Grisshammer,R. (2006) Und erstanding recombinant expression of membrane proteins. Curr. Opin. Biotechnol. , 17 , 337 340.

PAGE 190

190 283. Wagner,S., Baarst,L., Ytterberg,A.J., Klussmerer,A., Wagner,C.S., Nord,O., Nygren,P.Ã…., Van Wijks,K.J. and De Gier,J.W. (2007) Consequences of membrane prot ein overexpression in Escherichia coli. Mol. Cell. Proteomics , 6 , 1527 1550. 284. Wagner,S., Bader,M.L., Drew,D. and de Gier,J.W. (2006) Rationalizing membrane protein overexpression. Trends Biotechnol. , 24 , 364 371. 285. Pan,S.H. and Malcolm,B.A. (2000) Reduced background expression and improved plasmid stability with pET vectors in BL21 (DE3). Biotechniques , 29 , 1234 1238. 286. Terpe,K. (2006) Overview of bacterial expression systems for heterologous protein production: From molecular and biochemical fundamentals to commercial systems. Appl. Microbiol. Biotechnol. , 72 , 211 222. 28 7. Tan,S.I. and Ng,I.S. (2020) New Insight into Plasmid Driven T7 RNA Polymerase in Escherichia coli and Use as a Genetic Amplifier for a Biosensor. ACS Synth. Biol. , 9 , 613 622. 288. Angius,F., Ilioaia,O., Amrani,A., Suisse,A., Rosset,L., Legrand,A., Abou Hamdan,A., Uzan,M., Zito,F. and Miroux,B. (2018) A novel regulation mechanism of the T7 RNA polymerase based expression system improves overproduction and folding of membrane proteins. Sci. Rep. , 8 , 8572. 289. Li,Z. and Rinas,U. (2020) Recombinant protein production associated growth inhibition results mainly from transcription and not from translation. Microb. Cell Fact. , 19 , 83. 290. Dvorak,P., Chrast,L., Nikel,P.I., Fedr,R., Soucek,K., Sedlackova,M., Chaloupkova,R., Lorenzo,V., Prokop,Z. and Damborsky,J . (2015) Exacerbation of substrate toxicity by IPTG in Escherichia coli BL21(DE3) carrying a synthetic metabolic pathway. Microb. Cell Fact. , 14 , 201. 291. Iost,I., Guillerez,J. and Dreyfus,M. (1992) Bacteriophage T7 RNA polymerase travels far ahead of rib osomes in vivo. J. Bacteriol. , 174 , 619 622. 292. Sun,X.M., Zhang,Z.X., Wang,L.R., Wang,J.G., Liang,Y., Yang,H.F., Tao,R.S., Jiang,Y., Yang,J.J. and Yang,S. (2021) Downregulation of T7 RNA polymerase transcription enhances pET based recombinant protein pro duction in Escherichia coli BL21 (DE3) by suppressing autolysis. Biotechnol. Bioeng. , 118 , 153 163. 293. Miroux,B. and Walker,J.E. (1996) Over production of proteins in Escherichia coli: mutant hosts that allow synthesis of some membrane proteins and globu lar proteins at high levels. J. Mol. Biol. , 260 , 289 298. 294. Kwon,S.K., Kim,S.K., Lee,D.H. and Kim,J.F. (2015) Comparative genomics and experimental evolution of Escherichia coli BL21(DE3) strains reveal the landscape of toxicity escape from membrane pro tein overproduction. Sci. Rep. , 5 , 16076.

PAGE 191

191 295. Alfasi,S., Sevastsyanovich,Y., Zaffaroni,L., Griffiths,L., Hall,R. and Cole,J. (2011) Use of GFP fusions for the isolation of Escherichia coli strains for improved production of different target recombinant pr oteins. J. Biotechnol. , 156 , 11 21. 296. Schlegel,S., Genevaux,P. and de Gier,J.W. (2015) De convoluting the Genetic Adaptations of E. coli C41(DE3) in Real Time Reveals How Alleviating Protein Production Stress Improves Yields. Cell Rep. , 10 , 1758 1766. 2 97. Vethanayagan,J.G. and Flower,A.M. (2005) Decreased gene expression from T7 promoters may be due to impaired production of active T7 RNA polymerase. Microb. Cell Fact. , 4 , 3. 298. Li,Z.J., Zhang,Z.X., Xu,Y., Shi,T.Q., Ye,C., Sun,X.M. and Huang,H. (2022) CRISPR Based Construction of a BL21 (DE3) Derived Variant Strain Library to Rapidly Improve Recombinant Protein Production. ACS Synth. Biol. , 11 , 343 352. 299. Studier,F.W. (199 1) Use of bacteriophage T7 lysozyme to improve an inducible T7 expression system. J. Mol. Biol. , 219 , 37 44. 300. Spehr,V., Frahm,D. and Meyer,T.F. (2000) Improvement of the T7 expression system by the use of T7 lysozyme. Gene , 257 , 259 267. 301. Wycuff,D. R. and Matthews,K.S. (2000) Generation of an AraC araBAD promoter regulated T7 expression system. Anal. Biochem. , 277 , 67 73. 302. Kar,S. and Ellington,A.D. (2018) Construction of synthetic T7 RNA polymerase expression systems. Methods , 143 , 110 120. 303. Anilionyte,O., Liang,H., Ma,X., Yang,L. and Zhou,K. (2018) Short, auto inducible promoters for well controlled protein expression in Escherichia coli. Appl. Microbiol. Biotechnol. , 102 , 7007 7015. 304. Kushwaha,M. and Salis,H.M. (2015) A portable expression resource for engineering cross species genetic circuits and pathways. Nat. Commun. , 6 . 305. Kim,J., Quijano,J.F., Kim,J., Yeung,E. and Murray,R.M. (2021) Synthetic logic circuits using RNA aptame r against T7 RNA polymerase. Biotechnol. J. , 10.1002/biot.202000449. 306. Shis,D.L. and Bennett,M.R. (2013) Library of synthetic transcriptional AND gates built with split T7 RNA polymerase mutants. Proc. Natl. Acad. Sci. U. S. A. , 110 , 5028 5033. 307. Han ,T., Chen,Q. and Liu,H. (2017) Engineered Photoactivatable Genetic Switches Based on the Bacterium Phage T7 RNA Polymerase. ACS Synth. Biol. , 6 , 357 366.

PAGE 192

192 308. Riley,L.A. and Guss,A.M. (2021) Approaches to genetic tool development for rapid domestication o f non model microorganisms. Biotechnol. Biofuels , 14 , 1 17. 309. Lammens,E.M., Nikel,P.I. and Lavigne,R. (2020) Exploring the synthetic biology potential of bacteriophages for engineering non model bacteria. Nat. Commun. , 11 , 5294. 310. Teh,M.Y., Ooi,K.H., Danny Teo,S.X., Bin Mansoor,M.E., Shaun Lim,W.Z. and Tan,M.H. (2019) An Expanded Synthetic Biology Toolkit for Gene Expression Control in Acetobacteraceae. ACS Synth. Biol. , 8 , 708 723. 311. Whitford,C.M., Cruz Morales,P., Keasling ,J.D. and Weber,T. (2021) The design build test learn cycle for metabolic engineering of streptomycetes. Essays Biochem. , 65 , 261 275. 312. Schuster,L.A. and Reisch,C.R. (2022) Plasmids for controlled and tunable high level expression in E. coli. Appl. Env iron. Microbiol. , In press . 313. Dumon Seignovert,L., Cariot,G. and Vuillard,L. (2004) The toxicity of recombinant proteins in Escherichia coli: A comparison of overexpression in BL21(DE3), C41(DE3), and C43(DE3). Protein Expr. Purif. , 37 , 203 206. 314. Ya ng,S., Liu,Q., Zhang,Y., Du,G., Chen,J. and Kang,Z. (2018) Construction and Characterization of Broad Spectrum Promoters for Synthetic Biology. ACS Synth. Biol. , 7 , 287 291. 315. Gronenborn,B. (1976) Overproduction of phage Lambda repressor under control o f the lac promotor of Escherichia coli. MGG Mol. Gen. Genet. , 148 , 243 250. 316. Yanofsky,C., Platt,T., Crawford,I.P., Nichols,B.P., Christie,G.E., Horowitz,H., Vancleemput,M. and Wu,A.M. (1981) The complete nucleotide sequence of the tryptophan operon of Escherichia coli. Nucleic Acids Res. , 9 , 6647 6668. 317. Amann,E., Ochs,B. and Abel,K.J. (1988) Tightly regulated tac promoter vectors useful for the expression of unfused and fused proteins in Escherichia coli. Gene , 69 , 301 315. 318. Haldimann,A., Daniel s,L.L. and Wanner,B.L. (1998) Use of new methods for construction of tightly regulated arabinose and rhamnose promoter fusions in studies of the Escherichia coli phosphate regulon. J. Bacteriol. , 180 , 1277 1286. 319. Guiziou,S., Sauveplane,V., Chang,H.J., Clerté,C., Declerck,N., Jules,M. and Bonnet,J. (2016) A part toolbox to tune genetic expression in Bacillus subtilis. Nucleic Acids Res. , 44 , 7495 7508. 320. Meisner,J. and Goldberg,J.B. (2016) The Escherichia coli rhaSR PrhaBAD inducible promoter system a llows tightly controlled gene expression over a wide range in Pseudomonas aeruginosa. Appl. Environ. Microbiol. , 82 , 6715 6727.

PAGE 193

193 321. Eggeling,L., Bott,M. and Marienhagen,J. (2015) Novel screening methods biosensors. Curr. Opin. Biotechnol. , 35 , 30 36. 322. and Bennett,M.R. (2018) Tuning the dynamic range of bacterial promoters regulated by ligand inducible transcription factors. Nat. Commun. , 9 , 64. 323. Volke,D.C., Turlin,J., Mol,V. and Nikel,P.I. (2019) Physical decoupling of XylS/Pm regulatory elements and conditional proteolysis enable precise control of gene expression in Pseudomonas putida. Microb. Biotechnol. , 13 , 222 232. 324. Giacalone,M.J., Gentile,A.M., Lovitt,B.T., Berkley,N.L., Gunderson,C.W. and Surber,M.W. (2006) Toxic protein expression in Escherichia coli using a rhamnose based tightly regulated and tunable promoter system. Biotechniques , 40 , 355 364. 325. Sivashanmugam,A., Murray,V., Cui,C., Zhang,Y., Wang,J. and Li,Q. (2009) Practical protocols for production of very high yields of recombinant proteins using Escherichia coli. Protein Sci. , 18 , 936 948. 326. Mostafavi,M., Lewis,J.C., Saini,T., Bustamante,J.A., Gao,I.T., Tran,T.T., King ,S.N., Huang,Z. and Chen,J.C. (2014) Analysis of a taurine dependent promoter in Sinorhizobium meliloti that offers tight modulation of gene expression. BMC Microbiol. , 14 , 295. 327. Keasling,J.D. (1999) Gene expression tools for the metabolic engineering of bacteria. Trends Biotechnol. , 10.1016/S0167 7799(99)01376 1. 328. Razo Mejia,M., Barnes,S.L., Belliveau,N.M., Chure,G., Einav,T., Lewis,M. and Phillips,R. (2018) Tuning Transcriptional Regulation through Signaling: A Predictive Theory of Allosteric Indu ction. Cell Syst. , 6 , 456 469.e10. 329. Hicks,M., Bachmann,T.T. and Wang,B. (2020) Synthetic Biology Enables Programmable Cell Based Biosensors. ChemPhysChem , 21 , 132 144. 330. Cook,T.B., Rand,J.M., Nurani,W., Courtney,D.K., Liu,S.A. and Pfleger,B.F. (2018) Genetic tools for reliable gene expression and recombineering in Pseudomonas putida. J. Ind. Microbiol. Biotechnol. , 45 , 517 527. 331. Cao,Y., Song,M., Li,F., Li,C., Lin, X., Chen,Y., Chen,Y., Xu,J., Ding,Q. and Song,H. (2019) A Synthetic Plasmid Toolkit for Shewanella oneidensis MR 1. Front. Microbiol. , 10 , 410. 332. Moore,S.J., Lai,H.E., Kelwick,R.J.R., Chee,S.M., Bell,D.J., Polizzi,K.M. and Freemont,P.S. (2016) EcoFlex: A Multifunctional MoClo Kit for E. coli Synthetic Biology. ACS Synth. Biol. , 5 , 1033 1181.

PAGE 194

194 333. Decoene,T., De Paepe,B., Maertens,J., Coussement,P., Peters,G., De Maeseneire,S.L. and De Mey,M. (2018) Standardization in synthetic biology: an engineering di scipline coming of age. Crit. Rev. Biotechnol. , 38 , 647 656. 334. Sengupta,A., Pakrasi,H.B.. and Wangikar,P.P. (2018) Recent advances in synthetic biology of cyanobacteria. Appl. Microbiol. Biotechnol. , 102 , 5457 5471. 335. Cress,B.F., Jones,J.A., Kim,D.C., Leitz,Q.D., Englaender,J.A., Collins,S.M., Linhardt,R.J. and Koffas,M.A.G. (2016) Rapid generation of CRISPR/dCas9 regulated, orthogonally repressible hybrid T7 lac promoters for modular, tuneable control of metabolic pathway fluxes in Escherichia coli. Nucleic Acids Res. , 44 , 4472 4485. 336. Lou,C., Stanton,B., Chen,Y.J., Munsky,B. and Voigt,C.A. (2012) Ribozyme based insulator parts buffer synthetic circuits from genetic context. Nat. Biotechnol. , 30 , 1137 1142. 337. Khan,S.R., Gaines,J., Roop,R.M. and Farrand,S.K. (2008) Broad host range expression vectors with tightly regulated promoters and their use to examine the influence of TraR and TraM expression on Ti plasmid quorum sensing. Appl. Environ. Microbiol. , 74 , 50 53 5062. 338. Stabb,E.V. and Ruby,E.G. (2002) RP4 Based Plasmids for Conjugation between Escherichia coli and Members of the Vibrionaceae. Methods Enzymol. , 358 , 413 426. 339. Cormack,B.P., Valdivia,R.H. and Falkow,S. (1996) FACS optimized mutants of the g reen fluorescent protein (GFP). Gene , 173 , 33 38. 340. Wannier,T.M., Gillespie,S.K., Hutchins,N., Scott McIsaac,R., Wu,S.Y., Shen,Y., Campbell,R.E., Brown,K.S. and Mayo,S.L. (2018) Monomerization of far red fluorescent proteins. Proc. Natl. Acad. Sci. U. S . A. , 115 , E11294 E11301. 341. Shcherbo,D., Murphy,C.S., Ermakova,G. V., Solovieva,E.A., Chepurnykh,T. V., Shcheglov,A.S., Verkhusha,V.V., Pletnev,V.Z., Hazelwood,K.L., Roche,P.M., et al. (2009) Far red fluorescent tags for protein imaging in living tissue s. Biochem. J. , 418 , 567 574. 342. Campbell,R.E., Tour,O., Palmer,A.E., Steinbach,P.A., Baird,G.S., Zacharias,D.A. and Tsien,R.Y. (2002) A monomeric red fluorescent protein. Proc. Natl. Acad. Sci. U. S. A. , 99 , 7877 7882. 343. Barber,C.E., Tang,J.L., Feng, J.X., Pan,M.Q., Wilson,T.J.G., Slater,H., Dow,J.M., Williams,P. and Daniels,M.J. (1997) A novel regulatory system required for pathogenicity of Xanthomonas campestris is mediated by a small diffusible signal molecule. Mol. Microbiol. , 24 , 555 566.

PAGE 195

195 344. Ca lero,P., Jensen,S.I. and Nielsen,A.T. (2016) Broad Host Range ProUSER Vectors Enable Fast Characterization of Inducible Promoters and Optimization of p Coumaric Acid Production in Pseudomonas putida KT2440. ACS Synth. Biol. , 5 , 741 753. 345. Newman,J.R. an d Fuqua,C. (1999) Broad host range expression vectors that carry the L arabinose inducible Escherichia coli araBAD promoter and the araC regulator. Gene , 227 , 197 203. 346. Brewster,R.C., Weinert,F.M., Garcia,H.G., Song,D., Rydenfelt,M. and Phillips,R. (20 14) The transcription factor titration effect dictates level of gene expression. Cell , 156 , 1312 1323. 347. Sung,K.L., Chou,H.H., Pfleger,B.F., Newman,J.D., Yoshikuni,Y. and Keasling,J.D. (2007) Directed evolution of AraC for improved compatibility of arabinose and lactose inducible promoters. Appl. Environ. Microbiol. , 73 , 5711 5715. 348. Hecht,A., Endy,D ., Salit,M. and Munson,M.S. (2016) When Wavelengths Collide: Bias in Cell Abundance Measurements Due to Expressed Fluorescent Proteins. ACS Synth. Biol. , 5 , 1024 1027. 349. Cui,L. and Bikard,D. (2016) Consequences of Cas9 cleavage in the chromosome of Esch erichia coli. Nucleic Acids Res. , 44 , 4243 4251. 350. Solomon,K. V., Moon,T.S., Ma,B., Sanders,T.M. and Prather,K.L.J.J. (2013) Tuning primary metabolism for heterologous pathway productivity. ACS Synth. Biol. , 2 , 126 135. 351. Lalwani,M.A., Ip,S.S., Carra sco López,C., Day,C., Zhao,E.M., Kawabe,H. and Avalos,J.L. (2020) Optogenetic control of the lac operon for bacterial chemical and protein production. Nat. Chem. Biol. , 17 , 71 79. 352. Moon,T.S., Lou,C., Tamsir,A., Stanton,B.C. and Voigt,C.A. (2012) Geneti c programs constructed from layered logic gates in single cells. Nature , 491 , 249 253. 353. Hawkins,A.C., Arnold,F.H., Stuermer,R., Hauer,B. and Leadbetter,J.R. (2007) Directed evolution of Vibrio fischeri LuxR for improved response to butanoyl homoserine lactone. Appl. Environ. Microbiol. , 73 , 5775 5781. 354. Zhang,J.J., Tang,X., Zhang,M., Nguyen,D. and Moore,B.S. (2017) Broad host range expression reveals native and host regulatory elements that influence heterologous antibiotic production in Gram negativ e bacteria. MBio , 8 , e01291 17. 355. Wegerer,A., Sun,T. and Altenbuchner,J. (2008) Optimization of an E. coli L rhamnose inducible expression vector: Test of various genetic module combinations. BMC Biotechnol. , 8 .

PAGE 196

196 356. Graf,N. and Altenbuchner,J. (2013) Functional characterization and a pplication of a tightly regulated MekR/P mekA expression system in Escherichia coli and Pseudomonas putida. Appl. Microbiol. Biotechnol. , 97 , 8239 8251. 357. Lieder,S., Nikel,P.I., de Lorenzo,V. and Takors,R. (2015) Genome reduction boosts heterologous gen e expression in Pseudomonas putida. Microb. Cell Fact. , 14 , 23. 358. Goodner,B., Hinkle,G., Gattung,S., Miller,N., Blanchard,M., Qurollo,B., Goldman,B.S., Cao,Y., Askenazi,M., Halling,C., et al. (2001) Genome sequence of the plant pathogen and biotechnolog y agent Agrobacterium tumefaciens C58. Science (80 . ). , 294 , 2323 2328. 359. Gallagher,L.A., Ramage,E., Patrapuvich,R., Weiss,E., Brittnacher,M. and Manoil,C. (2013) Sequence defined transposon mutant library of Burkholderia thailandensis. MBio , 4 , e00604 13. 360. Yu,Y., Kim,H.S., Hui,H.C., Chi,H.L., Siew,H.S., Lin,D., Derr,A., Engels,R., DeShazer,D., Birren,B., et al. (2006) Genomic patterns of pathogen evolution revealed by comparison of Burkholderia pseudomallei, the causative agent of melioidosis , to avirulent Burkholderia thailandensis. BMC Microbiol. , 6 . 361. de Berardinis,V., Durot,M., Weissenbach,J. and Salanoubat,M. (2009) Acinetobacter baylyi ADP1 as a model for metabolic system biology. Curr. Opin. Microbiol. , 12 , 568 576. 362. Peters,J.M., Colavin,A., Shi,H., Czarny,T.L., Larson,M.H., Wong,S., Hawkins,J.S., Lu,C.H.S.S., Koo,B.M., Marta,E., et al. (2016) A comprehensive, CRISPR based functional analysis of essential genes in bacteria. Cell , 165 , 1493 1506. 363. Tan,S.Z., Reisch,C.R. and Prat her,K.L.J. (2018) A Robust CRISPRi Gene Repression System in Pseudomonas. J. Bacteriol. , 200 , JB.00575 17. 364. Liu,Y., Wan,X. and Wang,B. (2019) Engineered CRISPRa enables programmable eukaryote like gene activation in bacteria. Nat. Commun. , 10 , 3693. 36 5. Gormley,E.P. and Davies,J. (1991) Transfer of plasmid RSF1010 by conjugation from Escherichia coli to Streptomyces lividans and Mycobacterium smegmatis. J. Bacteriol. , 173 , 6705 8. 366. Geissendörfer,M. and Hillen,W. (1990) Regulated expression of heter ologous genes in Bacillus subtilis using the Tn 10 encoded tet regulatory elements. Appl. Microbiol. Biotechnol. , 33 , 657 663. 367. Ben Samoun,K., Leblon,G. and Reyes,O. (1999) Positively regulated expression of the Escherichia coli araBAD promoter in Cory nebacterium glutamicum. FEMS Microbiol. Lett. , 174 , 125 130.

PAGE 197

197 368. Tolia,N.H. and Joshua Tor,L. (2006) Strategies for protein coexpression in Escherichia coli. Nat. Methods , 3 , 55 64. 369. Rosenberg,A.H., Lade,B.N., Dao shan,C., Lin,S.W., Dunn,J.J. and Stud ier,F.W. (1987) Vectors for selective expression of cloned DNAs by T7 RNA polymerase. Gene , 56 , 125 135. 370. Blazeck,J. and Alper,H.S. (2013) Promoter engineering: Recent advances in controlling transcription at the most fundamental level. Biotechnol. J. , 8 , 46 58. 371. Held,D., Yaeger,K. and Novy,R. (2003) New coexpression vectors for expanded compatibilities in E. coli. Innovations . 372. Liang,X., Li,C., Wang,W. and Li,Q. (2018) Integrating T7 RNA Polymerase and Its Cognate Transcriptional Units for a Ho st Independent and Stable Expression System in Single Plasmid. ACS Synth. Biol. , 7 , 1424 1435. 373. Rosano,G.L., Morales,E.S. and Ceccarelli,E.A. (2019) New tools for recombinant protein production in Escherichia coli: A 5 year update. Protein Sci. , 28 , 1412 1422. 374. Lozano Terol,G., Gallego Jara,J., Sola Martínez,R.A., Martínez Vivancos,A., Cánov as Díaz,M. and de Diego Puente,T. (2021) Impact of the Expression System on Recombinant Protein Production in Escherichia coli BL21. Front. Microbiol. , 12 , 682001. 375. Clomburg,J.M. and Gonzalez,R. (2013) Anaerobic fermentation of glycerol: A platform for renewable fuels and chemicals. Trends Biotechnol. , 31 , 20 28. 376. Balzer,S., Kucharova,V., Megerle,J., Lale,R., Brautaset,T. and Valla,S. (2013) A comparative analysis of the properties of regulated promoter systems commonly used for recombinant gene exp ression in Escherichia coli. Microb. Cell Fact. , 12 , 26. 377. Gawin,A., Valla,S. and Brautaset,T. (2017) The XylS/Pm regulator/promoter system and its use in fundamental studies of bacterial gene expression, recombinant protein production and metabolic eng ineering. Microb. Biotechnol. , 10 , 702 718. 378. Figurski,D.H. and Helinski,D.R. (1979) Replication of an origin containing derivative of plasmid RK2 dependent on a plasmid function provided in trans. Proc. Nati. Acad. Sc , 76 , 1648 1652. 379. Kues,U. and S tahl,U. (1989) Replication of plasmids in gram negative bacteria. Microbiol. Rev. , 53 , 491 516.

PAGE 198

198 Hjelm,A. (2015) High level production of membrane proteins in E. coli BL2 1(DE3) by omitting the inducer IPTG. Microb. Cell Fact. , 14 , 142. 381. Kuipers,G., Karyolaimos,A., Zhang,Z., Ismail,N., Trinco,G., Vikström,D., Slotboom,D.J. and de Gier,J.W. (2017) The tunable pReX expression vector enables optimizing the T7 based production of membrane and secretory proteins in E. coli. Microb. Cell Fa ct. , 16 , 226. 382. Randall,L.L., Topping,T.B., Smith,V.F., Diamond,D.L. and Hardy,S.J.S. (1998) SecB: A chaperone from Escherichia coli. Methods Enzymol. , 290 , 444 459. 383. Kuderová,A., Nanak,E., Truksa,M. and Brzobohatý,B. (1999) Use of rifampicin in T7 RNA polymerase driven expression of a plant enzyme: Rifampicin improves yield and assembly. Protein Expr. Purif. , 16 , 405 409. 384. Ohuchi,S., Mori,Y. and Nakamura,Y. (2012) Evolution of an inhibitory RNA aptamer against T7 RNA polymerase. FEBS Open Bio , 2 , 203 207. 385. Zhao,H., Zhang,H.M., Chen,X., Li,T., Wu,Q., Ouyang,Q. and Chen,G.Q. (2017) Novel T7 like expression systems used for Halomonas. Metab. Eng. , 39 , 128 140. 386. Liang,T., Sun,J., Ju,S., Su,S., Yang,L. and Wu,J. (2021) Construction of T7 Li ke Expression System in Pseudomonas putida KT2440 to Enhance the Heterologous Expression Level. Front. Chem. , 9 , 664967. 387. Troeschel,S.C., Thies,S., Link,O., Real,C.I., Knops,K., Wilhelm,S., Rosenau,F. and Jaeger,K.E. (2012) Novel broad host range shutt le vectors for expression in Escherichia coli, Bacillus subtilis and Pseudomonas putida. J. Biotechnol. , 161 , 71 79. 388. Jacques,P.É., Rodrigue,S., Gaudreau,L., Goulet,J. and Brzezinski,R. (2006) Detection of prokaryotic promoters from the genomic distrib ution of hexanucleotide pairs. BMC Bioinformatics , 7 , 423. 389. Myers,K.S., Noguera,D.R. and Donohue,T.J. (2021) Promoter Architecture Differences among Alphaproteobacteria and Other Bacterial Taxa. mSystems , 10.1128/msystems.00526 21. 390. Ramírez Romero, M.A., Masulis,I., Cevallos,M.A., González,V. and Dávila,G. Nucleic Acids Res. , 34 , 1470 1480. 391. Browning,D.F. and Busby,S.J.W. (2016) Local and global regulation of transcr iption initiation in bacteria. Nat. Rev. Microbiol. , 14 , 638 650.

PAGE 199

199 392. Segall Shapiro,T.H., Sontag,E.D. and Voigt,C.A. (2018) Engineered promoters enable constant gene expression at any copy number in bacteria. Nat. Biotechnol. , 36 , 352 358. 393. Voigt,C. A. (2006) Genetic parts to program bacteria. Curr. Opin. Biotechnol. , 17 , 548 557. 394. Burgess,R.R. and Anthony,L. (2001) How sigma docks to RNA polymerase and what sigma does. Curr. Opin. Microbiol. , 4 , 126 131. 395. Campbell,E.A., Muzzin,O., Chlenov,M., Sun,J.L., Olson,C.A., Weinman,O., Trester Zedlitz,M.L. and Darst,S.A. (2002) Structure of the bacterial RNA Mol. Cell , 9 , 527 539. 396. Schaller,H., Gray,C. and Herrmann,K. (1975) Nucleotide sequence of an RNA po lymerase binding site from the DNA of bacteriophage fd. Proc. Natl. Acad. Sci. U. S. A. , 72 , 737 741. 397. Pribnow,D. (1975) Bacteriophage T7 early promoters: Nucleotide sequences of two RNA polymerase binding sites. J. Mol. Biol. , 99 . 398. Potvin,E., Sanschagrin,F. and Levesque,R.C. (2008) Sigma factors in Pseudomonas aeruginosa. FEMS Microbiol. Rev. , 32 , 38 55. 399. DeHaseth,P.L., Zupancic,M.L. and Record,M.T. (1998) RNA polymerase promoter interactions: The comings and goings of RNA polymerase. J. Bacteriol. , 180 , 3019 3025. 400. McLean,B.W., Wiseman,S.L. and Kropinski,A.M. (1997) Functional analysis of sigma 70 consensus promoters in Pseudomonas aeruginosa and Escherichia coli. Can. J. Microbiol. , 43 , 981 985. 401. Rangwala,S.H., Fuchs,R.L., Drahos,D.J. and Olins,P.O. (1991) Broad host range vector for efficient expression of foreign genes in gram negati ve bacteria. Nat. Biotechnol. , 9 , 477 479. 402. MacLellan,S.R., MacLean,A.M. and Finan,T.M. (2006) Promoter prediction in the rhizobia. Microbiology , 152 , 1751 1763. 403. Huang,C.H., Shen,C.R., Li,H., Sung,L.Y., Wu,M.Y. and Hu,Y.C. (2016) CRISPR interferen ce (CRISPRi) for gene regulation and succinate production in cyanobacterium S. elongatus PCC 7942. Microb. Cell Fact. , 15 , 196. 404. Luka,S., Patriarca,E.J., Riccio,A., Iaccarino,M. and Defez,R. (1996) Cloning of the rpoD analog from Rhizobium etli: sigA o f R. etli is growth phase regulated. J. Bacteriol. , 178 , 7138 7143.

PAGE 200

200 405. Peck,M.C., Gaal,T., Fisher,R.F., Gourse,R.L. and Long,S.R. (2002) The RNA polymerase subunits from Escherichia coli and function in basal and activated transcription both in vivo and in vitro. J. Bacteriol. , 184 , 3808 3814. 406. Crooks,G., Hon,G., Chandonia,J. and Brenner,S. (2004) WebLogo: a sequence logo generator. Genome Res , 14 , 1188 1190. 407. Wu,X. and Bartel,D.P. (2017) KpLogo: Positional k mer analysis reveals hidden specificity in biological sequences. Nucleic Acids Res. , 45 , W534 W538. 408. Guo,Y. and Gralla,J.D. (1998) Promoter opening via a DNA fork junction binding activity. Proc. Natl. Acad. Sci. U. S. A. , 95 , 11655 11660. 409. Fenton,M.S. and Gralla,J.D. (2003) Roles for inhibitory interactions in the use of the J. Biol. Chem. , 278 , 39669 39674. 410. Henry,K.K., Ross,W., Myers,K.S., Lemmer,K.C., Vera,J.M., Landick,R., Donohue,T.J. and Gourse,R.L. (2020) A majority of Rhodobacter sphaeroides promoters lack a crucial RNA polymerase recognition feature, enabling coordinated transcription activation. Proc. Natl. Acad. Sci. U. S. A. , 117 , 29658 29668. 411. Zwir,I., Latifi,T., Perez,J.C., Hu ang,H. and Groisman,E.A. (2012) The promoter architectural landscape of the Salmonella PhoP regulon. Mol. Microbiol. , 84 , 463 485. 412. Gomes,A.L.C., Johns,N.I., Yang,A., Velez Cortes,F., Smillie,C.S., Smith,M.B., Alm,E.J. and Wang,H.H. (2020) Genome and s equence determinants governing the expression of horizontally acquired DNA in bacteria. ISME J. , 14 , 2347 2357. 413. Markley,A.L., Begemann,M.B., Clarke,R.E., Gordon,G.C. and Pfleger,B.F. (2015) Synthetic Biology Toolbox for Controlling Gene Expression in the Cyanobacterium Synechococcus sp. strain PCC 7002. ACS Synth. Biol. , 4 , 595 603. 414. Tang,H., Wu,Y., Deng,J., Chen,N., Zheng,Z., Wei,Y., Luo,X. and Keasling,J.D. (2020) Promoter architecture and promoter engineering in saccharomyces cerevisiae. Metabol ites , 10 , 320. 415. Kim,S.K., Yoon,P.K., Kim,S.J., Woo,S.G., Rha,E., Lee,H., Yeom,S.J., Kim,H., Lee,D.H. and Lee,S.G. (2020) CRISPR interference mediated gene regulation in Pseudomonas putida KT2440. Microb. Biotechnol. , 13 , 210 221. 416. Yi,Y.C. and Ng,I. S. (2020) Establishment of toolkit and T7RNA polymerase/promoter system in Shewanella oneidensis MR 1. J. Taiwan Inst. Chem. Eng. , 109 , 8 14. 417. Iglewski,B.H. (1996) Medical Microbiology 4th editio. Baron,S. (ed) University of Texas Medical Branch at Gal veston, Galveston (TX).

PAGE 201

201 418. Yang,Y., Shen,W., Huang,J., Li,R., Xiao,Y., Wei,H., Chou,Y.C., Zhang,M., Himmel,M.E., Chen,S., et al. (2019) Prediction and characterization of promoters and ribosomal binding sites of Zymomonas mobilis in system biology era. B iotechnol. Biofuels , 12 , 52. 419. Keren,L., Zackay,O., Lotan Pompan,M., Barenholz,U., Dekel,E., Sasson,V., Aidelberg,G., Bren,A., Zeevi,D., Weinberger,A., et al. (2013) Promoters maintain their relative activity levels under different growth conditions. Mo l. Syst. Biol. , 9 , 701. 420. De Mey,M., Maertens,J., Lequeux,G.J., Soetaert,W.K. and Vandamme,E.J. (2007) Construction and model based analysis of a promoter library for E. coli: An indispensable tool for metabolic engineering. BMC Biotechnol. , 7 , 34. 421. Liu,D., Mao,Z., Guo,J., Wei,L., Ma,H., Tang,Y., Chen,T., Wang,Z. and Zhao,X. (2018) Construction, Model Based Analysis, and Characterization of a Promoter Library for Fine Tuned Gene Expression in Bacillus subtilis. ACS Synth. Biol. , 7 , 1785 1797. 422. Lloréns Rico,V., Lluch Senar,M. and Serrano,L. (2015) Distinguishing between productive and abortive promoters using a random forest classifier in Mycoplasma pneumoniae. Nucleic Acids Res. , 43 , 3442 3453.

PAGE 202

202 BIOGRAPHICAL SKETCH Layla Ang ela Schuster was born in Phoenix, Arizona in 1992. She moved from Arizona to Michigan as a child , living there for almost 10 years before her parents moved her to Florida because they hated the snow . She graduated from the University of Florida in 2013 wit h a B.S. in m icrobiology and c ell s cience and then, for something completely different, she ran off to Madrid, Spain to teach English in a Spanish high school for a few years. She then came back to the U.S., buckled down, and started a job as a lab tech at UF, where she learned many of the skills that would be invaluable to her as a graduate student. In 2016, she joined the lab of Dr. Reisch where she happily exerted control of gene expression in many species over the next four years . During that time, Layla developed a deep appreciation for the field of synthetic biology and looks forward to exerting control over bacteria for many years to come.