A User’s Guide to the Arabidopsis T-DNA Insertional Mutant Collections
The T-DNA insertion mutant libraries provide general access to alleles for gene functional studies
Mutagenesis has been a central tool for studying the genetics underlying biological traits, as phenotypic analysis of mutants provides a direct method to measure a gene’s contributions to biochemical, cellular, tissue and organ characteristics. In a mutant genotype where a polymorphism alters a single gene’s functional output, the isolated activity of that gene in vivo can be assessed by phenotypic comparison to the wild-type parental genotype. Furthermore, eukaryotic organisms harboring multiple mutations, often generated by sexual hybridization between single mutants, are valuable for characterization of more complex interactions such as epistasis, functional overlap, and sub-functionalization. Though biological assignment of gene function has always depended heavily upon phenotypic analysis of mutants, currently, only ~12% of Arabidopsis gene function assignments are based on in vivo characterization (1). Furthermore, while as many as 60% of Arabidopsis genes do have some inferred function, these characterizations are often based on relationships such as sequence homology to better-characterized genes and, thus, these inferred functions may often be incomplete or even inaccurate (1). With recent advances in genomic-scale tools and methods, we are beginning to see a rapid increase in the scope and quality of inferred gene functions (1, 2), but as computer models of genetic networks develop more complicated predictions about specific interactions, further characterization of mutant alleles by phenotyping will likely be required to support and extend the models (1, 3). Moreover, as a primary goal of plant research is crop improvement, mutant analysis will likely always be important as a tool for examining the in planta effects of the alteration of a gene function.
While new methods for targeted mutagenesis such as CRISPRs and TALENS are being developed, concerns related to specificity and off-target effects still need to be worked out in order to make these methods standard laboratory techniques (4–6). Even when robust eukaryotic genome editing tools allow for the average laboratory to inexpensively generate custom alleles, the availability of mutants for a new target gene may still be limited by the organism-specific features of transformability and lifespan. Thus, even with facile editing tools, if large numbers of genes will needed to be tested, which is likely to be the case as gene functional predictions improve, access to mutant alleles could become a bottleneck for confirmation and further characterization of predictions. One solution to the problem of immediate mutant allele access for any gene is the creation of very large collections of sequence-indexed insertion lines for an organism. A sequence-indexed mutant collection typically consists of several hundred thousand individual lines in which the precise genomic location of a mutation(s) in each line is determined by DNA sequencing. As some portion of the synthetic polymorphisms will be in or proximal to genes, these mutations commonly result in the loss or disruption of gene function. By creating a very large population of individually sequenced mutants, gene disruption alleles can be identified for almost all genes in an organism (7). Due to the value of such a resource, this approach has been applied to create sequence-indexed mutant collections in several organisms including mouse(8), zebrafish(9), Drosophila(10), and Arabidopsis (reviewed in (11)).
Agrobacterium tumefaciens transfer-DNA (T-DNA)-induced insertion mutant collections in Arabidopsis thaliana, created in the late 1990’s and early 2000’s as an international effort to saturate the gene-space with mutations, have been a particularly important resource for plant biology (11). The high gene-space coverage in these collections of lines is in part due to the relative ease with which T-DNA insertional mutagenesis can be used for creating large sequence-indexed collection in plants. The insertion of a T-DNA fragment into a plant host genome is a consequence of a natural transformation process where an Agrobacterium infection results in the transfer of a DNA fragment flanked by 25 bp border sequences (the T-DNA) from a heavily modified tumor inducing Ti plasmid into the infected plant’s genome (12).Highly-efficient T-DNA transformation protocols are available for Arabidopsis (13) and because the T-DNA inserts randomly (7) and is an effective gene-disrupting mutagen, the generation of the large mutant populations required for gene-space coverage is possible. Furthermore, because the T-DNA insert contains a known DNA sequence, primers designed from the left border (LB) of the T-DNA can be used to isolate the genomic/T-DNA sequence junction in a high-throughput fashion. The genomic portion of this sequence, commonly known as the flanking sequence tag (FST), can be mapped to the genome to precisely identify the chromosomal insert location for many individual lines.
Insertional indexing of large populations of T-DNA transformant lines has been used to achieve mutant allele coverage for the majority of Arabidopsis genes. In this chapter we will describe these Arabidopsis thaliana T-DNA insertional mutant collections with a particular focus on the mutant collections in the Colombia (Col-0) accession: the SALK, GABI-KAT, SAIL, and WISC lines(14–17). These lines, generated by several laboratories including ours, contain in total over 260,000 individual mutant lines and represent potential disruption mutants for most Arabidopsis thaliana genes. In addition to these four Col-0 collections, T-DNA insertional mutants are also available in other backgrounds, such as the FLAG collection lines in Wassilewskija (WS) (18), as well as Arabidopsis transposon-insertion collection, such as the CSHL and RIKEN lines in the Landsberg erecta and Nössen accessions, respectively (19, 20). However, as the Col-0 T-DNA collections have been the most heavily utilized, we will primarily focus on the SALK, SAIL, GABI-Kat and WISC lines, which we will collectively refer to here as the T-DNA collection.
The T-DNA collection has been used as a resource for thousands of published studies to address highly-varied questions in plant biology (11). This extensive use of the Col-0 T-DNA collection is in part due to the fact that Arabidopsis thaliana Col-0 accession has been a primary model in plant research for several decades, and is currently the only Arabidopsis accession with a high-quality reference genome(21). Additionally, the heavy use of this collection may also be attributable to the ease with which data and seed material from the collection can be accessed by researchers. Our website, T-DNA Express (http://signal.salk.edu/cgi-bin/tdnaexpress), is the primary portal to the T-DNA line information and includes search and analysis tools (iSect) to assist in the design of experiments to effectively leverage this collection. Additionally, seed lines are also easily obtained, and can be directly ordered from seed repositories for a small charge: the Arabidopsis Biological Resource Center (ABRC: https://abrc.osu.edu/) and The European Arabidopsis Stock Centre (NASC: http://arabidopsis.info/) for U.S./Canada and Europe, respectively. These requested mutant lines generally ship to researchers within days of placing an order. In just the past fives years over 800,000 lines have been shipped from the T-DNA insertion line collections from ABRC alone (personal comm. Debbie Crist).