Dissemination of Data

Dissemination of results to the scientific community, including a public website

The MCSG has developed and maintains a public website (www.mcsg.anl.gov). This interactive interface to the MCSG database lists the targets, their status, and provides information about deposited structures (NIH Table). Several MCSG resources are linked to the website and are available to the public. Examples include: ProFunc Server, SESAMI: Target Selection, Target Database, Protein Structure Classification, GPSS surface analysis server, a database of 3D enzyme active site templates, Protein-Protein Interaction Server, summaries and structural analyses of PDB data files, Comparative Protein Structure Modeling, Express Primer Tool and others. Because of the change in the PSI-2 target selection set, we have added Pfam annotation and appropriate web links to all our structures in the PDB.

Dissemination of results and data. The MCSG has distributed to qualified researchers:

  1. Materials - expression vectors; expression clones; purified proteins;
  2. Experimentally determined protein structures and homology models computed from MCSG structures;
  3. Protocols - protein expression/purification protocols; gene cloning and protein expression/purification methodologies; crystallization strategies and protocols; X-ray crystallography structure determination methods.
  4. Computer programs and access to MCSG databases and resources;

Collaborative effort to study protein function with scientists outside the MCSG (international)

The MCSG has engaged a significant number of scientists to study the function of proteins determined at the Center. Such interactions have led to many publications. Recent examples include:

  1. Collaboration with the lab of A. Yakunin to explore the enzymology of MCSG proteins.
  2. Collaborations with researchers on function of specific proteins: J. Mekalanos (Harvard Medical School), R. White (Virginia Polytechnic Institute and State University), D. Sanders (University of Saskatoon), M. James (University of Alberta), O. Dontsova (Moscow State University), and many others.

Structural genomics projects proposed by the scientific community

Proteins involved in cell fusion (coordinated by Dr. William Snell, University of Texas, SWMC)

In nature, cell fusion is a normal process involved in sexual reproduction, tissue formation, and immune response. Sexual reproduction in the green alga, Chlamydomonas, is regulated by environmental conditions and by cell-cell interactions. Recent studies have identified new molecular events that underlie signal transduction during Chlamydomonas fertilization. Several cell-surface proteins relevant to sexual cell fusion have been identified including Fus1 and generative cell specific-1 protein. At the MCSG we have performed bioinformatics analyses of genes from several genomes, of coding sequence homologues, collected all necessary cDNA and initiated cloning and protein expression. All 186 gene constructs were cloned and evaluated for expression and solubility. All 44 constructs that expressed soluble proteins were purified and screened for crystallization. The evaluation of crystallization results is in progress. Second round solubility screening as MBP fusion proteins is also in progress. All protein expression constructs have been shipped back to collaborators to generate antibodies.

Kinases involved in stress response - myosin light chain kinase (MLCK) – collaboration with Drs. Skip Garcia and Steven Dudek (Univ. of Chicago)

Activation of Na(+)-nutrient co-transport leads to increased tight junction permeability in intestinal absorptive (villus) enterocytes. This regulation requires myosin II regulatory light chain (MLC) phosphorylation that is mediated by MLC kinase (MLCK). MLCK1 is localized to the perijunctional actomyosin ring. MLCK1 is the isoform responsible for tight junction regulation in absorptive enterocytes. MLCK1 is a large ~200 kDa protein. The full length protein can be expressed in a baculovirus system. We have purified and screened the full length protein in our HTP system and produced crystals of MLCK1 that were rather poorly ordered. Complete bioinformatics analyses have been performed and several constructs have been designed and cloned into the pMCSG19 vector. These vectors are being screened for protein expression.


The MCSG PSI-2 program includes training in state-of-the-art research and technology developed at the Center.

  1. Training undergraduate and graduate students, (UVa, WashU, NU, ANL/UofC), student-teacher teams, and postdoctoral fellows
    1. Graduate students - summer workshop in 2006 (for 12 graduate students from UofC) at Argonne;
    2. Undergraduate co-operative students (9 in summer 2006, 10 in summer 2007);
    3. FAST teams (faculty/undergraduate student teams) funded by NSF. In the past 2 years we have hosted four FAST teams (11 students and 4 faculty members). All of these students were either minority or women or both;
    4. Postdocs (11 in 2006, 8 in 2007, distributed among several sites) training in high throughput structural biology and bioinformatics;
  2. ACA summer school in the past two years (~10 trainees/year) – synchrotron data collection and structure determination;
  3. Science Careers in Search of Women – presentations and shows for high school students on high throughput structural biology (~ 60 students);
  4. Active interaction with several large and small NIH-funded programs – examples include GLRCE and MWRCE (focus on drug discovery);
  5. Hosted visiting scientists to work on collaborative projects;
  6. Met with genome sequencing centers to seek alignment of interests;
  7. Interaction with industry to actively develop new instrumentation, robotic workstations and products (for example, new minimal bacterial growth media is commercially available);
  8. Presentations on MCSG subject matter;
  9. Writing reviews, book chapters and manuals (for example preparing a joint paper for Nature Methods describing the best practices from the SG efforts in the world);
  10. Co-organizing conferences and workshops.

MCSG Scientific Community Activities

MCSG uses a variety of communication tools to promote and stimulate interaction with the local and national scientific community (MCSG website, meeting presentations, workshops, student training, summer school). The PI and all co-PIs as well as senior staff promote the Protein Structure Initiative effort and support strong collaboration with the research community. We are developing models for different types of interaction with researchers across academia, research institutes and industry. We believe that the MCSG can offer multiple levels of expertise and technology to research groups in a wide range of disciplines. We have and plan to have workshops and training sessions to promote technology and structural genomics science. Our interaction with community involves research, training personnel, sharing protocols, procedures, equipment and tools. We also engage actively in collaborations that expand research done at the MCSG to enhance functional studies and studies of the most challenging systems. Our collaborations can be divided into several categories. Protein function-centered collaborations aim to better communicate the scientific advances through primary literature. They can involve protein crystallographers, non-PSI scientists and PSI scientists from other centers. In the past two years these collaborations resulted in 14 publications, including papers in Science, Nature and PNAS, describing a novel virulence factor, a membrane protein and enzymes.

We have also developed what can be described as strategic collaborations. These aim to launch scientific programs that are uniquely suited to the scale and throughput with which the MCSG can operate. These collaborations involve researchers with large-scale interests, institutions and consortia. Some examples are listed below:

  1. Great Lakes Regional Center of Excellence - Essential genes and virulence factors in Gram-positive human bacterial pathogens.
    The focus of this project is on proteins from human bacterial pathogens that represent major health threats. Many of these proteins are of biomedical interest. Bioinformatic analysis has identified numerous candidates for essential genes and virulence factors in Gram-positive bacteria. Because of their importance and potential as drug targets, determining their structure and confirming their essentiality and/or virulence is important. Several papers have been published as a result of this collaboration. Some of these projects evolve into drug targets, for example Sortase B from B. anthracis, MCSG target in PSI-1, has been targeted for inhibitors and used to visualize interactions of inhibitors with the enzyme. This work uses MCSG HTP technologies and was recently published (Maresso et al. 2007, JBC). In addition, the crystal structure of CapD from B. anthracis, a major virulence factor has been determined in complex with the small peptide.
  2. University of British Columbia and University of Toronto - Identification and structural characterization of antibiotic recognizing TetR transcription factors from Streptomyces and ligand-binding domains of 200 prioritized transcription factors.
    The regulation of transcription is a highly complex process as it is dependent upon a number of events, most notably the presence of DNA binding proteins and small ligands, as well as local DNA structure. Transcription regulators control many aspects of cell metabolism, growth, transport across membranes, survival and differentiation and therefore are related to pathogenecity and human disease. Transcription factors are activated by a small signal molecule from inside or outside the cell. They often bind upstream from a gene to either enhance or repress transcription of a gene or set of genes by assisting or blocking RNA polymerase binding. Transcription factors are typically modular proteins and have one or more effector domains (DNA binding & activator domain). They frequently function in conjunction with other transcription factors that may also bind directly or indirectly to the DNA promoter region. At the MCSG, we have systematically studied structures of a number of bacterial transcription factors. Currently structures of over 115 DNA and RNA binding proteins have been determined, including over 29 TetR-family regulators as well as several IclR and MarR regulators.
  3. Loyola University, Medical Center - Interactions of spore coat proteins in Bacillus
    The formation of bacterial spores is very poorly understood, yet critical to understanding cell cycle and pathogenicity of bacteria such as B. anthracis, B. subtilis, B. cereus and other organisms. Spore formation appears to be a very complex process and involves the interaction of at least 80 -100 gene products. Adam Dricks’ laboratory has identified several interacting spore coat proteins. We have attempted the co-expression of these proteins using our LIC duet system. The complex between CotE and CotS has been produced and characterized and small crystals have been obtained.
  4. TIGR - Proteins from bacterial and plant genomes
    UCSD - 40 prioritized virulence factors from P.syringae pv. tomato DC3000 Midwest Regional Center of Excellence - Proteins from pathogenic viruses Structural Genomics Consortium - Structure determination using HKL3000 of SGC targets and transfer/sharing of technology (e.g. chemical screens to MCSG, reductive methylation strategy to SGC) Enzyme Genomics group at University of Toronto - 20 bacterial haloacid dehalogenases McMaster University - 200 conserved proteins from alpha proteobacteria NCBI/University of Toronto - the microbial RNA silencing system (CASS)

MCSG Scientific Community Activities

  1. Interactions with other PSI centers and participation in the PSI Network activities.
    Collaboration with large production centers includes target selection (BIG4), coordination of efforts, sharing procedures (NYSGRC) and expression vectors (CESG). We have established active collaborations for structure determination using NMR. For example, some proteins that failed to crystallize in the MCSG pipeline were sent to NESG and CESG for NMR work. We have provided a list of membrane protein targets for technology centers. We have established collaboration with ISFI technology center to develop chaperones for protein co-crystallization and to apply protein surface entropy for improving protein crystallization. We collaborate with ATCG3D on improving gene synthesis and micro-fluidic crystallization.
  2. MCSG Biomedical Themes.
    MCSG is pursuing several larger projects with significant biomedical implications. These projects involve close interactions with collaborators from outside of the PSI, who provide functional and genetic studies that are not supported by the PSI-2 program.
    1. Cloning, expression and characterization of all stress proteins in C. elegans, H. sapiens and yeast – a collaboration with Drs. R. Morimoto (Northwestern University) and E. Craig (Univ. of Wisconsin, Madison)
      Molecular chaperones, inducible by heat shock and a variety of other stresses, have critical roles in protein homeostasis, balancing cell stress with adaptation, survival, and cell death mechanisms. In transformed cells and tumors, chaperones are frequently overexpressed, with constitutive activation of the heat shock transcription factor HSF1 implicated in tumor formation. Numerous human diseases, including neurodegenerative diseases, are associated with the chronic expression of misfolded and aggregation-prone proteins. The objective of this project was to determine structures of all stress proteins and their important complexes. So far we have selected 66 stress proteins identified through sequence and biochemical analysis and designed 384 constructs for expression. cDNA samples were collected and we designed an experiment in which 384 constructs were cloned into two vectors: (i) pMCSG7 – a standard LIC MCSG vector for protein expression and (ii) pBH31SA - a vector that secretes proteins into the periplasmic space. We processed all 384 clones and tested for expression and solubility. The pMCSG7 vector was superior to the secretion vector. All clones were sequenced. Cloning was repeated with new template DNA for constructs with errors. So far we have obtained 158 promising (41%) soluble clones. Analysis suggests that ~40% of the combined two vector constructs produce soluble proteins, although some groups of constructs did not express any protein or expression was at a very low level. We have purified all proteins and attempted crystallization. Several crystals were obtained and we determined 8 structures. In parallel, Morimoto's and Craig's laboratories identified several interacting partners and we will attempt co-crystallization of complexes.
    2. HP1 proteins involved in cancer - collaboration with Dr. Raul Urrutia (Mayo Clinic Cancer Center)
      Heterochromatin Protein 1 (HP1) was first discovered as a dominant suppressor of position-effect variegation and a major component of heterochromatin. The HP1 family is evolutionarily conserved, with members in fungi, plants and animals but not prokaryotes. The N-terminal chromodomain binds methylated lysine 9 of histone H3, causing transcriptional repression. The highly conserved C-terminal chromoshadow domain enables dimerization and also serves as a docking site for proteins involved in a wide variety of nuclear functions, from transcription to nuclear architecture. In addition, it appears that HP1 proteins have diverse roles in the nucleus, including the regulation of euchromatic genes. Mammalian heterochromatic proteins HP1alpha, HP1beta and the pan-nuclear HP1gamma are considered 'gatekeepers' of methyl-K9-H3-mediated silencing. The existence of an HP1-mediated 'silencing subcode' that underlies the instructions of the histone code has been suggested. We have expressed, purified and crystallized human HP1alpha and HP1gamma. Small crystals of HP1alpha and HP1gamma were obtained, however only crystals of HP1gamma were X-ray quality and diffracted to 1.8 Å resolution and a native dataset was collected. We are in the process of preparing heavy atom derivatives (both Se-Met and regular heavy atom soaks) for structure determination.
    3. Nuclear receptors – collaboration with Dr. G. Green (Univ. of Chicago).
      Nuclear receptors modulate transcription through ligand-mediated recruitment of transcriptional co-regulator proteins. The structural connection between ligand and co-regulator is mediated by a molecular switch. The dynamics of this switch are thought to underlie ligand specificity in nuclear receptor signaling, but the details of this control mechanism have remained elusive. Estrogen receptor (ER) serves as a model system to study ligand-mediated interactions with co-regulator proteins. Two papers have been published in 2007.
    4. Insulin-degrading enzyme (IDE) is an evolutionarily conserved zinc-metalloprotease crucial for the clearance of insulin, amyloid-ß and amylin - a collaboration with Drs. Marsha Rosner and Wei-Jen Tang (Univ. of Chicago).
      Insulin-degrading enzyme, a Zn2+-metalloprotease, is involved in the clearance of insulin and amyloid-beta. The structures of human IDE in complex with four substrates (insulin B chain, amyloid-beta peptide (1-40), amylin and glucagon) have been determined using the MCSG HTP pipeline and the results were published in Nature in 2006. IDE selectively uses the size and charge distribution of the substrate-binding cavity to entrap structurally diverse polypeptides. The enclosed substrate undergoes conformational changes to form beta-sheets with two discrete regions of IDE for its degradation.