You are here

2019 Model Metrics Challenge

last update: July 28, 2020

Quick Links

Info Results

 

Goals

Identify metrics most suitable for evaluating and comparing fit of atomic coordinate models into cryo-EM maps for specimens in the 1.5-4.0 Å reported overall resolution range.

Specific metrics for review:

  1. Model geometry (including Rama, rotamers, clashes, EMringer, CaBLAM)
  2. Overall fit of model into map density per residue and per atom
  3. Domain or secondary-structure element fit
  4. Resolvability at residue or atom-level
  5. Atomic Displacement parameters (B-factors) recommended optimization practice

Targets

There are two target specimens for this challenge. 

  • Human Heavy-chain Apoferritin: a series of three maps is provided (#1 - #3), which differ only in the number of particles used for reconstruction.  These maps were chosen so that different metrics can be carefully compared/contrasted at different resolutions.
  • Horse Liver Alcohol Dehydrogenase: one map is provided (#4). This structure has the extra challenge of fitting a ligand as well as the protein chain.

Target Map Download: You can use this rsync script. Alternatively, you can download individual maps from EMDR atlas pages (click on EMDB id in the table below, select "download" tab).

  T0101. Human Apoferritin
T0102. Human Apoferritin
T0103. Human Apoferritin
T0104. Horse Liver Alcohol Dehydrogenase

target

apoferritin apoferritin apoferritin
EMDB entry
Reported Resolution (Å)

EMD-20026
1.8

EMD-20027
2.3
EMD-20028
3.1
EMD-0406
2.9
Sharpened/Masked map emd_20026.map emd_20027.map emd_20028.map emd_0406.map
Unsharpened/Unmasked map emd_20026_additional_2.map emd_20027_additional_1.map emd_20028_additional_1.map
emd_0406_additional.map

Single protomer map

Identifies required position for chain A in submitted models

emd_20026_additional_1.map emd_20027_additional_2.map emd_20028_additional_2.map

n/a

Half-maps emd_20026_half_map_1.map, emd_20026_half_map_2.map emd_20027_half_map_1.map, emd_20027_half_map_1.map emd_20028_half_map_1.map,
emd_20028_half_map_2.map
emd_0406_half_map_1.map,
emd_0406_half_map_2.map
Primary Citation

unpublished

Herzik et al, 2019

Reference Models

models in bold will be used as references in analysis pipeline

3ajo (Xray)

2fha (Xray)

6nbb (EM)

2jhf (Xray)

Imposed Map Symmetry

Octahedral (O)

 Cyclic (C2)
 Specimen MW
21 kDa x 24-fold = 504 kDa  40 kDa x 2-fold = 80 kDa
Map Contributors

 Kaiming Zhang, Greg Pintilie, Shanshan Li, Wah Chiu

Mark Herzik, Mengyu Wu, Gabe Lander

 

Modelling Instructions

  • Ab initio modelling is encouraged but not required (in deposition you will need to describe your modelling process including any starting model).
  • Regardless of the modelling method used, submitted models should be as complete and as accurate as possible (i.e., close to publication-ready).
  • For the apoferritin targets, use separate modelling processes for each (do not "cross" or "daisy-chain" datasets).
  • Fitting to either the unsharpened/unmasked map or one of the half-maps is strongly encouraged.
  • Submission in mmCIF format is strongly encouraged.

Human heavy chain Apoferritin (#1-3)

Deposit: Single subunit, chain A, with position given by single protomer map

Chain residue numbering starts with 1

Clarification: T=1, T=2, A=3, S=4, T=5, S=6, etc.

Symmetry Matrices (center at x=y=z=109.2 Angstroms)

Full Sequence (Uniprot P02794)

TTASTSQVRQNYHQDSEAAINRQINLELYASYVYLSMSYYFDRDDVALKNFAKYFLHQSHEEREHAEKLMKLQNQRGGRI
FLQDIKKPDCDDWESGLNAMECALHLEKNVNQSLLELHKLATDKNDPHLCDFIETHYLNEQVKAIKELGDHVTNLRKMGA
PESGLAEYLFDKHTLGDSDNES
Note (May 23): 3ajo follows this sequence exactly; 2fha has variation K86Q.
We
will accept models with either sequence but the above sequence is preferred.

Horse liver Alcohol Dehydrogenase (#4)

Deposit: either single subunit (Chain A) or Dimer (chains A and B). Chain positions same as 6nbb

Chain residue numbering starts with 1

Clarification: S=1, T=2, A=3, S=4, G=5, K=6, etc.

For associated ligand NAD, use same chain id as protein, residue#=401

Symmetry Matrices (2-fold at x=y=142.976 Angstroms)

Full Sequence (Uniprot P00327)

STAGKVIKCKAAVLWEEKKPFSIEEVEVAPPKAHEVRIKMVATGICRSDDHVVSGTLVTPLPVIAGHEAAGIVESIGEGV
TTVRPGDKVIPLFTPQCGKCRVCKHPEGNFCLKNDLSMPRGTMQDGTSRFTCRGKPIHHFLGTSTFSQYTVVDEISVAKI
DAASPLEKVCLIGCGFSTGYGSAVKVAKVTQGSTCAVFGLGGVGLSVIMGCKAAGAARIIGVDINKDKFAKAKEVGATEC
VNPQDYKKPIQEVLTEMSNGGVDFSFEVIGRLDTMVTALSCCQEAYGVSVIVGVPPDSQNLSMNPMLLLSGRTWKGAIFG
GFKSKDSVPKLVADFMAKKFALDPLITHVLPFEKINEGFDLLRSGESIRTILTF

 

Process

  1. Participant Teleconference to review process, gather recommendations for deposition data collection and automated model comparison pipeline. (previous round data collection and analysis is summarized here and here).
  2. Participants prepare/upload their best models for each target (team approach to modelling is welcome). 
  3. Initial (blinded) analyses of deposited models will be performed via the automated model comparison pipeline (guided by recommendations in step #1).
  4. Participant Panel will meet to review the results at June Face-to-Face meeting and recommend next steps.

 

Timeline

17 April  Open Challenge
26 April, 11 AM US ET Participant Teleconference
 1 May  Deposition form opens to collect participant models
25 May 28 May (3 PM US ET)  Deadline for depositing models
27 May 29 May  Deposited models and metadata made available for assessment (blinded)
 6 June  Results of model-compare pipeline made available
 13-15 June  Participant/Assessor Face-to-Face Meeting

 

FAQ

Q1: Alcohol Dehydrogenase Target: Why are the half-maps (512x512x512) so much larger than the primary deposited map (368x368x368) for EMD-0406? Answered by Mark Herzik: The half maps are unfiltered and completely unmodified from RELION’s output. We did not think it was necessary for purveyors of the EMDB to download a ~0.5 GB map of alcohol dehydrogenase when most of the voxels (512x512x512 box size) would be just averaged noise. We used the larger box size during processing, despite the small mass of ADH, to prevent aliasing and CTF delocalization issues. It’s unclear what is the established recommendations in this regard but we think it makes things easier for the user.

Q2: It seems that the half-maps for EMD-20026 (1.8 Å) are better than the full maps based on CC values to a docked model (same model for all maps) [the other apoferritin entries have better CC value for full map vs half-map].  Digest of subsequent discussion provided here.

Q3: Will fully automated models be accepted or should we go through and correct errors?  In the challenge phase we want to collect "close to final" models (with errors identified/fixed as much as possible).

Q4: What is the goal of this challenge?  For challengers, it is to build the best quality model possible given the map data. For assessors, it is to decide what metrics are best for comparing models.

Q5: Should we develop new methods for this challenge?  We anticipate that everyone will make use of existing methods for modelling and assessment in this "short-timeline" round.

Q6: How many models can a modelling team submit? There is no limitation. Teams may submit multiple models per map.

Q7: What buffer was used to prepare the apoferritin sample?  Answered by Kaiming Zhang (May 14): 50 mM TrisHCl, pH 7.5, 150 mM NaCl.

Q8: How can I access the single protomer map to find the reference position for chain A (apoferritin target)? (added May 21).  The file name of the "additional" single protomer map included with the EMDB entry is listed in the target table.

Q9: Why are we asked to only submit a single chain model for the apoferritin targets? (added May 26).  This simplifies analysis. We will be able to create/analyze full complexes in a consistent way across all submissions using the symmetry matrices provided in the instructions.

Q10: Can you tell me my modeller group id?  These will be revealed near the end of the face-to-face meeting and posted here. 

 

EMDataResource Validation Challenges are supported by NIH National Institute of General Medical Sciences

Please send your challenge questions, comments and feedback to challenges@emdataresource.org

Theme by Danetsoft and Danang Probo Sayekti inspired by Maksimer