-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathserial_crystallography.html
More file actions
275 lines (237 loc) · 23.6 KB
/
serial_crystallography.html
File metadata and controls
275 lines (237 loc) · 23.6 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
<!DOCTYPE html>
<html lang="en" data-content_root="./">
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" /><meta name="viewport" content="width=device-width, initial-scale=1" />
<title>Processing serial synchrotron crystallography (SSX) datasets with xia2 — xia2 documentation</title>
<link rel="stylesheet" type="text/css" href="_static/pygments.css?v=03e43079" />
<link rel="stylesheet" type="text/css" href="_static/basic.css?v=b08954a9" />
<link rel="stylesheet" type="text/css" href="_static/alabaster.css?v=27fed22d" />
<link rel="stylesheet" href="_static/xia2.css" type="text/css" />
<script src="_static/documentation_options.js?v=5929fcd5"></script>
<script src="_static/doctools.js?v=fd6eb6e6"></script>
<script src="_static/sphinx_highlight.js?v=6ffebe34"></script>
<link rel="index" title="Index" href="genindex.html" />
<link rel="search" title="Search" href="search.html" />
<link rel="next" title="Overcoming problems with the detector geometry for SSX data" href="xia2-ssx-geometry-refinement.html" />
<link rel="prev" title="Multi-crystal data reduction with xia2.multiplex" href="multiplex-multi-crystal.html" />
<link rel="stylesheet" href="_static/custom.css" type="text/css" />
</head><body>
<!-- <div class="logoheader container"> -->
<!-- <a href="index.html"> -->
<!-- <img class="logoheader" alt="DIALS" src="_static/dials_header.png" /> -->
<!-- </a> -->
<!-- </div> -->
<div class="document">
<div class="documentwrapper">
<div class="bodywrapper">
<div class="body" role="main">
<section id="processing-serial-synchrotron-crystallography-ssx-datasets-with-xia2">
<h1>Processing serial synchrotron crystallography (SSX) datasets with xia2<a class="headerlink" href="#processing-serial-synchrotron-crystallography-ssx-datasets-with-xia2" title="Link to this heading">¶</a></h1>
<p>xia2 is able to processes serial synchrotron data through a SSX
pipeline built upon data analysis programs from DIALS.</p>
<p>To process serial data from images to a merged MTZ file, the
minimal recommended example command is:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">xia2</span><span class="o">.</span><span class="n">ssx</span> <span class="n">template</span><span class="o">=../</span><span class="n">lyso_1_</span><span class="c1">###.cbf space_group=P43212 \</span>
<span class="n">unit_cell</span><span class="o">=</span><span class="mf">79.1</span><span class="p">,</span><span class="mf">79.1</span><span class="p">,</span><span class="mf">38.2</span><span class="p">,</span><span class="mi">90</span><span class="p">,</span><span class="mi">90</span><span class="p">,</span><span class="mi">90</span> <span class="n">reference</span><span class="o">../</span><span class="n">lyso</span><span class="o">.</span><span class="n">mtz</span>
</pre></div>
</div>
<p>i.e. the minimal required information is the location of the images, the expected
space group and unit cell. A suitable reference file, containing a reference set of
intensities or pdb model (<code class="samp docutils literal notranslate"><span class="pre">.mtz,</span> <span class="pre">.cif,</span> <span class="pre">.pdb</span></code>) is also recommended to enable data
reduction using a reference.
Typically, a reference geometry and mask should also be provided, as described below.</p>
<p><strong>Table of contents of xia2 SSX processing documentation</strong></p>
<div class="line-block">
<div class="line"><a class="reference internal" href="#dataintegration"><span class="std std-ref">Data Integration</span></a></div>
<div class="line"><a class="reference internal" href="#datareduction"><span class="std std-ref">Data Reduction</span></a></div>
<div class="line"><a class="reference internal" href="#geometryrefinement"><span class="std std-ref">Overcoming difficulties in geometry refinement</span></a></div>
<div class="line"><a class="reference internal" href="#mergingingroups"><span class="std std-ref">Merging in groups</span></a></div>
<div class="line"><a class="reference internal" href="#exportingunmerged"><span class="std std-ref">Exporting unmerged data</span></a></div>
</div>
<section id="data-integration">
<span id="dataintegration"></span><h2>Data Integration<a class="headerlink" href="#data-integration" title="Link to this heading">¶</a></h2>
<p>To integrate data, the images must be provided, plus the space group to use for
processing and an estimate of the unit cell.
If the space group or unit cell are not known, the pipeline can be initially
run without these; spotfinding and indexing will be performed, and the most
populous unit cell clusters will be presented to the user.</p>
<p>The location of the image data must be specified by using one of the options
<code class="samp docutils literal notranslate"><span class="pre">image=</span></code>, <code class="samp docutils literal notranslate"><span class="pre">template=</span></code> or <code class="samp docutils literal notranslate"><span class="pre">directory=</span></code> (multiple instances of
one option is allowed). Here <code class="samp docutils literal notranslate"><span class="pre">image</span></code> corresponds to an image file
(e.g. <code class="samp docutils literal notranslate"><span class="pre">.cbf</span></code> or <code class="samp docutils literal notranslate"><span class="pre">.h5</span></code>), <code class="samp docutils literal notranslate"><span class="pre">template</span></code> is a file template for a
sequence of images (e.g. for files <code class="samp docutils literal notranslate"><span class="pre">lyso_1_001.cbf</span></code> to <code class="samp docutils literal notranslate"><span class="pre">lyso_1_900.cbf</span></code>, the matching
template is <code class="samp docutils literal notranslate"><span class="pre">lyso_1_###.cbf</span></code>), and <code class="samp docutils literal notranslate"><span class="pre">directory</span></code> is a path to a
directory containing image files.
For data in h5 format, the option <code class="samp docutils literal notranslate"><span class="pre">image=/path/to/image_master.h5</span></code> is recommended.
For cbf data, the file template option is recommended, e.g.:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">xia2</span><span class="o">.</span><span class="n">ssx</span> <span class="n">template</span><span class="o">=../</span><span class="n">lyso_1_</span><span class="c1">###.cbf space_group=P43212 \</span>
<span class="n">unit_cell</span><span class="o">=</span><span class="mf">79.1</span><span class="p">,</span><span class="mf">79.1</span><span class="p">,</span><span class="mf">38.2</span><span class="p">,</span><span class="mi">90</span><span class="p">,</span><span class="mi">90</span><span class="p">,</span><span class="mi">90</span>
</pre></div>
</div>
<p>The overall sequence of the data integration part of the pipeline is as follows.
In the absence of an explicitly provided reference geometry, spotfinding and
indexing will be run on batches of 1000 images at a time, until a threshold number
of crystals have been indexed (250 by default, as specified by the
<code class="samp docutils literal notranslate"><span class="pre">geometry_refinement.n_crystals</span></code> parameter). Then joint refinement of the
detector geometry is run to determine an improved geometry model. Following this,
the refined geometry is used to perform spotfinding, indexing and integration of
all images, with feedback provided on batches of 1000 images at a time. This batch
size can be adjusted with the <code class="samp docutils literal notranslate"><span class="pre">batch_size</span></code> parameter. The number of available
processes will be determined automatically (optionally a value for <code class="samp docutils literal notranslate"><span class="pre">nproc</span></code> can be given),
and parallel processing will be performed within the batch.</p>
<p>A DIALS reference geometry file (<code class="samp docutils literal notranslate"><span class="pre">refined.expt</span></code>) can be provided as input
with the option <code class="samp docutils literal notranslate"><span class="pre">reference_geometry=</span></code>, which will be used instead of
performing a joint refinement on the data. A mask file created from the DIALS
image viewer can also be provided with the option <code class="samp docutils literal notranslate"><span class="pre">mask=</span></code>, which will be
used in spotfinding and integration. A few other common options to set are the
<code class="samp docutils literal notranslate"><span class="pre">max_lattices</span></code> parameter, which determines the number of lattices to search
for on each image in indexing, and <code class="samp docutils literal notranslate"><span class="pre">d_min</span></code> which controls the resolution
limit for spotfinding and integration.</p>
<p>To see the full list of options and their descriptions, run <code class="samp docutils literal notranslate"><span class="pre">xia2.ssx</span> <span class="pre">-ce2</span> <span class="pre">-a1</span></code>.
Change the number after <code class="samp docutils literal notranslate"><span class="pre">-ce</span></code> to a value from 0 to 3 to see different
“expert levels” of program parameters. Note that a phil options file can be
provided for each of the DIALS programs, to allow further customisation of the
options for the individual programs. Additionally, stepwise processing can be
performed by running the program multiple times with the option <code class="samp docutils literal notranslate"><span class="pre">steps=find_spots</span></code>,
then <code class="samp docutils literal notranslate"><span class="pre">steps=index</span></code> and finally <code class="samp docutils literal notranslate"><span class="pre">steps=integrate</span></code>.</p>
</section>
<section id="data-reduction">
<span id="datareduction"></span><h2>Data Reduction<a class="headerlink" href="#data-reduction" title="Link to this heading">¶</a></h2>
<p>Following data integration, data reduction (reindexing, scaling and merging) will
be performed. The data reduction can be run separately to the full pipeline through
the command <code class="samp docutils literal notranslate"><span class="pre">xia2.ssx_reduce</span></code>, taking integrated data as input, e.g.:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">xia2</span><span class="o">.</span><span class="n">ssx_reduce</span> <span class="o">../</span><span class="n">xia2_ssx</span><span class="o">/</span><span class="n">batch_</span><span class="o">*/</span><span class="n">integrated</span><span class="o">*.</span><span class="p">{</span><span class="n">expt</span><span class="p">,</span><span class="n">refl</span><span class="p">}</span>
</pre></div>
</div>
<p>To run only the data integration without reduction, use the option
<code class="samp docutils literal notranslate"><span class="pre">steps=find_spots+index+integrate</span></code> (i.e. omit <code class="samp docutils literal notranslate"><span class="pre">+reduce</span></code>) when running <code class="samp docutils literal notranslate"><span class="pre">xia2.ssx</span></code>.</p>
<p>The data reduction process consists of unit cell filtering, followed by indexing
ambiguity resolution in batches (if ambiguities are possible due to lattice
and space group symmetries), followed by scaling and merging.</p>
<p>If a reference dataset/PDB model is
provided with the option <code class="samp docutils literal notranslate"><span class="pre">reference=</span></code>, then reindexing and scaling is performed
in parallel in batches of at least <code class="samp docutils literal notranslate"><span class="pre">reduction_batch_size</span></code> crystals, using intensities
extracted/generated from the reference as a reference when reindexing and scaling.
It is recommended to use a high-quality reference set of intensities in preference to genering
a set of intensities from a PDB model, to give a higher accuracy. If generating intensities
from a PDB model, the default bulk solvent parameters (<code class="samp docutils literal notranslate"><span class="pre">k_sol</span></code> and <code class="samp docutils literal notranslate"><span class="pre">b_sol</span></code>) should
be adjusted to suitable values.
If there is no reference given, the scaling is not performed in parallel. Other important
options are setting <code class="samp docutils literal notranslate"><span class="pre">anomalous=True/False</span></code> and specifying a <code class="samp docutils literal notranslate"><span class="pre">d_min</span></code> value.
To evaluate the success of indexing ambiguity resolution, it is important to inspect
the html output from dials.cosym jobs in the <code class="samp docutils literal notranslate"><span class="pre">data_reduction\reindex</span></code> folder.
To see the full list of data reduction parameters and their descriptions,
run <code class="samp docutils literal notranslate"><span class="pre">xia2.ssx_reduce</span> <span class="pre">-ce3</span> <span class="pre">-a2</span></code>. The output of the data reduction pipeline
is a merged MTZ file which can be taken onwards for structure determination.</p>
<p><code class="docutils literal notranslate"><span class="pre">xia2.ssx_reduce</span></code> also supports quick remerging of already processed data by using the option
<code class="docutils literal notranslate"><span class="pre">steps=merge</span></code>. This can be useful for aggregating multiple processing jobs processed
using the same reference model, or generating MTZ files with specified resolution
cutoffs e.g.:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">xia2</span><span class="o">.</span><span class="n">ssx_reduce</span> <span class="o">../</span><span class="n">xia2_ssx_reduce</span><span class="o">/</span><span class="n">DataFiles</span><span class="o">/</span><span class="n">scaled</span><span class="o">*.</span><span class="p">{</span><span class="n">expt</span><span class="p">,</span><span class="n">refl</span><span class="p">}</span> <span class="n">steps</span><span class="o">=</span><span class="n">merge</span> <span class="n">d_min</span><span class="o">=</span><span class="mf">1.8</span>
<span class="n">xia2</span><span class="o">.</span><span class="n">ssx_reduce</span> <span class="n">steps</span><span class="o">=</span><span class="n">merge</span> <span class="o">../</span><span class="p">{</span><span class="n">chip1</span><span class="p">,</span><span class="n">chip2</span><span class="p">}</span><span class="o">/</span><span class="n">DataFiles</span><span class="o">/</span><span class="n">scaled</span><span class="o">*.</span><span class="p">{</span><span class="n">expt</span><span class="p">,</span><span class="n">refl</span><span class="p">}</span>
</pre></div>
</div>
</section>
<section id="overcoming-difficulties-in-geometry-refinement">
<span id="geometryrefinement"></span><h2>Overcoming difficulties in geometry refinement<a class="headerlink" href="#overcoming-difficulties-in-geometry-refinement" title="Link to this heading">¶</a></h2>
<p>The first step of processing with <code class="samp docutils literal notranslate"><span class="pre">xia2.ssx</span></code> is to improve the detector geometry in order to
improve the indexing rate. The starting assumption is that the initial geometry is good enough to
successfully index a fraction of the images. When this is not the case, it will be necessary to
change some of the program parameters. Examples and suggestions of how to tackle trickier
processing cases are described in the link below.</p>
<div class="toctree-wrapper compound">
<ul>
<li class="toctree-l1"><a class="reference internal" href="xia2-ssx-geometry-refinement.html">Overcoming problems with the detector geometry</a><ul>
<li class="toctree-l2"><a class="reference internal" href="xia2-ssx-geometry-refinement.html#setting-the-detector-distance-beam-centre">Setting the detector distance/beam centre</a></li>
<li class="toctree-l2"><a class="reference internal" href="xia2-ssx-geometry-refinement.html#relaxing-restraints-in-refinement">Relaxing restraints in refinement</a></li>
<li class="toctree-l2"><a class="reference internal" href="xia2-ssx-geometry-refinement.html#refining-multi-panel-detectors">Refining multi-panel detectors</a></li>
</ul>
</li>
</ul>
</div>
</section>
<section id="merging-in-groups">
<span id="mergingingroups"></span><h2>Merging in groups<a class="headerlink" href="#merging-in-groups" title="Link to this heading">¶</a></h2>
<p><code class="docutils literal notranslate"><span class="pre">xia2.ssx_reduce</span></code> also supports merging in groups to handle more complex experiments such
as dose series experiments. This is described in more detail in the links below.</p>
<div class="toctree-wrapper compound">
<ul>
<li class="toctree-l1"><a class="reference internal" href="xia2-ssx-dose-series.html">Dose-series experiments and merging in groups</a><ul>
<li class="toctree-l2"><a class="reference internal" href="xia2-ssx-dose-series.html#dose-series-the-dose-series-repeat-option">Dose series - the <em>dose_series_repeat</em> option</a></li>
<li class="toctree-l2"><a class="reference internal" href="xia2-ssx-dose-series.html#dose-series-using-a-grouping-yml-file">Dose series - using a <em>grouping.yml</em> file</a></li>
<li class="toctree-l2"><a class="reference internal" href="xia2-ssx-dose-series.html#generalised-merge-grouping-on-metadata">Generalised merge grouping on metadata</a></li>
</ul>
</li>
</ul>
</div>
</section>
<section id="exporting-unmerged-data">
<span id="exportingunmerged"></span><h2>Exporting unmerged data<a class="headerlink" href="#exporting-unmerged-data" title="Link to this heading">¶</a></h2>
<p>Merged data (in MTZ format) is the standard output of <code class="docutils literal notranslate"><span class="pre">xia2.ssx</span></code>, however unmerged scaled data files
(in mmCIF format) can be generated using the tools from DIALS (note that this requires a
DIALS version later than v3.20).</p>
<p>If data were reduced with a reference, there may be more than one set of scaled reflection and experiment files.
In this case, these must first be combined with <code class="docutils literal notranslate"><span class="pre">dials.combine_experiments</span></code>, before using <code class="docutils literal notranslate"><span class="pre">dials.export</span></code> to
export to mmcif format:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">dials</span><span class="o">.</span><span class="n">combine_experiments</span> <span class="n">data_reduction</span><span class="o">/</span><span class="n">scale</span><span class="o">/</span><span class="n">scaled</span><span class="o">*.</span><span class="p">{</span><span class="n">expt</span><span class="p">,</span><span class="n">refl</span><span class="p">}</span>
<span class="n">dials</span><span class="o">.</span><span class="n">export</span> <span class="n">combined</span><span class="o">.*</span> <span class="nb">format</span><span class="o">=</span><span class="n">mmcif</span>
</pre></div>
</div>
<p>If data were not reduced with a reference, then one can just use <code class="docutils literal notranslate"><span class="pre">dials.export</span></code>:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">dials</span><span class="o">.</span><span class="n">export</span> <span class="n">data_reduction</span><span class="o">/</span><span class="n">scale</span><span class="o">/</span><span class="n">scaled</span><span class="o">.</span><span class="p">{</span><span class="n">expt</span><span class="p">,</span><span class="n">refl</span><span class="p">}</span> <span class="nb">format</span><span class="o">=</span><span class="n">mmcif</span>
</pre></div>
</div>
<p>mmCIF is a standardised format that is able to describe unmerged diffraction data, and the output scaled.cif file conforms to the v5
standard <a class="reference external" href="https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Index/">https://mmcif.wwpdb.org/dictionaries/mmcif_pdbx_v50.dic/Index/</a>. Note that during export to mmcif, the overall scale of the data
can change to avoid large negative intensity values, which may be present (with comparably large sigmas) for data scaled against a reference.
The unmerged mmcif data file can be understood with the gemmi program. For example, it can be converted to unmerged MTZ with gemmi and sorted with CCP4’s sortmtz for
further analysis:</p>
<div class="highlight-default notranslate"><div class="highlight"><pre><span></span><span class="n">gemmi</span> <span class="n">cif2mtz</span> <span class="n">scaled</span><span class="o">.</span><span class="n">cif</span> <span class="n">scaled</span><span class="o">.</span><span class="n">mtz</span>
<span class="n">echo</span> <span class="n">H</span> <span class="n">K</span> <span class="n">L</span> <span class="n">M</span><span class="o">/</span><span class="n">ISYM</span> <span class="o">|</span><span class="n">sortmtz</span> <span class="n">HKLIN</span> <span class="n">scaled</span> <span class="n">HKLOUT</span> <span class="nb">sorted</span>
</pre></div>
</div>
</section>
</section>
</div>
</div>
</div>
<div class="sphinxsidebar" role="navigation" aria-label="Main">
<div class="sphinxsidebarwrapper">
<p class="logo">
<a href="index.html">
<img class="logo" src="_static/xia2-new-logo.gif" alt="Logo" />
</a>
</p>
<h3>Navigation</h3>
<ul class="current">
<li class="toctree-l1"><a class="reference internal" href="quick_start.html">Getting started</a></li>
<li class="toctree-l1"><a class="reference internal" href="using_xia2.html">Using xia2</a></li>
<li class="toctree-l1"><a class="reference internal" href="installation.html">Installation</a></li>
<li class="toctree-l1"><a class="reference internal" href="introductory_example.html">Introductory example</a></li>
<li class="toctree-l1"><a class="reference internal" href="insulin_tutorial.html">Insulin tutorial</a></li>
<li class="toctree-l1"><a class="reference internal" href="multi_crystal.html">Multi-crystal data</a></li>
<li class="toctree-l1"><a class="reference internal" href="multiplex-multi-crystal.html">Multi-crystal data reduction (xia2.multiplex)</a></li>
<li class="toctree-l1 current"><a class="current reference internal" href="#">Serial-crystallography</a></li>
<li class="toctree-l1"><a class="reference internal" href="program_output.html">Program output</a></li>
<li class="toctree-l1"><a class="reference internal" href="parameters.html">Parameters</a></li>
<li class="toctree-l1"><a class="reference internal" href="comments.html">Comments</a></li>
<li class="toctree-l1"><a class="reference internal" href="background.html">History</a></li>
<li class="toctree-l1"><a class="reference internal" href="acknowledgements.html">Acknowledgements</a></li>
<li class="toctree-l1"><a class="reference internal" href="release_notes.html">Release notes</a></li>
<li class="toctree-l1"><a class="reference internal" href="license.html">License</a></li>
</ul>
</div>
</div>
<div class="clearer"></div>
</div>
<div class="footer container">
<a href="http://www.diamond.ac.uk/"><img class="logofooter" alt="Diamond" src="_static/diamond_logo.png" /></a>
<a href="http://dials.diamond.ac.uk/"><img class="logofooter" alt="DIALS" src="_static/dials_logo_fx_transparent_bg.png" /></a>
<a href="http://www.ccp4.ac.uk/"><img class="logofooter" alt="CCP4" src="_static/CCP4-logo-plain.png" /></a>
</div>
<div class="footer">
©2020, Diamond Light Source.
</div>
</body>
</html>