lXtractor.protocols package

lXtractor.protocols.superpose module

A sandbox module to encapsulate high-level operations based on core lXtractor’s functionality.

class lXtractor.protocols.superpose.SuperposeOutput(ID_fix, ID_mob, RmsdSuperpose, Distance, Transformation)

Bases: tuple

Distance: Any

Alias for field number 3

ID_fix: str

Alias for field number 0

ID_mob: str

Alias for field number 1

RmsdSuperpose: float

Alias for field number 2

Transformation: tuple[ndarray, ndarray, ndarray]

Alias for field number 4

lXtractor.protocols.superpose.align_and_superpose_pair(pair, dist_fn, skip_aln_if_match)[source]

Use sequence alignment to subset each chain structure in pair to common aligned residues and common atoms in each aligned residue pair. Use superpose_pair() to superpose the atom arrays from subsetted chain structures.

Parameters:
  • pair (tuple[tuple[str, ChainStructure, AtomArray | None], tuple[str, ChainStructure, AtomArray | None]]) – A pair of staged inputs.

  • dist_fn (Callable[[AtomArray, AtomArray], Any] | None) – An optional distance function accepting two positional args: “fixed” atom array and superposed atom array.

  • skip_aln_if_match (str) – Passed to lXtractor.core.chain.subset_to_matching().

Returns:

a tuple with id_fixed, id_mobile, rmsd of the superposed atoms, calculated distance, and the transformation matrices.

Return type:

tuple[str, str, float, Any, tuple[ndarray, ndarray, ndarray]]

lXtractor.protocols.superpose.superpose_pair(pair, dist_fn)[source]

A function performing superposition and rmsd calculation of already prepared AtomArray objects. Each must have the same number of atoms.

Parameters:
  • pair (tuple[tuple[str, AtomArray, AtomArray | None], tuple[str, AtomArray, AtomArray | None]]) – A pair of staged inputs. A staged input is a tuple with an identifier, an atom array to superpose, and an optional atom array for the dist_fn.

  • dist_fn (Callable[[AtomArray, AtomArray], Any] | None) – An optional distance function accepting two positional args: “fixed” atom array and superposed atom array.

Returns:

a tuple with id_fixed, id_mobile, rmsd of the superposed atoms, calculated distance, and the transformation matrices.

Return type:

tuple[str, str, float, Any, tuple[ndarray, ndarray, ndarray]]

lXtractor.protocols.superpose.superpose_pairwise(fixed, mobile=None, selection_superpose=(None, None), selection_dist=None, dist_fn=None, *, strict=True, map_name=None, exclude_hydrogen=False, skip_aln_if_match='len', verbose=False, num_proc=1, **kwargs)[source]

Superpose pairs of structures. Two modes are available:

1. strict=True – potentially faster and memory efficient, more parallelization friendly. In this case, after selection using the provided positions and atoms, the number of atoms between each fixed and mobile structure must match exactly.

2. strict=False – a “flexible” protocol. In this case, after the selection of atoms, there are two additional steps:

1. Sequence alignment between the selected subsets. It’s guaranteed to produce the same number of residues between fixed and mobile, which may be less than the initially selected number (see subset_to_matching()).

2. Following this, subset each pair of residues between fixed and mobile to a common list of atoms (see filter_to_common_atoms).

As a result, the “flexible” mode may be suitable for distantly related structures, while the “strict” mode may be used whenever it’s guaranteed that the selection will produce the same sets of atoms between fixed and mobile.

See also

lXtractor.util.structure.filter_selection_extended() – used to apply the selections.

Parameters:
  • fixed (Iterable[ChainStructure]) – An iterable over chain structures that won’t be moved.

  • mobile (Iterable[ChainStructure] | None) – An iterable over chain structures to superpose onto fixed ones. If None, will use the combinations of fixed.

  • selection_superpose (tuple[Sequence[int] | None, Sequence[Sequence[str]] | Sequence[str] | None] | Callable[[ChainStructure], AtomArray]) – A tuple with (residue positions, atom names) to select atoms for superposition, which will be applied to each fixed and mobile structure. If (None, None), will use all positions and atoms. Alternatively, a selector function accepting a chain structure and returning an atom array. If strict is False, it will convert the selected atom array to a chain structure.

  • selection_dist (tuple[Sequence[int] | None, Sequence[Sequence[str]] | Sequence[str] | None] | Callable[[ChainStructure], AtomArray] | None) – Same as selection_superpose. In addition, accepts None to indicate an empty selection, in which case, dist_fn should also be None.

  • dist_fn (Callable[[AtomArray, AtomArray], Any] | None) – An optional distance function applied to a pair of superposed atom arrays, possibly different from the arrays selected for superposition, which is controlled via selection_dist.

  • map_name (str | None) – Mapping for positions in both selection arguments. If used, must exist within Seq of each fixed and mobile structure. A good candidate is a mapping to a reference sequence or Alignment.

  • exclude_hydrogen (bool) – Exclude all hydrogen atoms during selection.

  • strict (bool) – Enable/disable the “strict” protocol. See the explanation above.

  • skip_aln_if_match (str) – Skip the sequence alignment if this field matches.

  • verbose (bool) – Display progress bar.

  • num_proc (int) – The number of parallel processes. For large selections, may consume a lot of RAM, so caution advised.

  • kwargs – Passed to ProcessPoolExecutor.map(). Useful for controlling chunksize and timeout parameters.

Returns:

A generator of namedtuple outputs each containing the IDs of the superposed objects, the RMSD between superposed structures, the distance function output, and the transformation matrices.

Return type:

Generator[SuperposeOutput, None, None]