pyKVFinder.pyKVFinderResults

class pyKVFinder.pyKVFinderResults(cavities: ndarray, surface: ndarray, depths: ndarray | None, scales: ndarray | None, volume: Dict[str, float], area: Dict[str, float], max_depth: Dict[str, float] | None, avg_depth: Dict[str, float] | None, avg_hydropathy: Dict[str, float] | None, residues: Dict[str, List[List[str]]], frequencies: Dict[str, Dict[str, Dict[str, int]]] | None, _vertices: ndarray, _step: float | int, _input: str | Path | None = None, _ligand: str | Path | None = None)[source]

A class containing pyKVFinder detection and characterization results.

Parameters:

cavities (numpy.ndarray) –
Cavity points in the 3D grid (cavities[nx][ny][nz]). Cavities array has integer labels in each position, that are:
- -1: bulk points;
- 0: biomolecule points;
- 1: empty space points;
- >=2: cavity points.
The empty space points are regions that do not meet the chosen volume cutoff to be considered a cavity.
surface (numpy.ndarray) –
Surface points in the 3D grid (surface[nx][ny][nz]). Surface array has integer labels in each position, that are:
- -1: bulk points;
- 0: biomolecule or empty space points;
- >=2: surface points.
The empty space points are regions that do not meet the chosen volume cutoff to be considered a cavity.
depths (numpy.ndarray, optional) – A numpy.ndarray with depth of cavity points (depth[nx][ny][nz]).
scales (numpy.ndarray, optional) – A numpy.ndarray with hydrophobicity scale value mapped at surface points (scales[nx][ny][nz]).
volume (Dict[str, float]) – A dictionary with volume of each detected cavity.
area (Dict[str, float]) – A dictionary with area of each detected cavity.
max_depth (Dict[str, float], optional) – A dictionary with maximum depth of each detected cavity.
avg_depth (Dict[str, float], optional) – A dictionary with average depth of each detected cavity.
avg_hydropathy (Dict[str, float], optional) – A dictionary with average hydropathy for each detected cavity and the range of the hydrophobicity scale (min, max).
residues (Dict[str, List[List[str]]]) – A dictionary with a list of interface residues for each detected cavity.
frequencies (Dict[str, Dict[str, Dict[str, int]]], optional) – A dictionary with frequencies of residues and class for residues of each detected cavity.
_vertices (numpy.ndarray) – A numpy.ndarray or a list with xyz vertices coordinates (origin, X-axis, Y-axis, Z-axis).
_step (float) – Grid spacing (A).
_input (Union[str, pathlib.Path], optional) – A path to input PDB or XYZ file, by default None.
_ligand (Union[str, pathlib.Path], optional) – A path to ligand PDB or XYZ file, by default None.

cavities

Cavity points in the 3D grid (cavities[nx][ny][nz]). Cavities array has integer labels in each position, that are:

-1: bulk points;

0: biomolecule points;

1: empty space points;

>=2: cavity points.

The empty space points are regions that do not meet the chosen volume cutoff to be considered a cavity.

Type:: numpy.ndarray

surface

Surface points in the 3D grid (surface[nx][ny][nz]). Surface array has integer labels in each position, that are:

-1: bulk points;

0: biomolecule or empty space points;

>=2: surface points.

The empty space points are regions that do not meet the chosen volume cutoff to be considered a cavity.

Type:: numpy.ndarray

depths

A numpy.ndarray with depth of cavity points (depth[nx][ny][nz]).

Type:: numpy.ndarray, optional

scales

A numpy.ndarray with hydrophobicity scale value mapped at surface points (scales[nx][ny][nz]).

Type:: numpy.ndarray, optional

ncav

Number of cavities.

Type:: int

volume

A dictionary with volume of each detected cavity.

Type:: Dict[str, float]

area

A dictionary with area of each detected cavity.

Type:: Dict[str, float]

max_depth

A dictionary with maximum depth of each detected cavity.

Type:: Dict[str, float], optional

avg_depth

A dictionary with average depth of each detected cavity.

Type:: Dict[str, float], optional

avg_hydropathy

A dictionary with average hydropathy for each detected cavity and the range of the hydrophobicity scale (min, max).

Type:: Dict[str, float], optional

residues

A dictionary with a list of interface residues for each detected cavity.

Type:: Dict[str, List[List[str]]]

frequencies

A dictionary with frequencies of residues and class for residues of each detected cavity.

Type:: Dict[str, Dict[str, Dict[str, int]]], optional

_vertices

A numpy.ndarray or a list with xyz vertices coordinates (origin, X-axis, Y-axis, Z-axis).

Type:: numpy.ndarray

_step

Grid spacing (A).

Type:: float

_input

A path to input PDB or XYZ file, by default None.

Type:: Union[str, pathlib.Path], optional

_ligand

A path to ligand PDB or XYZ file, by default None.

Type:: Union[str, pathlib.Path], optional

export(output: str | Path = 'cavity.pdb', nthreads: int | None = None) → str | None[source]

Exports cavitiy (H) and surface (HA) points to PDB-formatted file with a variable (B; optional) in B-factor column, and hydropathy to PDB-formatted file in B-factor column at surface points (HA).

Parameters:

output (Union[str, pathlib.Path]), optional) – A path to PDB file for writing cavities, by default cavity.pdb.
nthreads (int, optional) – Number of threads, by default None. If None, the number of threads is os.cpu_count() - 1.

Returns:

A raw string with the PDB-formatted file.

Return type:

Optional[str]

Note

The cavity nomenclature is based on the integer label. The cavity marked with 2, the first integer corresponding to a cavity, is KAA, the cavity marked with 3 is KAB, the cavity marked with 4 is KAC and so on.

Example

>>> import os
>>> import pyKVFinder
>>> pdb = os.path.join(os.path.dirname(pyKVFinder.__file__), 'data', 'tests', '1FMO.pdb')
>>> results = pyKVFinder.run_workflow(pdb)
>>> results.export()

export_all(fn: str | Path = 'results.toml', output: str | Path = 'cavity.pdb', include_frequencies_pdf: bool = False, pdf: str | Path = 'barplots.pdf', nthreads: int | None = None) → None[source]

Exports cavities and characterization to PDB-formatted files, writes file paths and characterization to a TOML-formatted file, and optionally plot bar charts of frequencies (residues and classes of residues) in a PDF file.

Parameters:

fn (Union[str, pathlib.Path], optional) – A path to TOML-formatted file for writing file paths and cavity characterization (volume, area and interface residues) per cavity detected, by default results.toml.
output (Union[str, pathlib.Path], optional) – A path to PDB file for writing cavities, by default cavity.pdb.
include_frequencies_pdf (bool, optional) – Whether to plot frequencies (residues and classes of residues) to PDF file, by default False.
pdf (Union[str, pathlib.Path], optional) – A path to a PDF file, by default barplots.pdf.
nthreads (int, optional) – Number of threads, by default None. If None, the number of threads is os.cpu_count() - 1.

Note

The cavity nomenclature is based on the integer label. The cavity marked with 2, the first integer corresponding to a cavity, is KAA, the cavity marked with 3 is KAB, the cavity marked with 4 is KAC and so on.

Note

The classes of residues are:

Aliphatic apolar (R1): Alanine, Glycine, Isoleucine, Leucine, Methionine, Valine.
Aromatic (R2): Phenylalanine, Tryptophan, Tyrosine.
Polar Uncharged (R3): Asparagine, Cysteine, Glutamine, Proline, Serine, Threonine.
Negatively charged (R4): Aspartate, Glutamate.
Positively charged (R5): Arginine, Histidine, Lysine.
Non-standard (RX): Non-standard residues.

Example

>>> import os
>>> import pyKVFinder
>>> pdb = os.path.join(os.path.dirname(pyKVFinder.__file__), 'data', 'tests', '1FMO.pdb')
>>> results = pyKVFinder.run_workflow(pdb)
>>> results.export_all()

Yet, we can set a include_frequencies_pdf flag to True to plot the bar charts of the frequencies in a PDF file.

>>> results.export_all(include_frequencies_pdf=True)

plot_frequencies(pdf: str | Path = 'barplots.pdf')[source]

Plot bar charts of frequencies (residues and classes of residues) in a PDF file.

Parameters:: pdf (Union[str, pathlib.Path], optional) – A path to a PDF file, by default barplots.pdf.

Note

The cavity nomenclature is based on the integer label. The cavity marked with 2, the first integer corresponding to a cavity, is KAA, the cavity marked with 3 is KAB, the cavity marked with 4 is KAC and so on.

Note

The classes of residues are:

Aliphatic apolar (R1): Alanine, Glycine, Isoleucine, Leucine, Methionine, Valine.
Aromatic (R2): Phenylalanine, Tryptophan, Tyrosine.
Polar Uncharged (R3): Asparagine, Cysteine, Glutamine, Proline, Serine, Threonine.
Negatively charged (R4): Aspartate, Glutamate.
Positively charged (R5): Arginine, Histidine, Lysine.
Non-standard (RX): Non-standard residues

Example

>>> import os
>>> import pyKVFinder
>>> pdb = os.path.join(os.path.dirname(pyKVFinder.__file__), 'data', 'tests', '1FMO.pdb')
>>> results = pyKVFinder.run_workflow(pdb)
>>> results.plot_frequencies()

write(fn: str | Path = 'results.toml', output: str | Path | None = None) → None[source]

Writes file paths and cavity characterization to TOML-formatted file

Parameters:

fn (Union[str, pathlib.Path], optional) – A path to TOML-formatted file for writing file paths and cavity characterization (volume, area, depth and interface residues) per cavity detected, by default results.toml.
output (Union[str, pathlib.Path], optional) – A path to a cavity PDB file, by default None.

Note

The cavity nomenclature is based on the integer label. The cavity marked with 2, the first integer corresponding to a cavity, is KAA, the cavity marked with 3 is KAB, the cavity marked with 4 is KAC and so on.

Example

>>> import os
>>> import pyKVFinder
>>> pdb = os.path.join(os.path.dirname(pyKVFinder.__file__), 'data', 'tests', '1FMO.pdb')
>>> results = pyKVFinder.run_workflow(pdb)
>>> results.write()