mstk.topology.Molecule¶
- class mstk.topology.Molecule(name='UNK')¶
A molecule is defined as atoms and the connectivity between them.
The term molecule is not strictly a chemical molecule. Some atoms may not be connected to any other atoms in the same molecule. However, there can not be bonds connecting atoms belong to different molecules. Drude particles and virtual sites are also considered as atoms. All bond, angles, dihedrals and impropers should be defined explicitly.
- Parameters:
name (str) –
- id¶
Index of this molecule in topology. -1 means information haven’t been updated by topology
- Type:
int
- name¶
Name of the molecule, not necessarily unique
- Type:
str
Methods
__init__([name])add_angle(atom1, atom2, atom3[, check_existence])Add a angle between three atoms.
add_atom(atom[, residue, index, update_topology])Add an atom to this molecule.
add_bond(atom1, atom2[, order, check_existence])Add a bond between two atoms.
add_dihedral(atom1, atom2, atom3, atom4[, ...])Add a dihedral between four atoms.
add_improper(atom1, atom2, atom3, atom4[, ...])Add a improper between four atoms.
add_residue(name, atoms[, refresh_residues])Put a group of atoms into a new residue.
from_rdmol(rdmol[, name])Initialize a molecule from a RDKit Mol object.
from_smiles(smiles)Initialize a molecule from SMILES string.
generate_angle_dihedral_improper([dihedral, ...])Generate angle, dihedral and improper from bonds The existing angles, dihedrals and impropers will be removed first The atoms and bonds concerning Drude particles will be ignored
generate_conformers([n_conformer])Generate several conformers with RDKit.
generate_drude_particles(ff[, type_drude, ...])Generate Drude particles from DrudeTerms in force field.
generate_virtual_sites(ff[, update_topology])Generate virtual sites from VirtualSiteTerms in force field.
Retrieve all the 1-2, 1-3 and 1-4 pairs based on the bond information.
get_distance_matrix([max_bond])Retrieve all the Drude dipole pairs belong to this molecule
get_sub_molecule(indexes[, deepcopy])Extract a substructure from this molecule by indexes of atoms.
Retrieve all the virtual site pairs belong to this molecule
guess_connectivity_from_ff(ff[, bond_limit, ...])Guess bonds, angles, dihedrals and impropers from force field.
is_similar_to(other)Check if this molecule is similar to another molecule.
merge(molecules[, deepcopy])Merge several molecules into a single molecule.
refresh_residues([update_topology])Remove empty residues, update id_in_mol attributes of each residue in this molecule
remove_atom(atom[, update_topology])Remove an atom and all the bonds connected to the atom from this molecule.
remove_atoms(atoms[, update_topology])Remove multiple atoms and all the bonds connected to these atoms from this molecule.
remove_connectivity(connectivity)Remove a connectivity (bond, angle, diheral or improper) from this molecule.
remove_drude_particles([update_topology])Remove all Drude particles and bonds belong to Drude particles
remove_non_polar_hydrogens([update_topology])Remove single-coordinated hydrogen atoms bonded to C and Si atoms
remove_residue(residue[, refresh_residues])Remove a residue from this molecule, and put the relevant atoms into the default residue
remove_virtual_sites([update_topology])Remove all virtual sites.
set_positions(positions)Set the positions of all atoms in this molecule
split([consecutive])Split the molecule into smaller pieces based on bond network.
Split the molecule into smaller pieces.
Attributes
List of angles belong to this molecule
List of atoms belong to this molecule
List of bonds belong to this molecule
List of dihedrals belong to this molecule
Whether or not all the atoms in the molecule have positions
List of impropers belong to this molecule
Number of angles belong to this molecule
Number of atoms belong to this molecule
Number of bonds belong to this molecule
Number of dihedrals belong to this molecule
Number of impropers belong to this molecule
Positions of all the atoms in this molecule
The rdkit.Chem.Mol object associated with this molecule.
All the residues in this molecule
The topology this molecule belongs to
- property name¶
- static from_smiles(smiles)¶
Initialize a molecule from SMILES string.
RDKit is used for parsing SMILES. The Hydrogen atoms will be created. The positions of all atoms will also be automatically generated. The SMILES string can contain the name of the molecule at the end, e.g. ‘CCCC butane’.
- Parameters:
smiles (str) –
- Returns:
molecule
- Return type:
- static from_rdmol(rdmol, name=None)¶
Initialize a molecule from a RDKit Mol object. If the RDKit Mol has conformers, the position of the first conformer will be assigned to the atoms
- Parameters:
rdmol (rdkit.Chem.Mol) –
name (str) – The name of the molecule. If not provided, the formula will be used as the name.
- Returns:
molecule
- Return type:
- property rdmol¶
The rdkit.Chem.Mol object associated with this molecule.
It is required by ZftTyper typing engine, which performs SMARTS matching on the molecule. The rdmol attribute will be assigned if the molecule is initialized from SMILES or RDKit Molecule. If it is not available, a RDKit molecule will be constructed from atoms and bonds. The positions will not be preserved.
- Returns:
rdmol
- Return type:
rdkit.Chem.Mol
- generate_conformers(n_conformer=1)¶
Generate several conformers with RDKit.
The positions will be generated from only elements and bonds. The chiral center will not be respected.
- Parameters:
n_conformer (int) – How many conformers to generate
- Returns:
molecules – Each conformer will be a independent molecule object
- Return type:
list of Molecule
- add_atom(atom, residue=None, index=None, update_topology=True)¶
Add an atom to this molecule.
The id_in_mol attribute of all atoms will be updated after insertion.
TODO Make residue assignment more robust
- Parameters:
atom (Atom) –
residue (Residue, Optional) – Add the atom to this residue. Make sure the residue belongs to this molecule. For performance concern, this is not checked. If set to None, the atom will be added to the default residue.
index (int, Optional) – If None, the new atom will be the last atom. Otherwise, it will be inserted in front of index-th atom.
update_topology (bool) – If True, the topology this molecule belongs to will update its atom list and assign id for all atoms and residues. Otherwise, you have to re-init the topology manually so that the topological information is correct.
- remove_atom(atom, update_topology=True)¶
Remove an atom and all the bonds connected to the atom from this molecule.
The atom will also be removed from its residue. The id_in_mol attribute of all atoms will be updated after removal. The angle, dihedral and improper involving this atom are untouched. Therefore, you may call generate_angle_dihedral_improper to refresh the connectivity.
- Parameters:
atom (Atom) –
update_topology (bool) – If update_topology is True, the topology this molecule belongs to will update its atom list and assign id for all atoms and residues. Otherwise, you have to re-init the topology manually so that the topological information is correct.
- remove_atoms(atoms, update_topology=True)¶
Remove multiple atoms and all the bonds connected to these atoms from this molecule.
The atom will also be removed from its residue. The id_in_mol attribute of all atoms will be updated after removal. The angle, dihedral and improper involving this atom are untouched. Therefore, you may call generate_angle_dihedral_improper to refresh the connectivity.
- Parameters:
atom (Atom) –
update_topology (bool) – If update_topology is True, the topology this molecule belongs to will update its atom list and assign id for all atoms and residues. Otherwise, you have to re-init the topology manually so that the topological information is correct.
- remove_non_polar_hydrogens(update_topology=True)¶
Remove single-coordinated hydrogen atoms bonded to C and Si atoms
- Parameters:
update_topology (bool) – If update_topology is True, the topology this molecule belongs to will update its atom list and assign id for all atoms and residues. Otherwise, you have to re-init the topology manually so that the topological information is correct.
- Returns:
ids_removed – The number of atoms removed
- Return type:
list of int
- add_residue(name, atoms, refresh_residues=True)¶
Put a group of atoms into a new residue. These atoms will be removed from their old residues.
Make sure that these atoms belong to this molecule. For performance issue, this is not checked.
- Parameters:
name (str) –
atoms (list of Atom) –
refresh_residues (bool) – If True, the residue list of this molecule will be updated. The id and id_in_mol for all residues will be assigned. This operation can be slow. If you have a lot of residues to add, it is more efficient to set it to False, and call refresh_residues manually after all residues are added.
- Returns:
residue
- Return type:
- remove_residue(residue, refresh_residues=True)¶
Remove a residue from this molecule, and put the relevant atoms into the default residue
Make sure that this residue belongs to this molecule. For performance issue, this is not checked.
- Parameters:
residue (Residue) –
refresh_residues (bool) – If True, the residue list of this molecule will be updated. The id and id_in_mol for all residues will be assigned. This operation can be slow. If you have a lot of residues to remove, it is more efficient to set it to False, and call refresh_residues manually after all residues are removed.
- refresh_residues(update_topology=True)¶
Remove empty residues, update id_in_mol attributes of each residue in this molecule
- Parameters:
update_topology (bool) – If True, the topology this molecule belongs to will assign id for all residues
- add_bond(atom1, atom2, order=0, check_existence=False)¶
Add a bond between two atoms.
Make sure that both these two atoms belong to this molecule. For performance issue, this is not checked.
- add_angle(atom1, atom2, atom3, check_existence=False)¶
Add a angle between three atoms.
The second atom is the central atom. Make sure that both these three atoms belong to this molecule. For performance issue, this is not checked.
- add_dihedral(atom1, atom2, atom3, atom4, check_existence=False)¶
Add a dihedral between four atoms.
Make sure that both these four atoms belong to this molecule. For performance issue, this is not checked.
- add_improper(atom1, atom2, atom3, atom4, check_existence=False)¶
Add a improper between four atoms.
The fist atom is the central atom. Make sure that both these four atoms belong to this molecule. For performance issue, this is not checked.
- remove_connectivity(connectivity)¶
Remove a connectivity (bond, angle, diheral or improper) from this molecule.
Make sure that this connectivity belongs to this molecule. For performance issue, this is not checked.
Note that when a bond get removed, the relevant angles, dihedrals and impropers are still there. You may call generate_angle_dihedral_improper to refresh connectivity.
# TODO This operation is slow. A batch version is required for better performance
- is_similar_to(other)¶
Check if this molecule is similar to another molecule.
It requires two molecules contains the same number of atoms. The correspond atoms should have same atom symbol, type and charge. The bonds should also be the same. But it doesn’t consider angles, dihedrals and impropers.
- Parameters:
other (Molecule) –
- Returns:
is
- Return type:
bool
- get_adjacency_matrix()¶
- get_distance_matrix(max_bond=None)¶
- property n_atom¶
Number of atoms belong to this molecule
- Returns:
n
- Return type:
int
- property n_bond¶
Number of bonds belong to this molecule
- Returns:
n
- Return type:
int
- property n_angle¶
Number of angles belong to this molecule
- Returns:
n
- Return type:
int
- property n_dihedral¶
Number of dihedrals belong to this molecule
- Returns:
n
- Return type:
int
- property n_improper¶
Number of impropers belong to this molecule
- Returns:
n
- Return type:
int
- property n_residue¶
- property dihedrals¶
List of dihedrals belong to this molecule
- Returns:
dihedrals
- Return type:
list of Dihedral
- property impropers¶
List of impropers belong to this molecule
- Returns:
impropers
- Return type:
list of Improper
- property has_position¶
Whether or not all the atoms in the molecule have positions
- Returns:
has
- Return type:
bool
- property positions¶
Positions of all the atoms in this molecule
- Returns:
positions
- Return type:
array_like
- set_positions(positions)¶
Set the positions of all atoms in this molecule
- Parameters:
positions (array_like) –
- get_drude_pairs()¶
Retrieve all the Drude dipole pairs belong to this molecule
- Returns:
pairs – [(parent, drude)]
- Return type:
list of tuple of Atom
- get_virtual_site_pairs()¶
Retrieve all the virtual site pairs belong to this molecule
- Returns:
pairs – [(parent, atom_virtual_site)]
- Return type:
list of tuple of Atom
- get_12_13_14_pairs()¶
Retrieve all the 1-2, 1-3 and 1-4 pairs based on the bond information.
The pairs only concerns real atoms. Drude particles will be ignored.
- Returns:
pairs12 (list of tuple of Atom)
pairs13 (list of tuple of Atom)
pairs14 (list of tuple of Atom)
- generate_angle_dihedral_improper(dihedral=True, improper=True)¶
Generate angle, dihedral and improper from bonds The existing angles, dihedrals and impropers will be removed first The atoms and bonds concerning Drude particles will be ignored
- Parameters:
dihedral (bool) – Whether or not generate dihedrals based on bonds
improper (bool) – Whether or not generate impropers based on bonds
- guess_connectivity_from_ff(ff, bond_limit=0.25, bond_tolerance=0.025, angle_tolerance=None, pbc='', cell=None)¶
Guess bonds, angles, dihedrals and impropers from force field.
It requires that atoms types are defined and positions are available. The distance between nearby atoms will be calculated. If it’s smaller than bond_length_limit, then it will be compared with the equilibrium length in FF. The bond will be added if a BondTerm is found in FF and the deviation is smaller than bond_tolerance. Then angles will be constructed from bonds. If angle_tolerance is None, all angles will be added. If angle_tolerance is set (as degree), then AngleTerm must be provided for these angles. The angle will be added only if the deviation between angle and equilibrium value in FF is smaller than angle_tolerance. Dihedrals and impropers will be constructed form bonds and be added if relevant terms are presented in FF.
PBC is supported for determining bonds across the periodic cell This is useful for simulating infinite structures pbc can be ‘’, ‘x’, ‘y’, ‘xy’, ‘xz’, ‘xyz’, which means check bonds cross specific boundaries cell should also be provided if pbc is not ‘’
TODO Add support for triclinic cell
- Parameters:
ff (ForceField) –
bond_limit (float) –
bond_tolerance (float) –
angle_tolerance (float) –
pbc (str) –
cell (UnitCell) –
- generate_drude_particles(ff, type_drude='DP_', seed=1, update_topology=True)¶
Generate Drude particles from DrudeTerms in force field.
The atom types should have been defined already. Drude particle will not be generated if DrudeTerm for its atom type can not be found in the FF. Note that The existing Drude particles will be removed before generating. The mass defined in the DrudeTerm will be transferred from parent atom to the Drude particle. The Drude charge will be calculated from the DrudeTerm and transferred from parent atom to the Drude particle. Bonds between parent-Drude will be generated and added to the topology. If AtomType and VdwTerm for generated Drude particles are not found in FF, these terms will be created and added to the FF.
- Parameters:
ff (ForceField) –
type_drude (str) –
seed (int) –
update_topology (bool) –
- remove_drude_particles(update_topology=True)¶
Remove all Drude particles and bonds belong to Drude particles
The charges and masses carried by Drude particles will be transferred back to parent atoms
- Parameters:
update_topology (bool) –
- generate_virtual_sites(ff, update_topology=True)¶
Generate virtual sites from VirtualSiteTerms in force field.
The atom types should have been defined already. Note that The existing virtual sites will be removed before generating. The charge won’t be assigned by this method. Therefore assign_charge_from_ff should be called to assign the charges on virtual sites.
Currently, only TIP4PSiteTerm has been implemented.
TODO Support other virtual site terms
- Parameters:
ff (ForceField) –
update_topology (bool) –
- remove_virtual_sites(update_topology=True)¶
Remove all virtual sites.
- Parameters:
update_topology (bool) –
- get_sub_molecule(indexes, deepcopy=True)¶
Extract a substructure from this molecule by indexes of atoms.
The substructure will not contain any bond, angle, dihedral and improper between atoms in substructure and remaining parts. Residue information will be reconstructed.
TODO Fix performance issue
- Parameters:
indexes (list of int) – The atoms in the substructure will be in the same order as in indexes
deepcopy (bool) – If set to False, then the atoms and connections in the substructure will be the identical object as the atoms and connections in this molecule. The data structure in this molecule will be messed up, and should not be accessed later.
- Returns:
substructure
- Return type:
- static merge(molecules, deepcopy=True)¶
Merge several molecules into a single molecule.
- Parameters:
molecules (list of Molecule) –
deepcopy (bool) – If True, the molecules will be deep-copied, and be intact after the mergence. Otherwise, the atoms of the merged molecule and of the original molecules will be the same objects. Then the original molecules will be unusable.
- Returns:
merged
- Return type:
- split(consecutive=False)¶
Split the molecule into smaller pieces based on bond network.
The atoms in each piece will preserve the original order. However, the atoms at the end of original molecule may end up in a piece in the beginning, causing the order of all atoms in all the pieces different from original order. To avoid this, set consecutive to True. In this case, it will make sure all atoms in front pieces will have atom id smaller than atoms in back pieces.
Residue information will be reconstructed for each piece
- Parameters:
consecutive (bool) –
- Returns:
molecules
- Return type:
list of Molecule