Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add documentation and added improved inertia alignment #122

Merged
merged 44 commits into from
Jan 9, 2025
Merged
Show file tree
Hide file tree
Changes from 38 commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
63bf137
Added example for Inertia re-alignment
charnley Nov 29, 2024
4952753
Ignore notebook leftovers
charnley Nov 29, 2024
cbb3f89
Correct package
charnley Nov 29, 2024
4e351f7
Update README.rst
charnley Nov 29, 2024
819ca80
Update README.rst
charnley Dec 2, 2024
3f689e2
Update README.rst
charnley Dec 2, 2024
aafb216
Starting to do notes
charnley Dec 13, 2024
55c3463
Added interactive slider
charnley Dec 17, 2024
714e793
Checkpoint
charnley Jan 2, 2025
a0d2213
Iteration
charnley Jan 2, 2025
40007aa
Update README.rst
charnley Jan 2, 2025
78a3e91
Update README.rst
charnley Jan 2, 2025
f987f79
Update README.rst
charnley Jan 2, 2025
ab520f5
Update README.rst
charnley Jan 2, 2025
c5297f8
Update README.rst
charnley Jan 2, 2025
f37d166
More pictures
charnley Jan 2, 2025
25110ec
Update README.rst
charnley Jan 2, 2025
65334be
Iteration
charnley Jan 3, 2025
b5e4c3b
Update README.rst
charnley Jan 3, 2025
f77ed46
Update README.rst
charnley Jan 3, 2025
d9e953f
Update README.rst
charnley Jan 3, 2025
eba625d
Update README.rst
charnley Jan 3, 2025
b246349
Iteration
charnley Jan 4, 2025
ad83eb1
Merge branch 'master' into charnley/doc
charnley Jan 4, 2025
3ff568e
Figures
charnley Jan 4, 2025
3da31a3
Update README.rst
charnley Jan 4, 2025
95e816b
Clean up notebooks
charnley Jan 4, 2025
7ec2bc8
Added inertia test and fixed some stuff
charnley Jan 4, 2025
51a16dd
Delete tests/resources/examples/test.xyz
charnley Jan 4, 2025
2bd6764
Update README.rst
charnley Jan 5, 2025
a561a93
Delete example.py
charnley Jan 5, 2025
3a2ef0b
remove old images
charnley Jan 5, 2025
c0f791c
Update README.rst
charnley Jan 5, 2025
a1a739a
Update README.rst
charnley Jan 5, 2025
c249fe8
Update README.rst
charnley Jan 5, 2025
bbbed38
Update README.rst
charnley Jan 5, 2025
67ef723
Move to docs
charnley Jan 5, 2025
234dbf4
Update README.rst
charnley Jan 6, 2025
51e5885
Feedback from Hagen
charnley Jan 9, 2025
0179219
Types
charnley Jan 9, 2025
0db3536
Added better --help information
charnley Jan 9, 2025
1ae244a
Update README.rst
charnley Jan 9, 2025
303d1a2
Update README.rst
charnley Jan 9, 2025
b760b29
fixed f-string for older python
charnley Jan 9, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@
.vim
env
*.sqlite3
*.ipynb_checkpoints

*.py[cod]

Expand Down
2 changes: 1 addition & 1 deletion .pre-commit-config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,7 @@ repos:
hooks:
- id: mypy
name: checking types
entry: mypy
entry: mypy --explicit-package-bases
language: system
types: [ python ]

Expand Down
3 changes: 3 additions & 0 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,9 @@ build:
upload:
${python} -m twine upload ./dist/*

start-jupyter:
${python} -m jupyterlab

## Version

version:
Expand Down
205 changes: 134 additions & 71 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,90 +1,82 @@
Calculate Root-mean-square deviation (RMSD) of Two Molecules Using Rotation
===========================================================================

The root-mean-square deviation (RMSD) is calculated, using Kabsch algorithm
(1976) or Quaternion algorithm (1991) for rotation, between two Cartesian
coordinates in either ``.xyz`` or ``.pdb`` format, resulting in the minimal
RMSD.
The root mean Square Deviation (RMSD) is the most common metric for measuring structural similarity between two structures. It is typically used in molecular biology, chemistry, and computational chemistry.

For more information please read RMSD_ and `Kabsch algorithm`_.
However, the result can become misleadingly large unless the input data has pre-optimized translation and rotation of the molecules in question.
This solution will perform this optimization before calculating optimal (minimal) RMSD values.

.. _RMSD: http://en.wikipedia.org/wiki/Root-mean-square_deviation
.. _Kabsch algorithm: http://en.wikipedia.org/wiki/Kabsch_algorithm

Motivation
----------
Additionally, if the atoms in the molecules are not correctly ordered, optimal rotation is impossible to achieve.
This tool utilizes several ways to solve this problem.

You have molecule A and B and want to calculate the structural difference
between those two. If you just calculate the RMSD_ straight-forward you might
get a too big of a value as seen below. You would need to first recenter the
two molecules and then rotate them unto each other to get the true minimal
RMSD. This is what this script does.
For more details, see below and read RMSD_ and `Kabsch algorithm`_.

========== =========== ==========
No Changes Re-centered Rotated
---------- ----------- ----------
|begin| |translate| |rotate|
========== =========== ==========
RMSD 2.50 RMSD 1.07 RMSD 0.25
========== =========== ==========
.. contents:: Overview
:depth: 1

.. |begin| image:: https://raw.githubusercontent.com/charnley/rmsd/master/img/plot_beginning.png
.. |translate| image:: https://raw.githubusercontent.com/charnley/rmsd/master/img/plot_translated.png
.. |rotate| image:: https://raw.githubusercontent.com/charnley/rmsd/master/img/plot_rotated.png


Citation
--------
Installation
============

- **Implementation**:
Calculate Root-mean-square deviation (RMSD) of Two Molecules Using Rotation, GitHub,
http://github.com/charnley/rmsd, <git commit hash or version number>
The easiest is to get the program via ``pip``.

- **Kabsch algorithm**:
Kabsch W., 1976,
A solution for the best rotation to relate two sets of vectors,
Acta Crystallographica, A32:922-923,
doi: http://dx.doi.org/10.1107/S0567739476001873
.. code-block:: bash

- **Quaternion algorithm**:
Michael W. Walker and Lejun Shao and Richard A. Volz, 1991,
Estimating 3-D location parameters using dual number quaternions, CVGIP: Image Understanding, 54:358-367,
doi: http://dx.doi.org/10.1016/1049-9660(91)90036-o
pip install rmsd

Please cite this project when using it for scientific publications.
There is only one Python file, so you can also download `calculate_rmsd.py` and put it in your bin folder.

.. code-block:: bash

Installation
------------
wget -O calculate_rmsd https://raw.githubusercontent.com/charnley/rmsd/master/rmsd/calculate_rmsd.py
chmod +x calculate_rmsd

Easiest is to get the program vis PyPi under the package name ``rmsd``,
Details
=======

.. code-block:: bash
To calculate the structural difference between two molecules, you might initially compute the RMSD directly (**Figure 1.a**).
However, this straightforward approach could give you a misleadingly large value.
To get the true minimal RMSD, you must adjust for translation (**Figure 1.b**) and rotation (**Figure 1.c**). This process aligns the two molecules best, ensuring the RMSD accurately reflects their structural similarity after optimal alignment.

pip install rmsd
.. list-table::
:header-rows: 1

* - 1.a
- 1.b
- 1.c

or download the project from GitHub via
* - |fig1.a|
- |fig1.b|
- |fig1.c|

.. code-block:: bash
* - RMSD = 2.8
- RMSD = 0.8
- RMSD = 0.2

git clone https://github.com/charnley/rmsd
Atom reordering methods can be used in cases where the atoms in the two molecules are not in the same order (**Figure 2.a**).
These algorithms find the optimal mapping of atoms between the two structures to minimize RMSD.

Each method has limitations because finding the best atom mapping depends on properly aligning structures.
This is usually done by comparing atom-pair distances. If the molecules are aligned, using the Hungarian_ cost minimization for atom distance works well.
If not, you can align the Inertia_ eigenvectors (**Figure 2.b**) as an approximation to align the molecules.
Or, use atomic descriptors (**Figure 2.c**), independent of the coordinate system, to reorder the atoms. Note that all reordering methods have limitations and drawbacks, and the actual order might not be found.

There is only one Python file, so you can also download `calculate_rmsd.py` and
put it in your bin folder.
.. list-table::
:header-rows: 1

.. code-block:: bash
* - 2.a
- 2.b
- 2.c

wget -O calculate_rmsd https://raw.githubusercontent.com/charnley/rmsd/master/rmsd/calculate_rmsd.py
chmod +x calculate_rmsd
* - |fig2.a|
- |fig2.b|
- |fig2.c|

Usage examples
--------------
==============

Use ``calculate_rmsd --help`` to see all the features. Usage is pretty straight
forward, call ``calculate_rmsd`` with two structures in either ``.xyz`` or
``.pdb``. In this example Ethane has the exact same structure, but is
``.pdb``. In this example, Ethane has the same structure but is
translated in space, so the RMSD should be zero.

.. code-block:: bash
Expand All @@ -99,33 +91,104 @@ visual comparison. The output will be in XYZ format.

calculate_rmsd --no-hydrogen --print tests/ethane.xyz tests/ethane_mini.xyz

If the atoms are scrambled and not aligned you can use the ``--reorder``
argument which will align the atoms from structure B unto A. Use
``--reorder-method`` to select what method for reordering. Choose between
Hungarian_ (default), distance (very approximate) and brute force (slow).
If the atoms are scrambled and not aligned, you can use the ``--reorder``
argument, which will align the atoms from structure B onto A.

.. _Hungarian: https://en.wikipedia.org/wiki/Hungarian_algorithm
Use ``--reorder-method`` to select the reordering method.
Choose between
Inertia_ aligned Hungarian_ distance ``inertia-hungarian`` (default),
Hungarian_ distance ``hungarian`` (if the structure is already aligned),
sorted distance ``distance``,
atomic representation ``qml``,
and brute force ``brute`` (for reference, don't use this).
More details on which to use in ``--help``.

.. code-block:: bash

calculate_rmsd --reorder tests/water_16.xyz tests/water_16_idx.xyz

If you want to run multiple calculations simultaneously, it's best not to rely solely on the script.
Instead, you can use GNU Parallel to handle this efficiently. For example, compare all ``ethane_*`` molecules using two cores and print one file and the RMSD per line.
Bash is good for stuff like that.

It is also possible to use RMSD as a library in other scripts, see
``example.py`` and ``tests/*`` for example usage.

.. code-block:: bash

Problems?
---------
find tests/resources -name "ethane_*xyz" | parallel -j2 "echo -n '{} ' && calculate_rmsd --reorder --no-hydrogen tests/resources/ethane.xyz {}"

Submit issues or pull requests on GitHub.
It is also possible to use RMSD as a library in other scripts; see ``tests/*`` for example usage.

Known problems
==============

A note on PDB
-------------
Found a bug? Submit issues or pull requests on GitHub.

Protein Data Bank format (PDB) is column-based; however, countless examples of non-standard ``.pdb`` files exist.
**Note on PDB format.** Protein Data Bank format (PDB) is column-based; however, countless examples of non-standard ``.pdb`` files exist.
We try to read them, but if you have trouble reading the file, check if the file format is compliant with PDB.
For example, some hydrogens are noted as ``HG11``, which we assume is not mercury.

- https://www.wwpdb.org/documentation/file-format-content/format33/sect9.html#ATOM

Citation
========

Please cite this project when using it for scientific publications. And cite the relevant methods implemented.

**Implementation**:
Calculate Root-mean-square deviation (RMSD) of Two Molecules Using Rotation, GitHub,
http://github.com/charnley/rmsd, <git commit hash or version number>


.. list-table::
:header-rows: 1

* - Method
- Argument
- Citation

* - **Kabsch**
- ``--rotation-method kabsch`` (Default)
- Wolfgang Kabsch (1976),
Acta Crystallographica, A32:922-923

http://dx.doi.org/10.1107/S0567739476001873

* - **Quaternion**
- ``--rotation-method quaternion``
- Walker, Shao & Volz (1991),
CVGIP: Image Understanding, 54:358-367,

http://dx.doi.org/10.1016/1049-9660(91)90036-o

* - **Distance Hungarian Assignment**
- ``--reorder-method inertia-hungarian`` (Default)
- Crouse (2016). Vol. 52, Issue 4, pp. 1679–1696, IEEE.

http://dx.doi.org/10.1109/TAES.2016.140952

* - **FCHL19**
- ``--reorder-method qml``
- Christensen et al (2020), J. Chem. Phys. 152, 044107

https://doi.org/10.1063/1.5126701

References
==========

- http://en.wikipedia.org/wiki/Root-mean-square_deviation
- http://en.wikipedia.org/wiki/Kabsch_algorithm
- https://en.wikipedia.org/wiki/Hungarian_algorithm
- https://docs.scipy.org/doc/scipy/reference/generated/scipy.optimize.linear_sum_assignment.html

.. _RMSD: http://en.wikipedia.org/wiki/Root-mean-square_deviation
.. _Kabsch algorithm: http://en.wikipedia.org/wiki/Kabsch_algorithm
.. _Hungarian: https://en.wikipedia.org/wiki/Hungarian_algorithm
.. _Inertia: https://en.wikipedia.org/wiki/Moment_of_inertia


.. |fig1.a| image:: https://raw.githubusercontent.com/charnley/rmsd/refs/heads/charnley/doc/docs/figures/fig_rmsd_nothing.png
.. |fig1.b| image:: https://raw.githubusercontent.com/charnley/rmsd/refs/heads/charnley/doc/docs/figures/fig_rmsd_recentered.png
.. |fig1.c| image:: https://raw.githubusercontent.com/charnley/rmsd/refs/heads/charnley/doc/docs/figures/fig_rmsd_rotated.png

.. |fig2.a| image:: https://raw.githubusercontent.com/charnley/rmsd/refs/heads/charnley/doc/docs/figures/fig_reorder_problem.png
.. |fig2.b| image:: https://raw.githubusercontent.com/charnley/rmsd/refs/heads/charnley/doc/docs/figures/fig_reorder_inertia.png
.. |fig2.c| image:: https://raw.githubusercontent.com/charnley/rmsd/refs/heads/charnley/doc/docs/figures/fig_reorder_qml.png
Binary file added docs/figures/fig_reorder_inertia.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figures/fig_reorder_problem.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figures/fig_reorder_qml.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figures/fig_rmsd_nothing.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figures/fig_rmsd_recentered.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/figures/fig_rmsd_rotated.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
29 changes: 29 additions & 0 deletions docs/notebooks/coord_funcs.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
import numpy as np
from numpy import ndarray


def get_rotation_matrix(degrees: float) -> ndarray:
"""https://en.wikipedia.org/wiki/Rotation_matrix"""

radians = degrees * np.pi / 180.0

r11 = np.cos(radians)
r12 = -np.sin(radians)
r21 = np.sin(radians)
r22 = np.cos(radians)

R = np.array([[r11, r12], [r21, r22]])

return R


def degree2radiant(degrees):
return degrees * np.pi / 180.0


def rotate_coord(angle, coord, axis=[0, 1]):
U = get_rotation_matrix(angle)
_xy = np.dot(coord[:, axis], U)
_coord = np.array(coord, copy=True)
_coord[:, axis] = _xy
return _coord
Loading
Loading