Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linear and efficient neighbour finder #393

Merged
merged 17 commits into from
Feb 16, 2024
Merged

Conversation

pfebrer
Copy link
Contributor

@pfebrer pfebrer commented Nov 2, 2021

This PR implements a linear scaling algorithm to find neighbours.

The algorithm

It is mostly inspired by the method described here. I don't know if it is the same way SIESTA does it.

Basically you divide the whole unit cell into 3D bins and you "store" each atom in the corresponding bin. Each bin is as big as the maximum distance that you have to look for neighbours. E.g. if the maximum radius is 1.5 Ang, then the bins need to be of size 3 Ang (6 Ang if you are looking for sphere overlaps.). That ensures that when you look for the neighbors of a given atom/position you only need to go one bin away in each direction. Therefore, you just need to explore 2**3 = 8 bins for each atom/position instead of all the atoms in the structure.

The concept is the same as what ASE uses in their neighbour list (https://wiki.fysik.dtu.dk/ase/ase/neighborlist.html). The implementation is quite different though and you will see this in the benchmarks (below).

How to use it

Currently, you have to import the class NeighFinder:

from sisl.geom import NeighFinder

Then instantiate it with a geometry, which will do the organization of the bins. The parameters R and sphere_overlap need to be provided here as well, since the size of the bins depends on them.

geom = sisl.geom.graphene_nanoribbon(5).tile(3, 0)
finder = NeighFinder(geom, R=1.5, sphere_overlap=False)

Once instantiated, you will be able to query the neighbour finder as much as you want. Although the most efficient way to query it is all at once. There are two methods to find neighbours: find_neighbours and find_all_unique_pairs.

The first one accepts atoms as an argument to find only the neighbours for a subset.

finder.find_neighbours(atoms=[10, 13])
[array([12,  8, 13], dtype=int32), array([16, 11, 10], dtype=int32)]

You can also choose to return the neighbours as pairs if this is the way you need them, which is faster (splitting to get the above result is actually just an extra internal step):

finder.find_neighbours(atoms=[10,13], as_pairs=True)
array([[10, 12],
       [10,  8],
       [10, 13],
       [13, 16],
       [13, 11],
       [13, 10]], dtype=int32)

Also, if you want just all unique pairs of neighbouring atoms you can call find_all_unique_pairs, which will save time and memory.

finder.find_all_unique_pairs()
array([[ 0,  2],
       [ 0,  3],
       [ 1,  4],
       [ 1,  3],
       [ 2,  5],
       [ 3,  6],
       [ 4,  7],
       [ 5,  8],
       [ 6,  9],
       [ 6,  8],
       [ 7,  9],
       [ 8, 10],
       [ 9, 11],
       [10, 12],
       [10, 13],
       [11, 14],
       [11, 13],
       [12, 15],
       [13, 16],
       [14, 17],
       [15, 18],
       [16, 19],
       [16, 18],
       [17, 19],
       [18, 20],
       [19, 21],
       [20, 22],
       [20, 23],
       [21, 24],
       [21, 23],
       [22, 25],
       [23, 26],
       [24, 27],
       [25, 28],
       [26, 29],
       [26, 28],
       [27, 29]], dtype=int32)

Benchmarks

Here I benchmark the implementation against ASE and a trivial close search.

The quadratic search is just a loop like:

def quadratic(geom):
    neighs = []
    for at in geom:
        neighs.append(geom.close(at, R=1.5))

The ASE call looks like:

from ase.neighborlist import neighbor_list

neighbours = neighbor_list("ij", geom.to.ase(), 1.5, self_interaction=False)

and this implementation call is:

from sisl.geom import NeighFinder

neighbours = NeighFinder(geom, R=1.5, sphere_overlap=False).find_neighbours(None, self_interaction=False, ...)
# or the same but calling find_all_unique_pairs
See the full script
from ase.neighborlist import neighbor_list
from sisl.geom import NeighFinder, fcc, graphene_nanoribbon

import timeit
from collections import defaultdict

def quadratic(geom):
    neighs = []
    for at in geom:
        neighs.append(geom.close(at, R=1.5))

what = "fcc"
times = defaultdict(list)
na = []

# Iterate over multiple tiles
for tiles in range(1, 10, 1):
    print(tiles)
    
    if what == "fcc":
        geom = fcc(1.5, "C").tile(3*tiles, 1).tile(3*tiles, 2).tile(3*tiles, 0)
    elif what == "ribbon":
        geom = graphene_nanoribbon(9).tile(tiles, 0)
    
    na.append(geom.na)
    
    # Compute with ASE
    ase_time = timeit.timeit(lambda: neighbor_list("ij", geom.to.ase(), 1.5, self_interaction=False), number=1)
    times["ASE"].append(ase_time)
    
    # Compute with this implementation (different variants)
    my_time = timeit.timeit(lambda: NeighFinder(geom, R=1.5).find_neighbours(None, as_pairs=True), number=1)
    times["THIS (PAIRS)"].append(my_time)
    
    my_time = timeit.timeit(lambda: NeighFinder(geom, R=1.5).find_neighbours(None, as_pairs=False), number=1)
    times["THIS (NEIGHS)"].append(my_time)
    
    my_time = timeit.timeit(lambda: NeighFinder(geom, R=1.5).find_all_unique_pairs(), number=1)
    times["THIS (UNIQUE PAIRS)"].append(my_time)
    
    # Compute with quadratic search
    quadratic_time = timeit.timeit(lambda: quadratic(geom), number=1)
    times["QUADRATIC"].append(quadratic_time)
    
# Plotting
import plotly.graph_objs as go

fig = go.Figure()
for key, time in times.items():
    fig.add_scatter(
        x=na, y=time, name=key
    )

fig.update_layout(
    xaxis_title="Number of atoms",
    yaxis_title="Time (s)",
    title=f"Finding neighbours on {what}",
    yaxis_showgrid=True,
    xaxis_showgrid=True
)

First, I compare the calculation of neighbors for an fcc system, which I tile uniformly in all directions

Against quadratic implementation:

fcc_vs_quad

From as little as 40 atoms, this implementation is faster. After that, the quadratic implementation blows up. This is even more critical if the system has periodic boundary conditions. With nsc=[3,3,3]:

image

Note that the implementation in this PR only needs to calculate some offsets if there are pbc and we are at the border of the cell, but still the search is only performed on 8 bins.

Note also that this implementation can scale to huge systems:
image

It works even better in sparse systems, where the number of candidate neighbors for each atom is much lower since some bins don't contain atoms:

image

Now, comparing with ASE, I have found that their algorithm scales pretty bad with the cell size:
image

This is (I think), because they store the location of each atom in a dense array of shape (nbins, max_atoms_per_bin). I followed the implementation described here. The usage of memory seems to be much more optimized. In fact, I have checked that this implementation almost keeps constant time if we increase the cell while keeping the number of atoms constant, while ASE's scales linearly with it:

image

Work to do

The current implementation only finds neighbors in the unit cell. As I mentioned before, one only needs to apply offsets if need it. I already compute the supercell indices of the bins that are searched, so it shouldn't be much of a problem.

However, I have some doubts:

  • When should we look for neighbors in neighboring cells? In principle, periodic conditions in sisl are indicated by nsc > 1. However, the user may not know beforehand if there are inter-cell interactions for a given radius. Should we require nsc to be 3 anyway?
  • Should we return atom indices in supercell indices? If so, that depends on nsc. So the user would need to set nsc themselves or receive the new values of nsc for them to use the indices properly.

Also if the neighbor search spans more than 1 neighboring cell (nsc > 3), I will make it compute neighbors with the quadratic algorithm.

Cheers!

@zerothi
Copy link
Owner

zerothi commented Nov 2, 2021

could you just share your benchmark script? For completeness sake ;)

@pfebrer
Copy link
Contributor Author

pfebrer commented Nov 2, 2021

Done! I updated the comment, it is hidden in the "See the full script" collapsible at the top of the benchmarks.

@zerothi
Copy link
Owner

zerothi commented Nov 2, 2021

Just a comment. In your ase timing, I think you are also timing the conversion of the sisl object to the ase object. Try and move this conversion out and see if this changes the picture. I get quite good timings with ase all across.

@pfebrer
Copy link
Contributor Author

pfebrer commented Nov 2, 2021

Hmm I moved the conversion out of the timing and I get almost the same. Could you share which systems are you testing? When I began testing, I only tiled the fcc in one direction and that gave similar timings between ASE and this implementation. But when I started tiling all dimensions the ASE implementation started to give very high timings.

@pfebrer
Copy link
Contributor Author

pfebrer commented Nov 2, 2021

Hmm I moved the conversion out of the timing and I get almost the same. Could you share which systems are you testing? When I began testing, I only tiled the fcc in one direction and that gave similar timings between ASE and this implementation. But when I started tiling all dimensions the ASE implementation started to give very high timings.

As I said, I think the problem is that the ASE implementation scales with cell size (see time vs cell size in the benchmarks)

@zerothi
Copy link
Owner

zerothi commented Nov 2, 2021

Hmm I moved the conversion out of the timing and I get almost the same. Could you share which systems are you testing? When I began testing, I only tiled the fcc in one direction and that gave similar timings between ASE and this implementation. But when I started tiling all dimensions the ASE implementation started to give very high timings.

I fucked it up ;) Nevertheless moving it outside is appropriate :)

@pfebrer
Copy link
Contributor Author

pfebrer commented Nov 3, 2021

I incorporated the distance filtering inside the fortran loop (previously it was done a posteriori in python) and now it is even more efficient. I kept the old ones to compare for now (dashed lines in this plot):

image

Now one can scale up the tiled fcc even more (compare with previous fcc plot which showed up to 1.6M atoms):

image

In fact I could keep making it larger if it wasn't because it results in an integer overflow 😅

@zerothi
Copy link
Owner

zerothi commented Nov 3, 2021

This is amazing!! I'll have a look asap, but I'll try and get #388 done first.

@pfebrer
Copy link
Contributor Author

pfebrer commented Nov 3, 2021

Ok! I'm going to keep optimizing some things now that I know everything works fine :)

@pfebrer
Copy link
Contributor Author

pfebrer commented Nov 4, 2021

I added (optional) periodic boundary conditions. For now, what is done is to return the supercell indices of the neighbors (in three extra columns).

Also, I added a find_close method to find neighbors of space coordinates instead of atoms. I guess it can be useful for grids (?).

@pfebrer
Copy link
Contributor Author

pfebrer commented Dec 14, 2021

This is almost ready, the only two things that need to be done are:

  • Implement the edge case of needing a supercell bigger than 3 in some direction.
  • Decide about the API:
    • Should we merge all the functionality in a single method?
    • Probably add option to return bond length directly and coordinate shifts instead of supercell indices.
    • Should this method replace the Geometry.find_close method?

@zerothi
Copy link
Owner

zerothi commented Dec 14, 2021

Thanks, I still plan on getting to this very interesting and useful thing! :)

I would however, probably change it to a Cython code, since fortran does not play nice with C-arrays (it may incur copies that are not necessary. So I first need to understand what you are doing, and then subsequently change it a bit. :)

@pfebrer
Copy link
Contributor Author

pfebrer commented Dec 14, 2021

Isn't there a way to check if copies are made? I tried to do it and didn't get any warning, but probably it is just because I did it wrong 😅

@zerothi
Copy link
Owner

zerothi commented Dec 14, 2021

Isn't there a way to check if copies are made? I tried to do it and didn't get any warning, but probably it is just because I did it wrong sweat_smile

pip3 install . --f2py-report-copy

should suffice, however, I would still like to use C instead of fortran. Simply because of compatibility issues going forward. numpy distutils will be discontinued in 3.12, and then I need to change to pythran (most likely). However, Cython won't require anything. So keeping it like this would be ideal.

@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 22, 2022

I'm trying to move this to cython 👍

@codecov
Copy link

codecov bot commented Feb 22, 2022

Codecov Report

Attention: 208 lines in your changes are missing coverage. Please review.

Comparison is base (09d1388) 86.14% compared to head (7f9b4dc) 85.78%.
Report is 7 commits behind head on main.

Files Patch % Lines
src/sisl/geom/_neighbors/_operations.py 0.00% 194 Missing ⚠️
src/sisl/geom/_neighbors/_finder.py 93.02% 12 Missing ⚠️
src/sisl/utils/misc.py 88.88% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #393      +/-   ##
==========================================
- Coverage   86.14%   85.78%   -0.37%     
==========================================
  Files         364      367       +3     
  Lines       43511    43904     +393     
==========================================
+ Hits        37483    37662     +179     
- Misses       6028     6242     +214     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 23, 2022

I have a Cython question. Imagine I have the geometry's coordinates in an array called xyz, and for a given atom I want to store the three components in another array called say atom_xyz. Doing this:

cdef np.ndarray[np.float_64, ndim=2] xyz
cdef np.ndarray[np.float_64] atom_xyz

atom_xyz[:] = xyz[3, :]

Cython shows python interaction and it slows down the code. Is there any way of avoiding the loop:

for i in range(3):
    atom_xyz[i] = xyz[3, i]

without hurting performance?

@zerothi
Copy link
Owner

zerothi commented Feb 23, 2022

I have a Cython question. Imagine I have the geometry's coordinates in an array called xyz, and for a given atom I want to store the three components in another array called say atom_xyz. Doing this:

cdef np.ndarray[np.float_64, ndim=2] xyz
cdef np.ndarray[np.float_64] atom_xyz

atom_xyz[:] = xyz[3, :]

Cython shows python interaction and it slows down the code. Is there any way of avoiding the loop:

for i in range(3):
    atom_xyz[i] = xyz[3, i]

without hurting performance?

You should generally avoid using np.ndarray in cython code, rather you should use memory views. And if they are temporary arrays use an allocated array.

Also, I guess you mean:

atom_xyz[:] = xyz[ia, :]

no?

@zerothi
Copy link
Owner

zerothi commented Feb 23, 2022

also, try and use cython --annotate that will give you a html page with some indications on where you can optimize ;)

@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 23, 2022

Yes, I was already doing that! I got that there was python interaction in this part, but I didn't know that with a memoryview you could do this. I've changed it now and all good!

@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 23, 2022

Ok, everything is working fine and as fast as in fortran now for the find_neighbours method :)

newplot - 2022-02-23T104204 560

I just have to do the same for the rest of methods and that would be it. This afternoon I'll work on it and the PR will be ready to discuss the details and merge! Any particular suggestion?

@zerothi
Copy link
Owner

zerothi commented Feb 23, 2022

Great job! I'll probably take over and clean things up, also so I know what's happening, I want to learn this one ;)

@zerothi
Copy link
Owner

zerothi commented Feb 23, 2022

Just let me know when I should take over! :0

@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 23, 2022

Ok, you can look at it now, everything is implemented 👍

@zerothi
Copy link
Owner

zerothi commented Feb 25, 2022

Ok some comments:

  1. when you overwrite nsc in the geometry, you'll also change the supercell indices it returns, this will break in some cases, it needs to be fixed in another way
  2. Why do you only need to look vertically and horizontally for neighbouring bins? What about the diagonals? I.e. an atom at the bin corner could connect to an atom along the diagonal? So effectively one should look in D**3 bins, no? Perhaps your tests does not catch these corner cases?
  3. Would it make sense to allow lower dimensionality searches, e.g. projection all atoms onto a plane and search only in this plane? It seems trivially generalized.

I am still working on it, so don't push anything ;)

@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 25, 2022

  1. Can you make me remember which part of the code do you mean? I can't recall. I don't return converted atomic indices and the supercell indices as I use them and return them have three components (x,y,z), so I don't understand how this could get affected (?)

  2. I do check the diagonals. For each atom/position I create a cube of 8 bins with potential neighbors (in get_search_indices). Could you draw a sketch of which bins I'm missing? I can't see it.

  3. Off the top of my head I don't know if this is easily generalizable, because depending on the direction of the plane, the created grid of bins would be useless. So perhaps it would be easier to create a new projected geometry and run the algorithm with that one. Which cases do you have in mind?

@zerothi zerothi mentioned this pull request Jan 23, 2024
10 tasks
src/sisl/geom/__init__.py Fixed Show fixed Hide fixed
@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 14, 2024

Do you have a plan for making Geometry.close use the finder?

pfebrer and others added 13 commits February 15, 2024 11:03
Signed-off-by: Nick Papior <[email protected]>
future.annotations requires Cython>=3

Signed-off-by: Nick Papior <[email protected]>
changed class and argument names to make them simpler.

Also changed default of overlap to False.
The internal R is now an array, regardless, but
it can have either ndim 0 or 1, depending on
whether it is an array or not.

Signed-off-by: Nick Papior <[email protected]>
This is not to diminish other contributors.
Pol has made tremendous contributions.
This is a delicate decision, and is in no
way a discouragement of other contributions made.

Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
Signed-off-by: Nick Papior <[email protected]>
@@ -43,6 +43,7 @@

"""
from ._composite import *
from ._neighbors import *

Check notice

Code scanning / CodeQL

'import *' may pollute namespace Note

Import pollutes the enclosing namespace, as the imported module
sisl.geom._neighbors
does not define '__all__'.
In some cases (when the bin-size is big) the
maximum pairs it would allow could be quite big.
When the bins has many atoms, the error in the
actual neighbors can be significant, leading
to memory problems.

Now the finder uses a maximum of 200MB of initial
neighbor lists. This results in incremental re-allocations
which is a bit more heavy, but it reduces the chances of
running out of memory.

One can change the default memory by changing the
class variable: Neighbor.memory accordingly.

Added it to the documentation.

Signed-off-by: Nick Papior <[email protected]>
@zerothi
Copy link
Owner

zerothi commented Feb 16, 2024

Do you have a plan for making Geometry.close use the finder?

Yes, however I think it would be best to figure out the #553 problem in a better way. Perhaps nsc should be a post-analysis problem that is used to cut the resulting arrays, and then work from that.

src/sisl/utils/misc.py Fixed Show fixed Hide fixed
@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 16, 2024

Ok! Also, can we make bin_size accept a tuple with one element for each dimension?

@zerothi zerothi merged commit 04307fe into zerothi:main Feb 16, 2024
6 of 8 checks passed
@pfebrer
Copy link
Contributor Author

pfebrer commented Feb 16, 2024

Thank you! One less thing to worry about hehe

@zerothi
Copy link
Owner

zerothi commented Feb 16, 2024

Thanks for contribution!

zerothi added a commit that referenced this pull request Feb 23, 2024
This follows #393 naming convention (US-style).

Signed-off-by: Nick Papior <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants