The counter-intuitive rise of Python in scientific computing
The writer of this code is Matthew Kennel. In keeping with the publication the place it was launched (again in 2004!) (https://arxiv.org/pdf/physics/0408067.pdf) it’s offered below phrases of the Academic Free License. I’ve finished some refactoring of it earlier than in a non-public fork.
The Scipy library has a helpful kd-tree which is described briefly of their Nature Strategies paper (see Fig. 3 in https://www.nature.com/articles/s41592-019-0686-2). The newest model is definitely written in C++:
The benchmark is to assemble a tree of n = 10000 uniformly distributed factors, and question the closest neighbors for r = 1000 random vectors in m = 3, 8, and 16 dimensions. The precise benchmark setup will be discovered right here: https://github.com/scipy/scipy/blob/10c8c8de491a607378d6dca7602b0a3ed440a4b4/benchmarks/benchmarks/spatial.py#L110-L139
I’ve repeated this check with the kdtree2
code from Matthew Kennel:
program kdtree_benchmark
use kdtree2_module
implicit none
sort(kdtree2), pointer :: tree
actual(kdkind), dimension(:,:), allocatable :: knowledge, queries
sort(kdtree2_result) :: outcomes(1)
actual :: t0, t1, t2
integer, parameter :: m(3) = [3, 8, 16]
integer, parameter :: n = 10000, r = 1000
integer :: idxs(r), idx
integer :: i
do i = 1, 3
write(*,*) "Dimension = ", m(i)
allocate(knowledge(m(i),n))
name random_number(knowledge)
allocate(queries(m(i),r))
name random_number(queries)
! Populate tree
name cpu_time(t0)
tree => kdtree2_create(knowledge,type=.true.,rearrange=.true.)
name cpu_time(t1)
! Question random vectors
do idx = 1, r
name kdtree2_n_nearest(tp=tree,qv=queries(:,idx),nn=1,outcomes=outcomes)
idxs(idx) = outcomes(1)%idx
finish do
name cpu_time(t2)
write(*,*) "Tree construct (s): ", t1 - t0
write(*,*) "Tree question (s): ", t2 - t1
name kdtree2_destroy(tree)
deallocate(knowledge)
deallocate(queries)
finish do
finish program
Operating this system on a >5 yr outdated ThinkPad T530 with an Intel® Core™ i5-3320M CPU:
$ gfortran -O3 src/kdtree2_precision_module.f90 src/kdtree2_priority_queue_module.f90 src/kdtree2_module.f90 checks/time.f90 -o time
$ ./time
Dimension = 3
Tree construct (s): 3.41600017E-03
Tree question (s): 1.05800014E-03
Dimension = 8
Tree construct (s): 4.14900016E-03
Tree question (s): 1.39079997E-02
Dimension = 16
Tree construct (s): 5.63300028E-03
Tree question (s): 0.218334988
The 16 yr outdated Fortran code delivers related efficiency to the newest Scipy model!
Had Bob from the weblog publish identified about it, he may have instructed these Python kids…