Mie Performance and Jitting
Scott Prahl
Feb 2025
Unfortunately, switching between jit and non-jit during runtime is too complicated when combined with numba caching.
[4]:
import os
import tempfile
import numpy as np
os.environ["MIEPYTHON_USE_JIT"] = "1" # Set to "0" to disable JIT
os.environ["NUMBA_CACHE_DIR"] = tempfile.gettempdir()
import miepython as mie
Size Parameters
We will use %timeit to see speeds for unjitted code, then jitted code
[6]:
ntests = 6
m = 1.5
N = np.logspace(0, 3, ntests, dtype=int)
result = np.zeros(ntests)
resultj = np.zeros(ntests)
for i in range(ntests):
x = np.linspace(0.1, 20, N[i])
a = %timeit -o qext, qsca, qback, g = mie.efficiencies_mx(m,x)
result[i] = a.best
# mie.use_numba(True) # Ensure the JIT backend is used
# for i in range(ntests):
# x = np.linspace(0.1, 20, N[i])
# a = %timeit -o qext, qsca, qback, g = mie.efficiencies_mx(m,x)
# resultj[i] = a.best
# improvement = result / resultj
# plt.loglog(N, resultj, ":r")
# plt.loglog(N, result, ":b")
# plt.loglog(N, resultj, "or", label="jit")
# plt.loglog(N, result, "ob", label="no jit")
# plt.legend()
# plt.xlabel("Number of sphere sizes calculated")
# plt.ylabel("Execution Time")
# plt.title("Jit improvement is %d to %dX" % (np.min(improvement), np.max(improvement)))
# plt.show()
1.71 μs ± 23.7 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
5.62 μs ± 156 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
25.4 μs ± 790 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
105 μs ± 558 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
407 μs ± 3.32 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
1.63 ms ± 46 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Embedded spheres
[8]:
ntests = 6
mwater = 4 / 3 # rough approximation
m = 1.0
mm = m / mwater
r = 500 # nm
N = np.logspace(0, 3, ntests, dtype=int)
result = np.zeros(ntests)
resultj = np.zeros(ntests)
for i in range(ntests):
lambda0 = np.linspace(300, 800, N[i]) # also in nm
xx = 2 * np.pi * r * mwater / lambda0
a = %timeit -o qext, qsca, qback, g = mie.efficiencies_mx(mm,xx)
result[i] = a.best
# mie.use_numba(True) # Ensure the JIT backend is used
# for i in range(ntests):
# lambda0 = np.linspace(300, 800, N[i]) # also in nm
# xx = 2 * np.pi * r * mwater / lambda0
# a = %timeit -o qext, qsca, qback, g = mie.efficiencies_mx(mm,xx)
# resultj[i] = a.best
# improvement = result / resultj
# plt.loglog(N, resultj, ":r")
# plt.loglog(N, result, ":b")
# plt.loglog(N, resultj, "or", label="jit")
# plt.loglog(N, result, "ob", label="no jit")
# plt.legend()
# plt.xlabel("Number of Wavelengths Calculated")
# plt.ylabel("Execution Time")
# plt.title("Jit improvement is %d to %dX" % (np.min(improvement), np.max(improvement)))
# plt.show()
2.85 μs ± 111 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
5.81 μs ± 75.1 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
25.2 μs ± 270 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
103 μs ± 1.11 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
405 μs ± 1.84 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
1.64 ms ± 28.3 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Testing efficiencies
Another high level function that should be sped up by jitting.
[9]:
ntests = 6
m_sphere = 1.0
n_water = 4 / 3
d = 1000 # nm
N = np.logspace(0, 3, ntests, dtype=int)
result = np.zeros(ntests)
resultj = np.zeros(ntests)
for i in range(ntests):
lambda0 = np.linspace(300, 800, N[i]) # also in nm
a = %timeit -o qext, qsca, qback, g = mie.efficiencies(m_sphere, d, lambda0, n_water)
result[i] = a.best
# mie.use_numba(True) # Ensure the JIT backend is used
# for i in range(ntests):
# lambda0 = np.linspace(300, 800, N[i]) # also in nm
# a = %timeit -o qext, qsca, qback, g = mie.efficiencies(m_sphere, d, lambda0, n_water)
# resultj[i] = a.best
# improvement = result / resultj
# plt.loglog(N, resultj, ":r")
# plt.loglog(N, result, ":b")
# plt.loglog(N, resultj, "or", label="jit")
# plt.loglog(N, result, "ob", label="no jit")
# plt.legend()
# plt.xlabel("Number of Wavelengths Calculated")
# plt.ylabel("Execution Time")
# plt.title("Jit improvement is %d to %dX" % (np.min(improvement), np.max(improvement)))
# plt.show()
3.61 μs ± 27.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
6.88 μs ± 60.8 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
26.3 μs ± 167 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
106 μs ± 2.52 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
407 μs ± 4.93 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
1.61 ms ± 14.3 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
Scattering Phase Function
[10]:
ntests = 6
m = 1.5
x = np.pi / 3
N = np.logspace(0, 3, ntests, dtype=int)
result = np.zeros(ntests)
resultj = np.zeros(ntests)
for i in range(ntests):
theta = np.linspace(-180, 180, N[i])
mu = np.cos(theta / 180 * np.pi)
a = %timeit -o s1, s2 = mie.S1_S2(m,x,mu)
result[i] = a.best
# for i in range(ntests):
# theta = np.linspace(-180, 180, N[i])
# mu = np.cos(theta / 180 * np.pi)
# a = %timeit -o s1, s2 = mie.S1_S2(m,x,mu)
# resultj[i] = a.best
# improvement = result / resultj
# plt.loglog(N, resultj, ":r")
# plt.loglog(N, result, ":b")
# plt.loglog(N, resultj, "or", label="jit")
# plt.loglog(N, result, "ob", label="no jit")
# plt.legend()
# plt.xlabel("Number of Angles Calculated")
# plt.ylabel("Execution Time")
# plt.title("Jit improvement is %d to %dX" % (np.min(improvement), np.max(improvement)))
# plt.show()
4.35 μs ± 100 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
4.12 μs ± 41.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
5.19 μs ± 64.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
8.71 μs ± 129 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
23.2 μs ± 264 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
81.7 μs ± 877 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
And finally, as function of sphere size
[11]:
ntests = 6
m = 1.5 - 0.1j
x = np.logspace(0, 3, ntests)
result = np.zeros(ntests)
resultj = np.zeros(ntests)
theta = np.linspace(-180, 180)
mu = np.cos(theta / 180 * np.pi)
for i in range(ntests):
a = %timeit -o s1, s2 = mie.S1_S2(m,x[i],mu)
result[i] = a.best
# mie.use_numba(True) # Ensure the JIT backend is used
# for i in range(ntests):
# a = %timeit -o s1, s2 = mie_S1_S2(m,x[i],mu)
# resultj[i] = a.best
# improvement = result / resultj
# plt.loglog(N, resultj, ":r")
# plt.loglog(N, result, ":b")
# plt.loglog(N, resultj, "or", label="jit")
# plt.loglog(N, result, "ob", label="no jit")
# plt.legend()
# plt.xlabel("Sphere Size Parameter")
# plt.ylabel("Execution Time")
# plt.title("Jit improvement is %d to %dX" % (np.min(improvement), np.max(improvement)))
# plt.show()
7.86 μs ± 100 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
12.9 μs ± 89.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
20.2 μs ± 23.5 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
42.9 μs ± 2.61 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
133 μs ± 1.75 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
504 μs ± 19 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
[ ]: