Mie Performance and Jitting

Scott Prahl

Feb 2025

Unfortunately, switching between jit and non-jit during runtime is too complicated when combined with numba caching.

[4]:
import os
import tempfile
import numpy as np

os.environ["MIEPYTHON_USE_JIT"] = "1"  # Set to "0" to disable JIT
os.environ["NUMBA_CACHE_DIR"] = tempfile.gettempdir()

import miepython as mie

Size Parameters

We will use %timeit to see speeds for unjitted code, then jitted code

[6]:
ntests = 6

m = 1.5
N = np.logspace(0, 3, ntests, dtype=int)
result = np.zeros(ntests)
resultj = np.zeros(ntests)

for i in range(ntests):
    x = np.linspace(0.1, 20, N[i])
    a = %timeit -o qext, qsca, qback, g = mie.efficiencies_mx(m,x)
    result[i] = a.best

# mie.use_numba(True)  # Ensure the JIT backend is used
# for i in range(ntests):
#     x = np.linspace(0.1, 20, N[i])
#     a = %timeit -o qext, qsca, qback, g = mie.efficiencies_mx(m,x)
#     resultj[i] = a.best

# improvement = result / resultj
# plt.loglog(N, resultj, ":r")
# plt.loglog(N, result, ":b")
# plt.loglog(N, resultj, "or", label="jit")
# plt.loglog(N, result, "ob", label="no jit")
# plt.legend()
# plt.xlabel("Number of sphere sizes calculated")
# plt.ylabel("Execution Time")
# plt.title("Jit improvement is %d to %dX" % (np.min(improvement), np.max(improvement)))
# plt.show()
1.71 μs ± 23.7 ns per loop (mean ± std. dev. of 7 runs, 1,000,000 loops each)
5.62 μs ± 156 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
25.4 μs ± 790 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
105 μs ± 558 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
407 μs ± 3.32 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
1.63 ms ± 46 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Embedded spheres

[8]:
ntests = 6
mwater = 4 / 3  # rough approximation
m = 1.0
mm = m / mwater
r = 500  # nm

N = np.logspace(0, 3, ntests, dtype=int)
result = np.zeros(ntests)
resultj = np.zeros(ntests)

for i in range(ntests):
    lambda0 = np.linspace(300, 800, N[i])  # also in nm
    xx = 2 * np.pi * r * mwater / lambda0
    a = %timeit -o qext, qsca, qback, g = mie.efficiencies_mx(mm,xx)
    result[i] = a.best

# mie.use_numba(True)  # Ensure the JIT backend is used
# for i in range(ntests):
#     lambda0 = np.linspace(300, 800, N[i])  # also in nm
#     xx = 2 * np.pi * r * mwater / lambda0
#     a = %timeit -o qext, qsca, qback, g = mie.efficiencies_mx(mm,xx)
#     resultj[i] = a.best

# improvement = result / resultj
# plt.loglog(N, resultj, ":r")
# plt.loglog(N, result, ":b")
# plt.loglog(N, resultj, "or", label="jit")
# plt.loglog(N, result, "ob", label="no jit")
# plt.legend()
# plt.xlabel("Number of Wavelengths Calculated")
# plt.ylabel("Execution Time")
# plt.title("Jit improvement is %d to %dX" % (np.min(improvement), np.max(improvement)))
# plt.show()
2.85 μs ± 111 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
5.81 μs ± 75.1 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
25.2 μs ± 270 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
103 μs ± 1.11 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
405 μs ± 1.84 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
1.64 ms ± 28.3 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Testing efficiencies

Another high level function that should be sped up by jitting.

[9]:
ntests = 6
m_sphere = 1.0
n_water = 4 / 3
d = 1000  # nm
N = np.logspace(0, 3, ntests, dtype=int)
result = np.zeros(ntests)
resultj = np.zeros(ntests)

for i in range(ntests):
    lambda0 = np.linspace(300, 800, N[i])  # also in nm
    a = %timeit -o qext, qsca, qback, g = mie.efficiencies(m_sphere, d, lambda0, n_water)
    result[i] = a.best

# mie.use_numba(True)  # Ensure the JIT backend is used
# for i in range(ntests):
#     lambda0 = np.linspace(300, 800, N[i])  # also in nm
#     a = %timeit -o qext, qsca, qback, g = mie.efficiencies(m_sphere, d, lambda0, n_water)
#     resultj[i] = a.best

# improvement = result / resultj
# plt.loglog(N, resultj, ":r")
# plt.loglog(N, result, ":b")
# plt.loglog(N, resultj, "or", label="jit")
# plt.loglog(N, result, "ob", label="no jit")
# plt.legend()
# plt.xlabel("Number of Wavelengths Calculated")
# plt.ylabel("Execution Time")
# plt.title("Jit improvement is %d to %dX" % (np.min(improvement), np.max(improvement)))
# plt.show()
3.61 μs ± 27.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
6.88 μs ± 60.8 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
26.3 μs ± 167 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
106 μs ± 2.52 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
407 μs ± 4.93 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
1.61 ms ± 14.3 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)

Scattering Phase Function

[10]:
ntests = 6
m = 1.5
x = np.pi / 3

N = np.logspace(0, 3, ntests, dtype=int)
result = np.zeros(ntests)
resultj = np.zeros(ntests)

for i in range(ntests):
    theta = np.linspace(-180, 180, N[i])
    mu = np.cos(theta / 180 * np.pi)
    a = %timeit -o s1, s2 = mie.S1_S2(m,x,mu)
    result[i] = a.best

# for i in range(ntests):
#     theta = np.linspace(-180, 180, N[i])
#     mu = np.cos(theta / 180 * np.pi)
#     a = %timeit -o s1, s2 = mie.S1_S2(m,x,mu)
#     resultj[i] = a.best

# improvement = result / resultj
# plt.loglog(N, resultj, ":r")
# plt.loglog(N, result, ":b")
# plt.loglog(N, resultj, "or", label="jit")
# plt.loglog(N, result, "ob", label="no jit")
# plt.legend()
# plt.xlabel("Number of Angles Calculated")
# plt.ylabel("Execution Time")
# plt.title("Jit improvement is %d to %dX" % (np.min(improvement), np.max(improvement)))
# plt.show()
4.35 μs ± 100 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
4.12 μs ± 41.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
5.19 μs ± 64.3 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
8.71 μs ± 129 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
23.2 μs ± 264 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
81.7 μs ± 877 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)

And finally, as function of sphere size

[11]:
ntests = 6
m = 1.5 - 0.1j
x = np.logspace(0, 3, ntests)
result = np.zeros(ntests)
resultj = np.zeros(ntests)

theta = np.linspace(-180, 180)
mu = np.cos(theta / 180 * np.pi)

for i in range(ntests):
    a = %timeit -o s1, s2 = mie.S1_S2(m,x[i],mu)
    result[i] = a.best

# mie.use_numba(True)  # Ensure the JIT backend is used
# for i in range(ntests):
#     a = %timeit -o s1, s2 = mie_S1_S2(m,x[i],mu)
#     resultj[i] = a.best

# improvement = result / resultj
# plt.loglog(N, resultj, ":r")
# plt.loglog(N, result, ":b")
# plt.loglog(N, resultj, "or", label="jit")
# plt.loglog(N, result, "ob", label="no jit")
# plt.legend()
# plt.xlabel("Sphere Size Parameter")
# plt.ylabel("Execution Time")
# plt.title("Jit improvement is %d to %dX" % (np.min(improvement), np.max(improvement)))
# plt.show()
7.86 μs ± 100 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
12.9 μs ± 89.9 ns per loop (mean ± std. dev. of 7 runs, 100,000 loops each)
20.2 μs ± 23.5 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
42.9 μs ± 2.61 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
133 μs ± 1.75 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
504 μs ± 19 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
[ ]: