Mie Performance and Jitting

Scott Prahl

Jan 2026

Switching between jit and non-jit during runtime is too complicated when combined with numba caching. Each run must be set up beforehand.

The results from my computer for the first test are indicative of speedups. The jitted version that uses numba is about 30x faster than the non-jitted version and 60x faster than the same code running under JupyterLite. (python 3.12.12, numba 0.63.1, pyodide-kernel 0.6.1)

Array size	JIT (µs)	Non-JIT (µs)	JupyterLite (µs)
1	1.8	32.3	46.8
3	4.9	151	246
15	22.2	790	1290
63	90.9	3320	5350
251	347	12800	21300
1000	1360	49000	88000

[1]:

import os
import sys
import tempfile
import numpy as np

if sys.platform == "emscripten":
    import piplite

    await piplite.install("miepython")
    os.environ["MIEPYTHON_USE_JIT"] = "0"  # jupyterlite cannot use numba
else:
    os.environ["MIEPYTHON_USE_JIT"] = "0"  # Set to "0" to disable JIT
    os.environ["NUMBA_CACHE_DIR"] = tempfile.gettempdir()

import miepython as mie


def print_header():
    if sys.platform == "emscripten":
        print("JupyterLite results:")
    elif os.environ["MIEPYTHON_USE_JIT"] == "0":
        print("Non-jitted results:")
    else:
        print("Jitted results:")

Size Parameters

We will use %timeit to see speeds for unjitted code, then jitted code

[2]:

ntests = 6

m = 1.5
N = np.logspace(0, 3, ntests, dtype=int)
result = np.zeros(ntests)
resultj = np.zeros(ntests)

print_header()
for i in range(ntests):
    x = np.linspace(0.1, 20, N[i])
    a = %timeit -o qext, qsca, qback, g = mie.efficiencies_mx(m,x)
    result[i] = a.best

Non-jitted results:
28.6 μs ± 504 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
142 μs ± 3.04 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
775 μs ± 16.1 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
3.22 ms ± 67.3 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
12.5 ms ± 284 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
51.9 ms ± 1.58 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Embedded spheres

[3]:

ntests = 6
mwater = 4 / 3  # rough approximation
m = 1.0
mm = m / mwater
r = 500  # nm

N = np.logspace(0, 3, ntests, dtype=int)
result = np.zeros(ntests)
resultj = np.zeros(ntests)

print_header()
for i in range(ntests):
    lambda0 = np.linspace(300, 800, N[i])  # also in nm
    xx = 2 * np.pi * r * mwater / lambda0
    a = %timeit -o qext, qsca, qback, g = mie.efficiencies_mx(mm,xx)
    result[i] = a.best

62.7 μs ± 839 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
167 μs ± 2.25 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
846 μs ± 9.37 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
3.58 ms ± 70.4 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
13.9 ms ± 164 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
55.7 ms ± 1.01 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Testing `efficiencies`

Another high level function that should be sped up by jitting.

[4]:

ntests = 6
m_sphere = 1.0
n_water = 4 / 3
d = 1000  # nm
N = np.logspace(0, 3, ntests, dtype=int)
result = np.zeros(ntests)
resultj = np.zeros(ntests)

print_header()
for i in range(ntests):
    lambda0 = np.linspace(300, 800, N[i])  # also in nm
    a = %timeit -o qext, qsca, qback, g = mie.efficiencies(m_sphere, d, lambda0, n_water)
    result[i] = a.best

66.4 μs ± 470 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
175 μs ± 1.52 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
851 μs ± 13.1 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
3.48 ms ± 24.1 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
13.8 ms ± 120 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
55.1 ms ± 1.3 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Scattering Phase Function

[ ]:

ntests = 6
m = 1.5
x = np.pi / 3

N = np.logspace(0, 3, ntests, dtype=int)
result = np.zeros(ntests)
resultj = np.zeros(ntests)

print_header()
for i in range(ntests):
    theta = np.linspace(-180, 180, N[i])
    mu = np.cos(theta / 180 * np.pi)
    a = %timeit -o s1, s2 = mie.S1_S2(m,x,mu)
    result[i] = a.best

And finally, as function of sphere size

[ ]:

ntests = 6
m = 1.5 - 0.1j
x = np.logspace(0, 3, ntests)
result = np.zeros(ntests)
resultj = np.zeros(ntests)

theta = np.linspace(-180, 180)
mu = np.cos(theta / 180 * np.pi)

print_header()
for i in range(ntests):
    a = %timeit -o s1, s2 = mie.S1_S2(m,x[i],mu)
    result[i] = a.best