Mie Performance and Jitting

Scott Prahl

Jan 2026

Switching between jit and non-jit during runtime is too complicated when combined with numba caching. Each run must be set up beforehand.

The results from my computer for the first test are indicative of speedups. The jitted version that uses numba is about 30x faster than the non-jitted version and 60x faster than the same code running under JupyterLite. (python 3.12.12, numba 0.63.1, pyodide-kernel 0.6.1)

Array size

JIT (µs)

Non-JIT (µs)

JupyterLite (µs)

1

1.8

32.3

46.8

3

4.9

151

246

15

22.2

790

1290

63

90.9

3320

5350

251

347

12800

21300

1000

1360

49000

88000

[1]:
import os
import sys
import tempfile
import numpy as np

if sys.platform == "emscripten":
    import piplite

    await piplite.install("miepython")
    os.environ["MIEPYTHON_USE_JIT"] = "0"  # jupyterlite cannot use numba
else:
    os.environ["MIEPYTHON_USE_JIT"] = "0"  # Set to "0" to disable JIT
    os.environ["NUMBA_CACHE_DIR"] = tempfile.gettempdir()

import miepython as mie


def print_header():
    if sys.platform == "emscripten":
        print("JupyterLite results:")
    elif os.environ["MIEPYTHON_USE_JIT"] == "0":
        print("Non-jitted results:")
    else:
        print("Jitted results:")

Size Parameters

We will use %timeit to see speeds for unjitted code, then jitted code

[2]:
ntests = 6

m = 1.5
N = np.logspace(0, 3, ntests, dtype=int)
result = np.zeros(ntests)
resultj = np.zeros(ntests)

print_header()
for i in range(ntests):
    x = np.linspace(0.1, 20, N[i])
    a = %timeit -o qext, qsca, qback, g = mie.efficiencies_mx(m,x)
    result[i] = a.best
Non-jitted results:
28.6 μs ± 504 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
142 μs ± 3.04 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
775 μs ± 16.1 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
3.22 ms ± 67.3 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
12.5 ms ± 284 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
51.9 ms ± 1.58 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Embedded spheres

[3]:
ntests = 6
mwater = 4 / 3  # rough approximation
m = 1.0
mm = m / mwater
r = 500  # nm

N = np.logspace(0, 3, ntests, dtype=int)
result = np.zeros(ntests)
resultj = np.zeros(ntests)

print_header()
for i in range(ntests):
    lambda0 = np.linspace(300, 800, N[i])  # also in nm
    xx = 2 * np.pi * r * mwater / lambda0
    a = %timeit -o qext, qsca, qback, g = mie.efficiencies_mx(mm,xx)
    result[i] = a.best
62.7 μs ± 839 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
167 μs ± 2.25 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
846 μs ± 9.37 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
3.58 ms ± 70.4 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
13.9 ms ± 164 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
55.7 ms ± 1.01 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Testing efficiencies

Another high level function that should be sped up by jitting.

[4]:
ntests = 6
m_sphere = 1.0
n_water = 4 / 3
d = 1000  # nm
N = np.logspace(0, 3, ntests, dtype=int)
result = np.zeros(ntests)
resultj = np.zeros(ntests)

print_header()
for i in range(ntests):
    lambda0 = np.linspace(300, 800, N[i])  # also in nm
    a = %timeit -o qext, qsca, qback, g = mie.efficiencies(m_sphere, d, lambda0, n_water)
    result[i] = a.best
66.4 μs ± 470 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
175 μs ± 1.52 μs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
851 μs ± 13.1 μs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)
3.48 ms ± 24.1 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
13.8 ms ± 120 μs per loop (mean ± std. dev. of 7 runs, 100 loops each)
55.1 ms ± 1.3 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)

Scattering Phase Function

[ ]:
ntests = 6
m = 1.5
x = np.pi / 3

N = np.logspace(0, 3, ntests, dtype=int)
result = np.zeros(ntests)
resultj = np.zeros(ntests)

print_header()
for i in range(ntests):
    theta = np.linspace(-180, 180, N[i])
    mu = np.cos(theta / 180 * np.pi)
    a = %timeit -o s1, s2 = mie.S1_S2(m,x,mu)
    result[i] = a.best

And finally, as function of sphere size

[ ]:
ntests = 6
m = 1.5 - 0.1j
x = np.logspace(0, 3, ntests)
result = np.zeros(ntests)
resultj = np.zeros(ntests)

theta = np.linspace(-180, 180)
mu = np.cos(theta / 180 * np.pi)

print_header()
for i in range(ntests):
    a = %timeit -o s1, s2 = mie.S1_S2(m,x[i],mu)
    result[i] = a.best