Developer Guide: Vectorized Batch Fit Engine (VBF)¶
This document provides a deep dive into the inner workings, architecture, and performance characteristics of the Vectorized Batch Fit Engine (VBF) located in spectroview/fit_engine/.
1. Why the VBF Engine Is Much Faster¶
The legacy fit engines operate on a per-spectrum basis. For a hyperspectral map containing thousands of spectra, this approach introduces significant overhead:
- Python Function Call Overhead: Calling the objective function and Jacobian estimator thousands of times per iteration.
- Finite-Difference Jacobians: Approximating the Jacobian numerically requires 2 * K (where K is the number of parameters) additional function evaluations per iteration, per spectrum.
- Sequential Execution: Even with multiprocessing, the overhead of serialization and inter-process communication creates bottlenecks.
The Vectorized Batch Fit Engine (VBF Engine) achieves massive speedups (~10× to 15× faster) through the following core principles:
- All-at-Once Optimization: It optimizes all \(N\) spectra simultaneously. The parameter matrices, data arrays, and residuals are manipulated as large 2D or 3D tensors.
- Vectorized Operations (BLAS/LAPACK): By framing the problem as tensors, the heavy lifting is offloaded to highly optimized C/Fortran libraries.
- Matrix multiplications and transpositions for the normal equations (\(J^T J\) and \(J^T r\)) are performed using
np.einsum. - The linear systems for all spectra are solved in a single call to
np.linalg.solve, which dispatches to LAPACK.
- Matrix multiplications and transpositions for the normal equations (\(J^T J\) and \(J^T r\)) are performed using
- Analytical Jacobians: Instead of estimating derivatives numerically, the engine uses exact analytical formulas for common peak shapes (
Lorentzian,Gaussian,PseudoVoigt). This eliminates the \(2K\) extra evaluations entirely. - No Spatial Propagation: Unlike older map-fitting approaches that used spiral traversal to propagate guesses from neighbor to neighbor (forcing sequential execution), the
VBF Engineinitializes all pixels independently using amplitude scaling, allowing purely parallel tensor math. - Variable Length Support: Spectra with different lengths (e.g., differently cropped) are padded with zeros to fit into a uniform 2D tensor, while a boolean
weightsmask ensures padded regions are ignored during optimization. - Expression Support: Supports complex mathematical relationships between parameters across the batch by evaluating mathematical constraints symbolically before mapping to free parameters.
2. Performance Optimizations¶
Recent optimizations have achieved a 2.2× to 5× speedup across various datasets:
- Template Model Application (11.9× speedup):
_prepare_fit_model_template()pre-computes Python Models and bounds sanitization once. Per-spectrum application only clones the lightweightparam_hintsdictionary. - Batch Preprocessing (3.0× speedup):
_batch_preprocess()computes range masks and baseline indices once for all spectra sharing the same x-axis, using vectorized subtraction instead of per-spectrum calls. - Adaptive Batched Solver (up to 5× speedup): The solver dynamically chooses between NumPy's batched
np.linalg.solve(best for large N) and SciPy'scho_solve(Cholesky decomposition, best for small N / large K, exploiting symmetric positive-definite normal equations). - Mean-Based Convergence Criteria: The convergence check uses
mean(|Δp| / |p|) < xtolrather thanmax(), preventing a single slowly converging parameter in multi-peak models from stalling the entire batch. - Vectorized Write-Back (3.5× speedup):
build_results_batch()evaluates all best-fit curves and computes R² simultaneously across all spectra using a single_to_full()expansion. - Zero-Weight Early Exit: Spectra identified as purely noise (all weights zero) are marked as converged instantly, saving all Levenberg-Marquardt iterations.
3. Code Logic and Core Implementation Principles¶
The engine implements a Batched Levenberg-Marquardt algorithm.
3.1. The Mathematics of Batched LM¶
For \(N\) spectra, each with \(M\) wavelength points and \(K\) free parameters:
- Evaluate Model: \(\mathbf{Y}_{pred} = f(\mathbf{x}, \mathbf{p})\), returning an \((N, M)\) tensor.
- Calculate Residuals: \(\mathbf{r} = \mathbf{W} \circ (\mathbf{Y}_{pred} - \mathbf{Y}_{data})\), returning an \((N, M)\) tensor.
- Calculate Jacobian: \(\mathbf{J} = \frac{\partial f}{\partial \mathbf{p}}\), returning an \((N, M, K)\) tensor.
- Normal Equations: Assemble \(J^T J\) (size \(N \times K \times K\)) and \(J^T r\) (size \(N \times K\)).
- Damping (Marquardt step): Add a damping factor \(\lambda_i\) to the diagonal of \(J^T J\) for each spectrum \(i\).
- Solve: Solve \((J^T J + \lambda \text{diag}(J^T J)) \delta \mathbf{p} = -J^T r\) for all \(N\) spectra simultaneously.
- Evaluate Step: Update \(\mathbf{p} \leftarrow \mathbf{p} + \delta \mathbf{p}\) (with projection to bounds) and evaluate the new cost. Adjust \(\lambda\) per spectrum based on success/failure.
3.2. Independent Convergence¶
Even though the math is batched, each spectrum converges independently. The optimizer uses a boolean mask (active = ~converged) to skip Jacobian calculations and linear solves for spectra that have already reached the tolerance limits, progressively speeding up the later iterations.
4. Folder and Class Structure¶
graph TD
TFT["VBFthread"] -->|"instantiates"| TE["VBFengine"]
TE -->|"builds"| EV["VBFevaluator"]
TE -->|"calls"| OPT["batched_LM()"]
EV -->|"routes to"| MOD["models.py"]
OPT -->|"evaluate / jacobian"| EV
| Module | Class / Function | Responsibility |
|---|---|---|
vbf_thread.py |
VBFthread |
QThread wrapper. Supports batched modes. Emits progress_changed and timings_ready signals. Sets 8 MB stack on macOS to prevent LAPACK segfaults. |
vbf_engine.py |
VBFengine |
Public API orchestrator. Manages the 8-step pipeline: apply model → preprocess → extract matrices → build p0 → build weights → optimize → write back results. Records step-level timings. |
evaluator.py |
VBFevaluator |
Bridge between the dictionary-based fit_model and the flat tensor API. Parses peak definitions, manages free/fixed parameter indexing, evaluates expressions, routes to correct batched model functions, builds FitResult objects. |
optimizer.py |
batched_levenberg_marquardt() |
Pure numerical optimizer. Solves N independent least-squares problems simultaneously using np.einsum. Uses an adaptive solver (cho_solve or np.linalg.solve) depending on matrix size. GUI-agnostic. |
models.py |
batched_*() functions |
Vectorized peak shape functions and their analytical Jacobians. Contains the BATCHED_MODELS registry. Includes numerical_jacobian() fallback. |
scalar_models.py |
FitResult, ParamValue, scalar functions |
Lightweight result classes compatible with legacy interfaces. Scalar peak functions used as fallbacks when no batched implementation exists. Contains PEAK_MODEL_REGISTRY. |
5. Processing Pipeline / Execution Flow¶
When a user triggers a fit (e.g., via the "Fit" button), the following pipeline executes:
sequenceDiagram
participant VM as ViewModel
participant TFT as FitThread
participant TE as Engine
participant EV as Evaluator
VM->>TFT: start()
TFT->>TE: fit_spectra()
TE->>EV: from_fit_model()
TE->>TE: preprocess + extract matrices
TE->>TE: build p0 + weights
TE->>TE: batched_LM()
TE->>EV: write_back_to_spectrum()
TE-->>TFT: fit_results
TFT-->>VM: finished + timings
5.1. Step-by-Step Details¶
Step 1 — Model Application (apply_model_to_spectra=True):
Instead of a slow per-spectrum deepcopy, _prepare_fit_model_template() creates a highly optimized template. It builds Model objects and sanitizes parameter bounds exactly once. Then, _apply_template_to_spectrum() merely copies the lightweight param_hints dictionary. When False (Spectra workspace with per-spectrum models), spectra are grouped by model signature and processed as separate batches.
Step 2 — Evaluator Construction:
VBFevaluator.from_fit_model() parses the model. For each peak:
- Looks up the model name in BATCHED_MODELS (fast path) or PEAK_MODEL_REGISTRY (scalar fallback).
- Assigns a sequential prefix (m01_, m02_, ...) and extracts param_hints.
- Builds _free_idx / _fixed_idx mappings between optimized free parameters and the full parameter set.
- Parses expression strings (e.g., m01_fwhm = m02_fwhm) and maps dependencies.
Step 3 — Preprocessing:
_batch_preprocess() uses a batched strategy. For spectra sharing the same x-axis (e.g., hyperspectral maps), it computes the range crop mask and evaluates the baseline indexing once. This mask and baseline are then rapidly applied via NumPy array slicing and subtraction to all spectra. For complex modes (like arPLS) or variable lengths, it falls back to per-spectrum spectrum.preprocess().
Step 4 — Data Extraction:
Spectra with different lengths are zero-padded to max_M (the longest spectrum). The weights matrix is built by _build_fit_weights(), which handles:
| Weight Rule | Condition | Effect |
|---|---|---|
| Negative masking | fit_negative=False |
w[y < 0] = 0 |
| Outlier masking | fit_outliers=False |
w[outlier_positions] = 0 |
| Noise threshold | coef_noise > 0 |
w[smoothed_y < noise_level] = 0 |
| Zero-padding | Padded region | w[M_s:] = 0 (automatic) |
Step 5 — Initial Parameters (p0):
- First fit (
apply_model_to_spectra=True):build_p0_matrix()tiles the model's initial values and scales each spectrum's amplitudes proportionally to the actual data intensity at the peak center position. The ratio is clamped to[0.01, 100]to prevent extreme initial guesses. - Re-fit (
apply_model_to_spectra=False):extract_p0_from_spectrum()reads the existing fittedparam_hintsvalues from each spectrum (warm start).
After p0 construction, apply_noise_threshold() zeros out amplitude and FWHM for peaks located in regions below the noise floor.
Step 6 — Optimization: See Section 3 for the LM algorithm details.
Step 7 — Result Writeback:
evaluator.build_results_batch() uses a vectorized approach. It performs a single _to_full() expansion for the entire \(N \times K\) parameter matrix, calls evaluate() once to generate all best-fit curves, and calculates \(R^2\) via array operations. The data is then packaged into FitResult objects and assigned to spectrum.result_fit.
6. The VBFevaluator in Detail¶
The VBFevaluator is the most complex class in the engine. It serves as the bridge between the flexible, dictionary-based world of the GUI and the rigid, flat-tensor world of the optimizer.
6.1. Parameter Space Mapping¶
fit_model dict VBFevaluator Optimizer
┌─────────────────┐ ┌────────────────────────────┐ ┌──────────────┐
│ peak_models: │ │ _param_names: │ │ │
│ "0": │ │ ["m01_ampli", │ │ p_free │
│ Gaussian: │ ──► │ "m01_fwhm", │ ──► │ (N, K_free) │
│ ampli: .. │ │ "m01_x0", │ │ │
│ fwhm: .. │ │ "m02_ampli", ...] │ │ │
│ x0: .. │ │ │ │ │
│ "1": │ │ _free_idx: [0, 1, 2, 3] │ │ │
│ Lorentzian: │ │ _fixed_idx: [4] │ │ │
│ ampli: .. │ │ _fixed_values: [0.5] │ │ │
│ ... │ │ │ │ │
└─────────────────┘ └────────────────────────────┘ └──────────────┘
6.2. Expression Support¶
Parameters can reference other parameters via mathematical expressions (e.g., m01_fwhm = m02_fwhm or m01_x0 + 10). The evaluator handles this in _to_full():
- Parameters with expressions are marked as fixed (not optimized directly).
- During
_to_full(), expressions are evaluated using Python'seval()with a restricted namespace containing all parameter names,np, and common math functions. - A multi-pass resolution loop handles expression chains (e.g.,
a = b,b = c) by retrying failed evaluations until all dependencies are resolved. - The Jacobian accounts for expressions via the chain rule: a numerical
J_exprmatrix is computed by perturbing each free parameter and observing how the full parameter vector changes, then the true Jacobian isJ_full @ J_expr.
6.3. Model Routing¶
The evaluator's evaluate() and jacobian() methods iterate over all registered peaks and sum their contributions:
def evaluate(self, x, p_free):
p_full = self._to_full(p_free) # (N, K_total)
Y = np.zeros((N, M))
for model_name, slc, eval_fn, jac_fn, has_jac in self._peaks:
Y += eval_fn(x, p_full[:, slc]) # Each peak adds its contribution
return Y
For the Jacobian, if a peak has an analytical Jacobian (has_jac=True), it is used directly. Otherwise, numerical_jacobian() is called as a fallback with central differences and relative perturbation scaling.
7. Batched Peak Models and Analytical Jacobians¶
7.1. Tensor Conventions¶
All batched functions follow the same signature:
def batched_shape(x, params):
"""
x: (M,) shared axis OR (N, M) per-spectrum axis
params: (N, n_p) parameter matrix
Returns: (N, M) predicted values
"""
def batched_shape_jac(x, params):
"""Returns: (N, M, n_p) Jacobian tensor"""
7.2. Registered Models (BATCHED_MODELS)¶
All peak models have vectorized batched implementations with analytical Jacobians, ensuring maximum performance for every model type:
| Model | Parameters | Formula |
|---|---|---|
Gaussian |
ampli, fwhm, x0 |
a · exp(-4 ln(2) · (x-x0)2 / w2) |
Lorentzian |
ampli, fwhm, x0 |
a / [1 + 4(x-x0)2 / w2] |
PseudoVoigt |
ampli, fwhm, x0, alpha |
α · G + (1-α) · L |
GaussianAsym |
ampli, fwhm_l, fwhm_r, x0 |
Piecewise Gaussian with left/right FWHM |
LorentzianAsym |
ampli, fwhm_l, fwhm_r, x0 |
Piecewise Lorentzian with left/right FWHM |
Fano |
ampli, fwhm, x0, q |
a · (q + ε)2 / (1 + ε2), ε = 2(x-x0)/w |
DecaySingleExp |
A, tau, B |
A · e-x/τ + B |
DecayBiExp |
A1, tau1, A2, tau2, B |
A1 · e-x/τ1 + A2 · e-x/τ2 + B |
7.3. Numerical Jacobian Fallback¶
For future custom models registered only in PEAK_MODEL_REGISTRY (without a batched implementation), numerical_jacobian() uses central differences with relative perturbation as a fallback:
h = max(|param| * eps, eps) # Scale step to parameter magnitude
J[:,:,k] = (f(p+h) - f(p-h)) / 2h # Central difference
This is ~2K times slower than analytical Jacobians per iteration but ensures correctness for any model shape.
8. The VBFthread¶
8.1. Two Operating Modes¶
graph TD
TFT["VBFthread"] --> Check{"batches?"}
Check -->|"Yes"| B["_run_batched()"]
Check -->|"No"| S["_run_single()"]
B --> G1["Group 1"]
B --> G2["Group 2"]
B --> G3["Group 3"]
S --> All["All spectra"]
Single-model mode (Maps workspace / Apply Fit Model):
All spectra share one fit_model. The engine processes them in one tensor batch.
Batched mode (Spectra workspace / individual models):
When spectra have different peak configurations, the ViewModel groups them by model signature (same number and types of peaks). Each group is processed sequentially through the engine, but spectra within each group are optimized in parallel.
8.2. macOS Stack Size¶
The thread sets an 8 MB stack size on macOS (setStackSize(8 * 1024 * 1024)) because:
- macOS defaults
QThreadstack to 512 KB. np.linalg.solvedispatches to LAPACK, which allocates workspace arrays on the stack.- For large K (many parameters), the stack allocation can exceed 512 KB, causing segfaults.
8.3. Signals¶
| Signal | Payload | Purpose |
|---|---|---|
progress_changed |
(current, total, percent, elapsed) |
Updates progress bar in the View |
timings_ready |
str |
Formatted timing breakdown for console/debug |
9. Optimization Parameters and Adjustments¶
The engine behavior can be tuned via the fit_params dictionary passed to fit_spectra().
9.1. Key Parameters¶
max_ite(default: 200): The maximum number of Levenberg-Marquardt iterations. Increasing this might help extremely difficult spectra converge but will increase total execution time.xtol(default: 1e-4): The relative tolerance for the parameter step size \(\delta p\). Convergence is reached when the mean relative change across all parameters (\(\operatorname{mean}(|\delta p| / |p|)\)) is less thanxtol. This mean-based criterion ensures that a single slowly oscillating parameter does not artificially delay convergence for the entire spectrum.ftol(default: 1e-4): The relative tolerance for the cost function (sum of squared residuals). If the relative change in the cost is less thanftol, the spectrum is considered converged.fit_negative(default:False): Whether to include negative intensity values in the fit. WhenFalse, negative points receive zero weight.fit_outliers(default:False): Whether to include statistical outlier points. WhenFalse, outliers detected byspectrum.calculate_outliers()receive zero weight.coef_noise(default: 1): Noise coefficient multiplier. When > 0, activates noise-based weight masking and peak suppression. See Section 10 for a detailed explanation.
9.2. Tuning for Performance vs. Accuracy¶
- Fast Mapping: For rapid previews, you can increase
xtolandftolto1e-3or1e-2. The optimizer will exit much earlier, providing a rough fit in a fraction of the time. - Precision Fitting: For publication-quality results, decrease
xtolandftolto1e-5or1e-6. - Handling "Stuck" Spectra: The optimizer tracks
consecutive_rejects. If a spectrum's cost fails to improve for 15 consecutive iterations (despite damping adjustments), it is marked as converged (stuck) to prevent it from holding back the rest of the batch. This threshold (MAX_REJECTSinoptimizer.py) can be adjusted if needed.
9.3. Damping Schedule¶
The per-spectrum damping factor \(\lambda\) is initialized at 1e-2 and adjusted after each iteration:
| Outcome | λ Adjustment | Effect |
|---|---|---|
| Cost improved | λ /= 3.0 (floor 1e-10) |
Trust the Gauss-Newton direction more |
| Cost worsened | λ *= 2.5 (ceiling 1e10) |
Shift toward gradient descent |
10. Noise Level Estimation and Noise Threshold¶
The coef_noise parameter controls a noise-aware filtering system that can significantly improve both the performance and precision of fitting, especially on hyperspectral maps where many pixels may contain weak or absent peaks.
10.1. How Noise Level Is Estimated (detect_noise_level)¶
The noise level is estimated by the native detect_noise_level() function. It quantifies the high-frequency oscillation amplitude of the spectrum — i.e., the typical point-to-point noise — while being robust to real spectral features.
def detect_noise_level(y):
delta = np.diff(y) # δ[i] = y[i+1] − y[i]
delta1, delta2 = delta[:-1], delta[1:] # adjacent difference pairs
mask = np.sign(delta1) * np.sign(delta2) == -1 # sign-alternating (zigzag)
ampli_noise = np.median(np.abs(delta1[mask] - delta2[mask]) / 2)
return ampli_noise
Step-by-step logic:
| Step | Operation | Purpose |
|---|---|---|
| 1 | np.diff(y) |
Computes consecutive differences \(\delta_i = y_{i+1} - y_i\) |
| 2 | Pair adjacent deltas | Creates overlapping pairs \((\delta_i, \delta_{i+1})\) |
| 3 | Sign-alternation mask | Selects only zigzag points where the signal goes up-then-down or down-then-up on consecutive steps — the signature of random noise |
| 4 | Half peak-to-peak | For each zigzag, \(\|\delta_i - \delta_{i+1}\| / 2\) measures the half-amplitude of the oscillation |
| 5 | Median | Takes the median over all zigzag points, making the estimate robust to outliers and real peaks |
Why this works: Real spectral peaks are broad features that create sustained positive or negative runs in the difference vector. They are excluded by the sign-alternation mask (step 3). Only rapid up-down oscillations characteristic of noise pass through, so the estimate cleanly separates noise from signal even if 30–50% of the spectrum contains strong peaks.
10.2. How coef_noise Activates Noise Thresholding¶
When coef_noise > 0, the engine computes a noise level threshold:
This threshold activates two complementary mechanisms:
10.2.1. Mechanism A — Weight Masking (_build_fit_weights)¶
During weight matrix construction in vbf_engine.py, a 5-point moving average smooths the spectrum, and any data point where the smoothed signal falls below the noise level is excluded from the fit by setting its weight to zero:
ymean = np.convolve(y, np.ones(5) / 5.0, mode='same') # 5-point moving average
noise_level = coef_noise * ampli_noise
w[ymean < noise_level] = 0.0 # zero weight → ignored by optimizer
The optimizer's residual calculation \(\mathbf{r} = \mathbf{W} \circ (\mathbf{Y}_{pred} - \mathbf{Y}_{data})\) naturally ignores these masked points, so the fit focuses only on regions with meaningful signal.
10.2.2. Mechanism B — Peak Suppression (apply_noise_threshold)¶
In the VBFevaluator, any peak whose center position (x0) falls in a below-threshold region has its amplitude and shape parameters suppressed, and its positional parameters restored to the initial guess:
for each peak:
x0_val = peak center position
if ymean[at x0] < noise_level:
ampli = 0.0 # force amplitude to zero
fwhm = 0.0 # force width to zero
# Additionally, x0 and other shape parameters are restored
# to p0_matrix (initial guesses) to prevent random fluctuation mapping
This runs twice during the pipeline:
- Before optimization (Step 5 in the pipeline) — sets a clean initial guess, preventing the optimizer from trying to fit noise fluctuations as peaks.
- After optimization (Step 7) — cleans up any peaks that may have drifted into noise regions during the LM iterations.
10.3. Performance and Precision Benefits¶
10.3.1. Performance¶
| Aspect | How It Helps |
|---|---|
| Faster convergence | Zeroed-out peaks in noise regions start and stay at zero, effectively reducing the active parameter count |
| Fewer wasted iterations | Without noise masking, the optimizer spends many iterations fitting random noise with tiny ghost peaks |
| Better initial guess | Pre-zeroing noise-region peaks brings \(p_0\) closer to the true solution, so LM converges in fewer iterations |
10.3.2. Precision¶
| Aspect | How It Helps |
|---|---|
| Eliminates ghost peaks | In maps, some pixels have weak or absent peaks. Without thresholding, noise fluctuations are fitted as tiny false peaks, producing artifacts in parameter maps |
| Stabilizes correlated parameters | Near the noise floor, amplitude, width, and position become highly correlated — small noise perturbations cause large parameter swings. Suppression avoids this instability |
| Prevents cross-talk | A ghost peak in a noise region can "steal" intensity from a real neighboring peak, biasing its fitted amplitude. Suppression prevents this |
| Cleaner R² values | Masked noise regions do not contribute to the R² calculation, so the reported goodness-of-fit reflects only meaningful signal regions |
10.4. Practical Example¶
Consider a Raman map with 3 defined peaks, where for some pixels peak #2 sits in a flat baseline region:
Without coef_noise (= 0):
Peak #2 → ampli = 0.3 (noise artifact), fwhm = 2.1 (meaningless), x0 = 520.3 (drifted)
Peak #1 → ampli = 18.2 (slightly biased — peak #2 stealing intensity)
With coef_noise = 1:
Peak #2 → ampli = 0.0, fwhm = 0.0 (correctly suppressed)
Peak #1 → ampli = 19.1 (accurate, no cross-talk)
10.5. Choosing the Right coef_noise Value¶
| Value | Behavior |
|---|---|
0 |
Disabled — all peaks are fitted everywhere, no noise masking |
0.5 – 1.0 |
Conservative — only suppresses peaks/regions well below the noise floor |
1.0 – 2.0 |
Moderate — good default for most datasets |
3.0 – 5.0 |
Aggressive — may suppress real weak peaks; use with caution |
> 5.0 |
Very aggressive — only strong, unambiguous peaks survive |
The default value in spectroview/__init__.py is 1. The Settings UI exposes it as a spin box with range [0, 100] and step 0.5.
11. Adding New Peak Models¶
11.1. Fast Path: Analytical Jacobian (Recommended)¶
To add a new peak shape with maximum performance:
Step 1 — Define the batched evaluation function in models.py:
def batched_newshape(x, params):
"""
x: (M,) or (N, M)
params: (N, n_p) where columns are [param1, param2, ...]
Returns: (N, M)
"""
p1 = params[:, 0:1] # (N, 1) — broadcasts over M
p2 = params[:, 1:2]
if x.ndim == 1:
dx = x[None, :] - p2 # (N, M)
else:
dx = x - p2
return p1 * some_function(dx)
Step 2 — Derive and define the analytical Jacobian:
def batched_newshape_jac(x, params):
"""Returns: (N, M, n_p) — partial derivatives w.r.t. each parameter."""
# ... compute intermediates ...
N, M = result.shape
J = np.empty((N, M, n_p))
J[:, :, 0] = d_result_d_p1
J[:, :, 1] = d_result_d_p2
return J
Step 3 — Register in the BATCHED_MODELS dictionary:
BATCHED_MODELS = {
# ... existing ...
"NewShape": (batched_newshape, batched_newshape_jac, ["param1", "param2", ...]),
}
11.2. Slow Path: Scalar Fallback¶
If deriving an analytical Jacobian is impractical:
Step 1 — Define a scalar function in scalar_models.py:
def newshape(x, param1, param2):
"""x is a 1D array, params are scalars. Returns 1D array."""
return param1 * some_function(x - param2)
Step 2 — Register in PEAK_MODEL_REGISTRY:
The evaluator will automatically wrap it via _make_batched_scalar() and use numerical_jacobian(). This is functional but significantly slower (~10–50× per peak per iteration).
11.3. Registration in the Application¶
After adding the model to the engine, register it in spectroview/__init__.py:
PEAK_MODELS = [
"Gaussian", "Lorentzian", "PseudoVoigt",
# ... existing ...
"NewShape", # Add here to appear in the UI dropdown
]
12. Timing and Diagnostics¶
The VBFengine records wall-clock timings for each step in self.timings:
Step 1 - apply_model: 0.012s
Step 2 - preprocess: 0.045s
Step 3 - build p0: 0.003s
Step 4 - batch fit: 1.234s (0.6 ms/spectrum, 1950/2000 converged)
Step 5 - write_back: 0.089s
These timings are emitted via VBFthread.timings_ready and printed to the console. They are invaluable for diagnosing performance bottlenecks:
- If Step 1 dominates → too many spectra to apply the model to; consider caching.
- If Step 4 dominates → normal; this is the actual optimization.
- If Step 4 shows low convergence → check initial guesses, bounds, or model suitability.
- If Step 5 dominates → many spectra with complex write-back; usually negligible.
12.1. R² Computation¶
The goodness-of-fit metric R² is computed during build_result():
Where \(\bar{y}_w\) is the weighted mean. When weights are present, only non-zero-weight points contribute to both the numerator and denominator, ensuring that masked regions (negative values, outliers, padding) do not artificially inflate or deflate the reported quality.