API Reference
Public API
NaNTracker.nantrack — Function
nantrack(model)Wrap trackable leaf layers of model with NaNCheck for forward and backward NaN detection. Returns a structurally identical model that throws DomainError (including the layer's KeyPath) at the first NaN.
The function uses Functors.fmap_with_path to walk the model tree and only wraps layers for which trackable returns true. Already-wrapped NaNCheck nodes are left unchanged (safe to call twice).
Stats tracking
Enable enable_stats!() before a training step to record norm and maxabs of every activation and gradient at each checked layer. When NaN is detected the recent trajectory is printed automatically. Query stats at any time with dump_stats() or recent_stats().
See also nanuntrack, trackable, enable_stats!.
NaNTracker.nanuntrack — Function
nanuntrack(model)Strip all NaNCheck wrappers, restoring the original model.
NaNTracker.trackable — Function
trackable(::KeyPath, layer) :: BoolPredicate that decides whether layer should be wrapped with NaNCheck. Returns true for common Flux leaf layers (Dense, Embedding, LayerNorm, Scale, Conv).
Functions are not wrapped. Pure functions (relu, swish, identity, etc.) have no parameters and cannot introduce NaN through weights. Wrapping them breaks GPU broadcasting (the NaNCheck wrapper is not isbits, which CUDA kernels require) and interferes with libraries like Onion that store activation functions as struct fields and broadcast them over GPU arrays.
Extend for your own leaf layers:
NaNTracker.trackable(::KeyPath, ::MyCustomLeaf) = trueNaNTracker.NaNCheck — Type
NaNCheck{P,L}Thin wrapper around a Flux layer that checks for NaN on every forward and backward pass. P is the path type (KeyPath), L is the wrapped layer type.
This struct:
- Has no custom
rrule— the inner layer is differentiated normally by whatever AD backend is active. - Forwards
getpropertyfor unknown fields to the wrapped layer, making it transparent to code that accesses layer internals (e.g..weight). - Is registered with
Functors.@functor(notFlux.@layer) so thatfmap/Optimisers.update!can reach the trainable parameters insidelayer.
Stats Tracking
NaNTracker.enable_stats! — Function
enable_stats!(; capacity=1000)Turn on activation/gradient stats collection. Each forward input, forward output, and gradient flowing through a NaNCheck layer records norm, maxabs, and NaN/Inf flags into a ring buffer.
Note: On GPU this introduces sync points (scalar transfers) at every checked layer. Use for debugging only.
NaNTracker.disable_stats! — Function
Turn off stats collection and release the buffer.
NaNTracker.clear_stats! — Function
Clear recorded stats without disabling collection.
NaNTracker.recent_stats — Function
recent_stats(; n=50, path_contains="")Return recent StatsEntry records. Optionally filter by path substring. Returns empty vector when stats are disabled.
NaNTracker.dump_stats — Function
dump_stats(; n=50, path_contains="", io=stderr)Print recent stats entries to io. Useful for inspecting activation/gradient magnitudes during training without waiting for a NaN.
Example
enable_stats!()
# ... run one training step ...
dump_stats(path_contains="attention") # show only attention layers
clear_stats!() # reset for next stepInternal
These are not exported but can be extended.
NaNTracker.hasnan — Function
hasnan(x) :: BoolCheck whether x contains any NaN values. Dispatches on type so it works for arrays, scalars, tuples, and falls back to false for anything else.