nifreeze.model.gpr module¶
Derivations from scikit-learn for Gaussian Processes.
- nifreeze.model.gpr.BOUNDS_A: tuple[float, float] = (0.1, 2.35)¶
The limits for the parameter a (angular distance in rad).
- nifreeze.model.gpr.BOUNDS_LAMBDA: tuple[float, float] = (0.001, 1000)¶
The limits for the parameter λ (signal scaling factor).
- nifreeze.model.gpr.CONFIGURABLE_OPTIONS: Mapping[str, set] = {'CG': {'disp', 'gtol', 'maxiter'}, 'Nelder-Mead': {'adaptive', 'disp', 'fatol', 'maxiter'}}¶
A mapping from optimizer names to the option set they allow.
Add new optimizers to this list, including what options may be configured.
- class nifreeze.model.gpr.DiffusionGPR(self, kernel: 'Kernel | None' = None, *, alpha: 'float' = 0.5, optimizer: 'str | Callable | None' = 'fmin_l_bfgs_b', n_restarts_optimizer: 'int' = 0, copy_X_train: 'bool' = True, normalize_y: 'bool' = True, n_targets: 'int | None' = None, random_state: 'int | None' = None, eval_gradient: 'bool' = True, tol: 'float | None' = None, disp: 'bool | int | None' = None, maxiter: 'int | None' = None, ftol: 'float | None' = None, gtol: 'float | None' = None, adaptive: 'bool | int | None' = None, fatol: 'float | None' = None)[source]¶
Bases:
GaussianProcessRegressor
A Gaussian process (GP) regressor specialized for nifreeze.
This specialization of the default GP regressor is created to allow the following extended behaviors:
Pacify Scikit-learn’s estimator parameter checker to allow optimizers given by name (as a string) other than the default BFGS.
Enable custom options of optimizers. See
minimize
for the available options. Please note that only a few of them are currently supported.
In the future, this specialization would be the right place for hyperparameter optimization using cross-validation and such.
In principle, Scikit-Learn’s implementation normalizes the training data as in [Andersson15] (see FSL’s source code). From their paper (p. 167, end of first column):
Typically one just subtracts the mean (\(\bar{\mathbf{f}}\)) from \(\mathbf{f}\) and then add it back to \(f^{*}\), which is analogous to what is often done in “traditional” regression.
Finally, the parameter \(\sigma^2\) maps on to Scikit-learn’s
alpha
of the regressor. Because it is not a parameter of the kernel, hyperparameter selection through gradient-descent with analytical gradient calculations would not work (the derivative of the kernel w.r.t.alpha
is zero).This might have been overlooked in [Andersson15], or else they actually did not use analytical gradient-descent:
A note on optimisation
It is suggested, for example in Rasmussen and Williams (2006), that an optimisation method that uses derivative information should be used when finding the hyperparameters that maximise Eq. (12). The reason for that is that such methods typically use fewer steps, and when the cost of calculating the derivatives is small/moderate compared to calculating the functions itself (as is the case for Eq. (12)) then execution time can be much shorter. However, we found that for the multi-shell case a heuristic optimisation method such as the Nelder-Mead simplex method (Nelder and Mead, 1965) was frequently better at avoiding local maxima. Hence, that was the method we used for all optimisations in the present paper.
Multi-shell regression (TODO). For multi-shell modeling, the kernel \(k(\textbf{x}, \textbf{x'})\) is updated following Eq. (14) in [Andersson15].
\[k(\textbf{x}, \textbf{x'}) = C_{\theta}(\mathbf{g}, \mathbf{g'}; a) C_{b}(|b - b'|; \ell)\]and \(C_{b}\) is based the log of the b-values ratio, a measure of distance along the b-direction, according to Eq. (15) given by:
\[C_{b}(b, b'; \ell) = \exp\left( - \frac{(\log b - \log b')^2}{2 \ell^2} \right),\]\(b\) and \(b'\) being the b-values, and \(\mathbf{g}\) and \(\mathbf{g'}\) the unit diffusion-encoding gradient unit vectors of the shells; and \({a, \ell}\) some hyperparameters.
The full GP regression kernel \(\mathbf{K}\) is then updated for a 2-shell case as follows (Eq. (16) in [Andersson15]):
\[\begin{split}\begin{equation} \mathbf{K} = \left[ \begin{matrix} \lambda C_{\theta}(\theta (\mathbf{G}_{1}); a) + \sigma_{1}^{2} \mathbf{I} & \lambda C_{\theta}(\theta (\mathbf{G}_{2}, \mathbf{G}_{1}); a) C_{b}(b_{2}, b_{1}; \ell) \\ \lambda C_{\theta}(\theta (\mathbf{G}_{1}, \mathbf{G}_{2}); a) C_{b}(b_{1}, b_{2}; \ell) & \lambda C_{\theta}(\theta (\mathbf{G}_{2}); a) + \sigma_{2}^{2} \mathbf{I} \\ \end{matrix} \right] \end{equation}\end{split}\]References
- class nifreeze.model.gpr.ExponentialKriging(self, beta_a: 'float' = 0.01, beta_l: 'float' = 2.0, a_bounds: 'tuple[float, float]' = (0.1, 2.35), l_bounds: 'tuple[float, float]' = (0.001, 1000))[source]¶
Bases:
Kernel
A scikit-learn’s kernel for DWI signals.
- Parameters:
- __call__(X: np.ndarray, Y: np.ndarray | None = None, eval_gradient: bool = False) np.ndarray | tuple[np.ndarray, np.ndarray] [source]¶
Return the kernel K(X, Y) and optionally its gradient.
- Parameters:
- Returns:
- diag(X: numpy.ndarray) numpy.ndarray [source]¶
Returns the diagonal of the kernel k(X, X).
The result of this method is identical to np.diag(self(X)); however, it can be evaluated more efficiently since only the diagonal is evaluated.
- property hyperparameter_a: sklearn.gaussian_process.kernels.Hyperparameter¶
- property hyperparameter_l: sklearn.gaussian_process.kernels.Hyperparameter¶
- nifreeze.model.gpr.LBFGS_CONFIGURABLE_OPTIONS = {'disp', 'ftol', 'gtol', 'maxiter'}¶
The set of extended options that can be set on the default BFGS.
- nifreeze.model.gpr.NONGRADIENT_METHODS = {'Nelder-Mead'}¶
A set of gradients that do not allow analytical gradients.
- nifreeze.model.gpr.SUPPORTED_OPTIMIZERS = {'CG', 'Nelder-Mead', 'fmin_l_bfgs_b'}¶
A set of supported optimizers (automatically created).
- class nifreeze.model.gpr.SphericalKriging(self, beta_a: 'float' = 1.38, beta_l: 'float' = 0.5, a_bounds: 'tuple[float, float]' = (0.1, 2.35), l_bounds: 'tuple[float, float]' = (0.001, 1000))[source]¶
Bases:
Kernel
A scikit-learn’s kernel for DWI signals.
- Parameters:
- __call__(X: np.ndarray, Y: np.ndarray | None = None, eval_gradient: bool = False) np.ndarray | tuple[np.ndarray, np.ndarray] [source]¶
Return the kernel K(X, Y) and optionally its gradient.
- Parameters:
- Returns:
- diag(X: numpy.ndarray) numpy.ndarray [source]¶
Returns the diagonal of the kernel k(X, X).
The result of this method is identical to np.diag(self(X)); however, it can be evaluated more efficiently since only the diagonal is evaluated.
- property hyperparameter_a: sklearn.gaussian_process.kernels.Hyperparameter¶
- property hyperparameter_l: sklearn.gaussian_process.kernels.Hyperparameter¶
- nifreeze.model.gpr.compute_pairwise_angles(X: np.ndarray, Y: np.ndarray | None = None, closest_polarity: bool = True, dense_output: bool = True) np.ndarray [source]¶
Compute pairwise angles across diffusion gradient encoding directions.
Following [Andersson15], it computes the smallest of the angles between each pair if
closest_polarity
isTrue
, i.e.,\[\theta(\mathbf{g}, \mathbf{g'}) = \arccos(|\langle \mathbf{g}, \mathbf{g'} \rangle|)\]- Parameters:
X ({array-like, sparse matrix} of shape (n_samples_X, n_features)) – Input data.
Y ({array-like, sparse matrix} of shape (n_samples_Y, n_features), optional) – Input data. If
None
, the output will be the pairwise similarities between all samples inX
.dense_output (
bool
, optional) – Whether to return dense output even when the input is sparse. IfFalse
, the output is sparse if both input arrays are sparse.closest_polarity (
bool
, optional) –True
to consider the smallest of the two angles between the crossinglines resulting from reversing each vector pair.
- Returns:
Pairwise angles across diffusion gradient encoding directions.
- Return type:
Examples
>>> X = np.asarray([(1.0, -1.0), (0.0, 0.0), (0.0, 0.0)]).T >>> compute_pairwise_angles(X, closest_polarity=False)[0, 1] 3.1415... >>> X = np.asarray([(1.0, -1.0), (0.0, 0.0), (0.0, 0.0)]).T >>> Y = np.asarray([(1.0, -1.0), (0.0, 0.0), (0.0, 0.0)]).T >>> compute_pairwise_angles(X, Y, closest_polarity=False)[0, 1] 3.1415... >>> X = np.asarray([(1.0, -1.0), (0.0, 0.0), (0.0, 0.0)]).T >>> compute_pairwise_angles(X)[0, 1] 0.0
- nifreeze.model.gpr.exponential_covariance(theta: numpy.ndarray, a: float) numpy.ndarray [source]¶
Compute the exponential covariance for given distances and scale parameter.
Implements \(C_{\theta}\), following Eq. (9) in [Andersson15]:
\[\begin{equation} C(\theta) = e^{-\theta/a} \,\, \text{for} \, 0 \leq \theta \leq \pi, \end{equation}\]\(\theta\) being computed as:
\[\theta(\mathbf{g}, \mathbf{g'}) = \arccos(|\langle \mathbf{g}, \mathbf{g'} \rangle|)\]
- nifreeze.model.gpr.spherical_covariance(theta: numpy.ndarray, a: float) numpy.ndarray [source]¶
Compute the spherical covariance for given distances and scale parameter.
Implements \(C_{\theta}\), following Eq. (10) in [Andersson15]:
\[\begin{split}\begin{equation} C(\theta) = \begin{cases} 1 - \frac{3 \theta}{2 a} + \frac{\theta^3}{2 a^3} & \textnormal{if} \; \theta \leq a \\ 0 & \textnormal{if} \; \theta > a \end{cases} \end{equation}\end{split}\]\(\theta\) being computed as:
\[\theta(\mathbf{g}, \mathbf{g'}) = \arccos(|\langle \mathbf{g}, \mathbf{g'} \rangle|)\]