Scoring¶
Proper scoring rules for posterior predictive evaluation.
trade_study.score(metric, predictions, truth, *, alpha=None, level=0.95)
¶
Compute a scalar scoring rule.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
metric
|
str
|
One of "crps", "wis", "interval", "energy", "rmse", "mae", "coverage", "brier". |
required |
predictions
|
NDArray[floating[Any]]
|
Model predictions (ensemble members, quantiles, etc.). |
required |
truth
|
NDArray[floating[Any]]
|
Known ground truth values. |
required |
alpha
|
float | NDArray[floating[Any]] | None
|
Significance level for interval-based scores. |
None
|
level
|
float
|
Nominal coverage level for coverage metric. |
0.95
|
Returns:
| Type | Description |
|---|---|
float
|
Scalar score value. |
Raises:
| Type | Description |
|---|---|
ValueError
|
If the metric name is not recognized. |
Source code in src/trade_study/_scoring.py
trade_study.coverage_curve(posteriors, truth, levels=None)
¶
Compute empirical coverage across nominal levels.
Parameters:
| Name | Type | Description | Default |
|---|---|---|---|
posteriors
|
NDArray[floating[Any]]
|
Posterior samples, shape (n_obs, n_samples). |
required |
truth
|
NDArray[floating[Any]]
|
True values, shape (n_obs,). |
required |
levels
|
NDArray[floating[Any]] | None
|
Nominal coverage levels (default: 0.05 to 0.99). |
None
|
Returns:
| Type | Description |
|---|---|
tuple[NDArray[floating[Any]], NDArray[floating[Any]]]
|
Tuple of (nominal_levels, empirical_coverage). |