Quo vadis statistical evaluation measures?
Statistical evaluation measures are foundational for assessing the quality, performance, and reliability of models and hypotheses. Today, we use a wide range of metrics depending on context (see overview over research questions ). The correct use and interpretation matters now and in the future, and so will their importance.
We will be particularly focussed on archetypical measures (like the p-value) and measures to evaluate probabilisitc predictions for binary outcomes (like the Brier score).