mcnemar_tables：McNemar 检验和 Cochran Q 检验的列联表

计算 McNemar 检验和 Cochran Q 检验的 2x2 列联表的函数

from mlxtend.evaluate import mcnemar_tables

概述

列联表

2x2 列联表被用于 McNemar 检验 (mlxtend.evaluate.mcnemar)，是比较两种不同模型的有用工具。与典型的混淆矩阵不同，该表比较的是两个模型之间的情况，而不是显示单个模型的预测的假阳性、真阳性、假阴性和真阴性。

例如，假设两个模型的准确率分别为 99.7% 和 99.6%，2x2 列联表可以为模型选择提供进一步的见解。

在图 A 和图 B 中，两个模型的预测准确率如下：

模型 1 准确率：9,960 / 10,000 = 99.6%
模型 2 准确率：9,970 / 10,000 = 99.7%

现在，在图 A 中，我们可以看到模型 2 在模型 1 预测错误的地方预测对了 11 次。反之，模型 1 在模型 2 预测错误的地方预测对了 1 次。因此，基于这个 11:1 的比例，我们可以得出结论：模型 2 的性能显著优于模型 1。然而，在图 B 中，比例是 25:15，这对于选择哪个模型更好则不太具有决定性。

参考文献

McNemar, Quinn, 1947. "Note on the sampling error of the difference between correlated proportions or percentages". Psychometrika. 12 (2): 153–157.
Edwards AL: Note on the “correction for continuity” in testing the significance of the difference between correlated proportions. Psychometrika. 1948, 13 (3): 185-187. 10.1007/BF02289261.
https://en.wikipedia.org/wiki/McNemar%27s_test

示例 1 - 单个 2x2 列联表

import numpy as np
from mlxtend.evaluate import mcnemar_tables

y_true = np.array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1])

y_mod0 = np.array([0, 1, 0, 0, 0, 1, 1, 0, 0, 0])
y_mod1 = np.array([0, 0, 1, 1, 0, 1, 1, 0, 0, 0])

tb = mcnemar_tables(y_true, 
                    y_mod0, 
                    y_mod1)

tb

{'model_0 vs model_1': array([[ 4.,  1.],
        [ 2.,  3.]])}

为了使用 matplotlib 可视化（并更好地解释）列联表，我们可以使用 checkerboard_plot 函数

from mlxtend.plotting import checkerboard_plot
import matplotlib.pyplot as plt

brd = checkerboard_plot(tb['model_0 vs model_1'],
                        figsize=(3, 3),
                        fmt='%d',
                        col_labels=['model 2 wrong', 'model 2 right'],
                        row_labels=['model 1 wrong', 'model 1 right'])
plt.show()

png

示例 2 - 多个 2x2 列联表

如果向 mcnemar_tables 函数提供多个模型作为输入，则将为每对模型创建 2x2 列联表

import numpy as np
from mlxtend.evaluate import mcnemar_tables

y_true = np.array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1])

y_mod0 = np.array([0, 1, 0, 0, 0, 1, 1, 0, 0, 0])
y_mod1 = np.array([0, 0, 1, 1, 0, 1, 1, 0, 0, 0])
y_mod2 = np.array([0, 0, 1, 1, 0, 1, 1, 0, 1, 0])

tb = mcnemar_tables(y_true, 
                    y_mod0, 
                    y_mod1,
                    y_mod2)

for key, value in tb.items():
    print(key, '\n', value, '\n')

model_0 vs model_1 
 [[ 4.  1.]
 [ 2.  3.]]

model_0 vs model_2 
 [[ 4.  2.]
 [ 2.  2.]]

model_1 vs model_2 
 [[ 5.  1.]
 [ 0.  4.]]

API

mcnemar_tables(y_target, *y_model_predictions)

计算 McNemar 检验或 Cochran Q 检验的多个 2x2 列联表。

参数

y_target : 类似数组，形状=[n_samples]

真实的类别标签，为一维 NumPy 数组。
y_model_predictions : 类似数组，形状=[n_samples]

模型的预测类别标签。

返回值

tables : 字典

NumPy 数组的字典，形状为 [2, 2]。每个字典键根据模型作为 *y_model_predictions 传递的顺序命名要比较的两个模型。字典条目数等于 m 个模型之间的成对组合数，即“m 选 2”。

例如，以下目标数组（包含真实标签）和 3 个模型
- y_true = np.array([0, 0, 0, 0, 0, 1, 1, 1, 1, 1])
- y_mod0 = np.array([0, 1, 0, 0, 0, 1, 1, 0, 0, 0])
- y_mod1 = np.array([0, 0, 1, 1, 0, 1, 1, 0, 0, 0])
- y_mod2 = np.array([0, 1, 1, 1, 0, 1, 0, 0, 0, 0])
将得到以下字典

{'model_0 vs model_1': array([[ 4., 1.], [ 2., 3.]]), 'model_0 vs model_2': array([[ 3., 0.], [ 3., 4.]]), 'model_1 vs model_2': array([[ 3., 0.], [ 2., 5.]])}

每个数组的结构如下：
- tb[0, 0]: 两个模型都预测正确的样本数
- tb[0, 1]: 模型 a 预测正确而模型 b 预测错误的样本数
- tb[1, 0]: 模型 b 预测正确而模型 a 预测错误的样本数
- tb[1, 1]: 两个模型都预测错误的样本数

示例

For usage examples, please see
https://mlxtend.cn/mlxtend/user_guide/evaluate/mcnemar_tables/

Python

按键	操作
`?`	打开此帮助
`n`	下一页
`p`	上一页
`s`	搜索