sigma图的计算和绘图 - 空想家的博客

下面给出完整方案，包括 sigma 计算方法 和 前端 echarts 绘图示例。

1. Sigma 计算方法

假设：

y_true：真实值数组（numpy array 或 list）
y_pred：预测值数组（对应 loss 最小的那一轮预测结果）

Python 实现

import numpy as np

def calculate_sigma(y_true, y_pred):
    """
    计算每个样本的残差标准差 (sigma)
    """
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)

    # 残差
    residuals = y_true - y_pred

    # 残差的标准差 (总体标准差)
    sigma = np.std(residuals)

    return sigma, residuals.tolist()

# 示例
y_true = [3.0, 5.0, 7.5, 10.0]
y_pred = [2.8, 5.2, 7.0, 9.8]

sigma, residuals = calculate_sigma(y_true, y_pred)
print("sigma:", sigma)
print("residuals:", residuals)  # 每个点的残差可用于绘制散点图

返回值：

sigma：整体的标准差，可绘制水平参考线
residuals：每个样本的残差，可作为图表的 y 轴数据

2. 前端 ECharts 绘图示例

以下是一个 残差与样本索引的散点图，并添加 ±sigma 的参考线。

<div id="sigmaChart" style="width: 800px; height: 400px;"></div>

<script src="https://cdn.jsdelivr.net/npm/echarts/dist/echarts.min.js"></script>
<script>
    // 假设后端返回以下数据
    const residuals = [0.2, -0.2, 0.5, -0.2]; // 每个点的残差
    const sigma = 0.3; // 标准差

    // x 轴为样本索引
    const xData = residuals.map((_, index) => index + 1);

    const chartDom = document.getElementById('sigmaChart');
    const myChart = echarts.init(chartDom);

    const option = {
        title: { text: '训练过程 Sigma 残差图' },
        tooltip: { trigger: 'axis' },
        xAxis: {
            type: 'category',
            data: xData,
            name: '样本索引'
        },
        yAxis: {
            type: 'value',
            name: '残差'
        },
        series: [
            {
                name: '残差',
                type: 'scatter',
                data: residuals,
                symbolSize: 8,
                itemStyle: {
                    color: '#5470C6'
                }
            },
            // 正 sigma 参考线
            {
                name: '+sigma',
                type: 'line',
                data: Array(xData.length).fill(sigma),
                lineStyle: { type: 'dashed', color: '#91cc75' },
                symbol: 'none'
            },
            // 负 sigma 参考线
            {
                name: '-sigma',
                type: 'line',
                data: Array(xData.length).fill(-sigma),
                lineStyle: { type: 'dashed', color: '#ee6666' },
                symbol: 'none'
            }
        ]
    };

    myChart.setOption(option);
</script>

3. 可扩展点

若需要绘制 训练过程中每一轮的 sigma 曲线，可以把每轮计算出的 sigma 存储为数组，在 echarts 中绘制折线图。
若需要区分训练集和验证集，可分别绘制两组残差。

下面详细说明如何计算需要的数据、定义数据格式及其含义，并提供大数据量时的优化方案。

1. 需要计算的数据

基本概念

真实值 (y_true)：模型在测试集/验证集上的真实标签。
预测值 (y_pred)：对应某一轮次（通常是 loss 最小轮次）的预测结果。
残差 (residual)：残差 = y_true - y_pred，表示每个点的预测误差。
σ (sigma)：残差的标准差，衡量整体预测的离散程度。

数据计算步骤

计算每个样本的残差：
```
residuals = y_true - y_pred
```
计算整体 sigma：
```
sigma = np.std(residuals)
```
可选：分段统计
- 若样本量很大，可以按区间计算残差的均值和标准差（见大数据量处理部分）。

2. 数据格式设计

单轮残差图（Scatter + 参考线）

适用于显示 loss 最小轮次 的残差分布。

{
  "xData": [1, 2, 3, 4, 5],
  "residuals": [0.2, -0.1, 0.5, -0.3, 0.1],
  "sigma": 0.25
}

xData：样本索引或时间戳
residuals：对应的残差值
sigma：整体标准差（可绘制参考线）

多轮 sigma 曲线（Loss + Sigma 监控）

如果要展示每一轮次的 sigma（例如判断收敛情况），格式可为：

{
  "epochs": [1, 2, 3, 4, 5],
  "sigmaList": [0.8, 0.6, 0.4, 0.35, 0.3],
  "lossList": [1.2, 0.9, 0.5, 0.4, 0.3]
}

epochs：训练轮次
sigmaList：每轮残差标准差
lossList：每轮损失值（用于双轴图或参考最优点）

3. 数据量大的处理方案

当样本量很大（> 5万点）时，前端直接绘制所有点会卡顿，需要数据降采样或聚合。

3.1 后端降采样

仅取等间隔的一部分点：

def downsample(residuals, step=10):
    return residuals[::step]

3.2 区间统计（推荐）

将数据按区间聚合，展示统计特征而非每个点：

import numpy as np

def aggregate_residuals(residuals, bin_size=100):
    n_bins = len(residuals) // bin_size
    agg_data = []
    for i in range(n_bins):
        segment = residuals[i*bin_size:(i+1)*bin_size]
        agg_data.append({
            "mean": float(np.mean(segment)),
            "std": float(np.std(segment)),
            "start": i*bin_size,
            "end": (i+1)*bin_size
        })
    return agg_data

前端绘制区间均值曲线 + 上下标准差带，更直观且性能好。

4. 推荐工作流

训练过程中：
- 每轮保存 y_pred 和 loss。
- 计算并记录 sigma。
训练结束后：
- 找出 loss 最小轮次。
- 获取该轮次的 residuals 和 sigma。
前端展示：
- 小数据：散点图 + sigma 参考线。
- 大数据：区间统计图或降采样曲线。

以下是完整方案，包括 Python 数据计算脚本 和 ECharts 可视化（支持大数据量）。

1. Python 数据计算脚本

该脚本完成以下功能：

在每个训练轮次计算 sigma 并记录。
找出 loss 最小轮次。
计算该轮次的残差数据并进行降采样/聚合处理。

import numpy as np

def calculate_sigma(y_true, y_pred):
    """
    计算残差及其标准差
    """
    residuals = np.array(y_true) - np.array(y_pred)
    sigma = float(np.std(residuals))
    return sigma, residuals.tolist()

def find_best_epoch(loss_list):
    """
    找出loss最小的轮次索引
    """
    return int(np.argmin(loss_list))

def aggregate_residuals(residuals, bin_size=100):
    """
    将残差数据按区间聚合，减少前端绘图数据量
    """
    agg_data = []
    n_bins = len(residuals) // bin_size
    for i in range(n_bins):
        segment = residuals[i*bin_size:(i+1)*bin_size]
        agg_data.append({
            "mean": float(np.mean(segment)),
            "std": float(np.std(segment)),
            "start": i*bin_size,
            "end": (i+1)*bin_size
        })
    return agg_data

# ======= 模拟训练过程 =======

epochs = 50
samples = 5000

# 模拟真实值
y_true = np.linspace(0, 10, samples)

# 模拟每轮预测与loss
y_preds = [y_true + np.random.normal(0, 1/(ep+1), samples) for ep in range(epochs)]
loss_list = [np.mean((y_true - y_pred)**2) for y_pred in y_preds]

# 每轮sigma记录
sigma_list = []
for y_pred in y_preds:
    sigma, _ = calculate_sigma(y_true, y_pred)
    sigma_list.append(sigma)

# 找出loss最小轮次
best_epoch = find_best_epoch(loss_list)
best_y_pred = y_preds[best_epoch]

# 获取残差数据
_, residuals = calculate_sigma(y_true, best_y_pred)

# 降采样 / 聚合
agg_residuals = aggregate_residuals(residuals, bin_size=100)

# 返回给前端的数据
result = {
    "epochs": list(range(1, epochs+1)),
    "lossList": loss_list,
    "sigmaList": sigma_list,
    "bestEpochResiduals": residuals,
    "sigma": sigma_list[best_epoch],
    "aggregatedResiduals": agg_residuals
}

print(result.keys())  # ['epochs', 'lossList', 'sigmaList', 'bestEpochResiduals', 'sigma', 'aggregatedResiduals']

返回数据说明

epochs：训练轮次数组。
lossList：每轮次损失值。
sigmaList：每轮次的 sigma。
bestEpochResiduals：最佳轮次（loss 最小）对应的所有残差（原始数据）。
sigma：最佳轮次的整体标准差。
aggregatedResiduals：聚合后的残差数据，用于大数据量可视化。

2. 前端 ECharts 可视化

2.1 多轮次 sigma & loss 曲线

展示训练收敛情况。

<div id="sigmaLossChart" style="width: 800px; height: 400px;"></div>
<script src="https://cdn.jsdelivr.net/npm/echarts/dist/echarts.min.js"></script>
<script>
const epochs = [1,2,3,4,5]; // Python返回
const sigmaList = [0.8,0.6,0.4,0.35,0.3];
const lossList = [1.2,0.9,0.5,0.4,0.3];

const chartDom = document.getElementById('sigmaLossChart');
const myChart = echarts.init(chartDom);
const option = {
    title: { text: 'Sigma & Loss 曲线' },
    tooltip: { trigger: 'axis' },
    legend: { data: ['Sigma', 'Loss'] },
    xAxis: { type: 'category', data: epochs, name: 'Epoch' },
    yAxis: [
        { type: 'value', name: 'Sigma' },
        { type: 'value', name: 'Loss' }
    ],
    series: [
        {
            name: 'Sigma',
            type: 'line',
            yAxisIndex: 0,
            data: sigmaList
        },
        {
            name: 'Loss',
            type: 'line',
            yAxisIndex: 1,
            data: lossList
        }
    ]
};
myChart.setOption(option);
</script>

2.2 残差图（大数据量可选聚合）

原始散点图（小数据集）

<div id="residualChart" style="width: 800px; height: 400px;"></div>
<script>
const residuals = [0.2, -0.1, 0.5, -0.3]; // Python返回
const sigma = 0.25;

const xData = residuals.map((_, i) => i+1);
const myChart2 = echarts.init(document.getElementById('residualChart'));
const option2 = {
    title: { text: '残差散点图' },
    xAxis: { type: 'category', data: xData, name: '样本索引' },
    yAxis: { type: 'value', name: '残差' },
    series: [
        { type: 'scatter', data: residuals },
        { type: 'line', data: Array(xData.length).fill(sigma), lineStyle:{type:'dashed'}, symbol:'none' },
        { type: 'line', data: Array(xData.length).fill(-sigma), lineStyle:{type:'dashed'}, symbol:'none' }
    ]
};
myChart2.setOption(option2);
</script>

聚合后带标准差带状图（大数据集）

<div id="aggResidualChart" style="width: 800px; height: 400px;"></div>
<script>
const aggregatedResiduals = [
    {start:0,end:100,mean:0.1,std:0.3},
    {start:100,end:200,mean:0.05,std:0.25}
]; // Python返回

const xData3 = aggregatedResiduals.map((d,i) => i);
const meanData = aggregatedResiduals.map(d => d.mean);
const upperBound = aggregatedResiduals.map(d => d.mean + d.std);
const lowerBound = aggregatedResiduals.map(d => d.mean - d.std);

const myChart3 = echarts.init(document.getElementById('aggResidualChart'));
const option3 = {
    title: { text: '聚合残差带状图' },
    xAxis: { type: 'category', data: xData3 },
    yAxis: { type: 'value' },
    series: [
        {
            type: 'line',
            data: meanData,
            name: '平均残差'
        },
        {
            type: 'line',
            data: upperBound,
            lineStyle: { opacity: 0 },
            stack: 'conf',
            symbol: 'none'
        },
        {
            type: 'line',
            data: lowerBound,
            lineStyle: { opacity: 0 },
            areaStyle: { color: 'rgba(91,155,213,0.3)' },
            stack: 'conf',
            symbol: 'none'
        }
    ]
};
myChart3.setOption(option3);
</script>

以下是完整方案，支持 绝对残差 与 百分比残差 两种计算方式，并在前端可切换显示模式。

1. Python 计算脚本

import numpy as np

def calculate_residuals(y_true, y_pred, use_percentage=False):
    """
    计算残差及其标准差
    :param y_true: 真实值数组
    :param y_pred: 预测值数组
    :param use_percentage: 是否使用百分比残差
    :return: sigma, residuals (list)
    """
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)

    # 避免除零
    safe_y_true = np.where(y_true == 0, 1e-8, y_true)

    if use_percentage:
        residuals = ((y_true - y_pred) / safe_y_true) * 100
    else:
        residuals = y_true - y_pred

    sigma = float(np.std(residuals))
    return sigma, residuals.tolist()


def aggregate_residuals(residuals, bin_size=100):
    """
    聚合残差数据以减少前端绘图量
    """
    agg_data = []
    n_bins = len(residuals) // bin_size
    for i in range(n_bins):
        segment = residuals[i*bin_size:(i+1)*bin_size]
        agg_data.append({
            "mean": float(np.mean(segment)),
            "std": float(np.std(segment)),
            "start": i*bin_size,
            "end": (i+1)*bin_size
        })
    return agg_data


# ===== 模拟数据 =====
samples = 5000
y_true = np.linspace(1, 100, samples)  # 避免为零
y_pred = y_true + np.random.normal(0, 5, samples)

# 计算绝对残差
sigma_abs, residuals_abs = calculate_residuals(y_true, y_pred, use_percentage=False)
agg_abs = aggregate_residuals(residuals_abs)

# 计算百分比残差
sigma_pct, residuals_pct = calculate_residuals(y_true, y_pred, use_percentage=True)
agg_pct = aggregate_residuals(residuals_pct)

# 输出给前端
result = {
    "absolute": {
        "sigma": sigma_abs,
        "residuals": residuals_abs,
        "aggregatedResiduals": agg_abs
    },
    "percentage": {
        "sigma": sigma_pct,
        "residuals": residuals_pct,
        "aggregatedResiduals": agg_pct
    }
}

print(result.keys())  # dict_keys(['absolute', 'percentage'])

2. 前端 ECharts 切换显示模式

HTML & JS

<div>
    <button onclick="switchMode('absolute')">绝对值残差</button>
    <button onclick="switchMode('percentage')">百分比残差</button>
</div>
<div id="residualChart" style="width: 800px; height: 400px;"></div>

<script src="https://cdn.jsdelivr.net/npm/echarts/dist/echarts.min.js"></script>
<script>
const chartDom = document.getElementById('residualChart');
const myChart = echarts.init(chartDom);

// 模拟后端返回数据
const chartData = {
    absolute: {
        residuals: [0.5, -0.2, 1.2, -0.8],
        sigma: 0.7
    },
    percentage: {
        residuals: [1.5, -0.4, 2.1, -1.6],
        sigma: 1.3
    }
};

function renderChart(mode) {
    const residuals = chartData[mode].residuals;
    const sigma = chartData[mode].sigma;
    const xData = residuals.map((_, i) => i + 1);

    const yAxisName = mode === 'absolute' ? '残差' : '残差 (%)';

    const option = {
        title: { text: mode === 'absolute' ? '绝对值残差图' : '百分比残差图' },
        xAxis: { type: 'category', data: xData },
        yAxis: { type: 'value', name: yAxisName },
        series: [
            { type: 'scatter', data: residuals },
            { type: 'line', data: Array(xData.length).fill(sigma), lineStyle:{type:'dashed'}, symbol:'none' },
            { type: 'line', data: Array(xData.length).fill(-sigma), lineStyle:{type:'dashed'}, symbol:'none' }
        ]
    };
    myChart.setOption(option);
}

function switchMode(mode) {
    renderChart(mode);
}

// 默认显示绝对值残差
renderChart('absolute');
</script>

3. 方案说明

后端：一次性计算并返回 absolute 和 percentage 两种模式的数据。
前端：提供按钮切换，避免重复请求。
聚合数据：可扩展为带均值+标准差带状图（大数据量优化）。

明白了，你指的是 ±1σ、±2σ、±3σ 区间 的统计控制图（类似质量管理中的控制图）。

1. Sigma 区间计算

残差区间计算公式：

Upperk=k×σ,Lowerk=−k×σ(k=1,2,3)\text{Upper}_k = k \times \sigma,\quad \text{Lower}_k = -k \times \sigma \quad (k = 1,2,3)

Python 示例：

import numpy as np

def calculate_sigma_bands(y_true, y_pred, use_percentage=False):
    y_true = np.array(y_true)
    y_pred = np.array(y_pred)
    safe_y_true = np.where(y_true == 0, 1e-8, y_true)

    residuals = ((y_true - y_pred) / safe_y_true) * 100 if use_percentage else (y_true - y_pred)
    sigma = np.std(residuals)

    # 计算1σ、2σ、3σ上下限
    bands = {
        "1sigma": {"upper": sigma, "lower": -sigma},
        "2sigma": {"upper": 2*sigma, "lower": -2*sigma},
        "3sigma": {"upper": 3*sigma, "lower": -3*sigma},
    }
    return residuals.tolist(), sigma, bands

2. ECharts 3σ 图示例

可运行 HTML

保存为 sigma_chart.html 并在浏览器打开：

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <title>3σ 控制图</title>
    <script src="https://cdn.jsdelivr.net/npm/echarts/dist/echarts.min.js"></script>
</head>
<body>
    <div id="sigmaChart" style="width: 900px; height: 400px;"></div>

    <script>
    const residuals = [0.5, -0.2, 1.2, -0.8, 0.4, -0.1, 0.7];  // 残差
    const sigma = 0.7;  // 标准差

    const xData = residuals.map((_, i) => i + 1);

    const option = {
        title: { text: '3σ 控制图' },
        tooltip: { trigger: 'axis' },
        xAxis: { type: 'category', data: xData, name: '样本索引' },
        yAxis: { type: 'value', name: '残差' },
        series: [
            { type: 'scatter', data: residuals, name: '残差', symbolSize: 6 },

            // 1σ 线
            { type: 'line', data: Array(xData.length).fill(sigma), lineStyle:{color:'#91cc75',type:'dashed'}, symbol:'none', name:'+1σ' },
            { type: 'line', data: Array(xData.length).fill(-sigma), lineStyle:{color:'#91cc75',type:'dashed'}, symbol:'none', name:'-1σ' },

            // 2σ 线
            { type: 'line', data: Array(xData.length).fill(2*sigma), lineStyle:{color:'#fac858',type:'dashed'}, symbol:'none', name:'+2σ' },
            { type: 'line', data: Array(xData.length).fill(-2*sigma), lineStyle:{color:'#fac858',type:'dashed'}, symbol:'none', name:'-2σ' },

            // 3σ 线
            { type: 'line', data: Array(xData.length).fill(3*sigma), lineStyle:{color:'#ee6666',type:'dashed'}, symbol:'none', name:'+3σ' },
            { type: 'line', data: Array(xData.length).fill(-3*sigma), lineStyle:{color:'#ee6666',type:'dashed'}, symbol:'none', name:'-3σ' },
        ]
    };

    const myChart = echarts.init(document.getElementById('sigmaChart'));
    myChart.setOption(option);
    </script>
</body>
</html>

3. 特点

显示 ±1σ、±2σ、±3σ 的虚线参考线。
残差点若超出 ±3σ，表示异常点。

明白了。我们需要绘制一个 Sigma 图，其中：

x 轴：绝对差值与实际值的百分比，即

percentage_error=∣ypred−ytrue∣∣ytrue∣×100\text{percentage_error} = \frac{

y_{\text{pred}} - y_{\text{true}}

}{

y_{\text{true}}

} \times 100

y 轴：该百分比误差出现的概率（通过直方图估计）。
Sigma 线：基于误差分布的均值和标准差，绘制 ±nσ 范围（n 可为 3、4、5 等）。

完整可运行代码

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

# 示例数据：真实值和预测值
np.random.seed(42)
y_true = np.random.uniform(50, 150, 1000)   # 真实值
y_pred = y_true + np.random.normal(0, 10, 1000)  # 预测值

# 计算百分比误差
percentage_error = np.abs(y_pred - y_true) / np.abs(y_true) * 100

# 绘制 Sigma 图函数
def plot_sigma_chart(errors, sigma_level=3, bins=20):
    """
    绘制 Sigma 图（带直方图和概率密度曲线）
    :param errors: 百分比误差数组
    :param sigma_level: sigma 范围（如3,4,5）
    :param bins: 直方图分箱数
    """
    # 计算均值和标准差
    mu, sigma = np.mean(errors), np.std(errors)
    
    # 直方图统计
    counts, bin_edges = np.histogram(errors, bins=bins, density=True)
    bin_centers = 0.5 * (bin_edges[1:] + bin_edges[:-1])
    
    # 高斯拟合曲线
    x = np.linspace(0, np.max(errors), 500)
    pdf = norm.pdf(x, mu, sigma)
    
    # 绘图
    plt.figure(figsize=(10, 6))
    plt.bar(bin_centers, counts, width=(bin_edges[1]-bin_edges[0]),
            alpha=0.6, color='skyblue', label='误差分布 (直方图)')
    plt.plot(x, pdf, 'r-', lw=2, label='正态拟合曲线')
    
    # 绘制 ±nσ 范围虚线
    for n in [-sigma_level, sigma_level]:
        plt.axvline(mu + n * sigma, color='green', linestyle='--', 
                    label=f'{n:+}σ 范围' if n > 0 else None)
    
    # 均值线
    plt.axvline(mu, color='black', linestyle='-', label='均值')
    
    # 标签和标题
    plt.xlabel('误差百分比 (%)')
    plt.ylabel('概率密度')
    plt.title(f'{sigma_level}-Sigma 图 (误差百分比分布)')
    plt.legend()
    plt.grid(True)
    plt.show()

# 绘制 3-Sigma 图
plot_sigma_chart(percentage_error, sigma_level=3, bins=20)

# 可切换为 4-Sigma、5-Sigma
# plot_sigma_chart(percentage_error, sigma_level=4, bins=20)
# plot_sigma_chart(percentage_error, sigma_level=5, bins=20)

代码说明

percentage_error：计算预测值与真实值的绝对差的百分比。
直方图：用 np.histogram 获取频率分布，density=True 转换为概率密度。
正态曲线：用 scipy.stats.norm.pdf 绘制拟合曲线。
Sigma 范围：用 ±nσ 在图中画出虚线。
bins 参数可以控制柱状图数量（推荐 20 左右）。

理解了，你需要的是：

x轴：预测值与实际值之间的相对误差百分比 ((预测值 - 实际值) / 实际值) * 100
y轴：该误差出现的概率（直方图归一化）
附加功能：绘制不同 σ 区间（如 1σ、2σ、3σ、4σ、5σ）的范围线

完整代码（可直接运行）

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

# 模拟数据（实际使用时替换为真实数据）
np.random.seed(42)
actual = np.random.uniform(50, 150, 1000)  # 实际值
predicted = actual + np.random.normal(0, 10, 1000)  # 预测值

# 计算相对误差百分比（可正可负）
relative_error_percent = (predicted - actual) / actual * 100

# 统计直方图数据
num_bins = 20  # 直方图柱数，可调整
counts, bin_edges = np.histogram(relative_error_percent, bins=num_bins, density=True)

# 计算每个柱的中心
bin_centers = 0.5 * (bin_edges[:-1] + bin_edges[1:])

# 估计误差分布的均值和标准差
mu, sigma = np.mean(relative_error_percent), np.std(relative_error_percent)

# 生成正态分布曲线
x = np.linspace(min(relative_error_percent), max(relative_error_percent), 500)
pdf = norm.pdf(x, mu, sigma)

# 绘图
plt.figure(figsize=(10, 6))

# 绘制直方图
plt.bar(bin_centers, counts, width=bin_edges[1]-bin_edges[0], alpha=0.6, color='skyblue', edgecolor='black', label='误差分布')

# 绘制正态分布曲线
plt.plot(x, pdf, 'r-', lw=2, label=f'正态分布拟合\nμ={mu:.2f}%, σ={sigma:.2f}%')

# 绘制多级sigma虚线
for s in [1, 2, 3, 4, 5]:  # 可自行调整sigma级数
    left, right = mu - s * sigma, mu + s * sigma
    plt.axvline(left, color='gray', linestyle='--', linewidth=1)
    plt.axvline(right, color='gray', linestyle='--', linewidth=1)
    plt.text(right, max(counts)*0.9, f'{s}σ', rotation=90, verticalalignment='top')

# 添加标签和图例
plt.xlabel('预测误差百分比 (%)')
plt.ylabel('概率密度')
plt.title('预测误差分布及多σ区间')
plt.legend()
plt.grid(alpha=0.3)
plt.tight_layout()
plt.show()

主要特性

x轴：误差百分比（带正负号，非绝对值）。
y轴：概率密度，由 density=True 归一化保证。
可调整 sigma 级数：修改 for s in [1, 2, 3, 4, 5]: 即可。
直方图 + 正态拟合曲线 + 多级 σ 虚线。

以下是基于你现有的 x 和 y 数据绘制 可调 sigma 控制图 的完整示例，支持 3σ、4σ、5σ 等任意级别：

完整 Python 代码

import numpy as np
import matplotlib.pyplot as plt

def plot_sigma_chart(x, y, sigma_level=3):
    """
    使用已有的 x 和 y 数据绘制 sigma 控制图。
    
    参数:
        x: 自变量数据 (列表或 NumPy 数组)
        y: 因变量数据 (列表或 NumPy 数组)
        sigma_level: sigma 水平（如 3 表示 3σ，4 表示 4σ）
    """
    # 转换为 NumPy 数组以便计算
    x = np.array(x)
    y = np.array(y)

    # 计算均值和标准差
    mean_y = np.mean(y)
    std_y = np.std(y)

    # 计算上限线和下限线
    upper_limit = mean_y + sigma_level * std_y
    lower_limit = mean_y - sigma_level * std_y

    # 开始绘图
    plt.figure(figsize=(10, 6))
    plt.plot(x, y, 'bo-', label='Data')  # 数据点

    # 绘制均值线
    plt.axhline(y=mean_y, color='green', linestyle='-', linewidth=2, label='Mean')

    # 绘制上限线和下限线
    plt.axhline(y=upper_limit, color='red', linestyle='--', linewidth=1.5, label=f'+{sigma_level}σ')
    plt.axhline(y=lower_limit, color='red', linestyle='--', linewidth=1.5, label=f'-{sigma_level}σ')

    # 标题和标签
    plt.title(f'Sigma Chart ({sigma_level}σ)')
    plt.xlabel('X')
    plt.ylabel('Y')
    plt.legend()
    plt.grid(True)
    plt.show()

# 示例：假设你已有训练数据 x 和 y
x = np.arange(1, 21)  # 训练数据的索引
y = np.random.normal(loc=50, scale=5, size=20)  # 模拟符合正态分布的训练结果

# 绘制 3σ 控制图
plot_sigma_chart(x, y, sigma_level=3)

# 可以修改 sigma_level 参数来绘制 4σ 或 5σ
# plot_sigma_chart(x, y, sigma_level=4)
# plot_sigma_chart(x, y, sigma_level=5)

代码特点

完全独立：可直接运行，无需重新指定 x 和 y。
sigma 可调：通过 sigma_level 轻松切换 3σ、4σ、5σ。
适合线性回归训练结果：直接传入模型训练中的数据即可。

下面是一个可直接运行的 Python 脚本，用于绘制线性回归预测残差的 N-Sigma 控制图，并且支持调整为 3σ、4σ、5σ 等：

完整代码

import numpy as np
import matplotlib.pyplot as plt
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

# ========== 数据准备 ==========
np.random.seed(42)
X = np.linspace(0, 10, 100).reshape(-1, 1)  # 特征
y = 3 * X.flatten() + 5 + np.random.normal(0, 2, 100)  # 线性关系 + 噪声

# 切分训练集和测试集
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# ========== 训练线性回归模型 ==========
model = LinearRegression()
model.fit(X_train, y_train)

# 预测
y_pred = model.predict(X_test)

# ========== 计算残差 ==========
residuals = y_test - y_pred
mean_residual = np.mean(residuals)
std_residual = np.std(residuals)

# ========== 可调参数：sigma 范围 ==========
sigma_level = 3  # 可改为 4、5 等

# 计算控制限
upper_limit = mean_residual + sigma_level * std_residual
lower_limit = mean_residual - sigma_level * std_residual

# ========== 绘制 N-Sigma 控制图 ==========
plt.figure(figsize=(10, 6))
plt.plot(residuals, marker='o', linestyle='-', label='Residuals')
plt.axhline(mean_residual, color='green', linestyle='--', label='Mean')
plt.axhline(upper_limit, color='red', linestyle='--', label=f'+{sigma_level}σ')
plt.axhline(lower_limit, color='red', linestyle='--', label=f'-{sigma_level}σ')

# 标记超出 Nσ 的点
outliers = np.where((residuals > upper_limit) | (residuals < lower_limit))[0]
plt.scatter(outliers, residuals[outliers], color='red', s=80, edgecolors='black', label='Outliers')

plt.title(f'{sigma_level}-Sigma Control Chart for Regression Residuals')
plt.xlabel('Sample Index')
plt.ylabel('Residual')
plt.legend()
plt.grid(True)
plt.show()

运行效果

横轴是测试样本索引。
纵轴是残差值。
绿色虚线是残差均值，红色虚线是 ±Nσ 控制线。
超出 ±Nσ 的点会被标记为红色点。

下面是一个可以直接运行的示例代码，用于绘制符合正态分布的线性回归残差数据的 3-Sigma 图，包含：

柱状图（误差百分比直方图）
累积分布曲线（σ线）
3 条虚线（±1σ、±2σ、±3σ 位置）

Python 代码

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

# ================================
# 1. 生成示例数据（线性回归残差）
# ================================
np.random.seed(42)
# 假设真实误差为正态分布 N(0, 0.1)
errors = np.random.normal(loc=0, scale=0.1, size=1000)

# 转换为绝对差值百分比 (例如预测值的 1 为基准)
abs_errors_pct = np.abs(errors) * 100

# ================================
# 2. 计算均值和标准差
# ================================
mean = np.mean(abs_errors_pct)
std = np.std(abs_errors_pct)

# ================================
# 3. 绘制直方图 + 累积分布曲线
# ================================
plt.figure(figsize=(10, 6))

# --- 直方图 ---
counts, bins, patches = plt.hist(
    abs_errors_pct, bins=20, density=True, alpha=0.6, color='skyblue', label='误差直方图'
)

# --- 计算正态分布拟合曲线 ---
x = np.linspace(0, max(abs_errors_pct) * 1.1, 500)
pdf = norm.pdf(x, mean, std)  # 概率密度函数
cdf = norm.cdf(x, mean, std)  # 累积分布函数 (σ线)

# --- 绘制概率密度曲线 ---
plt.plot(x, pdf, 'r-', lw=2, label='正态分布曲线')

# --- 绘制累积分布曲线 ---
plt.plot(x, cdf, 'g-', lw=2, label='累积分布曲线 (σ)')

# ================================
# 4. 绘制 1σ、2σ、3σ 虚线
# ================================
for i in range(1, 4):
    plt.axvline(mean + i * std, color='gray', linestyle='--', lw=1)
    plt.axvline(max(0, mean - i * std), color='gray', linestyle='--', lw=1)
    plt.text(mean + i * std, max(pdf)*0.8, f'+{i}σ', rotation=90)
    plt.text(max(0, mean - i * std), max(pdf)*0.8, f'-{i}σ', rotation=90)

# ================================
# 5. 图形美化
# ================================
plt.title('3-Sigma 误差分布图')
plt.xlabel('绝对差值百分比 (%)')
plt.ylabel('概率密度 / 累积分布')
plt.legend()
plt.grid(True, linestyle='--', alpha=0.5)
plt.tight_layout()
plt.show()

运行说明

复制代码到 sigma_plot.py 或 Jupyter Notebook。
确保安装依赖：
```
pip install numpy matplotlib scipy
```
运行后会显示包含 直方图 + 正态分布拟合曲线 + 累积分布 σ 线 的图。

以下是一个可复用的 Python 函数，可以直接传入残差数据，自动生成符合 3σ 原则 的分布图，并可选择是否绘制正态拟合曲线与累积分布曲线 (CDF)。

可复用函数

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

def plot_residuals_distribution(
    residuals,
    bins=30,
    show_fit=True,
    show_cdf=False,
    title="Residuals Distribution with 3σ Rule"
):
    """
    绘制残差分布图，包含 3σ 区间，可选正态拟合曲线和CDF曲线。

    参数：
        residuals (array-like): 残差数据
        bins (int): 直方图的柱数
        show_fit (bool): 是否显示正态分布拟合曲线
        show_cdf (bool): 是否显示累积分布曲线
        title (str): 图表标题
    """
    residuals = np.array(residuals)
    mean = np.mean(residuals)
    std = np.std(residuals)

    # 绘制直方图
    fig, ax1 = plt.subplots(figsize=(8, 5))
    n, bins_hist, patches = ax1.hist(residuals, bins=bins, density=True, alpha=0.6, color='skyblue')

    # 绘制正态拟合曲线
    if show_fit:
        x = np.linspace(min(residuals), max(residuals), 200)
        y = norm.pdf(x, mean, std)
        ax1.plot(x, y, 'r-', lw=2, label="Normal Fit")

    # 标记 1σ、2σ、3σ 区间
    for i, color in zip([1, 2, 3], ['g', 'orange', 'r']):
        ax1.axvline(mean + i*std, color=color, linestyle='--', label=f'+{i}σ')
        ax1.axvline(mean - i*std, color=color, linestyle='--', label=f'-{i}σ')

    ax1.set_title(title)
    ax1.set_xlabel("Residual")
    ax1.set_ylabel("Density")
    ax1.legend()

    # 可选：绘制累积分布曲线
    if show_cdf:
        ax2 = ax1.twinx()
        sorted_res = np.sort(residuals)
        cdf = np.arange(1, len(residuals)+1) / len(residuals)
        ax2.plot(sorted_res, cdf, 'k-', lw=1.5, label='CDF')
        ax2.set_ylabel("CDF")
        ax2.legend(loc='lower right')

    plt.show()

使用示例

# 示例残差数据
residuals = np.random.normal(0, 2, 1000)

# 仅绘制直方图+拟合曲线+3σ
plot_residuals_distribution(residuals)

# 绘制直方图+拟合曲线+CDF
plot_residuals_distribution(residuals, show_cdf=True)

下面提供了一个完整的 Python 示例，满足以下要求：

x 轴：差值与实际值的百分比（可为正负值）。
y 轴：概率密度（概率）。
图形：柱状图（直方图）+ σ 范围的累计概率曲线（可选择 1σ、2σ、3σ、4σ、5σ）。
直方图分箱：默认 20 个，可调整。

完整示例代码

import numpy as np
import matplotlib.pyplot as plt
from scipy.stats import norm

# 示例数据
np.random.seed(42)
actual = np.random.normal(100, 10, 1000)  # 实际值
predicted = actual + np.random.normal(0, 5, 1000)  # 预测值

# 计算差值百分比 (不取绝对值)
diff_percentage = (predicted - actual) / actual * 100

# 绘制 sigma 图函数
def plot_sigma_hist(diff_percentage, bins=20, sigma_level=3):
    """
    绘制包含直方图和 sigma 线的图像
    :param diff_percentage: 差值百分比数组
    :param bins: 直方图分箱数
    :param sigma_level: sigma 范围（例如 3 表示 ±3σ）
    """
    mean = np.mean(diff_percentage)
    std = np.std(diff_percentage)

    # 绘制直方图，density=True 使 y 轴为概率密度
    counts, bin_edges, _ = plt.hist(diff_percentage, bins=bins, density=True,
                                    alpha=0.6, color='skyblue', edgecolor='black',
                                    label='差值百分比直方图')

    # 生成平滑的 x 轴数据用于绘制正态分布曲线
    x = np.linspace(min(diff_percentage), max(diff_percentage), 1000)
    pdf = norm.pdf(x, mean, std)  # 正态分布概率密度函数

    plt.plot(x, pdf, 'r-', linewidth=2, label='正态分布拟合曲线')

    # 绘制 sigma 虚线
    for i in range(1, sigma_level + 1):
        lower = mean - i * std
        upper = mean + i * std
        plt.axvline(lower, color='gray', linestyle='--', linewidth=1)
        plt.axvline(upper, color='gray', linestyle='--', linewidth=1)
        plt.text(upper, max(pdf) * 0.9, f'+{i}σ', rotation=0, fontsize=8)
        plt.text(lower, max(pdf) * 0.9, f'-{i}σ', rotation=0, fontsize=8)

    plt.xlabel("差值百分比 (%)")
    plt.ylabel("概率密度")
    plt.title(f"差值百分比直方图及 ±{sigma_level}σ 范围")
    plt.legend()
    plt.grid(alpha=0.3)
    plt.show()

# 调用函数
plot_sigma_hist(diff_percentage, bins=20, sigma_level=4)

代码说明

diff_percentage：直接计算 (predicted - actual) / actual * 100，保留正负值。
直方图：通过 plt.hist 绘制，density=True 使 y 轴为概率密度而非计数。
sigma 虚线：根据均值和标准差绘制 ±1σ, ±2σ, …，数量可通过参数 sigma_level 调整。
拟合曲线：使用 scipy.stats.norm.pdf 绘制与数据匹配的正态分布曲线。

下面是将 Matplotlib 版本转换为 ECharts 版本的示例代码，并支持差值百分比、概率计算、柱状图与 sigma 线条（虚线）绘制。

1. 计算数据部分 (Python)

先使用 Python 处理数据，生成适合前端 ECharts 的 JSON 格式。

import numpy as np
import json

# 生成示例数据
np.random.seed(42)
actual = np.random.normal(100, 10, 500)  # 实际值
predicted = actual + np.random.normal(0, 5, 500)  # 预测值

# 计算差值百分比 (不取绝对值)
diff_percent = (predicted - actual) / actual * 100  # 差值百分比

# 计算直方图数据
num_bins = 20
counts, bin_edges = np.histogram(diff_percent, bins=num_bins, density=True)

# 转换为 ECharts 所需的格式 (柱状图数据)
bar_data = [{"value": round(c, 6), "bin": f"{round(bin_edges[i],2)}~{round(bin_edges[i+1],2)}"} 
            for i, c in enumerate(counts)]

# 生成 x 轴坐标（使用每个区间的中点）
x_data = [(bin_edges[i] + bin_edges[i+1]) / 2 for i in range(len(bin_edges)-1)]

# 计算 sigma 线条
sigma_levels = [1, 2, 3, 4, 5]  # 可动态选择
mean = np.mean(diff_percent)
std = np.std(diff_percent)

sigma_lines = {}
for s in sigma_levels:
    # 理论正态分布曲线
    x_line = np.linspace(mean - s * std, mean + s * std, 200)
    y_line = (1/(std*np.sqrt(2*np.pi))) * np.exp(-0.5*((x_line-mean)/std)**2)
    sigma_lines[s] = {"x": x_line.tolist(), "y": y_line.tolist()}

# 导出为 JSON 供前端使用
chart_data = {
    "x_data": [round(x, 3) for x in x_data],
    "bar_data": [round(c, 6) for c in counts],
    "sigma_lines": sigma_lines
}

with open("chart_data.json", "w") as f:
    json.dump(chart_data, f, indent=2)

输出文件 chart_data.json 的格式类似：

{
  "x_data": [-20.4, -18.3, ..., 21.7],
  "bar_data": [0.0012, 0.0035, ...],
  "sigma_lines": {
    "1": {"x": [...], "y": [...]},
    "2": {"x": [...], "y": [...]},
    ...
  }
}

2. ECharts 绘制部分 (HTML + JS)

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <title>Sigma 图</title>
    <script src="https://cdn.jsdelivr.net/npm/echarts@5/dist/echarts.min.js"></script>
    <style>
        #main {width: 900px; height: 600px;}
    </style>
</head>
<body>
<div id="main"></div>
<script>
fetch('chart_data.json') // 加载 Python 生成的数据
  .then(response => response.json())
  .then(data => {
      var chartDom = document.getElementById('main');
      var myChart = echarts.init(chartDom);

      var option = {
          title: { text: '差值百分比概率分布 (Sigma 图)' },
          tooltip: { trigger: 'axis' },
          xAxis: { 
              type: 'category',
              data: data.x_data,
              name: '差值百分比(%)'
          },
          yAxis: {
              type: 'value',
              name: '概率密度'
          },
          series: [
              {
                  name: '概率分布',
                  type: 'bar',
                  data: data.bar_data,
                  barWidth: '60%',
                  itemStyle: { color: '#5470C6' }
              },
              ...Object.keys(data.sigma_lines).map(s => ({
                  name: s + 'σ',
                  type: 'line',
                  data: data.sigma_lines[s].x.map((x, i) => [x, data.sigma_lines[s].y[i]]),
                  lineStyle: { type: 'dashed' },
                  smooth: true
              }))
          ]
      };

      myChart.setOption(option);
  });
</script>
</body>
</html>

特点

bar_data 显示柱状图（概率分布）。
sigma_lines 以虚线显示不同 sigma 范围内的理论正态分布曲线。
只需修改 sigma_levels 即可支持 4σ、5σ 等。

好的，以下是优化后的版本，确保：

只有一条主曲线（正态分布拟合曲线）。
通过虚线标记不同 σ 区间。
在 Tooltip 中显示区间的累计概率。

1. Python 端计算数据

import numpy as np
import json
from scipy.stats import norm

# 模拟数据
np.random.seed(42)
actual = np.random.normal(100, 10, 500)
predicted = actual + np.random.normal(0, 5, 500)

# 差值百分比
diff_percent = (predicted - actual) / actual * 100

# 统计直方图数据
num_bins = 20
counts, bin_edges = np.histogram(diff_percent, bins=num_bins, density=True)
x_data = [(bin_edges[i] + bin_edges[i+1]) / 2 for i in range(len(bin_edges)-1)]

# 正态分布拟合
mean = np.mean(diff_percent)
std = np.std(diff_percent)
x_line = np.linspace(mean - 4*std, mean + 4*std, 200)
y_line = norm.pdf(x_line, mean, std)

# Sigma 区间概率
sigma_levels = [1, 2, 3, 4, 5]
sigma_lines = []
sigma_probs = []
for s in sigma_levels:
    left = mean - s * std
    right = mean + s * std
    prob = norm.cdf(right, mean, std) - norm.cdf(left, mean, std)
    sigma_lines.append(left)
    sigma_lines.append(right)
    sigma_probs.append({"sigma": s, "probability": round(prob, 4)})

chart_data = {
    "x_data": [round(x, 3) for x in x_data],
    "bar_data": [round(c, 6) for c in counts],
    "curve": {"x": x_line.tolist(), "y": y_line.tolist()},
    "sigma_lines": sigma_lines,
    "sigma_probs": sigma_probs
}

with open("chart_data.json", "w") as f:
    json.dump(chart_data, f, indent=2)

2. ECharts 前端 HTML

<!DOCTYPE html>
<html>
<head>
    <meta charset="UTF-8">
    <title>Sigma 图</title>
    <script src="https://cdn.jsdelivr.net/npm/echarts@5/dist/echarts.min.js"></script>
    <style>
        #main {width: 900px; height: 600px;}
    </style>
</head>
<body>
<div id="main"></div>
<script>
fetch('chart_data.json')
  .then(response => response.json())
  .then(data => {
      var chartDom = document.getElementById('main');
      var myChart = echarts.init(chartDom);

      var option = {
          title: { text: '差值百分比概率分布 (Sigma 图)' },
          tooltip: {
              trigger: 'axis',
              formatter: function(params) {
                  let info = '';
                  params.forEach(p => {
                      info += `${p.seriesName}: ${p.value[1]?.toFixed(4) || p.value.toFixed(4)}<br/>`;
                  });
                  info += '<br/>Sigma 区间概率:<br/>';
                  data.sigma_probs.forEach(sp => {
                      info += `${sp.sigma}σ: ${sp.probability}<br/>`;
                  });
                  return info;
              }
          },
          xAxis: { type: 'value', name: '差值百分比(%)' },
          yAxis: { type: 'value', name: '概率密度' },
          series: [
              {
                  name: '概率分布',
                  type: 'bar',
                  data: data.x_data.map((x,i)=>[x, data.bar_data[i]]),
                  barWidth: '60%',
                  itemStyle: { color: '#5470C6' }
              },
              {
                  name: '拟合曲线',
                  type: 'line',
                  smooth: true,
                  data: data.curve.x.map((x, i) => [x, data.curve.y[i]]),
                  lineStyle: { width: 2, color: 'red' }
              }
          ],
          markLine: {
              symbol: 'none',
              lineStyle: { type: 'dashed', color: 'black' },
              data: data.sigma_lines.map(v => ({ xAxis: v }))
          }
      };

      myChart.setOption(option);
  });
</script>
</body>
</html>

优化内容

仅一条红色拟合曲线 (拟合曲线)。
虚线 (markLine) 标记不同 σ 区间。
tooltip 中显示所有 σ 区间的累计概率。

是的，需要提供分桶优化方案，否则：

数据点过多 会导致 ECharts 渲染性能下降（卡顿或浏览器崩溃）。
报错 Cannot read properties of undefined (reading 'x') 通常是因为 Python 生成的 JSON 数据中某些值缺失或格式不一致（如 x 或 y 为 None、NaN 或列表索引错误）。

分桶优化思路

分桶采样：将大规模数据分成固定数量的桶，每个桶只保留平均值或中位数。
分辨率控制：例如，最多保留 500~1000 个点，避免超过浏览器渲染能力。
数据格式校验：确保 JSON 中每个点都有 x、y 且为数值类型。

Python 端计算 + 分桶优化示例

import numpy as np
import json

def generate_sigma_data(mean, std, size=10000, max_points=800):
    x = np.linspace(mean - 4*std, mean + 4*std, size)
    y = 1/(std * np.sqrt(2*np.pi)) * np.exp(-0.5*((x-mean)/std)**2)
    
    # 分桶采样
    if size > max_points:
        step = size // max_points
        x = x[::step]
        y = y[::step]
    
    data = [{"x": float(xi), "y": float(yi)} for xi, yi in zip(x, y)]
    return json.dumps(data, ensure_ascii=False)

print(generate_sigma_data(0, 1, size=50000))

前端 index.html 读取 JSON + 绘制 ECharts

<!DOCTYPE html>
<html>
<head>
  <meta charset="UTF-8">
  <title>Sigma 曲线</title>
  <script src="https://cdn.jsdelivr.net/npm/echarts@5"></script>
</head>
<body>
  <div id="main" style="width:100%;height:500px;"></div>
  <script>
    fetch('sigma_data.json') // Python 输出保存为该文件
      .then(response => response.json())
      .then(data => {
        // 检查数据有效性
        data = data.filter(p => p.x !== undefined && p.y !== undefined);

        var chart = echarts.init(document.getElementById('main'));
        chart.setOption({
          title: { text: 'Sigma 曲线' },
          tooltip: { trigger: 'axis' },
          xAxis: { type: 'value' },
          yAxis: { type: 'value' },
          series: [{
            type: 'line',
            smooth: true,
            data: data.map(p => [p.x, p.y]),
            markLine: {
              symbol: 'none',
              lineStyle: { type: 'dashed' },
              data: [
                { xAxis: -1 }, // -1σ
                { xAxis: 1 },  // +1σ
                { xAxis: -2 },
                { xAxis: 2 }
              ]
            }
          }]
        });
      })
      .catch(err => console.error(err));
  </script>
</body>
</html>

关键优化点

分桶采样 避免性能问题。
JSON 结构统一：确保 x、y 都是浮点数且不为空。
ECharts markLine 用于绘制虚线标记 sigma 区间。

在 ECharts 中，你可以使用 markLine 或者额外的 series（带 type: 'line' 和虚线样式）来绘制 sigma 竖线，同时结合 tooltip 和 label 来显示概率数值提示。以下是完整示例：

1. 前端 ECharts 配置示例

var chart = echarts.init(document.getElementById('main'));

var xData = [...]; // 由 Python 计算返回
var yData = [...]; // 概率密度曲线数据

chart.setOption({
    title: { text: 'Sigma 曲线示例' },
    tooltip: {
        trigger: 'axis',
        formatter: function (params) {
            let point = params[0]; // 曲线点
            return `x: ${point.value[0].toFixed(2)}<br/>概率: ${point.value[1].toFixed(4)}`;
        }
    },
    xAxis: { type: 'value', name: 'X' },
    yAxis: { type: 'value', name: '概率密度' },
    series: [
        {
            name: 'Sigma 曲线',
            type: 'line',
            data: xData.map((x, i) => [x, yData[i]]),
            smooth: true,
            lineStyle: { color: 'blue' },
            markLine: {
                symbol: 'none',
                lineStyle: {
                    type: 'dashed',
                    color: 'red'
                },
                label: {
                    formatter: function (params) {
                        // 自定义竖线提示
                        return `Sigma: ${params.value}`;
                    }
                },
                data: [
                    { xAxis: -1 }, // -1 sigma
                    { xAxis: 1 },  // 1 sigma
                    { xAxis: 2 },  // 2 sigma
                ]
            }
        }
    ]
});

2. 关键点说明

sigma 竖线
- 通过 series.markLine.data 添加竖线。
- 每个对象 {xAxis: value} 表示一条竖线。
概率提示
- 使用 tooltip.formatter 获取当前曲线点概率值。
- 也可通过 markPoint 或 markLine.label 显示固定文本。
虚线样式
- lineStyle.type = 'dashed'。

3. 结合 Python 计算

Python 返回的 JSON 数据应包含：

{
  "xData": [...],
  "yData": [...],
  "sigmaLines": [-1, 1, 2]
}

前端直接替换 markLine.data。

可以使用 ECharts 的 markLine + markPoint + graphic 组件，将概率值直接显示在对应 σ 区域之间，而不依赖 tooltip。

方案：使用 graphic 在图上标注数值

核心思路

用 line 或 scatter 系列绘制分布曲线。
使用 markLine 绘制 σ 分界虚线（±1σ, ±2σ, ±3σ）。
使用 graphic 在图表上固定显示概率值，不随鼠标移动。

示例代码（带直接概率显示）

const sigmaValues = [
    {x: -1, prob: '34.1%'},
    {x: 1, prob: '34.1%'},
    {x: -2, prob: '13.6%'},
    {x: 2, prob: '13.6%'},
    {x: -3, prob: '2.1%'},
    {x: 3, prob: '2.1%'},
];

const option = {
    title: { text: '正态分布与 Sigma 区间' },
    tooltip: { show: false }, // 不使用 tooltip
    xAxis: { type: 'value', min: -4, max: 4 },
    yAxis: { type: 'value' },
    series: [
        {
            name: 'Normal Curve',
            type: 'line',
            smooth: true,
            data: generateNormalDistribution(), // 由 Python 计算并传入
            markLine: {
                symbol: 'none',
                lineStyle: { type: 'dashed', color: '#999' },
                data: [
                    {xAxis: -1}, {xAxis: 1},
                    {xAxis: -2}, {xAxis: 2},
                    {xAxis: -3}, {xAxis: 3}
                ]
            }
        }
    ],
    graphic: sigmaValues.map(item => ({
        type: 'text',
        left: `${50 + (item.x + 4) * 100 / 8}%`, // 位置换算
        top: '40%',
        style: {
            text: item.prob,
            fill: '#000',
            font: '14px sans-serif',
            textAlign: 'center'
        }
    }))
};

关键说明

graphic 用于直接在图上显示数值，left 和 top 可通过比例换算到 x、y 坐标。
若需要动态计算坐标，可通过 myChart.convertToPixel() 在图表渲染后更新位置。
Python 端只需提供 sigmaValues 的 x 坐标和概率值。

可选优化

分桶优化：Python 端减少曲线点数，避免 JSON 过大。
坐标自动映射：监听 chart.on('finished') 事件，用像素映射调整文字位置。

可以通过 markLine 绘制竖线，再用 graphic 组件 在图表内直接渲染文字，避免 tooltip。以下是示例代码（假设已有正态分布曲线数据）：

1. 计算概率区间及位置

import numpy as np
from scipy.stats import norm

mu, sigma = 0, 1  # 均值和标准差
x = np.linspace(-4, 4, 500)
y = norm.pdf(x, mu, sigma)

# sigma 区间
sigmas = [1, 2, 3]
prob_values = []
for s in sigmas:
    left, right = mu - s * sigma, mu + s * sigma
    prob = norm.cdf(right, mu, sigma) - norm.cdf(left, mu, sigma)
    prob_values.append({"sigma": s, "prob": round(prob * 100, 2), "pos": right})

2. 生成 ECharts 数据

chart_data = {
    "xAxis": {"type": "value"},
    "yAxis": {"type": "value"},
    "series": [{
        "type": "line",
        "data": list(map(list, zip(x.tolist(), y.tolist()))),
        "smooth": True
    }],
    "markLine": {
        "symbol": "none",
        "lineStyle": {"type": "dashed"},
        "data": [
            {"xAxis": mu - s * sigma} for s in sigmas
        ] + [
            {"xAxis": mu + s * sigma} for s in sigmas
        ]
    }
}

let option = JSON.parse('');

// 使用 graphic 绘制文本
option.graphic = [
    
];

关键点

convertToPixel 将数据坐标转换为像素位置（需在 myChart.on('finished') 中调用更新）。
graphic 组件用于绘制文字，而非 tooltip。
文字可放在顶部或竖线附近，位置通过计算得到。

好的，下面给出一个 ECharts 实现 Sigma 区域概率值显示 的方案，直接在图上显示，不依赖 tooltip。

核心思路

使用 line 或 scatter 系列绘制原始曲线；
使用 markLine 绘制 σ 区域竖线；
使用 graphic 或 markPoint 在竖线之间放置概率文本。

示例代码

var sigmaValues = [1, 2, 3];  // 1σ, 2σ, 3σ
var sigmaProbabilities = ['68.27%', '95.45%', '99.73%']; // 对应概率
var sigmaPositions = [10, 20, 30]; // x轴位置，可替换为动态计算

var option = {
  title: {
    text: 'Sigma 区域概率分布',
    left: 'center'
  },
  xAxis: { type: 'value' },
  yAxis: { type: 'value' },
  series: [
    {
      type: 'line',
      smooth: true,
      data: [...Array(100).keys()].map(x => [x, Math.exp(-(x-50)*(x-50)/200)]),
      markLine: {
        symbol: 'none',
        lineStyle: { color: '#aaa', type: 'dashed' },
        data: sigmaPositions.map(x => ({ xAxis: x }))
      }
    }
  ],
  graphic: sigmaProbabilities.map((p, i) => ({
    type: 'text',
    left: (sigmaPositions[i] + sigmaPositions[i-1] || 0) / 2, // 区域中点
    top: 50, // 调整到合适高度
    style: {
      text: p,
      fill: '#333',
      font: 'bold 14px sans-serif'
    }
  }))
};

myChart.setOption(option);

关键点

graphic 可在任意位置绘制文本，直接显示在图表上；
left 和 top 可以使用函数根据坐标轴位置计算；
可结合 chart.convertToPixel 方法让文本随坐标系缩放。

下面提供一个完整的 ECharts 配置示例，直接在浏览器中运行即可，展示正态分布曲线，并在 ±1σ、±2σ、±3σ 区域之间显示对应概率值（不使用 tooltip，而是直接在图中显示文本标签）：

HTML + JavaScript 示例

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8">
    <title>正态分布 Sigma 图</title>
    <script src="https://cdn.jsdelivr.net/npm/echarts@5/dist/echarts.min.js"></script>
    <style>
        #main {
            width: 1000px;
            height: 600px;
        }
    </style>
</head>
<body>
    <div id="main"></div>
    <script>
        // 生成正态分布数据
        function gaussian(x, mean, sigma) {
            return (1 / (sigma * Math.sqrt(2 * Math.PI))) *
                Math.exp(-Math.pow(x - mean, 2) / (2 * sigma * sigma));
        }

        let mean = 0;
        let sigma = 1;
        let xData = [];
        let yData = [];
        for (let x = -4; x <= 4; x += 0.1) {
            xData.push(x.toFixed(1));
            yData.push(gaussian(x, mean, sigma));
        }

        // σ 区域概率
        const sigmaLabels = [
            {pos: 0.5, text: '68.27% (±1σ)'},
            {pos: 1.5, text: '95.45% (±2σ)'},
            {pos: 2.5, text: '99.73% (±3σ)'}
        ];

        let chartDom = document.getElementById('main');
        let myChart = echarts.init(chartDom);

        let option = {
            title: {
                text: '正态分布曲线及Sigma区域',
                left: 'center'
            },
            xAxis: {
                type: 'value',
                min: -4,
                max: 4,
                name: 'X',
            },
            yAxis: {
                type: 'value',
                name: 'Probability Density'
            },
            series: [
                {
                    type: 'line',
                    data: xData.map((x, i) => [parseFloat(x), yData[i]]),
                    smooth: true,
                    lineStyle: {
                        width: 2,
                        color: '#5470C6'
                    },
                    areaStyle: {
                        opacity: 0.2
                    }
                }
            ],
            graphic: [
                // 竖线 ±1σ、±2σ、±3σ
                ...[1, 2, 3, -1, -2, -3].map(s => ({
                    type: 'line',
                    left: myChart.convertToPixel({xAxisIndex: 0}, s),
                    shape: {x1: 0, y1: 0, x2: 0, y2: 400},
                    style: {stroke: '#FF6B6B', lineWidth: 1.5},
                    silent: true
                })),
                // 概率标签
                ...sigmaLabels.map(s => ({
                    type: 'text',
                    left: myChart.convertToPixel({xAxisIndex: 0}, s.pos),
                    top: 100 + s.pos * 20,
                    style: {
                        text: s.text,
                        fill: '#333',
                        font: '14px sans-serif'
                    }
                }))
            ]
        };

        myChart.setOption(option);

        // 确保缩放时 graphic 更新位置
        window.addEventListener('resize', () => {
            myChart.setOption(option);
        });
    </script>
</body>
</html>

关键特性

竖线：使用 graphic 组件绘制，分别对应 ±1σ、±2σ、±3σ。
概率值文本：直接通过 graphic.text 绘制在区域上方，不依赖 tooltip。
数据生成：通过高斯函数生成曲线。

你遇到的错误：

Cannot read properties of undefined (reading 'queryComponents')

通常是因为 ECharts 渲染过程中 series 或 graphic 元素的定义不完整，或直接在 option 里访问了还未初始化的组件。

直接在图上标注 σ 区域概率值的完整示例

下面给出一个可直接运行的示例，使用 ECharts + graphic 元素 在高斯分布图上标注 1σ、2σ、3σ 区域对应的概率值。

<!DOCTYPE html>
<html style="height: 100%">
<head>
  <meta charset="utf-8">
  <title>ECharts Gaussian Sigma 区域标注</title>
  <script src="https://cdn.jsdelivr.net/npm/echarts@5/dist/echarts.min.js"></script>
</head>
<body style="height: 100%; margin: 0">
  <div id="main" style="height: 100%"></div>
  <script>
    // 计算高斯分布 y 值
    function gaussian(x, mean = 0, std = 1) {
      return Math.exp(-0.5 * Math.pow((x - mean) / std, 2)) / (std * Math.sqrt(2 * Math.PI));
    }

    const xData = [];
    const yData = [];
    for (let i = -4; i <= 4; i += 0.01) {
      xData.push(i);
      yData.push(gaussian(i));
    }

    // 定义 sigma 区间及概率值
    const sigmaRegions = [
      { range: [-1, 1], color: 'rgba(255, 215, 0, 0.4)', text: '68.27%' },
      { range: [-2, 2], color: 'rgba(50, 205, 50, 0.3)', text: '95.45%' },
      { range: [-3, 3], color: 'rgba(100, 149, 237, 0.3)', text: '99.73%' }
    ];

    const option = {
      title: { text: '高斯分布及 σ 区域概率', left: 'center' },
      tooltip: { trigger: 'axis' },
      xAxis: { type: 'value', min: -4, max: 4 },
      yAxis: { type: 'value' },
      series: [
        {
          type: 'line',
          data: xData.map((x, i) => [x, yData[i]]),
          smooth: true,
          lineStyle: { width: 2 }
        }
      ],
      graphic: sigmaRegions.flatMap(region => [
        {
          type: 'rect',
          shape: {
            x: (region.range[0] + 4) / 8 * window.innerWidth,
            y: 0,
            width: (region.range[1] - region.range[0]) / 8 * window.innerWidth,
            height: window.innerHeight
          },
          style: { fill: region.color },
          silent: true
        },
        {
          type: 'text',
          style: {
            text: region.text,
            x: ((region.range[0] + region.range[1]) / 2 + 4) / 8 * window.innerWidth,
            y: window.innerHeight * 0.2,
            textAlign: 'center',
            textVerticalAlign: 'middle',
            font: 'bold 16px Arial',
            fill: '#000'
          }
        }
      ])
    };

    const chartDom = document.getElementById('main');
    const myChart = echarts.init(chartDom);
    myChart.setOption(option);

    // 监听 resize 更新 graphic 元素
    window.addEventListener('resize', () => {
      myChart.resize();
      myChart.setOption(option); // 重新应用，避免尺寸错位
    });
  </script>
</body>
</html>

关键点

使用 graphic 手动绘制 rect + text，直接在画布上显示概率值，而非 tooltip。
shape 中的 x、y、width、height 需根据容器宽高动态计算。
使用 flatMap 将多个区域的矩形与文字元素一次性添加。

以下是一个可直接运行的 ECharts 代码示例，实现了：

在均值处绘制竖线。
在 ±1σ、±2σ、±3σ 处绘制竖线。
在竖线旁边标注对应区域的概率值，避免重叠。

HTML 文件（直接运行）

<!DOCTYPE html>
<html>
<head>
    <meta charset="utf-8">
    <title>Gaussian with Sigma Regions</title>
    <script src="https://cdn.jsdelivr.net/npm/echarts/dist/echarts.min.js"></script>
    <style>
        #main {
            width: 900px;
            height: 600px;
            margin: auto;
        }
    </style>
</head>
<body>
    <div id="main"></div>
    <script>
        var chartDom = document.getElementById('main');
        var myChart = echarts.init(chartDom);

        // 生成正态分布数据
        function gaussian(x, mean, sigma) {
            return Math.exp(-Math.pow(x - mean, 2) / (2 * sigma * sigma)) / (sigma * Math.sqrt(2 * Math.PI));
        }

        var mean = 0;
        var sigma = 1;
        var data = [];
        for (var x = -4; x <= 4; x += 0.05) {
            data.push([x, gaussian(x, mean, sigma)]);
        }

        // 区域概率值
        var sigmaRegions = [
            {sigma: 1, prob: '68.27%'},
            {sigma: 2, prob: '95.45%'},
            {sigma: 3, prob: '99.73%'}
        ];

        // 生成竖线标记
        var markLines = [
            {
                xAxis: mean,
                label: { formatter: 'Mean' },
                lineStyle: { color: 'black', type: 'solid' }
            }
        ];
        sigmaRegions.forEach(function(region, index) {
            markLines.push({
                xAxis: mean + region.sigma,
                label: {
                    formatter: region.prob,
                    position: index % 2 === 0 ? 'top' : 'bottom',
                    distance: 12 + index * 5 // 防止重叠
                },
                lineStyle: { color: '#666', type: 'dashed' }
            });
            markLines.push({
                xAxis: mean - region.sigma,
                label: {
                    formatter: region.prob,
                    position: index % 2 === 0 ? 'bottom' : 'top',
                    distance: 12 + index * 5
                },
                lineStyle: { color: '#666', type: 'dashed' }
            });
        });

        var option = {
            title: {
                text: 'Normal Distribution with Sigma Regions'
            },
            tooltip: { trigger: 'axis' },
            xAxis: { type: 'value' },
            yAxis: { type: 'value' },
            series: [
                {
                    type: 'line',
                    data: data,
                    smooth: true,
                    markLine: {
                        symbol: 'none',
                        data: markLines
                    }
                }
            ]
        };

        myChart.setOption(option);
    </script>
</body>
</html>

功能特点

每条 σ 竖线分别绘制，且通过 distance 和 position 调整标签避免重叠。
中心线显示均值标签 “Mean”。
代码可直接运行，无需额外依赖。

明白了，你希望 在每个 sigma 区间上方显示横向标注线（水平线）来表示区间范围，并且在中间标注概率，这就不是单纯用 markLine 竖线能做到的，而是需要用 额外的 series + line 或 markLine 配置 来画 横向的标注线。

下面我给你一个完整示例，实现：

±1σ、±2σ、±3σ 竖线
±σ 区间上方 横向标注线
区间 中间标注概率
标注不会叠加

<!DOCTYPE html>
<html>
<head>
<meta charset="UTF-8">
<title>Sigma Chart with Horizontal Markers</title>
<script src="https://cdn.jsdelivr.net/npm/echarts/dist/echarts.min.js"></script>
</head>
<body>
<div id="sigmaChart" style="width: 900px; height: 450px;"></div>
<script>
const mean = 0;
const std = 1;
const data = [];
const step = 0.1;

// 生成正态分布数据
for (let x = mean - 4*std; x <= mean + 4*std; x += step) {
    const y = (1 / (std * Math.sqrt(2 * Math.PI))) * Math.exp(-0.5 * ((x - mean)/std)**2);
    data.push([x, y]);
}

const sigmaChart = echarts.init(document.getElementById('sigmaChart'));

// Sigma区间
const sigmaIntervals = [
    { start: mean - 1*std, end: mean + 1*std, prob: '68.27%' },
    { start: mean - 2*std, end: mean + 2*std, prob: '95.45%' },
    { start: mean - 3*std, end: mean + 3*std, prob: '99.73%' }
];

// 生成横向标注线 series
const horizontalLines = sigmaIntervals.map((interval, index) => {
    const y = 0.25 - index*0.05; // 调整横线 y 位置
    return {
        type: 'line',
        data: [
            [interval.start, y],
            [interval.end, y]
        ],
        lineStyle: { color: 'green', width: 2 },
        markLine: { show: false },
        label: { show: false }
    };
});

// 横向标注文字 series
const horizontalLabels = sigmaIntervals.map((interval, index) => {
    const y = 0.25 - index*0.05;
    return {
        type: 'scatter',
        data: [{
            value: [(interval.start + interval.end)/2, y],
            label: { show: true, formatter: interval.prob, position: 'top', fontWeight: 'bold' }
        }],
        symbolSize: 0
    };
});

const option = {
    title: { text: 'Sigma Chart with Horizontal Markers' },
    tooltip: { trigger: 'axis' },
    xAxis: { type: 'value', name: 'x' },
    yAxis: { type: 'value', name: 'Probability Density' },
    series: [
        {
            type: 'line',
            smooth: true,
            data: data,
            name: 'Normal Distribution',
            lineStyle: { color: '#5470c6' },
            markLine: {
                symbol: 'none',
                lineStyle: { type: 'dashed', color: 'red' },
                data: [
                    { xAxis: mean - 1*std },
                    { xAxis: mean + 1*std },
                    { xAxis: mean - 2*std },
                    { xAxis: mean + 2*std },
                    { xAxis: mean - 3*std },
                    { xAxis: mean + 3*std }
                ]
            }
        },
        ...horizontalLines,
        ...horizontalLabels
    ]
};

sigmaChart.setOption(option);
</script>
</body>
</html>

✅ 特点：

竖线：通过 markLine 绘制 ±1σ、±2σ、±3σ。
横向标注线：绿色实线，覆盖每个 sigma 区间。
概率文字：横向标注线中间显示，避免叠加，可自适应调整 y 值。
折线图平滑显示，清楚展示概率密度。

1. Sigma 计算方法

Python 实现

2. 前端 ECharts 绘图示例

3. 可扩展点

1. 需要计算的数据

基本概念

数据计算步骤

2. 数据格式设计

单轮残差图（Scatter + 参考线）

多轮 sigma 曲线（Loss + Sigma 监控）

3. 数据量大的处理方案

3.1 后端降采样

3.2 区间统计（推荐）

4. 推荐工作流

1. Python 数据计算脚本

返回数据说明

2. 前端 ECharts 可视化

2.1 多轮次 sigma & loss 曲线

2.2 残差图（大数据量可选聚合）

原始散点图（小数据集）

聚合后带标准差带状图（大数据集）

1. Python 计算脚本

2. 前端 ECharts 切换显示模式

HTML & JS

3. 方案说明

1. Sigma 区间计算

2. ECharts 3σ 图示例

可运行 HTML

3. 特点

完整可运行代码

代码说明

完整代码（可直接运行）

主要特性

完整 Python 代码

代码特点

完整代码

运行效果

Python 代码

运行说明

可复用函数

使用示例

完整示例代码

代码说明

1. 计算数据部分 (Python)

2. ECharts 绘制部分 (HTML + JS)

特点

1. Python 端计算数据

2. ECharts 前端 HTML

优化内容

分桶优化思路

Python 端计算 + 分桶优化示例

前端 index.html 读取 JSON + 绘制 ECharts

关键优化点

1. 前端 ECharts 配置示例

2. 关键点说明

3. 结合 Python 计算

方案：使用 graphic 在图上标注数值

核心思路

示例代码（带直接概率显示）

关键说明

可选优化

1. 计算概率区间及位置

2. 生成 ECharts 数据

3. 添加文字标签（不使用 Tooltip）

关键点

核心思路

示例代码

关键点

HTML + JavaScript 示例

关键特性

直接在图上标注 σ 区域概率值的完整示例

关键点

HTML 文件（直接运行）

功能特点

✅ 特点：

FEATURED TAGS

FRIENDS