共计 7085 个字符,预计需要花费 18 分钟才能阅读完成。
目录
- 业务需求与口径定义(PRD → 指标 → 数据字典)
- 数据建模与 Schema(DDL + 约束 + 版本化)
- 关键查询 & 索引设计(SARGable 与计划稳定)
- API 侧 SQL:Keyset 分页、过滤、排序与计数
- 物化汇总与调度(每日指标、留存、分区归档)
- 监控与 SLO(mysqld_exporter + 告警规则)
- 备份与 PITR(脚本化)
- 部署:最小 Docker Compose(可本地跑通)
- 发布与回滚(Expand→Contract、金丝雀与基线)
- 验收与演练清单(含混沌注入)
- 题库:实战问答与参考答案
1) 业务需求与口径定义
核心指标(示例):
- GMV(仅统计
orders.status='paid'); - 订单数 / 买家数 / AOV(GMV/ 订单数);
- 留存(首单 Cohort 的 m1、m3 回购率);
- 渠道维度(
orders.channel)分解。
数据字典(片段):
metric: gmv_daily
source: orders(status='paid')
formula: sum(amount) by date(created_at)
precision: DECIMAL(18,2)
timezone: Asia/Shanghai
refresh: daily 02:00
owner: data@company
口径先行:明确过滤、时区、刷新频率与负责人,避免“多套真相”。
2) 数据建模与 Schema(DDL + 约束 + 版本化)
基础 DDL(合并前几课)
CREATE DATABASE IF NOT EXISTS shop
DEFAULT CHARACTER SET utf8mb4
DEFAULT COLLATE utf8mb4_0900_ai_ci;
USE shop;
-- 用户
CREATE TABLE IF NOT EXISTS users (
user_id BIGINT PRIMARY KEY AUTO_INCREMENT,
name VARCHAR(100) NOT NULL,
email VARCHAR(255) UNIQUE,
email_l VARCHAR(255) GENERATED ALWAYS AS (LOWER(email)) STORED,
is_active TINYINT(1) NOT NULL DEFAULT 1,
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
CHECK (is_active IN (0,1))
) ENGINE=InnoDB;
-- 商品
CREATE TABLE IF NOT EXISTS products (
product_id BIGINT PRIMARY KEY AUTO_INCREMENT,
title VARCHAR(200) NOT NULL,
price DECIMAL(10,2) NOT NULL CHECK (price>=0),
stock INT NOT NULL DEFAULT 0,
attrs JSON NULL,
color VARCHAR(32) GENERATED ALWAYS AS (JSON_UNQUOTE(JSON_EXTRACT(attrs,'$.color'))) STORED,
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
INDEX idx_products_color(color),
FULLTEXT KEY ft_title(title)
) ENGINE=InnoDB;
-- 订单主表
CREATE TABLE IF NOT EXISTS orders (
order_id BIGINT PRIMARY KEY AUTO_INCREMENT,
user_id BIGINT NOT NULL,
status ENUM('paid','canceled','refunded') NOT NULL DEFAULT 'paid',
amount DECIMAL(10,2) NOT NULL,
channel VARCHAR(16) NULL,
created_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP,
CONSTRAINT fk_orders_user FOREIGN KEY (user_id) REFERENCES users(user_id),
INDEX idx_orders_user_created(user_id, created_at),
INDEX idx_orders_status_created(status, created_at, order_id)
) ENGINE=InnoDB;
-- 订单明细
CREATE TABLE IF NOT EXISTS order_items (
item_id BIGINT PRIMARY KEY AUTO_INCREMENT,
order_id BIGINT NOT NULL,
product_id BIGINT NOT NULL,
quantity INT NOT NULL CHECK (quantity>0),
unit_price DECIMAL(10,2) NOT NULL,
CONSTRAINT fk_items_order FOREIGN KEY(order_id) REFERENCES orders(order_id),
CONSTRAINT fk_items_product FOREIGN KEY(product_id) REFERENCES products(product_id)
) ENGINE=InnoDB;
-- 价格区间(区间连接)CREATE TABLE IF NOT EXISTS price_history (
product_id BIGINT NOT NULL,
start_ts DATETIME NOT NULL,
end_ts DATETIME NOT NULL,
price DECIMAL(10,2) NOT NULL,
PRIMARY KEY(product_id, start_ts),
KEY idx_price_end(end_ts)
) ENGINE=InnoDB;
-- 每日汇总
CREATE TABLE IF NOT EXISTS metrics_daily (
d DATE PRIMARY KEY,
gmv DECIMAL(18,2) NOT NULL DEFAULT 0
) ENGINE=InnoDB;
版本化(最小实现)
CREATE TABLE IF NOT EXISTS schema_migrations (version VARCHAR(64) PRIMARY KEY,
applied_at DATETIME NOT NULL DEFAULT CURRENT_TIMESTAMP
);
-- 执行器读取 /migrate 目录按文件名顺序执行并记录版本
3) 关键查询 & 索引设计
用户近 30 天订单(命中联合索引,时间谓词可索引化):
SELECT order_id, amount, created_at
FROM orders
WHERE user_id = :uid
AND created_at >= NOW() - INTERVAL 30 DAY
ORDER BY created_at DESC
LIMIT 50;
商品搜索(全文 + 结构化过滤):
SELECT product_id, title, price
FROM products
WHERE MATCH(title) AGAINST(:q IN BOOLEAN MODE)
AND (:color IS NULL OR color=:color)
ORDER BY price ASC
LIMIT 20;
订单 GMV(防倍增):
WITH item_gmv AS (SELECT order_id, SUM(quantity*unit_price) AS gmv
FROM order_items GROUP BY order_id
)
SELECT o.order_id, u.name, g.gmv
FROM orders o
JOIN users u ON o.user_id=u.user_id
JOIN item_gmv g ON o.order_id=g.order_id
WHERE o.status='paid';
计划稳定工具:
- 直方图:
ANALYZE TABLE products UPDATE HISTOGRAM ON color WITH 32 BUCKETS; - 不可见索引灰度:
CREATE INDEX ... INVISIBLE;→ 验证 →ALTER INDEX ... VISIBLE;
4) API 侧 SQL(Keyset 分页 + 计数)
订单列表 Keyset 分页:
-- 首次
SELECT order_id, created_at, amount
FROM orders
WHERE user_id=:uid AND status='paid'
ORDER BY created_at DESC, order_id DESC
LIMIT 20;
-- 下一页(带游标)SELECT order_id, created_at, amount
FROM orders
WHERE user_id=:uid AND status='paid'
AND (created_at, order_id) < (:last_created_at, :last_id)
ORDER BY created_at DESC, order_id DESC
LIMIT 20;
“总数”估算:
- 精确计数:
SELECT COUNT(*) FROM orders WHERE user_id=:uid AND status='paid'; - 近似计数:采样或 HyperLogLog(落到 OLAP/ 缓存层)。
5) 物化汇总与调度
每日 GMV upsert:
INSERT INTO metrics_daily(d, gmv)
SELECT DATE(created_at), SUM(amount)
FROM orders WHERE status='paid' AND created_at >= CURDATE() - INTERVAL 7 DAY
GROUP BY DATE(created_at)
ON DUPLICATE KEY UPDATE gmv=VALUES(gmv);
事件调度器(每日 02:05 跑):
SET GLOBAL event_scheduler=ON;
DELIMITER $$
CREATE EVENT IF NOT EXISTS ev_gmv_daily
ON SCHEDULE EVERY 1 DAY STARTS DATE_ADD(CURDATE(), INTERVAL 1 DAY) + INTERVAL 2 HOUR + INTERVAL 5 MINUTE
DO BEGIN
INSERT INTO metrics_daily(d, gmv)
SELECT DATE(created_at), SUM(amount)
FROM orders WHERE status='paid' AND created_at >= CURDATE() - INTERVAL 7 DAY
GROUP BY DATE(created_at)
ON DUPLICATE KEY UPDATE gmv=VALUES(gmv);
END$$
DELIMITER ;
分区归档(按月)(选做):
CREATE TABLE IF NOT EXISTS orders_p LIKE orders;
ALTER TABLE orders_p
PARTITION BY RANGE (TO_DAYS(created_at)) (PARTITION p2025m09 VALUES LESS THAN (TO_DAYS('2025-10-01')),
PARTITION pmax VALUES LESS THAN MAXVALUE
);
6) 监控与 SLO(Prometheus 规则参考)
SLO 示例:p99 查询 < 200ms、复制延迟 < 2s、连接占比 < 85%。
告警片段:
groups:
- name: mysql.rules
rules:
- alert: MySQLHighLatencyP99
expr: histogram_quantile(0.99, sum(rate(mysql_statement_latency_bucket[5m])) by (le)) > 0.2
for: 10m
- alert: MySQLReplicaLag
expr: mysql_slave_status_seconds_behind_master > 2
for: 2m
- alert: MySQLConnectionsHigh
expr: mysql_global_status_threads_connected / mysql_global_variables_max_connections > 0.85
for: 5m
7) 备份与 PITR(脚本化)
逻辑全量 + binlog 窗口回放:
# backup.sh
set -e
TS=$(date +%F)
mysqldump -h127.0.0.1 -uroot -p$PASS --databases shop \
--single-transaction --routines --triggers | gzip > backups/shop_$TS.sql.gz
# pitr.sh
START="2025-10-11 00:00:00"; STOP="2025-10-11 12:00:00"
mysqlbinlog --read-from-remote-server -h127.0.0.1 -uroot -p$PASS \
--start-datetime="$START" --stop-datetime="$STOP" \
--result-file=redo.sql
mysql -h127.0.0.1 -uroot -p$PASS < redo.sql
生产建议:周全量 + 日增量 ,并安排 季度演练恢复。
8) 部署:最小 Docker Compose
目录结构建议:
capstone/
docker-compose.yml
mysql/
my.cnf
init/
00_schema.sql
01_seed.sql
02_indexes.sql
03_events.sql
monitoring/
prometheus.yml
rules.yml
docker-compose.yml(片段):
version: '3.8'
services:
mysql:
image: mysql:8.0
environment:
MYSQL_ROOT_PASSWORD: Passw0rd!
ports: ["3306:3306"]
volumes:
- ./mysql/my.cnf:/etc/mysql/conf.d/my.cnf:ro
- ./mysql/init:/docker-entrypoint-initdb.d:ro
exporter:
image: prom/mysqld-exporter:latest
environment:
DATA_SOURCE_NAME: exporter:Expo@123@(mysql:3306)/
ports: ["9104:9104"]
depends_on: [mysql]
prometheus:
image: prom/prometheus:latest
volumes:
- ./monitoring/prometheus.yml:/etc/prometheus/prometheus.yml:ro
- ./monitoring/rules.yml:/etc/prometheus/rules.yml:ro
command: ["--config.file=/etc/prometheus/prometheus.yml"]
ports: ["9090:9090"]
depends_on: [exporter]
9) 发布与回滚
策略:
- 架构变更遵循 Expand→Contract;
- 新索引以 INVISIBLE 上线;
- 计划回归保护:保存 EXPLAIN JSON 的
plan_hash与rows_examined; - 金丝雀:1% 流量先用新 plan/ 新索引,观察 p95、错误率;
- 异常回退:
ALTER INDEX ... INVISIBLE/DROP,或启用旧版本镜像。
10) 验收与演练清单
- ✅ Base:DDL 通过、外键有效、初始索引到位;
- ✅ 查询:关键 SQL
EXPLAIN/ANALYZE无全表扫描,rows 预估合理; - ✅ 监控:Exporters 上线,主看板有 p95/p99、慢查询速率、连接占比;
- ✅ 备份:
backup.sh成功出包,pitr.sh在沙箱可回放; - ✅ 事件:
ev_gmv_daily正常出数; - ✅ 压测:sysbench 跑通并把 QPS/p95 写入 baseline;
- ✅ 故障演练:重启 / 断网 / 只读切换 / 复制延迟注入均可恢复且指标达标。
11) 题库:实战问答与参考答案
Q1|为什么我的 WHERE DATE(created_at)='2025-10-01' 很慢?
- 因为对列做函数,失去可索引性。改为区间:
created_at>='2025-10-01' AND created_at<'2025-10-02'。
Q2|如何避免 JOIN 倍增导致 GMV 翻倍?
- 先在明细
GROUP BY得到订单级 GMV,再与订单回连。
Q3|线上大分页耗时长怎么办?
- 用 Keyset 游标;必要时覆盖索引先取主键,再回表。
Q4|新增索引会让写入变慢吗?
- 会。评估读写比例;可在离峰构建、或先 INVISIBLE 验证。
Q5|金丝雀对比怎么做?
- 1% 用户走新索引 / 新查询;比较 p95、错误率、rows_examined、返回行一致性。
Q6|如何把半结构化(JSON)查询跑快?
- 生成列 + 索引,避免直接
JSON_EXTRACT出现在WHERE。
Q7|如何设计“写后读一致”?
- 强制主读或在只读副本 wait-for-GTID 应用到对应事务位点。
Q8|如何验证备份可用?
- 定期在隔离环境恢复,运行健康检查与关键查询回归。
小结
- 以 口径与字典 收敛语义;以 DDL/ 约束 / 版本化 稳固数据;以 SARGable 查询与索引 稳住性能;以 监控 + 备份 + 演练 抵御不确定;以 发布策略 降低变更风险。
正文完

