Files
vibe_coding/.cursor/rules/references/014-database-deep.md
2026-03-05 21:27:11 +08:00

625 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# 014-database.mdc (Deep Reference)
> 该文件为原始详细规范归档,供 Tier 3 按需读取。
---
# 🗄️ Hyperf ORM + MySQL Database Standards
参考文档: @docs/architecture/data-model.md
## 核心原则
- 任何写操作前有备份策略
- 使用 Hyperf Migration禁止直接 DDL
- 软删除优先于硬删除 (`deleted_at`)
- 敏感数据加密存储
- 生产环境只读
## 必须字段 (每张表)
```php
// Hyperf Migration
Schema::create('examples', function (Blueprint $table) {
$table->bigIncrements('id');
$table->timestamps(); // created_at, updated_at
$table->softDeletes(); // deleted_at
$table->unsignedBigInteger('created_by')->nullable();
$table->unsignedBigInteger('updated_by')->nullable();
});
```
## Model 规范
```php
<?php
declare(strict_types=1);
namespace App\Model\Production;
use Hyperf\Database\Model\SoftDeletes;
use Hyperf\DbConnection\Model\Model;
class ProductionOrder extends Model
{
use SoftDeletes;
protected ?string $table = 'production_orders';
protected array $fillable = [
'order_no', 'customer_id', 'platform_id',
'status', 'total_amount', 'paid_amount',
];
protected array $casts = [
'total_amount' => 'decimal:2',
'paid_amount' => 'decimal:2',
'source_data' => 'json',
'created_at' => 'datetime',
];
// Eager loading to prevent N+1
public function customer(): \Hyperf\Database\Model\Relations\BelongsTo
{
return $this->belongsTo(Customer::class);
}
public function subOrders(): \Hyperf\Database\Model\Relations\HasMany
{
return $this->hasMany(ProductionSubOrder::class, 'order_id');
}
}
```
## 命名规范
### 表名模块前缀规则
**表名必须以所属模块名作为前缀**,格式:`<module>_<entity_plural>`
| 模块 | 前缀 | 示例 |
|------|------|------|
| 用户与权限 | `auth_` | `auth_users`, `auth_roles` |
| 案例核心 | `case_` | `case_cases`, `case_images` |
| 设计师 | `designer_` | `designer_profiles`, `designer_awards` |
| 运营内容 | `operation_` | `operation_banners`, `operation_topics` |
| 用户互动 | `interaction_` | `interaction_favorites`, `interaction_comments` |
| 日志 | `log_` | `log_user_logins`, `log_downloads` |
| 安全风控 | `security_` | `security_blacklists`, `security_risk_events` |
| 系统配置 | `system_` | `system_configs` |
> 多对多关联表同样需要加模块前缀:`<module>_<a>_belongs_<b>`,以主体模块为准。
### 通用命名约定
| 类型 | 规范 | 示例 |
|------|------|------|
| 表名 | `<module>_` 前缀 + snake_case 复数 | `auth_users`, `case_cases` |
| 字段名 | snake_case | `created_at` |
| 主键 | `id` | `BIGINT UNSIGNED AUTO_INCREMENT` |
| 外键 | `<table_singular>_id` | `customer_id` |
| 索引 | `idx_<table>_<columns>` | `idx_auth_users_status` |
| 唯一索引 | `uk_<table>_<columns>` | `uk_case_cases_code` |
## 高并发表设计规范
### 字段类型选择
| 场景 | 推荐类型 | 避免 |
|------|---------|------|
| 主键 | `BIGINT UNSIGNED` | `INT` (溢出风险) |
| 金额 | `DECIMAL(10,2)` | `FLOAT/DOUBLE` (精度丢失) |
| 状态 | `VARCHAR(20)``TINYINT` | `ENUM` (修改需 DDL) |
| 时间 | `TIMESTAMP` | `DATETIME` (不带时区) |
| JSON 数据 | `JSON` (MySQL 8) | `TEXT` (无法索引) |
| 短文本 | `VARCHAR(n)` 精确长度 | `VARCHAR(255)` 万能长度 |
### 新字段安全约束
**已有数据的表上新增字段必须遵循:**
```php
// ✅ 正确nullable
$table->string('avatar')->nullable();
// ✅ 正确:有默认值
$table->tinyInteger('priority')->default(0);
// ❌ 禁止NOT NULL 无默认值(锁表重写所有行)
$table->string('role'); // NOT NULL without default
```
如需 NOT NULL 约束,使用三步法:
1. 先添加 nullable 字段
2. 数据回填(独立迁移)
3. 再加 NOT NULL 约束(独立迁移)
### 索引策略
```sql
-- 复合索引:遵循最左前缀原则
CREATE INDEX idx_orders_status_created ON production_orders(status, created_at);
-- 覆盖索引:查询字段全在索引中,避免回表
CREATE INDEX idx_orders_cover ON production_orders(status, total_amount, paid_amount);
-- 前缀索引:长字符串字段
CREATE INDEX idx_orders_remark ON production_orders(remark(20));
```
**索引检查清单**:
- [ ] 所有外键字段有索引
- [ ] WHERE 常用字段有索引
- [ ] ORDER BY 字段在索引中
- [ ] 联合查询字段使用复合索引
- [ ] 单表索引不超过 6 个
### 百万级数据优化
- 分页使用游标分页(`WHERE id > ? LIMIT ?`)替代 `OFFSET`
- 大表 COUNT 使用近似值或缓存
- 批量操作使用 `chunk()` 分批处理
- 避免大事务,单事务操作 < 1000 行
- 热点数据使用 Redis 缓存,减少 DB 压力
## 读写分离
```php
// config/autoload/databases.php
return [
'default' => [
'driver' => 'mysql',
'read' => [
'host' => [env('DB_READ_HOST_1'), env('DB_READ_HOST_2')],
],
'write' => [
'host' => env('DB_WRITE_HOST'),
],
'port' => env('DB_PORT', 3306),
'database' => env('DB_DATABASE'),
'username' => env('DB_USERNAME'),
'password' => env('DB_PASSWORD'),
'charset' => 'utf8mb4',
'collation' => 'utf8mb4_unicode_ci',
'pool' => [
'min_connections' => 5,
'max_connections' => 50,
'connect_timeout' => 10.0,
'wait_timeout' => 3.0,
'heartbeat' => -1,
'max_idle_time' => 60,
],
],
];
```
## 高级查询模式
### UPSERT插入或更新
```php
// Hyperf Eloquent upsert — 冲突时更新指定字段
ProductionOrder::query()->upsert(
[
['order_no' => 'PO-001', 'status' => 'active', 'total_amount' => 100.00],
['order_no' => 'PO-002', 'status' => 'pending', 'total_amount' => 200.00],
],
['order_no'], // conflict key (unique)
['status', 'total_amount'] // columns to update on conflict
);
// 等效原生 SQL
// INSERT INTO production_orders (order_no, status, total_amount)
// VALUES ('PO-001', 'active', 100.00), ('PO-002', 'pending', 200.00)
// ON DUPLICATE KEY UPDATE status = VALUES(status), total_amount = VALUES(total_amount);
```
适用场景:外部数据同步、批量导入、幂等写入。
### FOR UPDATE SKIP LOCKED无锁队列消费
MySQL 8.0+ 支持,适合自定义任务队列或分布式任务分发:
```php
// Atomic: claim one pending job without blocking other workers
$job = Db::select(
"SELECT * FROM async_jobs
WHERE status = 'pending'
ORDER BY created_at
LIMIT 1
FOR UPDATE SKIP LOCKED"
);
if ($job) {
Db::update(
"UPDATE async_jobs SET status = 'processing', worker_id = ? WHERE id = ?",
[$workerId, $job[0]->id]
);
}
```
优势:多 Worker 并发消费时不阻塞,已被锁定的行自动跳过。
### 覆盖索引Index-Only Scan
```sql
-- 查询字段全在索引中,避免回表
CREATE INDEX idx_orders_cover
ON production_orders(status, created_at, total_amount, paid_amount);
-- 此查询只需扫描索引,不回表
SELECT status, total_amount, paid_amount
FROM production_orders
WHERE status = 'active' AND created_at > '2026-01-01';
```
## 反模式检测 SQL
定期运行以下诊断查询,发现潜在问题:
```sql
-- 1. 检测未建索引的外键字段
SELECT TABLE_NAME, COLUMN_NAME, CONSTRAINT_NAME, REFERENCED_TABLE_NAME
FROM information_schema.KEY_COLUMN_USAGE
WHERE REFERENCED_TABLE_NAME IS NOT NULL
AND TABLE_SCHEMA = DATABASE()
AND COLUMN_NAME NOT IN (
SELECT COLUMN_NAME FROM information_schema.STATISTICS
WHERE TABLE_SCHEMA = DATABASE()
AND TABLE_NAME = KEY_COLUMN_USAGE.TABLE_NAME
);
-- 2. 检测慢查询 Top 10需开启 performance_schema
SELECT DIGEST_TEXT AS query,
COUNT_STAR AS calls,
ROUND(AVG_TIMER_WAIT / 1000000000, 2) AS avg_ms,
ROUND(SUM_TIMER_WAIT / 1000000000, 2) AS total_ms
FROM performance_schema.events_statements_summary_by_digest
WHERE SCHEMA_NAME = DATABASE()
AND AVG_TIMER_WAIT > 500000000 -- > 500ms
ORDER BY AVG_TIMER_WAIT DESC
LIMIT 10;
-- 3. 检测表碎片dead rows / 需要 OPTIMIZE
SELECT TABLE_NAME,
TABLE_ROWS,
ROUND(DATA_LENGTH / 1024 / 1024, 2) AS data_mb,
ROUND(DATA_FREE / 1024 / 1024, 2) AS fragmented_mb
FROM information_schema.TABLES
WHERE TABLE_SCHEMA = DATABASE()
AND DATA_FREE > 10 * 1024 * 1024 -- > 10MB fragmentation
ORDER BY DATA_FREE DESC;
-- 4. 检测无用索引(从未使用)
SELECT s.TABLE_NAME, s.INDEX_NAME
FROM information_schema.STATISTICS s
LEFT JOIN performance_schema.table_io_waits_summary_by_index_usage p
ON s.TABLE_SCHEMA = p.OBJECT_SCHEMA
AND s.TABLE_NAME = p.OBJECT_NAME
AND s.INDEX_NAME = p.INDEX_NAME
WHERE s.TABLE_SCHEMA = DATABASE()
AND s.INDEX_NAME != 'PRIMARY'
AND (p.COUNT_STAR IS NULL OR p.COUNT_STAR = 0);
```
## 禁止的反模式
| 反模式 | 替代方案 |
|--------|----------|
| N+1 查询 | `with()` Eager Loading |
| 硬删除 | 软删除 + 定时清理 |
| 无索引外键 | 所有 FK 必须有索引 |
| `SELECT *` | 明确列名 `select()` |
| 长事务 | 拆分小事务 |
| 循环中单条 SQL | `insert()` 批量操作 |
| `OFFSET` 深分页 | 游标分页 `WHERE id > ?` |
| NOT NULL 无默认值加字段 | 先 nullable → 回填 → 加约束 |
| Schema + Data 混在一个迁移 | 拆为独立迁移文件 |
| 修改已部署的迁移文件 | 创建新的前向迁移 |
## 迁移文件命名规范
迁移文件严格遵循以下命名格式:
```
YYYY_MM_DD_HHMMSS_description.php
```
| 操作 | 文件名示例 |
|------|----------|
| 创建表 | `2026_02_24_100000_create_production_orders_table.php` |
| 添加字段 | `2026_02_24_110000_add_payment_status_to_production_orders.php` |
| 添加索引 | `2026_02_24_120000_add_index_status_to_production_orders.php` |
| 修改字段 | `2026_02_24_130000_modify_amount_column_in_production_orders.php` |
| 删除字段 | `2026_02_24_140000_drop_legacy_field_from_production_orders.php` |
**规则**:
- 时间戳精确到秒,保证顺序
- `description` 用 snake_case必须清晰表达操作内容
- 每个迁移只做一件事(一张表或一类变更)
- 必须实现 `down()` 方法支持回滚
## Schema 变更流程
1. 读取 `data-model.md` → 2. 设计变更 → 3. 编写 Migration (含 `down()` 回滚) → 4. 开发环境执行 → 5. 更新文档
```bash
# 生成迁移
php bin/hyperf.php gen:migration create_orders_table
# 执行迁移
php bin/hyperf.php migrate
# 回滚
php bin/hyperf.php migrate:rollback
# 生成模型
php bin/hyperf.php gen:model production_orders
```
## 核心表关系图
```mermaid
erDiagram
%% ── 用户体系 ──────────────────
users ||--o{ user_roles : "has"
roles ||--o{ user_roles : "has"
roles ||--o{ role_menus : "has"
roles ||--o{ role_depts : "has"
menus ||--o{ role_menus : "has"
departments ||--o{ users : "belongs"
departments ||--o{ role_depts : "has"
departments ||--o{ departments : "parent"
users {
bigint id PK
varchar username UK
varchar password
varchar real_name
bigint dept_id FK
tinyint status
tinyint data_scope
timestamp last_login_at
}
roles {
bigint id PK
varchar name UK
varchar code UK
tinyint data_scope
tinyint status
}
departments {
bigint id PK
varchar name
bigint parent_id FK
int sort_order
tinyint status
}
menus {
bigint id PK
varchar name
varchar path
varchar permission
bigint parent_id FK
tinyint type
int sort_order
}
%% ── 生产体系 ──────────────────
production_orders ||--o{ production_sub_orders : "has"
production_orders ||--o{ production_payments : "has"
production_orders }o--|| customers : "belongs"
production_orders }o--|| platforms : "belongs"
production_sub_orders ||--o{ production_items : "has"
production_orders {
bigint id PK
varchar order_no UK
bigint customer_id FK
bigint platform_id FK
tinyint status
decimal total_amount
decimal paid_amount
json source_data
bigint created_by FK
}
production_sub_orders {
bigint id PK
bigint order_id FK
varchar sub_order_no UK
tinyint status
decimal amount
}
production_payments {
bigint id PK
bigint order_id FK
varchar payment_no UK
decimal amount
tinyint payment_method
tinyint status
}
customers {
bigint id PK
varchar name
varchar company
varchar contact_phone
tinyint level
}
platforms {
bigint id PK
varchar name UK
varchar code UK
tinyint status
}
%% ── 通知体系 ──────────────────
notifications ||--o{ notification_reads : "has"
users ||--o{ notification_reads : "has"
notifications {
bigint id PK
varchar type
varchar title
text content
json data
bigint sender_id FK
tinyint scope
}
notification_reads {
bigint id PK
bigint notification_id FK
bigint user_id FK
timestamp read_at
}
%% ── 审批流程 ──────────────────
workflows ||--o{ workflow_nodes : "has"
workflow_nodes ||--o{ workflow_records : "has"
users ||--o{ workflow_records : "approves"
workflows {
bigint id PK
varchar name
varchar type
bigint reference_id
tinyint status
}
workflow_nodes {
bigint id PK
bigint workflow_id FK
int step
varchar name
bigint assignee_id FK
tinyint status
}
workflow_records {
bigint id PK
bigint node_id FK
bigint user_id FK
tinyint action
text comment
}
```
## 汇总表预聚合(报表优化)
大量统计查询不应实时扫描明细表,而应使用汇总表 + 定时任务预聚合:
```sql
-- 创建汇总表
CREATE TABLE order_stats_daily (
stat_date DATE NOT NULL,
platform_id BIGINT UNSIGNED NOT NULL,
total_orders INT UNSIGNED DEFAULT 0,
total_amount DECIMAL(14,2) DEFAULT 0.00,
avg_amount DECIMAL(10,2) DEFAULT 0.00,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
PRIMARY KEY (stat_date, platform_id)
) ENGINE=InnoDB;
```
```php
// Hyperf Crontab: hourly aggregation
#[Crontab(rule: "5 * * * *", name: "AggregateOrderStats")]
class AggregateOrderStatsTask extends AbstractTask
{
public function execute(): void
{
Db::statement("
INSERT INTO order_stats_daily (stat_date, platform_id, total_orders, total_amount, avg_amount)
SELECT DATE(created_at), platform_id,
COUNT(*), SUM(total_amount), AVG(total_amount)
FROM production_orders
WHERE created_at >= CURDATE()
AND deleted_at IS NULL
GROUP BY DATE(created_at), platform_id
ON DUPLICATE KEY UPDATE
total_orders = VALUES(total_orders),
total_amount = VALUES(total_amount),
avg_amount = VALUES(avg_amount)
");
}
}
```
适用场景:仪表盘、报表导出、趋势分析。明细查询仍走原表。
## MySQL 分区表(百万级+
数据量超百万且有明确时间维度的表,考虑 RANGE 分区:
```sql
-- 按月分区(适合时间序列数据)
ALTER TABLE operation_logs PARTITION BY RANGE (TO_DAYS(created_at)) (
PARTITION p202601 VALUES LESS THAN (TO_DAYS('2026-02-01')),
PARTITION p202602 VALUES LESS THAN (TO_DAYS('2026-03-01')),
PARTITION p202603 VALUES LESS THAN (TO_DAYS('2026-04-01')),
PARTITION pmax VALUES LESS THAN MAXVALUE
);
```
**分区适用条件**(全部满足才考虑):
- 数据量 > 500 万行
- 查询几乎总是带时间范围条件
- 需要高效的历史数据归档/清理
- 表没有跨分区的唯一索引需求
**不适合分区的场景**
- 小表(分区开销 > 收益)
- 查询不带分区键(全分区扫描更慢)
- 需要跨分区外键约束
## 数据库缓存配置
使用 Redis 缓存频繁查询的数据,减少数据库压力:
```php
// config/autoload/cache.php
return [
'default' => [
'driver' => Hyperf\Cache\Driver\RedisDriver::class,
'packer' => Hyperf\Codec\Packer\PhpSerializerPacker::class,
'prefix' => 'cache:',
],
];
// 缓存使用示例
// 注解式缓存(推荐简单场景)
#[Cacheable(prefix: 'user', ttl: 3600)]
public function getById(int $id): ?User
{
return User::find($id);
}
#[CacheEvict(prefix: 'user')]
public function update(int $id, array $data): bool
{
return User::where('id', $id)->update($data);
}
// 手动缓存(复杂场景)
$user = $this->cache->remember("user:{$id}", 3600, fn() => User::find($id));
```
### 缓存 TTL 策略
| 数据类型 | TTL | 失效策略 |
|---------|-----|---------|
| 字典/配置 | 7 天 | 管理员修改时失效 |
| 用户信息 | 1 小时 | 用户更新时失效 |
| 菜单树 | 24 小时 | 菜单变更时失效 |
| 列表查询 | 5 分钟 | 写操作后批量失效 |
| 统计报表 | 1 小时 | 定时任务刷新 |