docs: 更新协议文档并清理过期的问题追踪文档

This commit is contained in:
qzl
2026-03-27 14:05:14 +08:00
parent c592cc7854
commit 47e2aa3eb9
18 changed files with 1499 additions and 778 deletions
@@ -1,50 +0,0 @@
# Bug: 前端未渲染 events 接口事件
## 日期
- 2026-03-24
## 现象
- 用户反馈:改动后前端无法获取/渲染 `/api/v1/agent/runs/{threadId}/events` 的事件。
- 页面表现为消息流无事件增量或工具执行状态未更新。
## 本次背景
- 本次清理了前端死链路:
- `ToolRegistry`
- `RouteNavigationTool`
- `AiDecisionEngine`
- 当前主链路仍为 AG-UI SSE`AgUiService -> AgUiEvent -> ChatBloc -> HomeChatItemRenderer`
## 影响范围
- Chat 事件流渲染(运行状态、工具调用状态、文本完成事件)
- 可能影响 Home 聊天视图实时反馈
## 初步判断
- 已清理的死链路不在当前主流程中,理论上不应直接导致 SSE 事件无法渲染。
- 更可能的问题点:
1. `runId` 绑定过滤导致事件被丢弃(`shouldDispatch` 为 false
2. `onEvent` 回调异常导致流提前停止
3. SSE `data` 结构变化,`AgUiEvent.fromJson` 解析失败
## 关键代码位置
- `apps/lib/features/chat/data/services/ag_ui_service.dart`
- `apps/lib/features/chat/data/models/ag_ui_event.dart`
- `apps/lib/features/chat/presentation/bloc/chat_bloc.dart`
- `apps/lib/features/home/ui/widgets/home_chat_item_renderer.dart`
## 待执行排查
1.`_streamEventsFromApi` 增加临时诊断日志:`eventType``eventRunId``expectedRunId``shouldDispatch`
2. 捕获并输出 `onEvent` 抛错栈,确认是否由 UI/Bloc 处理异常中断
3. 抓取真实 SSE 帧,核对 `runId/threadId/type/data` 与解析模型一致性
4. 复测 `RUN_STARTED -> TOOL_* -> TEXT_MESSAGE_END -> RUN_FINISHED/RUN_ERROR` 完整链路
## 当前状态
- 状态:待定位
- 优先级:高
@@ -0,0 +1,140 @@
# Repository 缓存层抽象优化
## 问题描述
### 现有架构
```
┌─────────────────────────────────────────┐
│ HybridCacheStore │
│ (Memory + Persistent 二级缓存) │
├─────────────────────────────────────────┤
│ CacheEntry<T> │
│ (value + fetchedAt 时间戳) │
├─────────────────────────────────────────┤
│ CachePolicy │
│ (softTtl / hardTtl / minRefreshInterval)│
├─────────────────────────────────────────┤
│ CacheInvalidator │
│ (统一失效管理) │
└─────────────────────────────────────────┘
┌─────────────────────────────────────────┐
│ CalendarRepository │ ← 重复实现
│ TodoRepository │ ← 重复实现
│ UsersRepository │ ← 重复实现
│ ... │
└─────────────────────────────────────────┘
```
### 重复内容
| 重复内容 | 例子 |
|---------|------|
| key 命名空间 | `calendar:day:$day``todo:list:pending` |
| 缓存读取逻辑 | `store.read<CacheEntry<...>>(key)` |
| 数据转换 | API 返回 → CacheEntry 包装 |
| 刷新逻辑 | `_refreshDayAndRead()` |
| 强制刷新 | `forceRefresh` 参数处理 |
| 后台刷新防重 | `_refreshInFlight` map |
### 涉及文件
- `apps/lib/features/calendar/data/services/calendar_repository.dart`
- `apps/lib/features/todo/data/todo_repository.dart`
- `apps/lib/features/contacts/data/users/users_repository_impl.dart`
- `apps/lib/features/settings/data/services/user_profile_cache_repository.dart`
## 建议方案
### 1. 抽取 `CachedRepository` 基类
```dart
abstract class CachedRepository<T, R> {
HybridCacheStore get store;
CacheInvalidator get invalidator;
CachePolicy get policy;
String get namespace; // 'calendar', 'todo', etc.
Future<T> getOrLoad(
String key, {
bool forceRefresh = false,
required Future<R> Function() loader,
});
Future<void> invalidate(String key);
String buildKey(String suffix);
}
```
### 2. 各模块简化
```dart
// CalendarRepository
class CalendarRepository extends CachedRepository<List<ScheduleItemModel>, ScheduleItemModel> {
@override
String get namespace => 'calendar';
@override
Future<List<ScheduleItemModel>> getDayEvents(DateTime date, {bool forceRefresh}) {
return getOrLoad(
'day:${_formatDate(date)}',
forceRefresh: forceRefresh,
loader: () => calendarService.getEventsForDay(date),
);
}
String _formatDate(DateTime date) =>
'${date.year}-${date.month.toString().padLeft(2, '0')}-${date.day.toString().padLeft(2, '0')}';
}
// TodoRepository
class TodoRepository extends CachedRepository<List<TodoResponse>, TodoResponse> {
@override
String get namespace => 'todo';
Future<List<TodoResponse>> getPendingTodos({bool forceRefresh = false}) {
return getOrLoad(
'list:pending',
forceRefresh: forceRefresh,
loader: () => api.getPendingTodos(),
);
}
}
```
### 3. 可选:泛型缓存装饰器
```dart
class CachedApiCall<T> {
final HybridCacheStore store;
final CachePolicy policy;
final String key;
final DateTime Function() now;
Future<T> execute(Future<T> Function() loader);
}
```
## 收益
| 收益 | 说明 |
|------|------|
| 减少重复代码 | 各 Repository 移除 60%+ 相似逻辑 |
| 统一缓存行为 | 刷新策略、key 格式、并发控制一致 |
| 易维护 | 修复 bug 或优化逻辑只需改一处 |
| 易测试 | 基类可独立测试,子类继承即可 |
## 前置依赖
- 现有 `HybridCacheStore``CacheEntry``CachePolicy``CacheInvalidator` 已就绪
- 无需引入新依赖
## 状态
- [ ] 待评估优先级
- [ ] 待设计 CachedRepository 基类接口
- [ ] 先在一个 Repository 上试点
- [ ] 推广到其他 Repository
@@ -0,0 +1,90 @@
# AppTheme 硬编码颜色且缺失 Dark Mode
## 问题描述
### 1. 颜色硬编码
`AppTheme` 和各组件大量直接引用 `AppColors` 静态常量,而非 `Theme.of(context).colorScheme`
```dart
// app_theme.dart
appBarTheme: const AppBarTheme(
backgroundColor: AppColors.background, // 硬编码
foregroundColor: AppColors.slate900, // 硬编码
),
elevatedButtonTheme: ElevatedButtonThemeData(
style: ElevatedButton.styleFrom(
backgroundColor: AppColors.primary, // 硬编码
foregroundColor: AppColors.primaryForeground,
),
),
```
这导致:
- 主题切换时颜色不会改变
- 组件无法响应系统深色模式
- 违反 Flutter Material Design 规范
### 2. 缺失 Dark Mode
`AppTheme` 只有 `light` getter,没有 `dark`
```dart
static ThemeData get light => ThemeData(...);
```
`LinksyApp` 硬编码使用 light
```dart
theme: AppTheme.light,
locale: const Locale('zh'),
```
## 正确做法
### 颜色应使用 ThemeData
```dart
// 正确示例
appBarTheme: AppBarTheme(
backgroundColor: Theme.of(context).colorScheme.surface,
foregroundColor: Theme.of(context).colorScheme.onSurface,
),
// ColorScheme 应由 ThemeData 生成
colorScheme: ColorScheme.fromSeed(
seedColor: AppColors.primary,
brightness: Brightness.light, // 或 Brightness.dark
),
```
### 支持 Dark Mode
```dart
class AppTheme {
static ThemeData get light => ThemeData(
brightness: Brightness.light,
colorScheme: ColorScheme.fromSeed(
seedColor: AppColors.primary,
brightness: Brightness.light,
),
);
static ThemeData get dark => ThemeData(
brightness: Brightness.dark,
colorScheme: ColorScheme.fromSeed(
seedColor: AppColors.primary,
brightness: Brightness.dark,
),
);
}
```
## 相关文件
- `apps/lib/core/theme/app_theme.dart`
- `apps/lib/core/theme/design_tokens.dart`
## 修复优先级
**低** - 当前只有 light 模式,不影响功能
@@ -0,0 +1,43 @@
# AuthSessionBootstrapper 旧代码应删除
## 文件位置
`apps/lib/app/startup/auth_session_bootstrapper.dart`
## 问题描述
`AuthSessionBootstrapper` 是遗留代码,用于在用户登录时同步日历事件和通知提醒。
### 代码问题
```dart
Future<void> syncForAuthState(AuthState state) async {
if (state is! AuthAuthenticated) {
_syncedUserId = null;
return;
}
// 获取180天日历事件并重建通知提醒
final events = await _calendarService.getEventsForRange(start, end);
await _notificationService.rebuildUpcomingReminders(events);
...
}
```
1. **同步逻辑已迁移** - `CalendarService``LocalNotificationService` 应自己管理缓存生命周期,无需登录时手动触发
2. **内存缓存不可靠** - `_syncedUserId` 仅内存存储,App 重启后失效
3. **静默失败** - 同步失败被 `catch (_)` 吞掉,无日志无重试
4. **180 天硬编码** - 时间范围未从配置读取
## 处理方式
**直接删除**
- 删除 `apps/lib/app/startup/auth_session_bootstrapper.dart`
- 确认无调用处后,清理 `startup/` 目录(若为空)
## 相关文件
- `apps/lib/app/startup/auth_session_bootstrapper.dart`
## 修复优先级
**低** - 功能层面暂无影响,但属于应清理的技术债
@@ -0,0 +1,49 @@
# LinksyApp 强制依赖 ChatBloc
## 问题描述
`LinksyApp` (app.dart) 作为应用根节点,被迫在 `MultiBlocProvider` 中注入 `ChatBloc`
```dart
return MultiBlocProvider(
providers: [
BlocProvider<AuthBloc>.value(value: authBloc),
BlocProvider<ChatBloc>(
create: (_) => ChatBloc(apiClient: sl<IApiClient>()),
),
],
...
);
```
这导致:
1. 应用启动时就创建 `ChatBloc` 实例(内存浪费)
2. `LinksyApp` 需要知道"存在 ChatBloc 这个 Feature"
3. 违反单一职责原则:根节点应只负责全局配置,不应了解具体 Feature
## 根本原因
`HomeScreen` 是默认首页,其内部需要 `ChatBloc`。为了让它通过 `context.read<ChatBloc>()` 获取,被迫在根节点提供。
## 正确做法
ChatBloc 应该在路由级别按需注入:
```dart
GoRoute(
path: '/',
builder: (context) => BlocProvider(
create: (_) => ChatBloc(apiClient: sl<IApiClient>()),
child: const HomeScreen(),
),
)
```
## 相关文件
- `apps/lib/app/app.dart`
- `apps/lib/features/home/presentation/screens/home_screen.dart`
## 修复优先级
**中等** - 功能正常但架构不合理,属于技术债
+118
View File
@@ -0,0 +1,118 @@
# main.dart 与认证模块耦合
## 问题描述
当前 `main.dart` 直接依赖了 `AuthBloc``AuthStarted`,违反了依赖反转原则。
## 当前代码
```dart
// main.dart
import 'features/auth/presentation/bloc/auth_bloc.dart';
import 'features/auth/presentation/bloc/auth_event.dart';
void main() async {
// ...
final authBloc = sl<AuthBloc>();
authBloc.add(AuthStarted());
runApp(LinksyApp(authBloc: authBloc));
}
```
## 问题
1. **启动逻辑与认证模块耦合**
- main.dart 需要知道 `AuthBloc` 的存在
- 需要知道 `AuthStarted` 事件
- 需要手动触发启动事件
2. **AuthBloc 被暴露3层**
```
main.dart → LinksyApp → createAppRouter → redirect()
```
每层都传 authBloc,不优雅
## 建议方案
### 1. 启动逻辑下沉到 LinksyApp
```dart
// main.dart
void main() async {
WidgetsFlutterBinding.ensureInitialized();
await configureDependencies();
await AppConstants.init();
runApp(LinksyApp());
}
// app.dart (LinksyApp)
class LinksyApp extends StatelessWidget {
@override
Widget build(BuildContext context) {
final authBloc = sl<AuthBloc>();
authBloc.add(AuthStarted());
return BlocProvider.value(
value: authBloc,
child: // ...
);
}
}
```
### 2. 路由守卫由 LinksyApp 内部管理
```dart
// app.dart
class LinksyApp extends StatelessWidget {
@override
Widget build(BuildContext context) {
return BlocListener<AuthBloc, AuthState>(
listener: (context, state) {
final router = GoRouter.of(context);
if (state is AuthUnauthenticated) {
router.go(AppRoutes.authLogin);
} else if (state is AuthAuthenticated) {
if (router.matchedLocation == AppRoutes.authLogin) {
router.go(AppRoutes.homeMain);
}
}
},
child: MaterialApp.router(
routerConfig: createAppRouter(),
),
);
}
}
```
### 3. createAppRouter 不再需要 authBloc 参数
```dart
// app_router.dart
GoRouter createAppRouter() {
return GoRouter(
// 不再有 redirect 回调
// 路由守卫由 BlocListener 在 LinksyApp 统一处理
);
}
```
## 收益
| 收益 | 说明 |
|------|------|
| 解耦 | main.dart 完全不知道 AuthBloc 存在 |
| 单一职责 | LinksyApp 统一管理状态监听和路由跳转 |
| 易测试 | main.dart 不再需要 mock AuthBloc |
## 涉及文件
- `apps/lib/main.dart`
- `apps/lib/app/app.dart`
- `apps/lib/app/router/app_router.dart`
- `apps/lib/features/auth/presentation/bloc/auth_bloc.dart`
## 状态
- [ ] 待修复
@@ -0,0 +1,85 @@
# SharedPreferences 缺少统一管理模型
## 问题描述
当前 `SharedPreferences` 的使用散落各处,缺乏统一的数据模型约束:
### 现状
1. **Key 散落**
- `reminder_notification_callbacks.dart` 中定义:`'calendar_reminder_pending_notification_responses_v1'`
- 各处直接使用字符串 key,容易拼写错误或冲突
2. **重复获取实例**
```dart
final prefs = await SharedPreferences.getInstance(); // 每次都重新获取
```
3. **序列化逻辑分散**
- `ReminderNotificationCallbacks` 自己处理 JSON 序列化/反序列化
- 其他模块可能重复相同逻辑
4. **注册但未统一封装**
- `injection.dart` 只注册了 `SharedPreferences` 实例
- 没有封装成可复用的数据访问层
## 影响
- 维护困难:Key 散落,修改时需要全局搜索
- 容易出错:拼写错误难以发现
- 代码重复:序列化逻辑可能在多处重复实现
- 可测试性差:直接依赖 `SharedPreferences.getInstance()`
## 建议方案
### 1. 创建 `AppPreferences` 数据模型
```dart
class AppPreferences {
static const String _pendingNotificationsKey =
'calendar_reminder_pending_notification_responses_v1';
final SharedPreferences _prefs;
AppPreferences(this._prefs);
List<NotificationResponse> get pendingNotifications {
final list = _prefs.getStringList(_pendingNotificationsKey) ?? [];
return list.map(_decode).toList();
}
Future<void> setPendingNotifications(List<NotificationResponse> value) {
return _prefs.setStringList(_pendingNotificationsKey, value.map(_encode).toList());
}
// 其他偏好设置...
}
```
### 2. 在 injection.dart 中注册
```dart
final sharedPreferences = await SharedPreferences.getInstance();
sl.registerSingleton<AppPreferences>(AppPreferences(sharedPreferences));
```
### 3. 使用方通过接口访问
```dart
// 之前
final prefs = await SharedPreferences.getInstance();
await prefs.setStringList(key, value);
// 之后
sl<AppPreferences>().setPendingNotifications(value);
```
## 涉及文件
- `apps/lib/app/di/injection.dart` - 注册逻辑
- `apps/lib/features/notification/data/services/reminder_notification_callbacks.dart` - 主要使用方
- `apps/lib/features/notification/data/services/ios_notification_payload_bridge.dart` - 另一使用方
## 状态
- [ ] 待修复
@@ -0,0 +1,123 @@
# 服务层与 Repository 层职责混乱
## 问题描述
当前 `CalendarService``SettingsUserCache``UserProfileCacheRepository` 等服务/仓库职责边界模糊,存在大量重复逻辑和不必要的封装。
## 问题1SettingsUserCache 不该存在
### 当前结构
```
SettingsUserCache UserProfileCacheRepository
┌─────────────────┐ ┌─────────────────────────┐
│ - _cachedUser │ ←→ │ - HybridCacheStore │
│ - getProfile() │ │ - CachePolicy │
│ - set() │ │ - getProfile() │
│ - invalidate() │ │ - setCached() │
└─────────────────┘ └─────────────────────────┘
```
### 问题
- `SettingsUserCache` 只是给 `UserProfileCacheRepository` 包了一层内存缓存
- 两者的 `getProfile()``invalidate()` 逻辑几乎相同
- 这是重复包装,应该合并
## 问题2Repository 缓存逻辑重复
### 涉及文件
- `apps/lib/features/calendar/data/services/calendar_repository.dart`
- `apps/lib/features/settings/data/services/user_profile_cache_repository.dart`
- `apps/lib/features/todo/data/todo_repository.dart`
### 代码重复率:90%
```dart
// CalendarRepository
Future<List<ScheduleItemModel>> getDayEvents({bool forceRefresh}) async {
if (forceRefresh) return _refreshDayAndRead(...);
final cached = await store.read<CacheEntry<...>>(key);
if (cached == null) return _refreshDayAndRead(...);
final decision = policy.evaluate(now: now(), fetchedAt: cached.fetchedAt);
if (decision.shouldRefreshInBackground) _refreshInBackground();
if (decision.mustBlockForNetwork || !decision.canUseCached) {
return _refreshDayAndRead(...);
}
return cached.value;
}
// UserProfileCacheRepository
Future<UserResponse> getProfile({bool forceRefresh}) async {
if (forceRefresh) return _refreshAndRead();
final cached = await store.read<CacheEntry<...>>(cacheKey);
if (cached == null) return _refreshAndRead();
final decision = policy.evaluate(now: now(), fetchedAt: cached.fetchedAt);
if (decision.shouldRefreshInBackground) _refreshInBackground();
if (decision.mustBlockForNetwork || !decision.canUseCached) {
return _refreshAndRead();
}
return cached.value;
}
```
## 问题3CalendarService 不必要的延迟初始化
```dart
class CalendarService {
CalendarApi? _calendarApi;
CalendarApi get _api {
if (_calendarApi != null) return _calendarApi;
_calendarApi = CalendarApi(_apiClient); // 为什么懒加载?
return _calendarApi;
}
}
```
已经传入了 `IApiClient`,API 还在构造时懒加载,多此一举。
## 问题4:分层不清
| 类名 | 类型 | 问题 |
|------|------|------|
| `CalendarService` | Service | 依赖 Repository,该叫 Repository |
| `UserProfileCacheRepository` | Repository | 名字带 Cache,但 Repository 都带缓存 |
| `SettingsUserCache` | ??? | 内存缓存层,不该独立存在 |
| `TodoRepository` | Repository | 正确 |
## 应该的设计
```
Repository 层(纯数据 + 缓存)
├── CalendarRepository ← 继承 CachedRepository
├── UserProfileRepository ← 继承 CachedRepository
└── TodoRepository ← 继承 CachedRepository
Service 层(业务逻辑 + 跨 Repository 编排)
├── CalendarService ← 只做业务编排,不直接调 API
├── NotificationService ← 跨模块通知逻辑
└── ReminderActionExecutor ← 跨模块提醒执行
```
## 修复步骤
1. **删除** `SettingsUserCache`,合并到 `UserProfileCacheRepository`
2. **抽取** `CachedRepository` 基类(见 `docs/todo/2026-03-27-repository缓存抽象.md`
3. **简化** `CalendarService`,移除不必要的懒加载
4. **统一命名**
- 带缓存的 Repository 统一继承基类
- Service 只做业务编排,不处理缓存
## 涉及文件
- `apps/lib/features/calendar/data/services/calendar_service.dart`
- `apps/lib/features/calendar/data/services/calendar_repository.dart`
- `apps/lib/features/settings/data/services/settings_user_cache.dart`
- `apps/lib/features/settings/data/services/user_profile_cache_repository.dart`
- `apps/lib/features/todo/data/todo_repository.dart`
## 状态
- [ ] 待修复
+34
View File
@@ -0,0 +1,34 @@
# 路由语义混乱:根路径 `/` 定义为登录页
## 问题描述
`app_routes.dart` 中根路径 `/` 被定义为登录页:
```dart
static const authBoot = '/boot';
static const authLogin = '/'; // 根路径是登录页
static const homeMain = '/home'; // 首页反而在 /home
```
这导致:
- `/` 应该指向首页的直觉 expectation 违反
- 根路径无法放置真实首页内容
-`homeMain = '/home'` 语义不一致
## 正确做法
根路径 `/` 应保留给首页,登录页应使用独立路径如 `/login`
```dart
static const authLogin = '/login';
static const homeMain = '/';
```
## 相关文件
- `apps/lib/app/router/app_routes.dart`
- `apps/lib/app/router/app_router.dart`
## 修复优先级
**低** - 功能正常,属于历史遗留设计问题
+106
View File
@@ -0,0 +1,106 @@
# 路由守卫逻辑分散
## 问题描述
当前路由守卫逻辑分散在两处,可能导致判断不一致:
1. `app_router.dart``redirect()` - 核心守卫逻辑
2. `LinksyApp``BlocListener` - 预留了位置但未使用
## 当前代码
```dart
// app_router.dart
GoRouter createAppRouter(AuthBloc authBloc) {
return GoRouter(
refreshListenable: GoRouterRefreshStream(authBloc.stream),
redirect: (context, state) {
final authState = authBloc.state;
final isAuthenticated = authState is AuthAuthenticated;
// ... 守卫判断逻辑
},
);
}
// app.dart (LinksyApp)
BlocListener<AuthBloc, AuthState>(
listener: (context, state) {
// Handle auth state changes if needed ← 预留但未使用
},
)
```
## 问题
| 问题 | 说明 |
|------|------|
| 逻辑分散 | 守卫在 `redirect()`,但 BlocListener 预留了位置 |
| 隐患 | 将来可能有人在两处都加逻辑,导致不一致 |
| 职责不清 | 到底是 redirect 管跳转,还是 BlocListener 管跳转 |
## 建议方案
**方案1:路由守卫集中在 redirect()(当前方案,保持但清理)**
```dart
// app_router.dart
GoRouter createAppRouter() {
return GoRouter(
refreshListenable: GoRouterRefreshStream(sl<AuthBloc>().stream),
redirect: (context, state) {
// 唯一的守卫逻辑
},
);
}
// LinksyApp - 只做副作用,不做路由跳转
BlocListener<AuthBloc, AuthState>(
listener: (context, state) {
// 埋点、Toast 等副作用
},
)
```
**方案2:路由守卫集中在 BlocListener**
```dart
// app_router.dart - 不再有 redirect
GoRouter createAppRouter() {
return GoRouter(
routes: [...],
);
}
// LinksyApp - 唯一的路由守卫入口
BlocListener<AuthBloc, AuthState>(
listener: (context, state) {
final router = GoRouter.of(context);
if (state is AuthUnauthenticated) {
if (!isPublicRoute(router.matchedLocation)) {
router.go(AppRoutes.authLogin);
}
} else if (state is AuthAuthenticated) {
if (router.matchedLocation == AppRoutes.authLogin) {
router.go(AppRoutes.homeMain);
}
}
},
)
```
## 收益
| 收益 | 说明 |
|------|------|
| 单一职责 | 路由跳转只在一处判断 |
| 可维护 | 将来不会有人误在另一处加逻辑 |
| 清晰 | 开发者知道去哪改守卫逻辑 |
## 涉及文件
- `apps/lib/app/app.dart`
- `apps/lib/app/router/app_router.dart`
## 状态
- [ ] 待修复
@@ -1,470 +0,0 @@
# Agent Run Cancel (Failed Semantics) Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:**`/api/v1/agent/runs` 增加可中断能力,在用户触发 cancel 后真正停止运行中的 agent 流程,并以 `RUN_ERROR(code=RUN_CANCELED)` 结束,最终将 session 状态落为 `failed`
**Architecture:** 使用“协作取消 + 主任务中断”方案:API 层写入 Redis cancel 信号,runtime 在 worker 进程内并行 watcher 监听信号,命中后先调用 active agent 的 `interrupt()` 做优雅收尾,再 `cancel()` 当前 run 主任务做硬兜底。终态统一通过 `RUN_ERROR` 事件落库,复用现有 `FAILED` 会话语义,避免数据库枚举迁移。
**Tech Stack:** FastAPI, TaskIQ, Redis, AgentScope, SQLAlchemy, Flutter, Pytest, Ruff, BasedPyright
---
### Task 1: 先更新协议文档(接口与事件语义)
**Files:**
- Modify: `docs/protocols/agent/api-endpoints.md`
- Modify: `docs/protocols/agent/sse-events.md`
**Step 1: 在 API 文档新增 cancel 端点契约**
`api-endpoints.md` 的端点清单添加:
```md
| POST | `/runs/{thread_id}/cancel` | 请求取消指定 run |
```
并新增章节说明:
- 请求参数:`thread_id` + `runId`(建议 query
- 返回:`202 Accepted` + `accepted: true`
- 语义:仅表示“取消请求已接收”,不保证已即时终止
**Step 2: 在 SSE 文档补充取消终态语义**
`sse-events.md``RUN_ERROR` 章节补充:
```json
{
"type": "RUN_ERROR",
"threadId": "...",
"runId": "...",
"message": "run canceled by user",
"code": "RUN_CANCELED"
}
```
并明确:
- `RUN_CANCELED` 是用户主动中断,不是系统异常
- 本阶段仍复用 session `failed`(向后兼容)
**Step 3: 文档自检**
检查文档是否同时覆盖:
- HTTP 行为
- SSE 终态事件
- 兼容策略(不引入新 session 状态)
**Step 4: 提交文档变更**
```bash
git add docs/protocols/agent/api-endpoints.md docs/protocols/agent/sse-events.md
git commit -m "docs: define agent run cancel API and RUN_CANCELED error semantics"
```
### Task 2: 打通 API 层到队列层的 cancel 信号写入
**Files:**
- Modify: `backend/src/v1/agent/schemas.py`
- Modify: `backend/src/v1/agent/dependencies.py`
- Modify: `backend/src/v1/agent/service.py`
- Modify: `backend/src/v1/agent/router.py`
- Test: `backend/tests/unit/v1/agent/test_service.py`
- Test: `backend/tests/integration/v1/agent/test_routes.py`
**Step 1: 增加 cancel 接口响应 schema**
`v1/agent/schemas.py` 增加:
```python
class CancelRunResponse(BaseModel):
model_config = ConfigDict(populate_by_name=True, serialize_by_alias=True)
thread_id: str = Field(alias="threadId")
run_id: str = Field(alias="runId")
accepted: bool
```
**Step 2: 扩展队列协议接口**
`QueueClientLike` 增加:
```python
async def request_cancel(self, *, thread_id: str, run_id: str, requested_by: str) -> None: ...
```
**Step 3: 在 `TaskiqQueueClient` 实现 request_cancel**
`v1/agent/dependencies.py` 中新增:
- cancel key 规范:`agent:cancel:{thread_id}:{run_id}`
- `SET key value EX <ttl>` 写入取消信号
- `value` 可写 json 字符串(包含 user_id/timestamp
**Step 4: 在 service 层新增 cancel_run**
`v1/agent/service.py` 增加方法:
- 校验 session owner(复用 `get_session_owner + ensure_session_owner`
- 调用 `self._queue.request_cancel(...)`
- 返回 `accepted` 结果 DTO
**Step 5: 在 router 新增 cancel 路由**
`v1/agent/router.py` 新增:
```python
@router.post("/runs/{thread_id}/cancel", response_model=CancelRunResponse, status_code=202)
async def cancel_run(...):
...
```
约束:
- `runId` 必填(建议 query
- 非 owner 返回 403
- 参数非法返回 422
**Step 6: 写 service 单测(先红)**
`test_service.py` 添加:
- owner 可发起 cancel`queue.request_cancel` 被调用
- 非 owner cancel 返回 403
**Step 7: 运行单测确认失败**
Run: `uv run pytest backend/tests/unit/v1/agent/test_service.py -k cancel -q`
Expected: 至少 1 个测试失败(新逻辑尚未实现)
**Step 8: 实现最小代码使测试通过**
按 Step 1-5 完成实现,避免额外重构。
**Step 9: 运行测试验证通过**
Run: `uv run pytest backend/tests/unit/v1/agent/test_service.py -k cancel -q`
Expected: PASS
**Step 10: 增加路由集成测试**
`test_routes.py` 增加:
- `POST /api/v1/agent/runs/{thread_id}/cancel?runId=...` 返回 202
- 响应字段别名正确(`threadId/runId/accepted`
**Step 11: 运行路由测试**
Run: `uv run pytest backend/tests/integration/v1/agent/test_routes.py -k cancel -q`
Expected: PASS
**Step 12: 提交 API 层变更**
```bash
git add backend/src/v1/agent/schemas.py backend/src/v1/agent/dependencies.py backend/src/v1/agent/service.py backend/src/v1/agent/router.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/v1/agent/test_routes.py
git commit -m "feat: add agent run cancel endpoint and Redis cancel signal"
```
### Task 3: runtime runner 植入取消 watcher 与优雅中断
**Files:**
- Modify: `backend/src/core/agentscope/runtime/runner.py`
- Test: `backend/tests/unit/core/agentscope/runtime/test_runner.py`
**Step 1: 在 runner.execute 增加 cancel_checker 参数**
更新 `execute()` 签名:
```python
cancel_checker: Callable[[], Awaitable[bool]] | None = None
```
并保持默认 `None` 向后兼容。
**Step 2: 增加 active agent 引用与锁**
`AgentScopeRunner.__init__` 增加:
- `self._active_agent: JsonReActAgent | None = None`
- `self._active_agent_lock = asyncio.Lock()`
**Step 3: 在 `_run_worker_stage` 设置 active agent 生命周期**
`agent.reply_json(...)` 前后包裹:
- before: 记录 `self._active_agent = agent`
- finally: 清理引用
**Step 4: 新增 `_watch_cancel_signal` 协程**
行为:
- 循环调用 `cancel_checker()`
- 命中后先尝试 `await active_agent.interrupt()`
-`run_task.cancel("run canceled by user")`
- 间隔 `await asyncio.sleep(0.2)`
**Step 5: 在 execute 启停 watcher**
- `run_task = asyncio.current_task()`
- 如果有 `cancel_checker``create_task(_watch_cancel_signal(...))`
- `finally` 中停止 watcher 并 `await` 回收
**Step 6: 补 stage 边界取消 gate(关键)**
在 router 结束后、worker 开始前检查一次 `cancel_checker()`
- 为 true 时抛 `asyncio.CancelledError`
目的:防止“router 已结束但仍进入 worker”。
**Step 7: 写 runner 单测(先红)**
新增测试用例:
- cancel 信号触发后,`execute` 抛出 `CancelledError`
- worker 未被继续执行(或中途被中断)
**Step 8: 运行 runner 测试**
Run: `uv run pytest backend/tests/unit/core/agentscope/runtime/test_runner.py -k cancel -q`
Expected: PASS
**Step 9: 提交 runner 变更**
```bash
git add backend/src/core/agentscope/runtime/runner.py backend/tests/unit/core/agentscope/runtime/test_runner.py
git commit -m "feat: add cooperative cancellation watcher to agentscope runner"
```
### Task 4: orchestrator 与 task worker 处理 CancelledError 终态
**Files:**
- Modify: `backend/src/core/agentscope/runtime/orchestrator.py`
- Modify: `backend/src/core/agentscope/runtime/tasks.py`
- Test: `backend/tests/unit/core/agentscope/runtime/test_orchestrator.py`
- Test: `backend/tests/unit/core/agentscope/runtime/test_tasks.py`
**Step 1: orchestrator 单独捕获 CancelledError**
`orchestrator.run()` 添加:
```python
except asyncio.CancelledError:
await self._pipeline.emit(... RUN_ERROR code="RUN_CANCELED" ...)
raise
```
保留现有 `except Exception` 处理系统错误。
**Step 2: task 层构造 cancel_checker 并注入 runtime.run**
`tasks.py`
- 构造 key`agent:cancel:{thread_id}:{run_id}`
- 定义 `async def cancel_checker() -> bool: return bool(await redis.exists(key))`
- 调用 `runtime.run(..., cancel_checker=cancel_checker)`
**Step 3: task 层补资源清理**
`run_agentscope_task``finally`
- 删除 cancel key 或缩短 TTL
- 记录日志(仅必要字段)
**Step 4: 写 orchestrator 单测(先红)**
验证:
- 收到 `CancelledError` 时发 `RUN_ERROR``code == "RUN_CANCELED"`
**Step 5: 写 tasks 单测(先红)**
验证:
- runtime 收到的 `cancel_checker` 可用
- key 命中时上抛 `CancelledError` 路径成立
**Step 6: 运行测试**
Run: `uv run pytest backend/tests/unit/core/agentscope/runtime/test_orchestrator.py backend/tests/unit/core/agentscope/runtime/test_tasks.py -k cancel -q`
Expected: PASS
**Step 7: 提交 runtime 编排层变更**
```bash
git add backend/src/core/agentscope/runtime/orchestrator.py backend/src/core/agentscope/runtime/tasks.py backend/tests/unit/core/agentscope/runtime/test_orchestrator.py backend/tests/unit/core/agentscope/runtime/test_tasks.py
git commit -m "fix: emit RUN_CANCELED error when run task is interrupted"
```
### Task 5: 事件流与持久化一致性回归
**Files:**
- Modify: `backend/tests/unit/core/agentscope/events/test_store.py`
- Modify: `backend/tests/unit/core/agentscope/events/test_agui_codec.py`
- Modify: `backend/tests/integration/v1/agent/test_sse_flow_live.py`
**Step 1: 补 event store 行为测试**
新增断言:
-`RUN_ERROR``code=RUN_CANCELED` 时,session 状态依然为 `FAILED`
**Step 2: 补 codec 测试**
新增断言:
- `RUN_ERROR``code` 字段能正确透传到 wire event
**Step 3: 补 SSE 集成测试**
场景:
- 触发 `/runs`
- 触发 `/runs/{thread_id}/cancel`
- SSE 最终出现 `RUN_ERROR(code=RUN_CANCELED)`
**Step 4: 运行事件相关测试**
Run: `uv run pytest backend/tests/unit/core/agentscope/events/test_store.py backend/tests/unit/core/agentscope/events/test_agui_codec.py backend/tests/integration/v1/agent/test_sse_flow_live.py -k "cancel or run_error" -q`
Expected: PASS
**Step 5: 提交事件层变更**
```bash
git add backend/tests/unit/core/agentscope/events/test_store.py backend/tests/unit/core/agentscope/events/test_agui_codec.py backend/tests/integration/v1/agent/test_sse_flow_live.py
git commit -m "test: cover RUN_CANCELED propagation across store codec and SSE"
```
### Task 6: 全量验证与发布前检查
**Files:**
- Modify: `docs/protocols/agent/api-endpoints.md`(如需补充最终字段)
- Modify: `docs/protocols/agent/sse-events.md`(如需补充最终字段)
**Step 1: 运行受影响单元测试集合**
Run:
```bash
uv run pytest backend/tests/unit/v1/agent/test_service.py backend/tests/unit/core/agentscope/runtime/test_runner.py backend/tests/unit/core/agentscope/runtime/test_orchestrator.py backend/tests/unit/core/agentscope/runtime/test_tasks.py backend/tests/unit/core/agentscope/events/test_store.py backend/tests/unit/core/agentscope/events/test_agui_codec.py -q
```
Expected: PASS
**Step 2: 运行受影响集成测试集合**
Run:
```bash
uv run pytest backend/tests/integration/v1/agent/test_routes.py backend/tests/integration/v1/agent/test_sse_flow_live.py -q
```
Expected: PASS
**Step 3: 运行静态检查**
Run:
```bash
uv run ruff check backend/src backend/tests
uv run basedpyright
```
Expected: PASS(无新增 lint/type 错误)
**Step 4: 手工验证路径**
手工流程:
- 发起 `/runs`
- 立刻调用 `/runs/{thread_id}/cancel?runId=...`
- 观察 SSE:应以 `RUN_ERROR(code=RUN_CANCELED)` 结束
- 检查 session`status=failed`
**Step 5: 最终提交**
```bash
git add docs/protocols/agent/api-endpoints.md docs/protocols/agent/sse-events.md backend/src/v1/agent/*.py backend/src/core/agentscope/runtime/*.py backend/tests/unit/core/agentscope/runtime/*.py backend/tests/unit/core/agentscope/events/*.py backend/tests/unit/v1/agent/test_service.py backend/tests/integration/v1/agent/test_routes.py backend/tests/integration/v1/agent/test_sse_flow_live.py
git commit -m "feat: support run cancellation with RUN_CANCELED failed semantics"
```
---
## 风险与回滚
- 风险 1:cancel key 误命中导致误中断
- 缓解:key 粒度使用 `thread_id + run_id`,并设置 TTL
- 风险 2:中断时出现重复终态事件
- 缓解:在 orchestrator 保证 CancelledError 只走 `RUN_ERROR` 分支,避免继续发 `RUN_FINISHED`
- 风险 3:高并发下 Redis 轮询压力上升
- 缓解:轮询间隔 200ms,后续按并发量评估改为 pub/sub
回滚策略:
- 回滚 `router/service/dependencies` cancel 新接口
- 回滚 `runner/orchestrator/tasks` cancel 注入逻辑
- 保持原 `POST /runs` 与 SSE 流程不变
### Task 7: 前端接入 cancel API(发送后“停止生成”按钮走后端真实取消)
**Files:**
- Modify: `apps/lib/features/chat/data/services/ag_ui_service.dart`
- Modify: `apps/lib/features/chat/presentation/bloc/chat_bloc.dart`
- Modify: `apps/lib/features/chat/data/models/ag_ui_event.dart`
- Modify: `apps/lib/features/home/ui/screens/home_screen_interactions.dart`
- Test: `apps/test/features/chat/data/services/ag_ui_service_test.dart`
- Test: `apps/test/features/chat/presentation/chat_bloc_attachment_sync_test.dart`
**Step 1: 在 AgUiService 维护当前运行态标识**
`AgUiService` 增加字段:
- `_activeThreadIdForRun: String?`
- `_activeRunId: String?`
并在 `sendMessage` 成功拿到 `/runs` 响应后设置这两个字段;在收到目标 run 的终态事件(`RUN_FINISHED` / `RUN_ERROR`)后清理。
**Step 2: 将 cancelCurrentRun 从“仅断 SSE”升级为“先调用后端 cancel,再本地收流”**
`AgUiService.cancelCurrentRun()` 改为:
1.`_activeThreadIdForRun``_activeRunId` 为空:退化为当前行为(仅关闭 SSE)
2. 否则先调用:
```text
POST /api/v1/agent/runs/{threadId}/cancel?runId={runId}
```
3. 请求成功后再执行 `_cancelActiveSseSubscription()`(避免继续占用本地连接)
4. 不论后端是否即时生效,都清理本地 active run 字段,防止重复 cancel
说明:这一步就是把“发送消息后的停止按钮”真正连到后端取消能力。
**Step 3: 错误语义细化(前端展示友好)**
`chat_bloc.dart` 处理 `RunErrorEvent` 时:
- 如果 `errorEvent.code == 'RUN_CANCELED'`,错误文案不按失败提示展示(可置空或显示“已停止生成”)
- 仍执行 `_resetRunState` 与 tool 卡片收尾,保持 UI 一致性
**Step 4: 保持现有按钮入口,不改交互入口路径**
`home_screen_interactions.dart` 里的 `_onStopGenerating -> _chatBloc.cancelCurrentRun()` 已经是正确入口,继续复用。
仅调整 Toast 文案策略:
- 请求已发出:`已请求停止`
- 收到 `RUN_ERROR(code=RUN_CANCELED)`:最终态 `已停止生成`
**Step 5: 写 AgUiService 测试(先红)**
`ag_ui_service_test.dart` 增加:
- `cancelCurrentRun` 会调用新端点 `/api/v1/agent/runs/{threadId}/cancel`
- query 参数包含 `runId`
- 调用后会关闭当前 SSE subscription
**Step 6: 写 ChatBloc 测试(先红)**
`chat_bloc_attachment_sync_test.dart` 增加:
- 收到 `RunErrorEvent(message: 'run canceled by user', code: 'RUN_CANCELED')` 后:
- `isWaitingFirstToken/isStreaming/isCancelling` 全部归零
- 不显示普通失败文案(或显示取消态文案,按你们最终文案策略断言)
**Step 7: 运行 Flutter 测试**
Run:
```bash
flutter test apps/test/features/chat/data/services/ag_ui_service_test.dart apps/test/features/chat/presentation/chat_bloc_attachment_sync_test.dart
```
Expected: PASS
**Step 8: 前端接入提交**
```bash
git add apps/lib/features/chat/data/services/ag_ui_service.dart apps/lib/features/chat/presentation/bloc/chat_bloc.dart apps/lib/features/chat/data/models/ag_ui_event.dart apps/lib/features/home/ui/screens/home_screen_interactions.dart apps/test/features/chat/data/services/ag_ui_service_test.dart apps/test/features/chat/presentation/chat_bloc_attachment_sync_test.dart
git commit -m "feat: wire stop-generating button to backend run cancel API"
```
@@ -1,99 +0,0 @@
# Protocols 文档修复计划
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** 修复 protocols 文档中与后端实现不一致的错误
**Architecture:** 直接修改 docs/protocols/ 下的 markdown 文件,确保文档与 backend/src/models/ 中的实际模型定义保持一致
**Tech Stack:** Markdown 编辑
---
## 修复任务清单
### Task 1: 修复 Memories Protocol - 移除不存在的 `agent_id` 字段
**文件:**
- 修改: `docs/protocols/models/memory.md`
**修改内容:**
- 从 "数据库存储" 章节的表格中移除 `agent_id` 字段
- 该字段在实现中不存在
---
### Task 2: 修复 InboxMessages Protocol - 添加缺失的 `group_id` 字段
**文件:**
- 修改: `docs/protocols/models/inbox-messages.md`
**修改内容:**
-`InboxMessageResponse` 数据结构中添加 `group_id: uuid | null` 字段
---
### Task 3: 修复 ScheduleItems Protocol - 补充 `permission` 位掩码说明
**文件:**
- 修改: `docs/protocols/calendar/schedule-items.md`
**修改内容:**
-`ScheduleItemResponse` 的说明中,添加 `permission` 字段的位掩码语义:
- `1` = view
- `2` = invite
- `4` = edit
-`ScheduleItemShareRequest` 中补充说明
---
### Task 4: 修复 Friendships Protocol - 补充内部状态说明
**文件:**
- 修改: `docs/protocols/models/friendships.md`
**修改内容:**
-`FriendRequestResponse``status` 字段说明中,添加注释:
- `blocked``declined` 为内部实现状态
- 对外返回时映射为 `rejected`
- 说明这是实现细节,客户端应处理所有枚举值
---
### Task 5: 修复 Memories Protocol - 改进 `source` 列移除说明
**文件:**
- 修改: `docs/protocols/models/memory.md`
**修改内容:**
- 在 "数据库存储" 章节的表格中,明确标注 `source` 列已移除
- 或者在表格下方添加更醒目的 "已移除字段" 说明
---
### Task 6: 修复 Automation Jobs Protocol - 添加 `bootstrap_key` 字段
**文件:**
- 修改: `docs/protocols/models/automation-jobs.md`
**修改内容:**
- 在 "Canonical Fields" 表格中添加 `bootstrap_key: string | null` 字段说明
- 简短说明其用途(引导配置键)
---
## 执行顺序
1. Task 1 - Memories: 移除 agent_id
2. Task 2 - InboxMessages: 添加 group_id
3. Task 3 - ScheduleItems: 补充 permission 说明
4. Task 4 - Friendships: 补充状态说明
5. Task 5 - Memories: 改进 source 说明
6. Task 6 - AutomationJobs: 添加 bootstrap_key
---
## 验证方式
- 人工检查:对比修改后的文档与 backend/src/models/ 中的实际模型定义
- 确保文档中描述的每个字段都能在对应 model 文件中找到
@@ -1,29 +0,0 @@
# Backend Schemas Restructure Design
**Goal:** Restructure `backend/src/schemas` into clear domain/shared/enums modules while keeping API contracts in `backend/src/v1/*/schemas.py`.
**Architecture:** Move reusable validation models and enums into `schemas/domain`, `schemas/shared`, and `schemas/enums.py`. Keep versioned request/response contracts in `v1/*/schemas.py` and update imports to explicit module paths. Remove legacy aggregate exports and duplicate/empty schema directories.
**Tech Stack:** Python 3.13, Pydantic v2, Ruff, Pytest.
---
## Approved decisions
- Use one-shot hard cut.
- Keep API contracts in `backend/src/v1/*/schemas.py`.
- Keep `schemas` as reusable constraints only.
- Remove implicit root re-export usage.
## Target structure
- `backend/src/schemas/enums.py`
- `backend/src/schemas/domain/*.py`
- `backend/src/schemas/shared/*.py`
- `backend/src/v1/*/schemas.py` (unchanged naming and ownership)
## Validation gates
- `uv run ruff check ...`
- `uv run pytest ...` for impacted suites
- `./infra/scripts/dev-migrate.sh migrate`
@@ -1,123 +0,0 @@
# Backend Schemas Restructure Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Hard-cut refactor backend schema modules into clear domain/shared/enums boundaries while keeping API contracts in `v1/*/schemas.py`.
**Architecture:** Introduce `schemas/enums.py` and move reusable schema models into `schemas/domain` and `schemas/shared`. Update all backend imports to explicit new module paths and remove old schema package wrappers and duplicate directories.
**Tech Stack:** Python 3.13, Pydantic v2, Ruff, Pytest, Alembic migration runner.
---
### Task 1: Create new schema module layout
**Files:**
- Create: `backend/src/schemas/enums.py`
- Create: `backend/src/schemas/domain/automation.py`
- Create: `backend/src/schemas/domain/inbox.py`
- Create: `backend/src/schemas/domain/schedule.py`
- Create: `backend/src/schemas/domain/memory.py`
- Create: `backend/src/schemas/domain/memory_content.py`
- Create: `backend/src/schemas/domain/chat_message.py`
- Create: `backend/src/schemas/domain/chat_session.py`
- Create: `backend/src/schemas/domain/todo.py`
- Create: `backend/src/schemas/domain/invite_code.py`
- Create: `backend/src/schemas/shared/user.py`
**Step 1: Write failing import checks**
```python
def test_new_schema_paths_importable() -> None:
import schemas.domain.automation # noqa: F401
```
**Step 2: Run test to verify it fails**
Run: `uv run pytest backend/tests/unit -k new_schema_paths_importable -v`
Expected: FAIL with import error
**Step 3: Implement new modules**
Copy and normalize existing reusable models into new modules.
**Step 4: Run test to verify it passes**
Run: `uv run pytest backend/tests/unit -k new_schema_paths_importable -v`
Expected: PASS
### Task 2: Update all backend imports to new schema paths
**Files:**
- Modify: `backend/src/**/*.py` (affected import lines)
**Step 1: Write failing grep assertions**
Run: `uv run python -c "..."` with assertions for old import patterns.
**Step 2: Verify failures with old paths present**
Run: `uv run ruff check backend/src`
**Step 3: Implement import rewrites**
Replace old paths (`schemas.model_enums`, `schemas.automation`, `schemas.memories.memory_content`, etc.) with new explicit modules.
**Step 4: Verify static checks pass**
Run: `uv run ruff check backend/src`
Expected: PASS
### Task 3: Remove legacy schema wrappers and duplicates
**Files:**
- Delete: `backend/src/schemas/model_enums.py`
- Delete: `backend/src/schemas/automation/__init__.py`
- Delete: `backend/src/schemas/inbox/messages.py`
- Delete: `backend/src/schemas/schedule/items.py`
- Delete: `backend/src/schemas/memories/__init__.py`
- Delete: `backend/src/schemas/memories/memory_content.py`
- Delete: `backend/src/schemas/messages/chat_message.py`
- Delete: `backend/src/schemas/messages/__init__.py`
- Delete: `backend/src/schemas/sessions/chat_session.py`
- Delete: `backend/src/schemas/sessions/__init__.py`
- Delete: `backend/src/schemas/todo/contracts.py`
- Delete: `backend/src/schemas/todo/__init__.py`
- Delete: `backend/src/schemas/user/context.py`
- Delete: `backend/src/schemas/user/__init__.py`
- Delete: `backend/src/schemas/inbox/__init__.py`
- Delete: `backend/src/schemas/invite_codes/__init__.py`
- Modify: `backend/src/schemas/__init__.py`
**Step 1: Remove old modules**
Delete legacy wrappers after all imports are rewritten.
**Step 2: Verify no old imports remain**
Run: `uv run python -c "..."` or grep-based assertion commands.
Expected: zero matches
### Task 4: Verification and migration
**Files:**
- Verify only
**Step 1: Run quality gates**
Run: `uv run ruff check backend/src`
**Step 2: Run impacted tests**
Run: `uv run pytest backend/tests/unit/v1/automation_jobs backend/tests/unit/v1/schedule_items backend/tests/unit/v1/todo backend/tests/unit/v1/friendships backend/tests/unit/v1/inbox_messages backend/tests/unit/v1/users backend/tests/unit/v1/agent backend/tests/unit/core/agentscope`
**Step 3: Run migration script**
Run: `./infra/scripts/dev-migrate.sh migrate`
**Step 4: Commit**
```bash
git add backend/src/schemas backend/src/v1 backend/src/models backend/src/core docs/plans
git commit -m "refactor: restructure backend schema modules by domain boundaries"
```
@@ -0,0 +1,178 @@
# Apps 通用数据采集与报错日志系统设计(一阶段)
## 1. 背景与目标
当前前端需要在应用分发后具备可观测性能力,以支持线上报错定位和关键行为分析。
一阶段目标聚焦于“最小可用且可扩展”的通用采集体系,覆盖:
- 报错日志收集与排查支持。
- 用户打开应用时间与会话持续时长。
- 页面停留时长。
- 对话输入耗时与发送次数。
本设计优先保证数据质量、隐私安全、稳定性与后续重构兼容性,不绑定当前目录结构和具体实现文件。
## 2. 设计范围
### 2.1 一阶段纳入
- 全局异常采集(框架异常、异步未捕获异常、业务显式上报异常)。
- 会话生命周期采集(开始、结束、时长)。
- 页面生命周期采集(进入、离开、停留时长)。
- 对话输入行为采集(输入开始、输入提交)。
### 2.2 一阶段不纳入
- 复杂埋点体系(曝光、点击流全量追踪、实验分流)。
- 全量性能指标体系(FPS、内存、卡顿详细分层)。
- 非关键业务域的大规模事件扩展。
## 3. 方案选型结论
采用“自研通用采集 SDK + 自有后端接收”的主路径。
原因:
- 关键数据字段需与业务强关联,需高可定制能力。
- 需要强约束隐私边界与脱敏策略。
- 一阶段目标明确且有限,自研成本可控。
- 后续可平滑扩展第三方崩溃平台作为补充,不影响主链路。
## 4. 总体架构
系统分为四层:
1. 采集层:负责统一接入错误、会话、页面、输入等事件源。
2. 处理层:负责事件标准化、上下文补全、脱敏、去重与采样。
3. 存储层:负责本地队列缓存、离线持久化、容量控制。
4. 上报层:负责批量传输、失败重试、退避与状态感知。
核心原则:
- 所有事件必须走统一入口。
- 业务代码不直接请求上报接口。
- 上报失败不能影响用户主流程。
## 5. 事件模型设计
### 5.1 事件分类
- error:异常与失败事件。
- lifecycle:应用/页面生命周期事件。
- behavior:关键操作行为事件。
### 5.2 一阶段标准事件
- app_session_started
- app_session_ended
- page_view_started
- page_view_ended
- chat_input_started
- chat_input_submitted
说明:
- 输入“次数”由 chat_input_submitted 聚合统计,不新增独立事件。
- 应用“持续事件”采用开始+结束计算时长,不使用固定频率心跳。
### 5.3 通用字段规范
每条事件统一包含:
- event_id:事件唯一标识。
- event_name:事件名。
- event_type:事件分类。
- event_time:客户端事件时间。
- session_id:会话标识。
- user_id_hash:用户标识哈希(可空)。
- app_version:应用版本。
- platform:平台类型。
- route:当前页面标识(可空)。
- payload:事件特有字段。
错误事件附加字段:
- error_type、error_message、stacktrace(按策略裁剪)。
- severitywarning/error/fatal)。
- fingerprint(聚合键)。
## 6. 关键指标口径
### 6.1 应用打开时间与会话时长
- 会话开始:应用进入可交互前台时记录。
- 会话结束:应用切到后台时记录。
- 时长计算:session_end_time - session_start_time。
- 异常中断补偿:若会话未正常结束,在下次启动时补偿结束记录,并标记为补偿事件。
### 6.2 页面停留时长
- 页面进入时记录 page_view_started。
- 页面离开时记录 page_view_ended 与 duration_ms。
- 页面切换由统一路由观察机制触发,确保口径一致。
### 6.3 对话输入时间与次数
- 用户首次进入输入状态时记录 chat_input_started。
- 用户提交输入时记录 chat_input_submitted。
- 输入时长:submit_time - input_start_time。
- 输入次数:按提交事件数量聚合。
- 不采集输入正文,仅采集长度与时长。
## 7. 隐私与安全设计
必须遵循最小采集原则与默认脱敏策略:
- 禁止采集或上报:token、密码、手机号、邮箱、聊天正文、敏感证件信息。
- 标识类字段统一哈希化或匿名化。
- 错误消息与堆栈按规则裁剪,避免包含敏感上下文。
- 采用字段白名单策略,非白名单字段默认不上报。
安全底线:
- 采集系统本身不可引入认证绕过与敏感信息泄露风险。
- 上报通道需具备鉴权与重放防护能力。
## 8. 稳定性与性能策略
- 事件写入本地队列后异步上报,主线程不阻塞。
- 批量上传,控制单次包体和频率。
- 失败使用指数退避重试,达到上限后丢弃并记录内部统计。
- 本地队列设置容量上限,采用环形覆盖或优先级淘汰策略。
- 在弱网/离线场景允许延迟上报,恢复后补发。
## 9. 质量保障与验证
一阶段验收重点:
- 正确性:六类标准事件均可稳定产出且字段完整。
- 一致性:同类事件口径一致,可跨版本对比。
- 安全性:敏感字段泄露检测通过。
- 稳定性:采集开启后不影响主要业务链路。
建议验证维度:
- 会话开始/结束与时长计算一致性。
- 页面进出成对率。
- 输入开始到提交链路闭环率。
- 错误事件聚合有效性(fingerprint 去重后可读)。
## 10. 演进路线
二阶段建议按需扩展:
- 增加更多关键业务事件(保留最小集合原则)。
- 引入崩溃平台作为补充通道(仅 fatal/crash)。
- 建立统一查询与告警规则(高频错误、关键路径失败率)。
- 增强版本对比分析能力,支持发布质量回归判断。
## 11. 决策摘要
一阶段采用通用采集体系,围绕“报错日志 + 三类关键行为”快速落地:
- 会话:打开与持续时长。
- 页面:停留时长。
- 对话:输入时长与提交次数。
在不依赖具体重构代码结构的前提下,该设计可作为后续大重构期间的稳定观测基线。
@@ -0,0 +1,317 @@
# L10n Cleanup + Stable Error Code + Frontend Text Migration Implementation Plan
> **For Claude:** REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.
**Goal:** Remove redundant l10n wrappers, introduce backend stable/mappable error codes for HTTP contracts, and continue frontend hardcoded-text localization migration to zh/en with default zh.
**Architecture:** Keep Flutter UI localization in `lib/l10n` as single source of truth, minimize cross-layer localization coupling, and use backend RFC7807 + `code`/`params` as machine-readable contract. Frontend maps `code -> l10n key` for user-facing messages while preserving fallback behavior.
**Tech Stack:** Flutter gen-l10n, FastAPI (RFC7807), Pydantic models, Dio client error mapping, existing AGENTS/rules constraints.
---
### Task 1: Freeze and baseline current behavior
**Files:**
- Modify: none (read-only task)
- Verify: `apps/lib/**`, `backend/src/**`, `docs/protocols/**`
**Step 1: Snapshot app localization status**
Run: `python scripts/count_cn_literals.py` (or equivalent one-off command)
Expected: baseline count and top files with Chinese literals.
**Step 2: Snapshot backend detail-string usage**
Run: `python scripts/count_http_detail_usage.py` (or equivalent one-off command)
Expected: per-file count of `HTTPException(detail=...)` hotspots.
**Step 3: Capture baseline checks**
Run: `cd apps && flutter analyze`
Expected: no new errors, only known existing warnings/infos.
Run: `cd backend && uv run pytest -q` (or targeted fast suite if full too slow)
Expected: baseline pass/fail recorded for regression comparison.
---
### Task 2: Refactor l10n structure to remove redundant wrapper responsibilities
**Files:**
- Modify: `apps/lib/app/app.dart`
- Modify: UI files currently importing `apps/lib/core/l10n/l10n.dart`
- Delete/Modify: `apps/lib/core/l10n/l10n.dart` (depending on final outcome)
- Verify: generated files under `apps/lib/l10n/`
**Step 1: Define target rule**
Rule:
- UI layer uses `context.l10n`.
- Non-UI layer does not depend on ad-hoc global locale state.
- If non-UI needs localization, pass already-localized strings in from caller or inject mapper service.
**Step 2: Write failing/static guard checks**
Add temporary grep checks:
- Fail when new code adds `L10n.current` in feature/presentation.
- Fail when `core/l10n/l10n.dart` is reintroduced for convenience access.
**Step 3: Replace call sites incrementally**
For each file:
1. Replace `L10n.current.xxx` with `context.l10n.xxx` where `BuildContext` exists.
2. For cubit/service/form validators, inject message providers or pass messages from UI.
3. Keep behavior unchanged.
**Step 4: Remove locale global mutation path**
In `app.dart`:
- Remove `L10n.setLocale(...)` style side effects.
- Keep Flutter-native delegates + `supportedLocales` + default locale logic.
**Step 5: Delete redundant wrapper (if no remaining valid use case)**
Delete `apps/lib/core/l10n/l10n.dart` only after all references are removed and non-UI strategy is in place.
**Step 6: Verify**
Run: `cd apps && flutter gen-l10n && flutter analyze`
Expected: no errors.
---
### Task 3: Define backend stable error code contract (RFC7807 extension)
**Files:**
- Modify: `backend/src/core/http/response.py`
- Modify: `backend/src/app.py`
- Create: `backend/src/core/http/errors.py`
- Modify: `docs/protocols/agent/api-endpoints.md`
- (optional) Create: `docs/protocols/common/error-contract.md`
**Step 1: Extend problem details schema**
Add fields:
- `code: str | None`
- `params: dict[str, str | int | float | bool] | None`
Preserve RFC7807 required fields and media type.
**Step 2: Introduce unified domain error type**
In `core/http/errors.py`, create exception class carrying:
- http status
- stable error code (UPPER_SNAKE_CASE)
- optional params
- optional internal detail
**Step 3: Wire global exception handlers**
In `app.py`:
- Convert domain exceptions to problem+json with `code` and `params`.
- Keep fallback for unknown exceptions.
**Step 4: Define code naming convention**
Examples:
- `AUTH_INVALID_TOKEN`
- `AUTH_TOKEN_EXPIRED`
- `SCHEDULE_ITEM_NOT_FOUND`
- `TODO_TITLE_REQUIRED`
- `FRIENDSHIP_ALREADY_EXISTS`
---
### Task 4: Migrate backend hotspots from free-text detail to stable codes
**Files:**
- Modify: `backend/src/v1/friendships/service.py`
- Modify: `backend/src/v1/schedule_items/service.py`
- Modify: `backend/src/v1/todo/service.py`
- Modify: `backend/src/v1/agent/service.py`
- Modify: `backend/src/v1/users/service.py`
- Modify: `backend/src/v1/memories/service.py`
- Modify: `backend/src/v1/auth/gateway.py`
- Modify: `backend/src/v1/agent/router.py`
- Modify: other files with `HTTPException(detail=...)`
**Step 1: Prioritize by impact**
Order:
1. Auth
2. Agent
3. Todo/Schedule/Friendships
4. Users/Memories
**Step 2: Replace throw sites**
For each detail-based throw:
1. Map to stable `code`.
2. Keep detail only as optional server diagnostic text.
3. Add params when useful (e.g., max size, field, limit).
**Step 3: Preserve backwards compatibility window**
During transition:
- Keep `detail` present.
- Add `code`/`params` immediately.
- Frontend prefers `code`, falls back to existing behavior.
---
### Task 5: Frontend network error mapping to l10n via backend code
**Files:**
- Modify: `apps/lib/core/network/api_exception.dart`
- Create: `apps/lib/core/network/error_code_mapper.dart`
- Modify: call sites currently displaying raw backend detail
- Modify: `apps/lib/l10n/app_zh.arb`, `apps/lib/l10n/app_en.arb`
**Step 1: Parse `code` and `params` from response payload**
In `ApiException.fromDioError`:
- Read RFC7807 + extension fields.
- Keep `statusCode` fallback behavior.
**Step 2: Map `code -> localized message`**
Implement central mapper:
- Input: code/status/params
- Output: localized user-facing string key resolution
**Step 3: Fallback strategy**
Priority:
1. known code mapping
2. status-based generic mapping
3. safe generic fallback (`request failed` localized)
**Step 4: Replace UI direct usage of raw server detail**
Audit and update places where `e.toString()` or backend detail is shown directly.
---
### Task 6: Continue frontend hardcoded text migration (remaining files)
**Files:**
- Modify: `apps/lib/features/settings/presentation/screens/*.dart` (remaining high-count files)
- Modify: `apps/lib/features/calendar/presentation/screens/*.dart`
- Modify: `apps/lib/features/calendar/presentation/widgets/*.dart`
- Modify: `apps/lib/l10n/app_zh.arb`, `apps/lib/l10n/app_en.arb`
**Step 1: Batch by screen group**
Batch A: settings deep pages
Batch B: calendar pages
Batch C: shared/home leftovers
**Step 2: Migrate with key hygiene**
Rules:
- key names are feature-prefixed and stable
- dynamic texts use placeholders, not string concatenation
- avoid duplicate semantic keys
**Step 3: After each batch, run verification**
Run:
- `cd apps && flutter gen-l10n`
- `cd apps && flutter analyze`
Track remaining hardcoded-literal count after each batch.
---
### Task 7: Protocol docs and test updates
**Files:**
- Modify: `docs/protocols/agent/api-endpoints.md`
- Modify/Create: `docs/protocols/common/error-contract.md`
- Modify: backend integration/unit tests asserting only `detail`
- Modify: frontend tests around error display/mapping
**Step 1: Document new error response shape**
Example:
```json
{
"type": "about:blank",
"title": "Unprocessable Entity",
"status": 422,
"detail": "Validation failed",
"code": "TODO_TITLE_REQUIRED",
"params": {"field": "title"},
"instance": "/api/v1/todo"
}
```
**Step 2: Update tests to assert codes first**
Replace brittle text assertions with:
- `status`
- `code`
- optional `params`
---
### Task 8: Final verification gate
**Files:**
- Verify only
**Step 1: Apps verification**
Run:
- `cd apps && flutter gen-l10n`
- `cd apps && flutter analyze`
**Step 2: Backend verification**
Run:
- `cd backend && uv run ruff check .`
- `cd backend && uv run basedpyright`
- `cd backend && uv run pytest -q`
**Step 3: Cross-contract smoke**
Run targeted API checks ensuring error payload includes `code` for representative modules.
---
### Task 9: Rollout and compatibility
**Files:**
- Modify: release notes/changelog if used
**Step 1: Progressive rollout strategy**
- Phase 1: backend emits both `detail` + `code`
- Phase 2: frontend consumes `code` with fallback
- Phase 3: clean up legacy detail-dependent branches
**Step 2: Monitoring**
- Log unknown/unmapped error codes on frontend
- Add backend metrics for top emitted error codes
---
## Risks and mitigations
- Risk: non-UI code loses localization access after wrapper removal
- Mitigation: inject messages from UI/service boundary; avoid static locale globals.
- Risk: backend code migration is broad (many detail throws)
- Mitigation: staged module-by-module migration + compatibility window.
- Risk: front/back mismatch in error code enum
- Mitigation: shared protocol doc + CI checks for known code list.
## Done criteria
- `apps/lib/core/l10n/l10n.dart` removed or reduced to zero-overlap minimal utility with explicit justification.
- Backend RFC7807 responses include stable `code` (and optional `params`) on migrated endpoints.
- Frontend maps known codes to zh/en l10n; raw detail is no longer primary user-facing string.
- Hardcoded visible Chinese text count in `apps/lib` reduced to agreed threshold or zero for targeted modules.
- Docs and tests updated accordingly.
+16 -7
View File
@@ -269,12 +269,21 @@ WAV 音频转写。
---
## 通用错误
## 错误约定(Agent
当前实现的错误主体为 FastAPI `detail` 字段
Agent 路由的错误同样遵循统一 HTTP 错误契约,详见
```json
{
"detail": "..."
}
```
- `docs/protocols/common/http-error-codes.md`
本文件只补充 Agent 相关错误码示例:
- `AGENT_RUN_INPUT_INVALID`
- `AGENT_RUN_MESSAGES_INVALID`
- `AGENT_INVALID_LAST_EVENT_ID`
- `AGENT_SSE_CONNECTION_LIMIT`
- `AGENT_ATTACHMENT_EMPTY`
- `AGENT_ATTACHMENT_TOO_LARGE`
- `AGENT_AUDIO_UNSUPPORTED_FORMAT`
- `AGENT_AUDIO_TOO_LARGE`
- `AGENT_AUDIO_EMPTY`
- `AGENT_ASR_UNAVAILABLE`
+200
View File
@@ -0,0 +1,200 @@
# HTTP Error Contract (RFC7807 + Stable Codes)
This document is the single source of truth for backend HTTP error transport format and frontend parsing strategy.
## Response Format
All API errors must use `application/problem+json` and include RFC7807 fields.
```json
{
"type": "about:blank",
"title": "Unprocessable Entity",
"status": 422,
"detail": "Validation failed",
"code": "TODO_TITLE_REQUIRED",
"params": {
"field": "title"
},
"instance": "/api/v1/todo"
}
```
### Field Rules
- `code` (required for business errors): stable machine-readable code (`UPPER_SNAKE_CASE`)
- `params` (optional): key-value values for localized message placeholders
- `detail` (required by RFC7807): human-readable fallback/debug text
## Backend Rules
- Do not rely on free-text `detail` as the only contract.
- New endpoints and new error branches must return stable `code`.
- Existing branches can migrate incrementally but must prefer code-first.
- Keep status semantics unchanged (`400/401/403/404/409/422/429/5xx`).
## Frontend Parsing Rules
- Parse in this order: `code` -> `params` -> `status` -> fallback `detail`.
- User-facing text should come from local l10n mapping by `code`.
- Unknown code fallback:
1) status-based generic localized message
2) safe fallback localized message (do not expose raw internals)
## Error Code Registry (Single Source of Truth)
This section is the canonical registry shared by backend and frontend.
When creating/modifying/deprecating any code, this table must be updated in the same change.
| Code | Domain | HTTP | Meaning |
|---|---|---:|---|
| `AGENT_RUN_INPUT_INVALID` | agent | 422 | Run input payload invalid |
| `AGENT_RUN_MESSAGES_INVALID` | agent | 422 | Run messages contract invalid |
| `AGENT_INVALID_LAST_EVENT_ID` | agent | 422 | SSE Last-Event-ID invalid |
| `AGENT_SSE_CONNECTION_LIMIT` | agent | 429 | SSE connections exceed per-user limit |
| `AGENT_ATTACHMENT_EMPTY` | agent | 422 | Attachment payload empty |
| `AGENT_ATTACHMENT_TOO_LARGE` | agent | 413 | Attachment exceeds allowed size |
| `AGENT_AUDIO_UNSUPPORTED_FORMAT` | agent | 400 | Audio content type/header unsupported |
| `AGENT_AUDIO_TOO_LARGE` | agent | 400 | Audio exceeds allowed size |
| `AGENT_AUDIO_EMPTY` | agent | 400 | Audio payload empty |
| `AGENT_ASR_UNAVAILABLE` | agent | 502 | ASR dependency unavailable |
| `AGENT_FORBIDDEN` | agent | 403 | Current user does not own target thread/session |
| `AGENT_PAYLOAD_INVALID` | agent | 422 | Run payload or forwarded runtime mode is invalid |
| `AGENT_ATTACHMENTS_TOO_MANY` | agent | 422 | Attachments exceed per-message limit |
| `AGENT_SIGNED_IMAGE_URL_INVALID` | agent | 422 | Signed image URL is malformed or unverifiable |
| `AGENT_ATTACHMENT_STORAGE_UNAVAILABLE` | agent | 503 | Attachment storage backend unavailable |
| `AGENT_ATTACHMENT_UNSUPPORTED_TYPE` | agent | 422 | Attachment MIME type is unsupported |
| `AGENT_ATTACHMENT_UPLOAD_FAILED` | agent | 502 | Upload to attachment storage failed |
| `AGENT_ATTACHMENT_BUCKET_INVALID` | agent | 422 | Attachment bucket does not match allowed bucket |
| `AGENT_ATTACHMENT_PATH_SCOPE_INVALID` | agent | 422 | Attachment path is outside allowed user scope |
| `AGENT_SIGNED_URL_GENERATION_FAILED` | agent | 502 | Failed to generate signed URL from storage backend |
| `AGENT_SESSION_ID_INVALID` | agent | 422 | Session ID is not a valid UUID |
| `AGENT_SESSION_NOT_FOUND` | agent | 404 | Agent chat session does not exist |
| `AGENT_USER_ID_INVALID` | agent | 422 | User ID is not a valid UUID |
| `INVALID_BINARY_URL_HOST` | agent | 422 | Signed URL host is invalid |
| `INVALID_BINARY_URL_BUCKET` | agent | 422 | Signed URL bucket is invalid |
| `INVALID_BINARY_URL_PATH_SCOPE` | agent | 422 | Signed URL path scope is invalid |
| `AUTH_SERVICE_UNAVAILABLE` | auth | 503 | Upstream auth service is temporarily unavailable |
| `AUTH_TOO_MANY_REQUESTS` | auth | 429 | Auth operation exceeds request rate limit |
| `AUTH_VERIFICATION_CODE_INVALID` | auth | 401 | OTP verification code is invalid |
| `AUTH_REFRESH_TOKEN_INVALID` | auth | 401 | Refresh token is invalid or expired |
| `AUTH_REFRESH_TOKEN_MISSING` | auth | 401 | Refresh token is missing for logout/refresh |
| `AUTH_USER_NOT_FOUND` | auth | 404 | User lookup by phone returns no match |
| `AUTH_UNAUTHORIZED` | auth | 401 | Authorization header or token is invalid |
| `JWT_VERIFIER_NOT_CONFIGURED` | auth | 503 | JWT verifier configuration is missing |
| `AUTOMATION_JOB_LIMIT_EXCEEDED` | automation_jobs | 400 | User-created automation jobs exceed allowed limit |
| `AUTOMATION_SYSTEM_JOB_MODIFICATION_FORBIDDEN` | automation_jobs | 403 | System bootstrap job cannot be modified |
| `AUTOMATION_JOB_NOT_FOUND` | automation_jobs | 404 | Target automation job does not exist or is not owned by user |
| `AUTOMATION_JOB_STORE_UNAVAILABLE` | automation_jobs | 503 | Automation job persistence unavailable |
| `NOT_FOUND` | runtime/tooling | 404 | Resource/tool target not found |
| `LOOKUP_FAILED` | runtime/tooling | 500 | Lookup or resolution failed |
| `INTERNAL_ERROR` | runtime/tooling | 500 | Internal execution error |
| `MISSING_RUNTIME_ARGS` | runtime/tooling | 400 | Required runtime arguments missing |
| `TOOL_PENDING_APPROVAL` | runtime/tooling | 409 | Tool call awaiting approval |
| `TOOL_REJECTED` | runtime/tooling | 403 | Tool call rejected by policy/user |
| `USER_STORE_UNAVAILABLE` | users | 503 | User storage or database access unavailable |
| `USER_NOT_FOUND` | users | 404 | Requested user profile not found |
| `USER_UPDATE_FIELDS_EMPTY` | users | 400 | Update request contains no writable fields |
| `USER_AVATAR_UNSUPPORTED_TYPE` | users | 422 | Avatar MIME type is unsupported |
| `USER_AVATAR_TOO_LARGE` | users | 413 | Avatar file size exceeds configured limit |
| `USER_AVATAR_EMPTY` | users | 422 | Avatar upload payload is empty |
| `USER_AVATAR_UPLOAD_FAILED` | users | 502 | Upstream storage upload failed |
| `USER_AUTH_LOOKUP_UNAVAILABLE` | users | 503 | Auth/identity phone lookup backend unavailable |
| `TODO_SERVICE_UNAVAILABLE` | todo | 503 | Todo persistence unavailable |
| `TODO_NOT_FOUND` | todo | 404 | Todo item does not exist |
| `TODO_ACCESS_FORBIDDEN` | todo | 403 | Current user cannot operate on target todo |
| `TODO_REORDER_DUPLICATE_ID` | todo | 400 | Reorder payload contains duplicate todo IDs |
| `TODO_STATUS_INVALID` | todo | 400 | Todo status filter value invalid |
| `TODO_PRIORITY_INVALID` | todo | 400 | Todo priority filter value out of range |
| `SCHEDULE_ITEM_INVALID_TIME_RANGE` | schedule_items | 400 | `end_at` must be after `start_at` |
| `SCHEDULE_ITEM_STORE_UNAVAILABLE` | schedule_items | 503 | Schedule item persistence unavailable |
| `SCHEDULE_ITEM_NOT_FOUND` | schedule_items | 404 | Schedule item does not exist |
| `SCHEDULE_ITEM_START_AT_TIMEZONE_REQUIRED` | schedule_items | 400 | `start_at` must include timezone when `end_at` is set |
| `SCHEDULE_ITEM_PAGE_INVALID` | schedule_items | 400 | Pagination `page` must be greater than or equal to 1 |
| `SCHEDULE_ITEM_PAGE_SIZE_INVALID` | schedule_items | 400 | Pagination `page_size` out of allowed range |
| `SCHEDULE_ITEM_SHARE_FORBIDDEN` | schedule_items | 403 | Current user cannot share this schedule item |
| `SCHEDULE_ITEM_SHARE_PERMISSION_EXCEEDED` | schedule_items | 403 | Requested share permission exceeds inviter permission |
| `SCHEDULE_ITEM_SUBSCRIPTION_ALREADY_ACTIVE` | schedule_items | 400 | Recipient already has active subscription |
| `SCHEDULE_ITEM_INVITE_ALREADY_SUBSCRIBED` | schedule_items | 400 | Recipient already accepted calendar invite |
| `SCHEDULE_ITEM_INVITE_ALREADY_PENDING` | schedule_items | 400 | Recipient already has pending calendar invite |
| `SCHEDULE_ITEM_AUTH_LOOKUP_UNAVAILABLE` | schedule_items | 503 | Auth/identity lookup unavailable when sharing |
| `SCHEDULE_ITEM_PENDING_INVITE_NOT_FOUND` | schedule_items | 404 | No pending invitation exists for target item/user |
| `SCHEDULE_ITEM_ACCEPT_SUBSCRIPTION_FAILED` | schedule_items | 503 | Subscription accept flow failed unexpectedly |
| `SCHEDULE_ITEM_REJECT_SUBSCRIPTION_FAILED` | schedule_items | 503 | Subscription reject flow failed unexpectedly |
| `SCHEDULE_ITEM_DATETIME_TIMEZONE_REQUIRED` | schedule_items | 400 | Datetime input must include timezone |
| `SCHEDULE_ITEM_DATETIME_REQUIRED` | schedule_items | 400 | Required datetime input missing |
| `INBOX_MESSAGE_NOT_FOUND` | inbox_messages | 404 | Inbox message does not exist for current user |
| `INBOX_MESSAGE_STORE_UNAVAILABLE` | inbox_messages | 503 | Inbox message persistence unavailable |
| `MEMORIES_USER_NOT_FOUND` | memories | 404 | User memory record does not exist |
| `MEMORIES_WORK_NOT_FOUND` | memories | 404 | Work memory record does not exist |
| `MEMORIES_SERVICE_UNAVAILABLE` | memories | 503 | Memories persistence unavailable |
| `FRIEND_REQUEST_SELF_NOT_ALLOWED` | friendships | 400 | User cannot send friend request to self |
| `FRIEND_ALREADY_ACCEPTED` | friendships | 400 | Users are already friends |
| `FRIEND_REQUEST_BLOCKED` | friendships | 400 | Friend request blocked by relationship status |
| `FRIEND_REQUEST_ALREADY_SENT` | friendships | 400 | Pending friend request already exists |
| `FRIENDSHIP_SERVICE_UNAVAILABLE` | friendships | 503 | Friendship persistence unavailable |
| `FRIEND_REQUEST_NOT_FOUND` | friendships | 404 | Friend request record not found |
| `FRIEND_REQUEST_FORBIDDEN` | friendships | 403 | Current user is not allowed for this friend request action |
| `FRIEND_REQUEST_NOT_PENDING` | friendships | 400 | Friend request is not in pending state |
| `FRIEND_INBOX_MESSAGE_NOT_FOUND` | friendships | 404 | Friend request inbox message not found |
| `FRIENDSHIP_DATA_INVALID` | friendships | 400 | Friendship record is missing required linkage fields |
| `FRIENDSHIP_NOT_FOUND` | friendships | 404 | Friendship record not found |
| `FRIENDSHIP_REMOVE_REQUIRES_ACCEPTED` | friendships | 400 | Only accepted friendships can be removed |
## Registry Coverage Check Script
Use the checker script to ensure this registry and frontend code mapping stay aligned:
```bash
python3 scripts/check_error_code_registry.py
```
Optional arguments:
- `--doc`: custom registry markdown path
- `--mapper`: custom frontend mapper path (default: `apps/lib/core/network/error_code_mapper.dart`)
Output always includes three result groups:
- doc has code but frontend has no mapping
- frontend maps code but doc has no such code
- duplicate codes
Exit code policy:
- `0`: no inconsistency found
- non-`0`: at least one inconsistency found or input path invalid
## Agent Error Code Set
### Agent
- `AGENT_RUN_INPUT_INVALID`
- `AGENT_RUN_MESSAGES_INVALID`
- `AGENT_INVALID_LAST_EVENT_ID`
- `AGENT_SSE_CONNECTION_LIMIT`
- `AGENT_ATTACHMENT_EMPTY`
- `AGENT_ATTACHMENT_TOO_LARGE`
- `AGENT_AUDIO_UNSUPPORTED_FORMAT`
- `AGENT_AUDIO_TOO_LARGE`
- `AGENT_AUDIO_EMPTY`
- `AGENT_ASR_UNAVAILABLE`
- `AGENT_FORBIDDEN`
- `AGENT_PAYLOAD_INVALID`
- `AGENT_ATTACHMENTS_TOO_MANY`
- `AGENT_SIGNED_IMAGE_URL_INVALID`
- `AGENT_ATTACHMENT_STORAGE_UNAVAILABLE`
- `AGENT_ATTACHMENT_UNSUPPORTED_TYPE`
- `AGENT_ATTACHMENT_UPLOAD_FAILED`
- `AGENT_ATTACHMENT_BUCKET_INVALID`
- `AGENT_ATTACHMENT_PATH_SCOPE_INVALID`
- `AGENT_SIGNED_URL_GENERATION_FAILED`
- `AGENT_SESSION_ID_INVALID`
- `AGENT_SESSION_NOT_FOUND`
- `AGENT_USER_ID_INVALID`
## Compatibility Strategy
- Transition phase keeps `detail` and adds `code`/`params`.
- Frontend moves to code-first mapping first; backend can then continue migrating remaining endpoints.