new-api

mirror of https://github.com/QuantumNous/new-api.git synced 2026-04-19 11:08:37 +00:00

Author	SHA1	Message	Date
comeback01	f04ed7584a	Merge branch 'main' into french-translation	2025-12-20 11:08:07 +01:00
长安	0a2f12c04e	fix: 修复 Anthropic 渠道缓存计费错误 ## 问题描述当使用 Anthropic 渠道通过 `/v1/chat/completions` 端点调用且启用缓存功能时，计费逻辑错误地减去了缓存 tokens，导致严重的收入损失（94.5%）。 ## 根本原因不同 API 的 `prompt_tokens` 定义不同： - Anthropic API: `input_tokens` 字段已经是纯输入 tokens（不包含缓存） - OpenAI API: `prompt_tokens` 字段包含所有 tokens（包含缓存） - OpenRouter API: `prompt_tokens` 字段包含所有 tokens（包含缓存）当前 `postConsumeQuota` 函数对所有渠道都减去缓存 tokens，这对 Anthropic 渠道是错误的，因为其 `input_tokens` 已经不包含缓存。 ## 修复方案在 `relay/compatible_handler.go` 的 `postConsumeQuota` 函数中，添加渠道类型判断： ```go if relayInfo.ChannelType != constant.ChannelTypeAnthropic { baseTokens = baseTokens.Sub(dCacheTokens) } ``` 只对非 Anthropic 渠道减去缓存 tokens。 ## 影响分析 ### ✅ 不受影响的场景 1. 无缓存调用（所有渠道） - cache_tokens = 0 - 减去 0 = 不减去 - 结果：完全一致 2. OpenAI/OpenRouter 渠道 + 缓存 - 继续减去缓存（因为 ChannelType != Anthropic） - 结果：完全一致 3. Anthropic 渠道 + /v1/messages 端点 - 使用 PostClaudeConsumeQuota（不修改） - 结果：完全不受影响 ### ✅ 修复的场景 4. Anthropic 渠道 + /v1/chat/completions + 缓存 - 修复前：错误地减去缓存，导致 94.5% 收入损失 - 修复后：不减去缓存，计费正确 ## 验证数据以实际记录 143509 为例： \| 项目 \| 修复前 \| 修复后 \| 差异 \| \|------\|--------\|--------\|------\| \| Quota \| 10,489 \| 191,330 \| +180,841 \| \| 费用 \| ¥0.020978 \| ¥0.382660 \| +¥0.361682 \| \| 收入恢复 \| - \| - \| +1724.1% \| ## 测试建议 1. 测试 Anthropic 渠道 + 缓存场景 2. 测试 OpenAI 渠道 + 缓存场景（确保不受影响） 3. 测试无缓存场景（确保不受影响） ## 相关 Issue 修复 Anthropic 渠道使用 prompt caching 时的计费错误。	2025-12-20 14:17:12 +08:00
CaIon	cc3ba39e72	feat(gin): improve request body handling and error reporting v0.10.1	2025-12-20 13:34:10 +08:00
CaIon	4ee595c448	feat(init): increase MaxRequestBodyMB to enhance request handling	2025-12-20 13:27:55 +08:00
CaIon	d9634ad2d3	feat(channel): add error handling for SaveWithoutKey when channel ID is 0	2025-12-20 13:26:40 +08:00
Seefs	a343ce84ee	Merge pull request #2476 from TinsFox/chore/code-inspector-plugin	2025-12-20 11:04:40 +08:00
Seefs	531dfb2555	docs: document pyroscope env var	2025-12-19 23:16:56 +08:00
TinsFox	e6ec551fbf	chore: add code-inspector-plugin integration	2025-12-19 23:04:53 +08:00
Seefs	5ef7247eac	docs: document pyroscope env var	2025-12-19 23:03:04 +08:00
Seefs	1168ddf9f9	fix: systemname	2025-12-19 22:27:35 +08:00
Seefs	a98aad2501	Merge pull request #2474 from TinsFox/main	2025-12-19 21:39:56 +08:00
TinsFox	97132de2ca	style: add card spacing	2025-12-19 21:00:31 +08:00
Seefs	da24a165d0	fix(gemini): handle minimal reasoning effort budget - Add minimal case to clampThinkingBudgetByEffort to avoid defaulting to full thinking budget	2025-12-18 08:10:46 +08:00
comeback01	f88fc26150	Refine French translations for UI conciseness Updated web/src/i18n/locales/fr.json to improve French translations for the user interface. Removed verbose prefixes like 'Gestion des...' and 'Paramètres de...' to prevent truncation in sidebars and menus. Harmonized terms for consistency (e.g., 'Tâches', 'Journaux', 'Dessins'). Renamed 'Place du marché' to 'Marché des modèles'.	2025-12-17 12:10:36 +01:00
Seefs	b35ae9f693	Merge pull request #2452 from QuantumNous/fix/oom-request-body-limit	2025-12-16 18:21:59 +08:00
t0ng7u	8cb56fc319	🧹 fix: harden request-body size handling and error unwrapping Tighten oversized request handling across relay paths and make error matching reliable. - Align `MAX_REQUEST_BODY_MB` fallback to `32` in request body reader and decompression middleware - Stop ignoring `GetRequestBody` errors in relay retry paths; return consistent 413 on oversized bodies (400 for other read errors) - Add `Unwrap()` to `types.NewAPIError` so `errors.Is/As` can match wrapped underlying errors - `go test ./...` passes	2025-12-16 18:10:00 +08:00
t0ng7u	8e3f9b1faa	🛡️ fix: prevent OOM on large/decompressed requests; skip heavy prompt meta when token count is disabled Clamp request body size (including post-decompression) to avoid memory exhaustion caused by huge payloads/zip bombs, especially with large-context Claude requests. Add a configurable `MAX_REQUEST_BODY_MB` (default `32`) and document it. - Enforce max request body size after gzip/br decompression via `http.MaxBytesReader` - Add a secondary size guard in `common.GetRequestBody` and cache-safe handling - Return 413 Request Entity Too Large on oversized bodies in relay entry - Avoid building large `TokenCountMeta.CombineText` when both token counting and sensitive check are disabled (use lightweight meta for pricing) - Update READMEs (CN/EN/FR/JA) with `MAX_REQUEST_BODY_MB` - Fix a handful of vet/formatting issues encountered during the change - `go test ./...` passes	2025-12-16 17:00:19 +08:00
Seefs	2a511c6ee4	fix: 支持传入system_instruction和systemInstruction两种风格系统提示词参数名	2025-12-16 13:08:58 +08:00
Calcium-Ion	11593bd3da	Merge pull request #2445 from QuantumNous/feat/token-ip-whitelist-cidr feat(auth): enhance IP restriction handling with CIDR support	2025-12-15 20:14:09 +08:00
CaIon	e16e7d6fb9	feat(auth): refactor IP restriction handling to use clearer variable naming	2025-12-15 20:13:09 +08:00
旃蒙	0217ed2f98	fix(task): 修复渠道配置多个key时无法获取任务的问题	2025-12-15 18:15:35 +08:00
CaIon	39593052b6	feat(auth): enhance IP restriction handling with CIDR support	2025-12-15 17:24:09 +08:00
CaIon	4ea8cbd207	Revert "feat(audio): replace SysLog with logger for improved logging in GetAudioDuration" This reverts commit `e293be0138`. v0.10.1-alpha.8	2025-12-14 00:04:40 +08:00
CaIon	e293be0138	feat(audio): replace SysLog with logger for improved logging in GetAudioDuration	2025-12-13 23:59:58 +08:00
CaIon	9c2483ef48	fix(audio): improve WAV duration calculation with enhanced PCM size handling	2025-12-13 23:57:32 +08:00
CaIon	689c43143b	feat(model_ratio): add default ratios for gpt-4o-mini-tts	2025-12-13 19:14:27 +08:00
CaIon	a2da6a9e90	refactor(channel_select): improve retry logic with reset functionality	2025-12-13 18:09:10 +08:00
Calcium-Ion	7a307e2e99	Merge pull request #2434 from QuantumNous/feat/gpt-4o-mini-tts feat: support gpt tts series model quota calculate v0.10.1-alpha.6	2025-12-13 17:55:16 +08:00
CaIon	7cae4a640b	fix(audio): correct TotalTokens calculation for accurate usage reporting	2025-12-13 17:49:57 +08:00
CaIon	e36e2e1b69	feat(audio): enhance audio request handling with token type detection and streaming support	2025-12-13 17:24:23 +08:00
CaIon	b602843ce1	feat(token): add CrossGroupRetry field to token insertion	2025-12-13 16:45:42 +08:00
CaIon	21fca238bf	refactor(error): replace dto.OpenAIError with types.OpenAIError for consistency	2025-12-13 16:43:57 +08:00
CaIon	c51936e068	refactor(channel_select): enhance retry logic and context key usage for channel selection	2025-12-13 16:43:38 +08:00
Seefs	fcafadc6bb	feat: pyroscope integrate	2025-12-13 13:49:38 +08:00
CaIon	b58fa3debc	fix(helper): improve error handling in FlushWriter and related functions v0.10.1-alpha.5	2025-12-13 13:29:21 +08:00
CaIon	1c167c1068	refactor(auth): replace direct token group setting with context key retrieval v0.10.1-alpha.4	2025-12-13 01:38:12 +08:00
Calcium-Ion	f9b6e4c243	Merge pull request #2430 from QuantumNous/fix/cross-group-retry fix(channel_select): adjust priority retry logic for cross-group v0.10.1-alpha.3	2025-12-13 01:05:40 +08:00
CaIon	b523f6a0ba	fix(channel_select): adjust priority retry logic for cross-group channel selection	2025-12-13 01:04:10 +08:00
Calcium-Ion	30cb224793	Merge pull request #2429 from QuantumNous/feat/xhigh feat(adaptor): add '-xhigh' suffix to reasoning effort options	2025-12-12 22:06:19 +08:00
CaIon	ce6fb95f96	refactor(relay): update channel retrieval to use RelayInfo structure v0.10.1-alpha.2	2025-12-12 22:04:38 +08:00
Calcium-Ion	2ac6a5b02f	Merge pull request #2424 from ion1ze/main fix: correct sender format issues fix #1347	2025-12-12 20:55:22 +08:00
CaIon	50854c17bb	feat(adaptor): add '-xhigh' suffix to reasoning effort options for model parsing	2025-12-12 20:53:48 +08:00
Calcium-Ion	147659fb6e	Merge pull request #2426 from QuantumNous/feat/auto-cross-group-retry feat(token): add cross-group retry option for token processing v0.10.1-alpha.1	2025-12-12 20:45:54 +08:00
Calcium-Ion	e9fb2ccdd1	Merge pull request #2428 from seefs001/fix/health-check fix: health check	2025-12-12 20:45:34 +08:00
Seefs	48a17efade	fix: health check	2025-12-12 20:37:32 +08:00
CaIon	7e1d1350c7	feat: implement cross-group retry functionality and update translations	2025-12-12 18:28:33 +08:00
CaIon	01b4039e96	feat(token): add cross-group retry option for token processing	2025-12-12 17:59:21 +08:00
hackerxiao	8e629a2a11	feat: 支持仅使用x-api-key获取anthropic格式的模型列表注释增加	2025-12-12 17:27:24 +08:00
zdwy5	e1bee48152	fix: 支持aws 通过全局参数透传或者渠道参数透传来调用 (#2423 ) * fix: 支持aws 通过全局参数透传或者渠道参数透传来调用 * fix(aws): replace json.Unmarshal with common.Unmarshal for request body processing --------- Co-authored-by: r0 <liangchunlei@01.ai> Co-authored-by: CaIon <i@caion.me>	2025-12-12 17:09:27 +08:00
hackerxiao	2a16c37aab	feat: 支持仅使用x-api-key获取anthropic格式的模型列表	2025-12-12 16:53:10 +08:00

1 2 3 4 5 ...

4955 Commits