1. Async task model redirection (aligned with sync tasks):
- Integrate ModelMappedHelper in RelayTaskSubmit after model name
determination, populating OriginModelName / UpstreamModelName on RelayInfo.
- All task adaptors now send UpstreamModelName to upstream providers:
- Gemini & Vertex: BuildRequestURL uses UpstreamModelName.
- Doubao & Ali: BuildRequestBody conditionally overwrites body.Model.
- Vidu, Kling, Hailuo, Jimeng: convertToRequestPayload accepts RelayInfo
and unconditionally uses info.UpstreamModelName.
- Sora: BuildRequestBody parses JSON and multipart bodies to replace
the "model" field with UpstreamModelName.
- Frontend log visibility: LogTaskConsumption and taskBillingOther now
emit is_model_mapped / upstream_model_name in the "other" JSON field.
- Billing safety: RecalculateTaskQuotaByTokens reads model name from
BillingContext.OriginModelName (via taskModelName) instead of
task.Data["model"], preventing billing leaks from upstream model names.
2. Per-call billing (TaskPricePatches lifecycle):
- Rename TaskBillingContext.ModelName → OriginModelName; add PerCallBilling
bool field, populated from TaskPricePatches at submission time.
- settleTaskBillingOnComplete short-circuits when PerCallBilling is true,
skipping both adaptor adjustments and token-based recalculation.
- Remove ModelName from TaskSubmitResult; use relayInfo.OriginModelName
consistently in controller/relay.go for billing context and logging.
3. Multipart retry boundary mismatch fix:
- Root cause: after Sora (or OpenAI audio) rebuilds a multipart body with a
new boundary and overwrites c.Request.Header["Content-Type"], subsequent
calls to ParseMultipartFormReusable on retry would parse the cached
original body with the wrong boundary, causing "NextPart: EOF".
- Fix: ParseMultipartFormReusable now caches the original Content-Type in
gin context key "_original_multipart_ct" on first call and reuses it for
all subsequent parses, making multipart parsing retry-safe globally.
- Sora adaptor reverted to the standard pattern (direct header set/get),
which is now safe thanks to the root fix.
4. Tests:
- task_billing_test.go: update makeTask to use OriginModelName; add
PerCallBilling settlement tests (skip adaptor adjust, skip token recalc);
add non-per-call adaptor adjustment test with refund verification.
Replace all bare task.Update() (DB.Save) calls with UpdateWithStatus(),
which adds a WHERE status = ? guard to prevent concurrent processes from
overwriting each other's state transitions.
Key changes:
model/task.go:
- Add taskSnapshot struct with Equal() method for change detection
- Add Snapshot() method to capture pre-update state
- Add UpdateWithStatus(fromStatus) using DB.Where().Save() for CAS
semantics with full-struct save (no explicit field listing needed)
model/midjourney.go:
- Add UpdateWithStatus(fromStatus string) with same CAS pattern
service/task_polling.go (updateVideoSingleTask):
- Snapshot before processing upstream response; skip DB write if unchanged
- Terminal transitions (SUCCESS/FAILURE) use UpdateWithStatus CAS:
billing/refund only executes if this process wins the transition
- Non-terminal updates also use UpdateWithStatus to prevent overwriting
a concurrent terminal transition back to IN_PROGRESS
- Defer settleTaskBillingOnComplete to after CAS check (shouldSettle flag)
relay/relay_task.go (tryRealtimeFetch):
- Add snapshot + change detection; use UpdateWithStatus for CAS safety
controller/midjourney.go (UpdateMidjourneyTaskBulk):
- Capture preStatus before mutations; use UpdateWithStatus CAS
- Gate refund (IncreaseUserQuota) on CAS success (won && shouldReturnQuota)
This prevents the multi-instance race condition where:
1. Instance A reads task (IN_PROGRESS), fetches upstream (still IN_PROGRESS)
2. Instance B reads same task, fetches upstream (now SUCCESS), writes SUCCESS
3. Instance A's bare Save() overwrites SUCCESS back to IN_PROGRESS
- Enhanced the RelayTask function to utilize a locked channel when available, allowing for better reuse during retries.
- Updated error handling to ensure proper context setup for the selected channel.
- Clarified comments in ResolveOriginTask regarding channel locking and retry behavior.
- Introduced a new field in TaskRelayInfo to store the locked channel object, improving type safety and reducing import cycles.
- Updated the remix handling in ResolveOriginTask to prioritize extracting OtherRatios from the BillingContext of the original task if available.
- Retained the previous logic for extracting seconds and size from task data as a fallback.
- Improved clarity and maintainability of the remix logic by separating the new and old approaches.
Add three billing lifecycle methods to the TaskAdaptor interface:
- EstimateBilling: compute OtherRatios from user request before pricing
- AdjustBillingOnSubmit: adjust ratios from upstream submit response
- AdjustBillingOnComplete: determine final quota at task terminal state
Introduce BaseBilling as embeddable no-op default for adaptors without
custom billing. Move Sora/Ali OtherRatios logic from shared validation
into per-adaptor EstimateBilling implementations.
Add TaskBillingContext to persist pricing params (model_price, group_ratio,
other_ratios) in task private data for async polling settlement.
Extract RecalculateTaskQuota as a general-purpose delta settlement
function and unify polling billing via settleTaskBillingOnComplete
(adaptor-first, then token-based fallback).
Restructure the task relay system for better separation of concerns:
- Extract task billing into service/task_billing.go with unified settlement flow
- Move task polling loop from controller to service/task_polling.go (supports Suno + video platforms)
- Split RelayTask into fetch/submit paths with dedicated retry logic (taskSubmitWithRetry)
- Add TaskDto, TaskResponse generics, and FetchReq to dto/task.go
- Add taskcommon/helpers.go for shared task adaptor utilities
- Remove controller/task_video.go (logic consolidated into service layer)
- Update all task adaptors (ali, doubao, gemini, hailuo, jimeng, kling, sora, suno, vertex, vidu)
- Simplify frontend task logs to use new TaskDto response format
Clamp request body size (including post-decompression) to avoid memory exhaustion caused by huge payloads/zip bombs, especially with large-context Claude requests. Add a configurable `MAX_REQUEST_BODY_MB` (default `32`) and document it.
- Enforce max request body size after gzip/br decompression via `http.MaxBytesReader`
- Add a secondary size guard in `common.GetRequestBody` and cache-safe handling
- Return **413 Request Entity Too Large** on oversized bodies in relay entry
- Avoid building large `TokenCountMeta.CombineText` when both token counting and sensitive check are disabled (use lightweight meta for pricing)
- Update READMEs (CN/EN/FR/JA) with `MAX_REQUEST_BODY_MB`
- Fix a handful of vet/formatting issues encountered during the change
- `go test ./...` passes
This commit refactors the logging mechanism across the application by replacing direct logger calls with a centralized logging approach using the `common` package. Key changes include:
- Replaced instances of `logger.SysLog` and `logger.FatalLog` with `common.SysLog` and `common.FatalLog` for consistent logging practices.
- Updated resource initialization error handling to utilize the new logging structure, enhancing maintainability and readability.
- Minor adjustments to improve code clarity and organization throughout various modules.
This change aims to streamline logging and improve the overall architecture of the codebase.
This commit introduces a major architectural refactoring to improve quota management, centralize logging, and streamline the relay handling logic.
Key changes:
- **Pre-consume Quota:** Implements a new mechanism to check and reserve user quota *before* making the request to the upstream provider. This ensures more accurate quota deduction and prevents users from exceeding their limits due to concurrent requests.
- **Unified Relay Handlers:** Refactors the relay logic to use generic handlers (e.g., `ChatHandler`, `ImageHandler`) instead of provider-specific implementations. This significantly reduces code duplication and simplifies adding new channels.
- **Centralized Logger:** A new dedicated `logger` package is introduced, and all system logging calls are migrated to use it, moving this responsibility out of the `common` package.
- **Code Reorganization:** DTOs are generalized (e.g., `dalle.go` -> `openai_image.go`) and utility code is moved to more appropriate packages (e.g., `common/http.go` -> `service/http.go`) for better code structure.
Summary
• Migrated all ratio-related sources into `setting/ratio_setting/`
– `model_ratio.go` (renamed from model-ratio.go)
– `cache_ratio.go`
– `group_ratio.go`
• Changed package name to `ratio_setting` and relocated initialization (`ratio_setting.InitRatioSettings()` in main).
• Updated every import & call site:
– Model / cache / completion / image ratio helpers
– Group ratio helpers (`GetGroupRatio*`, `ContainsGroupRatio`, `CheckGroupRatio`, etc.)
– JSON-serialization & update helpers (`*Ratio2JSONString`, `Update*RatioByJSONString`)
• Adjusted controllers, middleware, relay helpers, services and models to reference the new package.
• Removed obsolete `setting` / `operation_setting` imports; added missing `ratio_setting` imports.
• Adopted idiomatic map iteration (`for key := range m`) where value is unused.
• Ran static checks to ensure clean build.
This commit centralises all ratio configuration (model, cache and group) in one cohesive module, simplifying future maintenance and improving code clarity.
- Introduce `SelfUseModeEnabled` setting to allow flexible model ratio configuration
- Update error handling to provide more informative messages when model ratios are not set
- Modify pricing and relay logic to support self-use mode
- Add UI toggle for enabling self-use mode in operation settings
- Implement fallback mechanism for model ratios when self-use mode is enabled
- Add new context keys for user-related information
- Modify user cache and authentication middleware to populate context
- Refactor quota and notification services to use context-based user data
- Remove redundant database queries by leveraging context information
- Update various components to use new context-based user retrieval methods
- Implement user notification settings with email and webhook options
- Add new user settings for quota warning threshold and notification preferences
- Create backend API and database support for user notification configuration
- Enhance frontend personal settings with notification configuration UI
- Support custom notification email and webhook URL
- Add service layer for sending user notifications
- Replaced references to common.GroupRatio and common.UserUsableGroups with corresponding functions from the new setting package across multiple controllers and services.
- Introduced new setting functions for managing group ratios and user usable groups, enhancing code organization and maintainability.
- Updated related functions to ensure consistent behavior with the new setting package integration.
- Added a new 'group' parameter to various logging functions, including RecordConsumeLog, GetAllLogs, and GetUserLogs, to allow for more granular log tracking.
- Updated the logs table component to display group information, improving the visibility of log data.
- Refactored related functions to accommodate the new group parameter, ensuring consistent handling across the application.
- Improved the initialization of the group column for PostgreSQL compatibility.