Files
new-api/service/quota.go
同語 9da3412fde feat: add subscription billing system (#2808)
* ci: create docker automation

*  feat: add subscription billing system with admin management and user purchase flow

Implement a new subscription-based billing model alongside existing metered/per-request billing:

Backend:
- Add subscription plan models (SubscriptionPlan, SubscriptionPlanItem, UserSubscription, etc.)
- Implement CRUD APIs for subscription plan management (admin only)
- Add user subscription queries with support for multiple active/expired subscriptions
- Integrate payment gateways (Stripe, Creem, Epay) for subscription purchases
- Implement pre-consume and post-consume billing logic for subscription quota tracking
- Add billing preference settings (subscription_first, wallet_first, etc.)
- Enhance usage logs with subscription deduction details

Frontend - Admin:
- Add subscription management page with table view and drawer-based edit form
- Match UI/UX style with existing admin pages (redemption codes, users)
- Support enabling/disabling plans, configuring payment IDs, and model quotas
- Add user subscription binding modal in user management

Frontend - Wallet:
- Add subscription plans card with current subscription status display
- Show all subscriptions (active and expired) with remaining days/usage percentage
- Display purchasable plans with pricing cards following SaaS best practices
- Extract purchase modal to separate component matching payment confirm modal style
- Add skeleton loading states with active animation
- Implement billing preference selector in card header
- Handle payment gateway availability based on admin configuration

Frontend - Usage Logs:
- Display subscription deduction details in log entries
- Show step-by-step breakdown of subscription usage (pre-consumed, delta, final, remaining)
- Add subscription deduction tag for subscription-covered requests

*  feat(admin): add user subscription management and refine UI/pagination

Add admin APIs to list/create/invalidate/delete user subscriptions
Add model helpers to fetch all user subscriptions (incl. expired) and support cancel/hard-delete
Wire new admin routes for user subscription operations
Replace “Bind subscription plan” entry with a dedicated User Subscriptions SideSheet in Users table
Use CardTable with responsive layout and working client-side pagination inside the SideSheet
Improve subscription purchase modal empty-gateway state with a Banner notice

*  feat(admin): streamline subscription plan benefits editor with bulk actions

Restore the avatar/icon header for the “Model Benefits” section
Replace scattered controls with a compact toolbar-style workflow
Support multi-select add with a default quota for new items
Add row selection with bulk apply-to-selected / apply-to-all quota updates
Enable delete-selected to manage benefits faster and reduce mistakes

*  fix(subscription): finalize payments, log billing, and clean up dead code

Complete subscription orders by creating a matching top-up record and writing billing logs
Add Epay return handler to verify and finalize browser callbacks
Require Stripe/Creem webhook configuration before starting subscription payments
Show subscription purchases in topup history with clearer labels/methods
Remove unused subscription helper, legacy Creem webhook struct, and unused topup fields
Simplify subscription self API payload to active/all lists only

* 🎨 style: format all code with gofmt and lint:fix

Apply consistent code formatting across the entire codebase using
gofmt and lint:fix tools. This ensures adherence to Go community
standards and improves code readability and maintainability.

Changes include:
- Run gofmt on all .go files to standardize formatting
- Apply lint:fix to automatically resolve linting issues
- Fix code style inconsistencies and formatting violations

No functional changes were made in this commit.

*  feat(subscription): add quota reset periods and admin configuration

- Add reset period fields on subscription plans and user items
- Apply automatic quota resets during pre-consume based on plan schedule
- Expose reset-period configuration in the admin plan editor
- Display reset cadence in subscription cards and purchase modal
- Validate custom reset seconds on plan create/update

*  feat(subscription): harden subscription billing with resets, idempotency, and production-grade stability

Add plan-level quota reset periods and display/reset cadence in admin/UI
Enforce natural reset alignment with background reset task and cleanup job
Make subscription pre-consume/refund idempotent with request-scoped records and retries
Use database time for consistent resets across multi-instance deployments
Harden payment callbacks with locking and idempotent order completion
Record subscription purchases in topup history and billing logs
Optimize subscription queries and add critical composite indexes

*  feat(subscription): cache plan lookups and stabilize pre-consume

Introduce hybrid caches for subscription plans, items, and plan info with explicit
invalidation on admin updates. Streamline pre-consume transactions to reduce
redundant queries while preserving idempotency and reset logic.

* 🐛 fix(subscription): avoid pre-consume lookup noise

Use a RowsAffected check for the idempotency lookup so missing records
no longer surface as "record not found" errors while preserving behavior.

* 🔧 ci: Change workflow trigger to sub branch

Update the Docker image workflow to run on pushes to the sub branch instead of main.

* 💸 chore: Align subscription pricing display with global currency settings
Unify subscription price rendering to use the site-wide currency symbol/rate on the wallet and admin views.
Make subscription plan currency read-only in the editor and force USD on create/update to avoid drift.
Use global currency display type when creating Creem checkout payloads.

* 🔧 chore: Unify subscription plan status toggle with PATCH endpoint

Replace separate enable/disable flows with a single PATCH API that updates the enabled flag.
Update frontend hooks and table actions to call the unified endpoint and keep UI behavior consistent.
Introduce a minimal admin controller handler and route for the status update.

*  feat: Add subscription limits and UI tags consistency

Add per-plan purchase limits with backend enforcement and UI disable states.
Expose limit configuration in admin plan editor and show limits in plan tables/cards.
Refine subscription UI tags with unified badge style and streamlined “My Subscriptions” layout.

* 🎨 style: tag color to white

* 🚀 refactor: Simplify subscription quota to total amount model

Remove per-model subscription items and switch to a single total quota per plan and user subscription. Update billing, reset, and logging flows to operate on total quota, and refactor admin/user UI to configure and display total quota consistently.

* 🚀 chore: Remove duplicate subscription usage percentage display

Keep the usage percentage shown only in the total quota line to avoid redundant “已用 0%” text while preserving remaining days in the summary.

*  feat: Add subscription upgrade group with auto downgrade

*  feat: Update subscription purchase modal display

Show total quota as currency with tooltip for raw quota, hide reset cycle when never, and display upgrade group when configured to match card display rules.

*  feat: Extract quota conversion helpers to shared utils

Move quota display/conversion helpers into web/src/helpers/quota.js and update the subscription plan editor to import and use the shared utilities instead of inline functions.

*  chore: Add upgrade group guidance in subscription editor

Add explanatory helper text under the upgrade group field to clarify automatic group upgrades, rollback conditions, and the expected delay before downgrading takes effect.

* 🔧 chore: remove unused Creem settings state

Drop the unused originInputs state and redundant updates to keep the Creem
settings form state minimal and easier to maintain.

* 🚀 chore: Remove useless action

*  Add full i18n coverage for subscription-related UI across locales

*  feat: harden subscription billing and improve UI consistency

Improve subscription payment safety and data integrity by handling user/URL lookup failures, fixing Stripe subscription mode, persisting quota reset fields, and correcting subscription delta accounting and DB timestamp casting. Refine the UI with stricter custom duration validation, accurate currency rounding, conditional Epay labeling, rollback on preference update failure, and shared subscription formatting helpers plus clearer component naming.

* 🔧 fix: make epay webhook and return flow subscription-aware

Ensure Epay webhook acknowledges success only after order completion, returning fail on processing errors to allow retries. Redirect subscription payment returns to the subscription page instead of top-up for correct user flow.

* 🚦 fix: guard epay return success on order completion
Redirect subscription return flow to failure when order completion fails, preventing false success states after payment verification.

* 🔧 fix: normalize epay error handling and webhook retries

Standardize SubscriptionRequestEpay error responses via ApiErrorMsg for a consistent schema.
Return "fail" on non-success trade statuses in the epay webhook to preserve retry behavior.

* 🧾 fix: persist epay orders before purchase

Create the subscription order before initiating epay payment and expire it if the provider call fails, preventing orphaned transactions and improving reconciliation.

* 🔧 fix: harden epay callbacks and billing fallbacks

Use POST and form parsing for epay notify/return routes, persist epay orders before provider calls with expiry on failure, and ensure notify handlers retry correctly.
Restrict subscription-first fallback to insufficient-subscription errors and log refund failures after retries to avoid silent quota drift.

* 🔧 fix: harden billing flow and sidebar settings

Add missing strings import for subscription fallback checks, log failed subscription refunds after retries, and extend sidebar module settings with a subscription management toggle plus translations.

* 🛡️ fix: fail fast on epay form parse errors

Handle ParseForm errors in epay notify/return handlers by returning fail or redirecting to failure, avoiding unsafe fallback to query parameters.

*  fix: refine Japanese subscription status labels

Adjust Japanese UI wording for active-count labels to read more naturally and consistently.

*  fix: standardize epay success response schema

Return subscription epay pay success responses via ApiSuccess to include the consistent success field and align with error schema.
2026-02-03 17:40:43 +08:00

597 lines
21 KiB
Go
Raw Permalink Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
package service
import (
"errors"
"fmt"
"log"
"math"
"strings"
"time"
"github.com/QuantumNous/new-api/common"
"github.com/QuantumNous/new-api/constant"
"github.com/QuantumNous/new-api/dto"
"github.com/QuantumNous/new-api/logger"
"github.com/QuantumNous/new-api/model"
relaycommon "github.com/QuantumNous/new-api/relay/common"
"github.com/QuantumNous/new-api/setting/ratio_setting"
"github.com/QuantumNous/new-api/setting/system_setting"
"github.com/QuantumNous/new-api/types"
"github.com/bytedance/gopkg/util/gopool"
"github.com/gin-gonic/gin"
"github.com/shopspring/decimal"
)
type TokenDetails struct {
TextTokens int
AudioTokens int
}
type QuotaInfo struct {
InputDetails TokenDetails
OutputDetails TokenDetails
ModelName string
UsePrice bool
ModelPrice float64
ModelRatio float64
GroupRatio float64
}
func hasCustomModelRatio(modelName string, currentRatio float64) bool {
defaultRatio, exists := ratio_setting.GetDefaultModelRatioMap()[modelName]
if !exists {
return true
}
return currentRatio != defaultRatio
}
func calculateAudioQuota(info QuotaInfo) int {
if info.UsePrice {
modelPrice := decimal.NewFromFloat(info.ModelPrice)
quotaPerUnit := decimal.NewFromFloat(common.QuotaPerUnit)
groupRatio := decimal.NewFromFloat(info.GroupRatio)
quota := modelPrice.Mul(quotaPerUnit).Mul(groupRatio)
return int(quota.IntPart())
}
completionRatio := decimal.NewFromFloat(ratio_setting.GetCompletionRatio(info.ModelName))
audioRatio := decimal.NewFromFloat(ratio_setting.GetAudioRatio(info.ModelName))
audioCompletionRatio := decimal.NewFromFloat(ratio_setting.GetAudioCompletionRatio(info.ModelName))
groupRatio := decimal.NewFromFloat(info.GroupRatio)
modelRatio := decimal.NewFromFloat(info.ModelRatio)
ratio := groupRatio.Mul(modelRatio)
inputTextTokens := decimal.NewFromInt(int64(info.InputDetails.TextTokens))
outputTextTokens := decimal.NewFromInt(int64(info.OutputDetails.TextTokens))
inputAudioTokens := decimal.NewFromInt(int64(info.InputDetails.AudioTokens))
outputAudioTokens := decimal.NewFromInt(int64(info.OutputDetails.AudioTokens))
quota := decimal.Zero
quota = quota.Add(inputTextTokens)
quota = quota.Add(outputTextTokens.Mul(completionRatio))
quota = quota.Add(inputAudioTokens.Mul(audioRatio))
quota = quota.Add(outputAudioTokens.Mul(audioRatio).Mul(audioCompletionRatio))
quota = quota.Mul(ratio)
// If ratio is not zero and quota is less than or equal to zero, set quota to 1
if !ratio.IsZero() && quota.LessThanOrEqual(decimal.Zero) {
quota = decimal.NewFromInt(1)
}
return int(quota.Round(0).IntPart())
}
func PreWssConsumeQuota(ctx *gin.Context, relayInfo *relaycommon.RelayInfo, usage *dto.RealtimeUsage) error {
if relayInfo.UsePrice {
return nil
}
userQuota, err := model.GetUserQuota(relayInfo.UserId, false)
if err != nil {
return err
}
token, err := model.GetTokenByKey(strings.TrimPrefix(relayInfo.TokenKey, "sk-"), false)
if err != nil {
return err
}
modelName := relayInfo.OriginModelName
textInputTokens := usage.InputTokenDetails.TextTokens
textOutTokens := usage.OutputTokenDetails.TextTokens
audioInputTokens := usage.InputTokenDetails.AudioTokens
audioOutTokens := usage.OutputTokenDetails.AudioTokens
groupRatio := ratio_setting.GetGroupRatio(relayInfo.UsingGroup)
modelRatio, _, _ := ratio_setting.GetModelRatio(modelName)
autoGroup, exists := common.GetContextKey(ctx, constant.ContextKeyAutoGroup)
if exists {
groupRatio = ratio_setting.GetGroupRatio(autoGroup.(string))
log.Printf("final group ratio: %f", groupRatio)
relayInfo.UsingGroup = autoGroup.(string)
}
actualGroupRatio := groupRatio
userGroupRatio, ok := ratio_setting.GetGroupGroupRatio(relayInfo.UserGroup, relayInfo.UsingGroup)
if ok {
actualGroupRatio = userGroupRatio
}
quotaInfo := QuotaInfo{
InputDetails: TokenDetails{
TextTokens: textInputTokens,
AudioTokens: audioInputTokens,
},
OutputDetails: TokenDetails{
TextTokens: textOutTokens,
AudioTokens: audioOutTokens,
},
ModelName: modelName,
UsePrice: relayInfo.UsePrice,
ModelRatio: modelRatio,
GroupRatio: actualGroupRatio,
}
quota := calculateAudioQuota(quotaInfo)
if userQuota < quota {
return fmt.Errorf("user quota is not enough, user quota: %s, need quota: %s", logger.FormatQuota(userQuota), logger.FormatQuota(quota))
}
if !token.UnlimitedQuota && token.RemainQuota < quota {
return fmt.Errorf("token quota is not enough, token remain quota: %s, need quota: %s", logger.FormatQuota(token.RemainQuota), logger.FormatQuota(quota))
}
err = PostConsumeQuota(relayInfo, quota, 0, false)
if err != nil {
return err
}
logger.LogInfo(ctx, "realtime streaming consume quota success, quota: "+fmt.Sprintf("%d", quota))
return nil
}
func PostWssConsumeQuota(ctx *gin.Context, relayInfo *relaycommon.RelayInfo, modelName string,
usage *dto.RealtimeUsage, extraContent string) {
useTimeSeconds := time.Now().Unix() - relayInfo.StartTime.Unix()
textInputTokens := usage.InputTokenDetails.TextTokens
textOutTokens := usage.OutputTokenDetails.TextTokens
audioInputTokens := usage.InputTokenDetails.AudioTokens
audioOutTokens := usage.OutputTokenDetails.AudioTokens
tokenName := ctx.GetString("token_name")
completionRatio := decimal.NewFromFloat(ratio_setting.GetCompletionRatio(modelName))
audioRatio := decimal.NewFromFloat(ratio_setting.GetAudioRatio(relayInfo.OriginModelName))
audioCompletionRatio := decimal.NewFromFloat(ratio_setting.GetAudioCompletionRatio(modelName))
modelRatio := relayInfo.PriceData.ModelRatio
groupRatio := relayInfo.PriceData.GroupRatioInfo.GroupRatio
modelPrice := relayInfo.PriceData.ModelPrice
usePrice := relayInfo.PriceData.UsePrice
quotaInfo := QuotaInfo{
InputDetails: TokenDetails{
TextTokens: textInputTokens,
AudioTokens: audioInputTokens,
},
OutputDetails: TokenDetails{
TextTokens: textOutTokens,
AudioTokens: audioOutTokens,
},
ModelName: modelName,
UsePrice: usePrice,
ModelRatio: modelRatio,
GroupRatio: groupRatio,
}
quota := calculateAudioQuota(quotaInfo)
totalTokens := usage.TotalTokens
var logContent string
if !usePrice {
logContent = fmt.Sprintf("模型倍率 %.2f,补全倍率 %.2f,音频倍率 %.2f,音频补全倍率 %.2f,分组倍率 %.2f",
modelRatio, completionRatio.InexactFloat64(), audioRatio.InexactFloat64(), audioCompletionRatio.InexactFloat64(), groupRatio)
} else {
logContent = fmt.Sprintf("模型价格 %.2f,分组倍率 %.2f", modelPrice, groupRatio)
}
// record all the consume log even if quota is 0
if totalTokens == 0 {
// in this case, must be some error happened
// we cannot just return, because we may have to return the pre-consumed quota
quota = 0
logContent += fmt.Sprintf("(可能是上游超时)")
logger.LogError(ctx, fmt.Sprintf("total tokens is 0, cannot consume quota, userId %d, channelId %d, "+
"tokenId %d, model %s pre-consumed quota %d", relayInfo.UserId, relayInfo.ChannelId, relayInfo.TokenId, modelName, relayInfo.FinalPreConsumedQuota))
} else {
model.UpdateUserUsedQuotaAndRequestCount(relayInfo.UserId, quota)
model.UpdateChannelUsedQuota(relayInfo.ChannelId, quota)
}
logModel := modelName
if extraContent != "" {
logContent += ", " + extraContent
}
other := GenerateWssOtherInfo(ctx, relayInfo, usage, modelRatio, groupRatio,
completionRatio.InexactFloat64(), audioRatio.InexactFloat64(), audioCompletionRatio.InexactFloat64(), modelPrice, relayInfo.PriceData.GroupRatioInfo.GroupSpecialRatio)
model.RecordConsumeLog(ctx, relayInfo.UserId, model.RecordConsumeLogParams{
ChannelId: relayInfo.ChannelId,
PromptTokens: usage.InputTokens,
CompletionTokens: usage.OutputTokens,
ModelName: logModel,
TokenName: tokenName,
Quota: quota,
Content: logContent,
TokenId: relayInfo.TokenId,
UseTimeSeconds: int(useTimeSeconds),
IsStream: relayInfo.IsStream,
Group: relayInfo.UsingGroup,
Other: other,
})
}
func PostClaudeConsumeQuota(ctx *gin.Context, relayInfo *relaycommon.RelayInfo, usage *dto.Usage) {
useTimeSeconds := time.Now().Unix() - relayInfo.StartTime.Unix()
promptTokens := usage.PromptTokens
completionTokens := usage.CompletionTokens
modelName := relayInfo.OriginModelName
tokenName := ctx.GetString("token_name")
completionRatio := relayInfo.PriceData.CompletionRatio
modelRatio := relayInfo.PriceData.ModelRatio
groupRatio := relayInfo.PriceData.GroupRatioInfo.GroupRatio
modelPrice := relayInfo.PriceData.ModelPrice
cacheRatio := relayInfo.PriceData.CacheRatio
cacheTokens := usage.PromptTokensDetails.CachedTokens
cacheCreationRatio := relayInfo.PriceData.CacheCreationRatio
cacheCreationRatio5m := relayInfo.PriceData.CacheCreation5mRatio
cacheCreationRatio1h := relayInfo.PriceData.CacheCreation1hRatio
cacheCreationTokens := usage.PromptTokensDetails.CachedCreationTokens
cacheCreationTokens5m := usage.ClaudeCacheCreation5mTokens
cacheCreationTokens1h := usage.ClaudeCacheCreation1hTokens
if relayInfo.ChannelType == constant.ChannelTypeOpenRouter {
promptTokens -= cacheTokens
isUsingCustomSettings := relayInfo.PriceData.UsePrice || hasCustomModelRatio(modelName, relayInfo.PriceData.ModelRatio)
if cacheCreationTokens == 0 && relayInfo.PriceData.CacheCreationRatio != 1 && usage.Cost != 0 && !isUsingCustomSettings {
maybeCacheCreationTokens := CalcOpenRouterCacheCreateTokens(*usage, relayInfo.PriceData)
if maybeCacheCreationTokens >= 0 && promptTokens >= maybeCacheCreationTokens {
cacheCreationTokens = maybeCacheCreationTokens
}
}
promptTokens -= cacheCreationTokens
}
calculateQuota := 0.0
if !relayInfo.PriceData.UsePrice {
calculateQuota = float64(promptTokens)
calculateQuota += float64(cacheTokens) * cacheRatio
calculateQuota += float64(cacheCreationTokens5m) * cacheCreationRatio5m
calculateQuota += float64(cacheCreationTokens1h) * cacheCreationRatio1h
remainingCacheCreationTokens := cacheCreationTokens - cacheCreationTokens5m - cacheCreationTokens1h
if remainingCacheCreationTokens > 0 {
calculateQuota += float64(remainingCacheCreationTokens) * cacheCreationRatio
}
calculateQuota += float64(completionTokens) * completionRatio
calculateQuota = calculateQuota * groupRatio * modelRatio
} else {
calculateQuota = modelPrice * common.QuotaPerUnit * groupRatio
}
if modelRatio != 0 && calculateQuota <= 0 {
calculateQuota = 1
}
quota := int(calculateQuota)
totalTokens := promptTokens + completionTokens
var logContent string
// record all the consume log even if quota is 0
if totalTokens == 0 {
// in this case, must be some error happened
// we cannot just return, because we may have to return the pre-consumed quota
quota = 0
logContent += fmt.Sprintf("(可能是上游出错)")
logger.LogError(ctx, fmt.Sprintf("total tokens is 0, cannot consume quota, userId %d, channelId %d, "+
"tokenId %d, model %s pre-consumed quota %d", relayInfo.UserId, relayInfo.ChannelId, relayInfo.TokenId, modelName, relayInfo.FinalPreConsumedQuota))
} else {
model.UpdateUserUsedQuotaAndRequestCount(relayInfo.UserId, quota)
model.UpdateChannelUsedQuota(relayInfo.ChannelId, quota)
}
quotaDelta := quota - relayInfo.FinalPreConsumedQuota
if quotaDelta > 0 {
logger.LogInfo(ctx, fmt.Sprintf("预扣费后补扣费:%s实际消耗%s预扣费%s",
logger.FormatQuota(quotaDelta),
logger.FormatQuota(quota),
logger.FormatQuota(relayInfo.FinalPreConsumedQuota),
))
} else if quotaDelta < 0 {
logger.LogInfo(ctx, fmt.Sprintf("预扣费后返还扣费:%s实际消耗%s预扣费%s",
logger.FormatQuota(-quotaDelta),
logger.FormatQuota(quota),
logger.FormatQuota(relayInfo.FinalPreConsumedQuota),
))
}
if quotaDelta != 0 {
err := PostConsumeQuota(relayInfo, quotaDelta, relayInfo.FinalPreConsumedQuota, true)
if err != nil {
logger.LogError(ctx, "error consuming token remain quota: "+err.Error())
}
}
other := GenerateClaudeOtherInfo(ctx, relayInfo, modelRatio, groupRatio, completionRatio,
cacheTokens, cacheRatio,
cacheCreationTokens, cacheCreationRatio,
cacheCreationTokens5m, cacheCreationRatio5m,
cacheCreationTokens1h, cacheCreationRatio1h,
modelPrice, relayInfo.PriceData.GroupRatioInfo.GroupSpecialRatio)
model.RecordConsumeLog(ctx, relayInfo.UserId, model.RecordConsumeLogParams{
ChannelId: relayInfo.ChannelId,
PromptTokens: promptTokens,
CompletionTokens: completionTokens,
ModelName: modelName,
TokenName: tokenName,
Quota: quota,
Content: logContent,
TokenId: relayInfo.TokenId,
UseTimeSeconds: int(useTimeSeconds),
IsStream: relayInfo.IsStream,
Group: relayInfo.UsingGroup,
Other: other,
})
}
func CalcOpenRouterCacheCreateTokens(usage dto.Usage, priceData types.PriceData) int {
if priceData.CacheCreationRatio == 1 {
return 0
}
quotaPrice := priceData.ModelRatio / common.QuotaPerUnit
promptCacheCreatePrice := quotaPrice * priceData.CacheCreationRatio
promptCacheReadPrice := quotaPrice * priceData.CacheRatio
completionPrice := quotaPrice * priceData.CompletionRatio
cost, _ := usage.Cost.(float64)
totalPromptTokens := float64(usage.PromptTokens)
completionTokens := float64(usage.CompletionTokens)
promptCacheReadTokens := float64(usage.PromptTokensDetails.CachedTokens)
return int(math.Round((cost -
totalPromptTokens*quotaPrice +
promptCacheReadTokens*(quotaPrice-promptCacheReadPrice) -
completionTokens*completionPrice) /
(promptCacheCreatePrice - quotaPrice)))
}
func PostAudioConsumeQuota(ctx *gin.Context, relayInfo *relaycommon.RelayInfo, usage *dto.Usage, extraContent string) {
useTimeSeconds := time.Now().Unix() - relayInfo.StartTime.Unix()
textInputTokens := usage.PromptTokensDetails.TextTokens
textOutTokens := usage.CompletionTokenDetails.TextTokens
audioInputTokens := usage.PromptTokensDetails.AudioTokens
audioOutTokens := usage.CompletionTokenDetails.AudioTokens
tokenName := ctx.GetString("token_name")
completionRatio := decimal.NewFromFloat(ratio_setting.GetCompletionRatio(relayInfo.OriginModelName))
audioRatio := decimal.NewFromFloat(ratio_setting.GetAudioRatio(relayInfo.OriginModelName))
audioCompletionRatio := decimal.NewFromFloat(ratio_setting.GetAudioCompletionRatio(relayInfo.OriginModelName))
modelRatio := relayInfo.PriceData.ModelRatio
groupRatio := relayInfo.PriceData.GroupRatioInfo.GroupRatio
modelPrice := relayInfo.PriceData.ModelPrice
usePrice := relayInfo.PriceData.UsePrice
quotaInfo := QuotaInfo{
InputDetails: TokenDetails{
TextTokens: textInputTokens,
AudioTokens: audioInputTokens,
},
OutputDetails: TokenDetails{
TextTokens: textOutTokens,
AudioTokens: audioOutTokens,
},
ModelName: relayInfo.OriginModelName,
UsePrice: usePrice,
ModelRatio: modelRatio,
GroupRatio: groupRatio,
}
quota := calculateAudioQuota(quotaInfo)
totalTokens := usage.TotalTokens
var logContent string
if !usePrice {
logContent = fmt.Sprintf("模型倍率 %.2f,补全倍率 %.2f,音频倍率 %.2f,音频补全倍率 %.2f,分组倍率 %.2f",
modelRatio, completionRatio.InexactFloat64(), audioRatio.InexactFloat64(), audioCompletionRatio.InexactFloat64(), groupRatio)
} else {
logContent = fmt.Sprintf("模型价格 %.2f,分组倍率 %.2f", modelPrice, groupRatio)
}
// record all the consume log even if quota is 0
if totalTokens == 0 {
// in this case, must be some error happened
// we cannot just return, because we may have to return the pre-consumed quota
quota = 0
logContent += fmt.Sprintf("(可能是上游超时)")
logger.LogError(ctx, fmt.Sprintf("total tokens is 0, cannot consume quota, userId %d, channelId %d, "+
"tokenId %d, model %s pre-consumed quota %d", relayInfo.UserId, relayInfo.ChannelId, relayInfo.TokenId, relayInfo.OriginModelName, relayInfo.FinalPreConsumedQuota))
} else {
model.UpdateUserUsedQuotaAndRequestCount(relayInfo.UserId, quota)
model.UpdateChannelUsedQuota(relayInfo.ChannelId, quota)
}
quotaDelta := quota - relayInfo.FinalPreConsumedQuota
if quotaDelta > 0 {
logger.LogInfo(ctx, fmt.Sprintf("预扣费后补扣费:%s实际消耗%s预扣费%s",
logger.FormatQuota(quotaDelta),
logger.FormatQuota(quota),
logger.FormatQuota(relayInfo.FinalPreConsumedQuota),
))
} else if quotaDelta < 0 {
logger.LogInfo(ctx, fmt.Sprintf("预扣费后返还扣费:%s实际消耗%s预扣费%s",
logger.FormatQuota(-quotaDelta),
logger.FormatQuota(quota),
logger.FormatQuota(relayInfo.FinalPreConsumedQuota),
))
}
if quotaDelta != 0 {
err := PostConsumeQuota(relayInfo, quotaDelta, relayInfo.FinalPreConsumedQuota, true)
if err != nil {
logger.LogError(ctx, "error consuming token remain quota: "+err.Error())
}
}
logModel := relayInfo.OriginModelName
if extraContent != "" {
logContent += ", " + extraContent
}
other := GenerateAudioOtherInfo(ctx, relayInfo, usage, modelRatio, groupRatio,
completionRatio.InexactFloat64(), audioRatio.InexactFloat64(), audioCompletionRatio.InexactFloat64(), modelPrice, relayInfo.PriceData.GroupRatioInfo.GroupSpecialRatio)
model.RecordConsumeLog(ctx, relayInfo.UserId, model.RecordConsumeLogParams{
ChannelId: relayInfo.ChannelId,
PromptTokens: usage.PromptTokens,
CompletionTokens: usage.CompletionTokens,
ModelName: logModel,
TokenName: tokenName,
Quota: quota,
Content: logContent,
TokenId: relayInfo.TokenId,
UseTimeSeconds: int(useTimeSeconds),
IsStream: relayInfo.IsStream,
Group: relayInfo.UsingGroup,
Other: other,
})
}
func PreConsumeTokenQuota(relayInfo *relaycommon.RelayInfo, quota int) error {
if quota < 0 {
return errors.New("quota 不能为负数!")
}
if relayInfo.IsPlayground {
return nil
}
//if relayInfo.TokenUnlimited {
// return nil
//}
token, err := model.GetTokenByKey(relayInfo.TokenKey, false)
if err != nil {
return err
}
if !relayInfo.TokenUnlimited && token.RemainQuota < quota {
return fmt.Errorf("token quota is not enough, token remain quota: %s, need quota: %s", logger.FormatQuota(token.RemainQuota), logger.FormatQuota(quota))
}
err = model.DecreaseTokenQuota(relayInfo.TokenId, relayInfo.TokenKey, quota)
if err != nil {
return err
}
return nil
}
func PostConsumeQuota(relayInfo *relaycommon.RelayInfo, quota int, preConsumedQuota int, sendEmail bool) (err error) {
// 1) Consume from wallet quota OR subscription item
if relayInfo != nil && relayInfo.BillingSource == BillingSourceSubscription {
if relayInfo.SubscriptionId == 0 {
return errors.New("subscription id is missing")
}
delta := int64(quota)
if delta != 0 {
if err := model.PostConsumeUserSubscriptionDelta(relayInfo.SubscriptionId, delta); err != nil {
return err
}
relayInfo.SubscriptionPostDelta += delta
}
} else {
// Wallet
if quota > 0 {
err = model.DecreaseUserQuota(relayInfo.UserId, quota)
} else {
err = model.IncreaseUserQuota(relayInfo.UserId, -quota, false)
}
if err != nil {
return err
}
}
if !relayInfo.IsPlayground {
if quota > 0 {
err = model.DecreaseTokenQuota(relayInfo.TokenId, relayInfo.TokenKey, quota)
} else {
err = model.IncreaseTokenQuota(relayInfo.TokenId, relayInfo.TokenKey, -quota)
}
if err != nil {
return err
}
}
if sendEmail {
if (quota + preConsumedQuota) != 0 {
checkAndSendQuotaNotify(relayInfo, quota, preConsumedQuota)
}
}
return nil
}
func checkAndSendQuotaNotify(relayInfo *relaycommon.RelayInfo, quota int, preConsumedQuota int) {
gopool.Go(func() {
userSetting := relayInfo.UserSetting
threshold := common.QuotaRemindThreshold
if userSetting.QuotaWarningThreshold != 0 {
threshold = int(userSetting.QuotaWarningThreshold)
}
//noMoreQuota := userCache.Quota-(quota+preConsumedQuota) <= 0
quotaTooLow := false
consumeQuota := quota + preConsumedQuota
if relayInfo.UserQuota-consumeQuota < threshold {
quotaTooLow = true
}
if quotaTooLow {
prompt := "您的额度即将用尽"
topUpLink := fmt.Sprintf("%s/console/topup", system_setting.ServerAddress)
// 根据通知方式生成不同的内容格式
var content string
var values []interface{}
notifyType := userSetting.NotifyType
if notifyType == "" {
notifyType = dto.NotifyTypeEmail
}
if notifyType == dto.NotifyTypeBark {
// Bark推送使用简短文本不支持HTML
content = "{{value}},剩余额度:{{value}},请及时充值"
values = []interface{}{prompt, logger.FormatQuota(relayInfo.UserQuota)}
} else if notifyType == dto.NotifyTypeGotify {
content = "{{value}},当前剩余额度为 {{value}},请及时充值。"
values = []interface{}{prompt, logger.FormatQuota(relayInfo.UserQuota)}
} else {
// 默认内容格式适用于Email和Webhook支持HTML
content = "{{value}},当前剩余额度为 {{value}},为了不影响您的使用,请及时充值。<br/>充值链接:<a href='{{value}}'>{{value}}</a>"
values = []interface{}{prompt, logger.FormatQuota(relayInfo.UserQuota), topUpLink, topUpLink}
}
err := NotifyUser(relayInfo.UserId, relayInfo.UserEmail, relayInfo.UserSetting, dto.NewNotify(dto.NotifyTypeQuotaExceed, prompt, content, values))
if err != nil {
common.SysError(fmt.Sprintf("failed to send quota notify to user %d: %s", relayInfo.UserId, err.Error()))
}
}
})
}