Merge pull request #2412 from seefs001/pr-2372

feat: add openai video remix endpoint
Merge pull request #2194 from NoahCodeGG/fix/process_channel_error
2026-04-02 02:50:02 +00:00 · 2025-12-11 23:35:23 +08:00 · 2025-12-11 18:12:06 +08:00 · 2025-12-09 16:57:24 +08:00 · 2025-12-09 14:10:26 +08:00 · 2025-12-09 14:05:30 +08:00
124 changed files with 17675 additions and 1280 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -21,3 +21,4 @@ web/bun.lock

 electron/node_modules
 electron/dist
+data/
--- a/9
+++ b/9
@@ -14,7 +14,7 @@ ENV GO111MODULE=on CGO_ENABLED=0
 ARG TARGETOS
 ARG TARGETARCH
 ENV GOOS=${TARGETOS:-linux} GOARCH=${TARGETARCH:-amd64}
-
+ENV GOEXPERIMENT=greenteagc

 WORKDIR /build

@@ -25,10 +25,11 @@ COPY . .
 COPY --from=builder /build/dist ./web/dist
 RUN go build -ldflags "-s -w -X 'github.com/QuantumNous/new-api/common.Version=$(cat VERSION)'" -o new-api

-FROM alpine
+FROM debian:bookworm-slim

-RUN apk upgrade --no-cache \
-    && apk add --no-cache ca-certificates tzdata \
+RUN apt-get update \
+    && apt-get install -y --no-install-recommends ca-certificates tzdata libasan8 \
+    && rm -rf /var/lib/apt/lists/* \
    && update-ca-certificates

 COPY --from=builder2 /build/new-api /
--- a/README.en.md
+++ b/README.en.md
@@ -238,6 +238,7 @@ docker run --name new-api -d --restart always \
 - `gemini-2.5-flash-nothinking` - Disable thinking mode
 - `gemini-2.5-pro-thinking` - Enable thinking mode
 - `gemini-2.5-pro-thinking-128` - Enable thinking mode with thinking budget of 128 tokens
+- You can also append `-low`, `-medium`, or `-high` to any Gemini model name to request the corresponding reasoning effort (no extra thinking-budget suffix needed).

 </details>

@@ -303,6 +304,7 @@ docker run --name new-api -d --restart always \
 | `SQL_DSN` | Database connection string | - |
 | `REDIS_CONN_STRING` | Redis connection string | - |
 | `STREAMING_TIMEOUT` | Streaming timeout (seconds) | `300` |
+| `STREAM_SCANNER_MAX_BUFFER_MB` | Max per-line buffer (MB) for the stream scanner; increase when upstream sends huge image/base64 payloads | `64` |
 | `AZURE_DEFAULT_API_VERSION` | Azure API version | `2025-04-01-preview` |
 | `ERROR_LOG_ENABLED` | Error log switch | `false` |

--- a/README.fr.md
+++ b/README.fr.md
@@ -234,6 +234,7 @@ docker run --name new-api -d --restart always \
 - `gemini-2.5-flash-nothinking` - Désactiver le mode de pensée
 - `gemini-2.5-pro-thinking` - Activer le mode de pensée
 - `gemini-2.5-pro-thinking-128` - Activer le mode de pensée avec budget de pensée de 128 tokens
+- Vous pouvez également ajouter les suffixes `-low`, `-medium` ou `-high` aux modèles Gemini pour fixer le niveau d’effort de raisonnement (sans suffixe de budget supplémentaire).

 </details>

@@ -299,6 +300,7 @@ docker run --name new-api -d --restart always \
 | `SQL_DSN` | Chaine de connexion à la base de données | - |
 | `REDIS_CONN_STRING` | Chaine de connexion Redis | - |
 | `STREAMING_TIMEOUT` | Délai d'expiration du streaming (secondes) | `300` |
+| `STREAM_SCANNER_MAX_BUFFER_MB` | Taille max du buffer par ligne (Mo) pour le scanner SSE ; à augmenter quand les sorties image/base64 sont très volumineuses (ex. images 4K) | `64` |
 | `AZURE_DEFAULT_API_VERSION` | Version de l'API Azure | `2025-04-01-preview` |
 | `ERROR_LOG_ENABLED` | Interrupteur du journal d'erreurs | `false` |

@@ -438,4 +440,4 @@ Si ce projet vous est utile, bienvenue à nous donner une ⭐️ Étoile！

 <sub>Construit avec ❤️ par QuantumNous</sub>

-</div>
+</div>
--- a/README.ja.md
+++ b/README.ja.md
@@ -243,6 +243,7 @@ docker run --name new-api -d --restart always \
 - `gemini-2.5-flash-nothinking` - 思考モードを無効にする
 - `gemini-2.5-pro-thinking` - 思考モードを有効にする
 - `gemini-2.5-pro-thinking-128` - 思考モードを有効にし、思考予算を128トークンに設定する
+- Gemini モデル名の末尾に `-low` / `-medium` / `-high` を付けることで推論強度を直接指定できます（追加の思考予算サフィックスは不要です）。

 </details>

@@ -308,6 +309,7 @@ docker run --name new-api -d --restart always \
 | `SQL_DSN** | データベース接続文字列 | - |
 | `REDIS_CONN_STRING` | Redis接続文字列 | - |
 | `STREAMING_TIMEOUT` | ストリーミング応答のタイムアウト時間（秒） | `300` |
+| `STREAM_SCANNER_MAX_BUFFER_MB` | ストリームスキャナの1行あたりバッファ上限（MB）。4K画像など巨大なbase64 `data:` ペイロードを扱う場合は値を増加させてください | `64` |
 | `AZURE_DEFAULT_API_VERSION` | Azure APIバージョン | `2025-04-01-preview` |
 | `ERROR_LOG_ENABLED` | エラーログスイッチ | `false` |

--- a/README.md
+++ b/README.md
@@ -239,6 +239,7 @@ docker run --name new-api -d --restart always \
 - `gemini-2.5-flash-nothinking` - 禁用思考模式
 - `gemini-2.5-pro-thinking` - 启用思考模式
 - `gemini-2.5-pro-thinking-128` - 启用思考模式，并设置思考预算为128tokens
+- 也可以直接在 Gemini 模型名称后追加 `-low` / `-medium` / `-high` 来控制思考力度（无需再设置思考预算后缀）

 </details>

@@ -297,15 +298,16 @@ docker run --name new-api -d --restart always \
 <details>
 <summary>常用环境变量配置</summary>

-| 变量名 | 说明 | 默认值 |
-|--------|------|--------|
-| `SESSION_SECRET` | 会话密钥（多机部署必须） | - |
-| `CRYPTO_SECRET` | 加密密钥（Redis 必须） | - |
-| `SQL_DSN` | 数据库连接字符串 | - |
-| `REDIS_CONN_STRING` | Redis 连接字符串 | - |
-| `STREAMING_TIMEOUT` | 流式超时时间（秒） | `300` |
-| `AZURE_DEFAULT_API_VERSION` | Azure API 版本 | `2025-04-01-preview` |
-| `ERROR_LOG_ENABLED` | 错误日志开关 | `false` |
+| 变量名 | 说明                                                           | 默认值 |
+|--------|--------------------------------------------------------------|--------|
+| `SESSION_SECRET` | 会话密钥（多机部署必须）                                                 | - |
+| `CRYPTO_SECRET` | 加密密钥（Redis 必须）                                               | - |
+| `SQL_DSN` | 数据库连接字符串                                                     | - |
+| `REDIS_CONN_STRING` | Redis 连接字符串                                                  | - |
+| `STREAMING_TIMEOUT` | 流式超时时间（秒）                                                    | `300` |
+| `STREAM_SCANNER_MAX_BUFFER_MB` | 流式扫描器单行最大缓冲（MB），图像生成等超大 `data:` 片段（如 4K 图片 base64）需适当调大 | `64` |
+| `AZURE_DEFAULT_API_VERSION` | Azure API 版本                                                 | `2025-04-01-preview` |
+| `ERROR_LOG_ENABLED` | 错误日志开关                                                       | `false` |

 📖 **完整配置：** [环境变量文档](https://docs.newapi.pro/installation/environment-variables)

--- a/common/constants.go
+++ b/common/constants.go
@@ -121,6 +121,9 @@ var BatchUpdateInterval int

 var RelayTimeout int // unit is second

+var RelayMaxIdleConns int
+var RelayMaxIdleConnsPerHost int
+
 var GeminiSafetySetting string

 // https://docs.cohere.com/docs/safety-modes Type; NONE/CONTEXTUAL/STRICT
--- a/common/embed-file-system.go
+++ b/common/embed-file-system.go
@@ -4,6 +4,7 @@ import (
 	"embed"
 	"io/fs"
 	"net/http"
+	"os"

 	"github.com/gin-contrib/static"
 )
@@ -14,7 +15,7 @@ type embedFileSystem struct {
 	http.FileSystem
 }

-func (e embedFileSystem) Exists(prefix string, path string) bool {
+func (e *embedFileSystem) Exists(prefix string, path string) bool {
 	_, err := e.Open(path)
 	if err != nil {
 		return false
@@ -22,12 +23,21 @@ func (e embedFileSystem) Exists(prefix string, path string) bool {
 	return true
 }

+func (e *embedFileSystem) Open(name string) (http.File, error) {
+	if name == "/" {
+		// This will make sure the index page goes to NoRouter handler,
+		// which will use the replaced index bytes with analytic codes.
+		return nil, os.ErrNotExist
+	}
+	return e.FileSystem.Open(name)
+}
+
 func EmbedFolder(fsEmbed embed.FS, targetPath string) static.ServeFileSystem {
 	efs, err := fs.Sub(fsEmbed, targetPath)
 	if err != nil {
 		panic(err)
 	}
-	return embedFileSystem{
+	return &embedFileSystem{
 		FileSystem: http.FS(efs),
 	}
 }
--- a/common/init.go
+++ b/common/init.go
@@ -90,6 +90,8 @@ func InitEnv() {
 	SyncFrequency = GetEnvOrDefault("SYNC_FREQUENCY", 60)
 	BatchUpdateInterval = GetEnvOrDefault("BATCH_UPDATE_INTERVAL", 5)
 	RelayTimeout = GetEnvOrDefault("RELAY_TIMEOUT", 0)
+	RelayMaxIdleConns = GetEnvOrDefault("RELAY_MAX_IDLE_CONNS", 500)
+	RelayMaxIdleConnsPerHost = GetEnvOrDefault("RELAY_MAX_IDLE_CONNS_PER_HOST", 100)

 	// Initialize string variables with GetEnvOrDefaultString
 	GeminiSafetySetting = GetEnvOrDefaultString("GEMINI_SAFETY_SETTING", "BLOCK_NONE")
@@ -114,6 +116,7 @@ func initConstantEnv() {
 	constant.StreamingTimeout = GetEnvOrDefault("STREAMING_TIMEOUT", 300)
 	constant.DifyDebug = GetEnvOrDefaultBool("DIFY_DEBUG", true)
 	constant.MaxFileDownloadMB = GetEnvOrDefault("MAX_FILE_DOWNLOAD_MB", 20)
+	constant.StreamScannerMaxBufferMB = GetEnvOrDefault("STREAM_SCANNER_MAX_BUFFER_MB", 64)
 	// ForceStreamOption 覆盖请求参数，强制返回usage信息
 	constant.ForceStreamOption = GetEnvOrDefaultBool("FORCE_STREAM_OPTION", true)
 	constant.CountToken = GetEnvOrDefaultBool("CountToken", true)
@@ -128,6 +131,8 @@ func initConstantEnv() {
 	constant.GenerateDefaultToken = GetEnvOrDefaultBool("GENERATE_DEFAULT_TOKEN", false)
 	// 是否启用错误日志
 	constant.ErrorLogEnabled = GetEnvOrDefaultBool("ERROR_LOG_ENABLED", false)
+	// 任务轮询时查询的最大数量
+	constant.TaskQueryLimit = GetEnvOrDefault("TASK_QUERY_LIMIT", 1000)

 	soraPatchStr := GetEnvOrDefaultString("TASK_PRICE_PATCH", "")
 	if soraPatchStr != "" {
--- a/common/json.go
+++ b/common/json.go
@@ -23,11 +23,11 @@ func Marshal(v any) ([]byte, error) {
 }

 func GetJsonType(data json.RawMessage) string {
-	data = bytes.TrimSpace(data)
-	if len(data) == 0 {
+	trimmed := bytes.TrimSpace(data)
+	if len(trimmed) == 0 {
 		return "unknown"
 	}
-	firstChar := bytes.TrimSpace(data)[0]
+	firstChar := trimmed[0]
 	switch firstChar {
 	case '{':
 		return "object"
--- a/common/model.go
+++ b/common/model.go
@@ -17,6 +17,13 @@ var (
 		"flux-",
 		"flux.1-",
 	}
+	OpenAITextModels = []string{
+		"gpt-",
+		"o1",
+		"o3",
+		"o4",
+		"chatgpt",
+	}
 )

 func IsOpenAIResponseOnlyModel(modelName string) bool {
@@ -40,3 +47,13 @@ func IsImageGenerationModel(modelName string) bool {
 	}
 	return false
 }
+
+func IsOpenAITextModel(modelName string) bool {
+	modelName = strings.ToLower(modelName)
+	for _, m := range OpenAITextModels {
+		if strings.Contains(modelName, m) {
+			return true
+		}
+	}
+	return false
+}
--- a/common/str.go
+++ b/common/str.go
@@ -3,12 +3,19 @@ package common
 import (
 	"encoding/base64"
 	"encoding/json"
-	"math/rand"
 	"net/url"
 	"regexp"
 	"strconv"
 	"strings"
 	"unsafe"
+
+	"github.com/samber/lo"
+)
+
+var (
+	maskURLPattern    = regexp.MustCompile(`(http|https)://[^\s/$.?#].[^\s]*`)
+	maskDomainPattern = regexp.MustCompile(`\b(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,}\b`)
+	maskIPPattern     = regexp.MustCompile(`\b(?:\d{1,3}\.){3}\d{1,3}\b`)
 )

 func GetStringIfEmpty(str string, defaultValue string) string {
@@ -19,12 +26,10 @@ func GetStringIfEmpty(str string, defaultValue string) string {
 }

 func GetRandomString(length int) string {
-	//rand.Seed(time.Now().UnixNano())
-	key := make([]byte, length)
-	for i := 0; i < length; i++ {
-		key[i] = keyChars[rand.Intn(len(keyChars))]
+	if length <= 0 {
+		return ""
 	}
-	return string(key)
+	return lo.RandomString(length, lo.AlphanumericCharset)
 }

 func MapToJsonStr(m map[string]interface{}) string {
@@ -170,8 +175,7 @@ func maskHostForPlainDomain(domain string) string {
 // api.openai.com -> ***.***.com
 func MaskSensitiveInfo(str string) string {
 	// Mask URLs
-	urlPattern := regexp.MustCompile(`(http|https)://[^\s/$.?#].[^\s]*`)
-	str = urlPattern.ReplaceAllStringFunc(str, func(urlStr string) string {
+	str = maskURLPattern.ReplaceAllStringFunc(str, func(urlStr string) string {
 		u, err := url.Parse(urlStr)
 		if err != nil {
 			return urlStr
@@ -224,14 +228,12 @@ func MaskSensitiveInfo(str string) string {
 	})

 	// Mask domain names without protocol (like openai.com, www.openai.com)
-	domainPattern := regexp.MustCompile(`\b(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,}\b`)
-	str = domainPattern.ReplaceAllStringFunc(str, func(domain string) string {
+	str = maskDomainPattern.ReplaceAllStringFunc(str, func(domain string) string {
 		return maskHostForPlainDomain(domain)
 	})

 	// Mask IP addresses
-	ipPattern := regexp.MustCompile(`\b(?:\d{1,3}\.){3}\d{1,3}\b`)
-	str = ipPattern.ReplaceAllString(str, "***.***.***.***")
+	str = maskIPPattern.ReplaceAllString(str, "***.***.***.***")

 	return str
 }
--- a/constant/channel.go
+++ b/constant/channel.go
@@ -180,3 +180,27 @@ func GetChannelTypeName(channelType int) string {
 	}
 	return "Unknown"
 }
+
+type ChannelSpecialBase struct {
+	ClaudeBaseURL string
+	OpenAIBaseURL string
+}
+
+var ChannelSpecialBases = map[string]ChannelSpecialBase{
+	"glm-coding-plan": {
+		ClaudeBaseURL: "https://open.bigmodel.cn/api/anthropic",
+		OpenAIBaseURL: "https://open.bigmodel.cn/api/coding/paas/v4",
+	},
+	"glm-coding-plan-international": {
+		ClaudeBaseURL: "https://api.z.ai/api/anthropic",
+		OpenAIBaseURL: "https://api.z.ai/api/coding/paas/v4",
+	},
+	"kimi-coding-plan": {
+		ClaudeBaseURL: "https://api.kimi.com/coding",
+		OpenAIBaseURL: "https://api.kimi.com/coding/v1",
+	},
+	"doubao-coding-plan": {
+		ClaudeBaseURL: "https://ark.cn-beijing.volces.com/api/coding",
+		OpenAIBaseURL: "https://ark.cn-beijing.volces.com/api/coding/v3",
+	},
+}
--- a/constant/context_key.go
+++ b/constant/context_key.go
@@ -3,8 +3,9 @@ package constant
 type ContextKey string

 const (
-	ContextKeyTokenCountMeta ContextKey = "token_count_meta"
-	ContextKeyPromptTokens   ContextKey = "prompt_tokens"
+	ContextKeyTokenCountMeta  ContextKey = "token_count_meta"
+	ContextKeyPromptTokens    ContextKey = "prompt_tokens"
+	ContextKeyEstimatedTokens ContextKey = "estimated_tokens"

 	ContextKeyOriginalModel    ContextKey = "original_model"
 	ContextKeyRequestStartTime ContextKey = "request_start_time"
--- a/constant/env.go
+++ b/constant/env.go
@@ -3,6 +3,7 @@ package constant
 var StreamingTimeout int
 var DifyDebug bool
 var MaxFileDownloadMB int
+var StreamScannerMaxBufferMB int
 var ForceStreamOption bool
 var CountToken bool
 var GetMediaToken bool
@@ -14,6 +15,7 @@ var NotifyLimitCount int
 var NotificationLimitDurationMinute int
 var GenerateDefaultToken bool
 var ErrorLogEnabled bool
+var TaskQueryLimit int

 // temporary variable for sora patch, will be removed in future
 var TaskPricePatches []string
--- a/constant/task.go
+++ b/constant/task.go
@@ -15,6 +15,7 @@ const (
 	TaskActionTextGenerate      = "textGenerate"
 	TaskActionFirstTailGenerate = "firstTailGenerate"
 	TaskActionReferenceGenerate = "referenceGenerate"
+	TaskActionRemix             = "remixGenerate"
 )

 var SunoModel2Action = map[string]string{
--- a/controller/channel-test.go
+++ b/controller/channel-test.go
@@ -351,7 +351,7 @@ func testChannel(channel *model.Channel, testModel string, endpointType string)
 			newAPIError: types.NewOpenAIError(err, types.ErrorCodeReadResponseBodyFailed, http.StatusInternalServerError),
 		}
 	}
-	info.PromptTokens = usage.PromptTokens
+	info.SetEstimatePromptTokens(usage.PromptTokens)

 	quota := 0
 	if !priceData.UsePrice {
--- a/controller/channel.go
+++ b/controller/channel.go
@@ -11,7 +11,6 @@ import (
 	"github.com/QuantumNous/new-api/constant"
 	"github.com/QuantumNous/new-api/dto"
 	"github.com/QuantumNous/new-api/model"
-	"github.com/QuantumNous/new-api/relay/channel/volcengine"
 	"github.com/QuantumNous/new-api/service"

 	"github.com/gin-gonic/gin"
@@ -166,6 +165,30 @@ func GetAllChannels(c *gin.Context) {
 	return
 }

+func buildFetchModelsHeaders(channel *model.Channel, key string) (http.Header, error) {
+	var headers http.Header
+	switch channel.Type {
+	case constant.ChannelTypeAnthropic:
+		headers = GetClaudeAuthHeader(key)
+	default:
+		headers = GetAuthHeader(key)
+	}
+
+	headerOverride := channel.GetHeaderOverride()
+	for k, v := range headerOverride {
+		str, ok := v.(string)
+		if !ok {
+			return nil, fmt.Errorf("invalid header override for key %s", k)
+		}
+		if strings.Contains(str, "{api_key}") {
+			str = strings.ReplaceAll(str, "{api_key}", key)
+		}
+		headers.Set(k, str)
+	}
+
+	return headers, nil
+}
+
 func FetchUpstreamModels(c *gin.Context) {
 	id, err := strconv.Atoi(c.Param("id"))
 	if err != nil {
@@ -192,10 +215,20 @@ func FetchUpstreamModels(c *gin.Context) {
 	case constant.ChannelTypeAli:
 		url = fmt.Sprintf("%s/compatible-mode/v1/models", baseURL)
 	case constant.ChannelTypeZhipu_v4:
-		url = fmt.Sprintf("%s/api/paas/v4/models", baseURL)
+		if plan, ok := constant.ChannelSpecialBases[baseURL]; ok && plan.OpenAIBaseURL != "" {
+			url = fmt.Sprintf("%s/models", plan.OpenAIBaseURL)
+		} else {
+			url = fmt.Sprintf("%s/api/paas/v4/models", baseURL)
+		}
 	case constant.ChannelTypeVolcEngine:
-		if baseURL == volcengine.DoubaoCodingPlan {
-			url = fmt.Sprintf("%s/v1/models", volcengine.DoubaoCodingPlanOpenAIBaseURL)
+		if plan, ok := constant.ChannelSpecialBases[baseURL]; ok && plan.OpenAIBaseURL != "" {
+			url = fmt.Sprintf("%s/v1/models", plan.OpenAIBaseURL)
+		} else {
+			url = fmt.Sprintf("%s/v1/models", baseURL)
+		}
+	case constant.ChannelTypeMoonshot:
+		if plan, ok := constant.ChannelSpecialBases[baseURL]; ok && plan.OpenAIBaseURL != "" {
+			url = fmt.Sprintf("%s/models", plan.OpenAIBaseURL)
 		} else {
 			url = fmt.Sprintf("%s/v1/models", baseURL)
 		}
@@ -214,14 +247,13 @@ func FetchUpstreamModels(c *gin.Context) {
 	}
 	key = strings.TrimSpace(key)

-	// 获取响应体 - 根据渠道类型决定是否添加 AuthHeader
-	var body []byte
-	switch channel.Type {
-	case constant.ChannelTypeAnthropic:
-		body, err = GetResponseBody("GET", url, channel, GetClaudeAuthHeader(key))
-	default:
-		body, err = GetResponseBody("GET", url, channel, GetAuthHeader(key))
+	headers, err := buildFetchModelsHeaders(channel, key)
+	if err != nil {
+		common.ApiError(c, err)
+		return
 	}
+
+	body, err := GetResponseBody("GET", url, channel, headers)
 	if err != nil {
 		common.ApiError(c, err)
 		return
--- a/controller/relay.go
+++ b/controller/relay.go
@@ -125,13 +125,13 @@ func Relay(c *gin.Context, relayFormat types.RelayFormat) {
 		}
 	}

-	tokens, err := service.CountRequestToken(c, meta, relayInfo)
+	tokens, err := service.EstimateRequestToken(c, meta, relayInfo)
 	if err != nil {
 		newAPIError = types.NewError(err, types.ErrorCodeCountTokenFailed)
 		return
 	}

-	relayInfo.SetPromptTokens(tokens)
+	relayInfo.SetEstimatePromptTokens(tokens)

 	priceData, err := helper.ModelPriceHelper(c, relayInfo, tokens, meta)
 	if err != nil {
@@ -285,7 +285,7 @@ func processChannelError(c *gin.Context, channelError types.ChannelError, err *t
 	logger.LogError(c, fmt.Sprintf("channel error (channel #%d, status code: %d): %s", channelError.ChannelId, err.StatusCode, err.Error()))
 	// 不要使用context获取渠道信息，异步处理时可能会出现渠道信息不一致的情况
 	// do not use context to get channel info, there may be inconsistent channel info when processing asynchronously
-	if service.ShouldDisableChannel(channelError.ChannelId, err) && channelError.AutoBan {
+	if service.ShouldDisableChannel(channelError.ChannelType, err) && channelError.AutoBan {
 		gopool.Go(func() {
 			service.DisableChannel(channelError, err.Error())
 		})
--- a/controller/task.go
+++ b/controller/task.go
@@ -29,7 +29,7 @@ func UpdateTaskBulk() {
 		time.Sleep(time.Duration(15) * time.Second)
 		common.SysLog("任务进度轮询开始")
 		ctx := context.TODO()
-		allTasks := model.GetAllUnFinishSyncTasks(500)
+		allTasks := model.GetAllUnFinishSyncTasks(constant.TaskQueryLimit)
 		platformTask := make(map[constant.TaskPlatform][]*model.Task)
 		for _, t := range allTasks {
 			platformTask[t.Platform] = append(platformTask[t.Platform], t)
@@ -116,9 +116,10 @@ func updateSunoTaskAll(ctx context.Context, channelId int, taskIds []string, tas
 	if adaptor == nil {
 		return errors.New("adaptor not found")
 	}
+	proxy := channel.GetSetting().Proxy
 	resp, err := adaptor.FetchTask(*channel.BaseURL, channel.Key, map[string]any{
 		"ids": taskIds,
-	})
+	}, proxy)
 	if err != nil {
 		common.SysLog(fmt.Sprintf("Get Task Do req error: %v", err))
 		return err
--- a/controller/task_video.go
+++ b/controller/task_video.go
@@ -67,6 +67,7 @@ func updateVideoSingleTask(ctx context.Context, adaptor channel.TaskAdaptor, cha
 	if channel.GetBaseURL() != "" {
 		baseURL = channel.GetBaseURL()
 	}
+	proxy := channel.GetSetting().Proxy

 	task := taskM[taskId]
 	if task == nil {
@@ -76,7 +77,7 @@ func updateVideoSingleTask(ctx context.Context, adaptor channel.TaskAdaptor, cha
 	resp, err := adaptor.FetchTask(baseURL, channel.Key, map[string]any{
 		"task_id": taskId,
 		"action":  task.Action,
-	})
+	}, proxy)
 	if err != nil {
 		return fmt.Errorf("fetchTask failed for task %s: %w", taskId, err)
 	}
--- a/controller/token.go
+++ b/controller/token.go
@@ -142,7 +142,7 @@ func AddToken(c *gin.Context) {
 		common.ApiError(c, err)
 		return
 	}
-	if len(token.Name) > 30 {
+	if len(token.Name) > 50 {
 		c.JSON(http.StatusOK, gin.H{
 			"success": false,
 			"message": "令牌名称过长",
@@ -208,7 +208,7 @@ func UpdateToken(c *gin.Context) {
 		common.ApiError(c, err)
 		return
 	}
-	if len(token.Name) > 30 {
+	if len(token.Name) > 50 {
 		c.JSON(http.StatusOK, gin.H{
 			"success": false,
 			"message": "令牌名称过长",
--- a/controller/video_proxy.go
+++ b/controller/video_proxy.go
@@ -1,6 +1,7 @@
 package controller

 import (
+	"context"
 	"fmt"
 	"io"
 	"net/http"
@@ -10,6 +11,7 @@ import (
 	"github.com/QuantumNous/new-api/constant"
 	"github.com/QuantumNous/new-api/logger"
 	"github.com/QuantumNous/new-api/model"
+	"github.com/QuantumNous/new-api/service"

 	"github.com/gin-gonic/gin"
 )
@@ -75,11 +77,22 @@ func VideoProxy(c *gin.Context) {
 	}

 	var videoURL string
-	client := &http.Client{
-		Timeout: 60 * time.Second,
+	proxy := channel.GetSetting().Proxy
+	client, err := service.GetHttpClientWithProxy(proxy)
+	if err != nil {
+		logger.LogError(c.Request.Context(), fmt.Sprintf("Failed to create proxy client for task %s: %s", taskID, err.Error()))
+		c.JSON(http.StatusInternalServerError, gin.H{
+			"error": gin.H{
+				"message": "Failed to create proxy client",
+				"type":    "server_error",
+			},
+		})
+		return
 	}

-	req, err := http.NewRequestWithContext(c.Request.Context(), http.MethodGet, "", nil)
+	ctx, cancel := context.WithTimeout(c.Request.Context(), 60*time.Second)
+	defer cancel()
+	req, err := http.NewRequestWithContext(ctx, http.MethodGet, "", nil)
 	if err != nil {
 		logger.LogError(c.Request.Context(), fmt.Sprintf("Failed to create request: %s", err.Error()))
 		c.JSON(http.StatusInternalServerError, gin.H{
@@ -117,13 +130,12 @@ func VideoProxy(c *gin.Context) {
 			return
 		}
 		req.Header.Set("x-goog-api-key", apiKey)
-	case constant.ChannelTypeAli:
-		// Video URL is directly in task.FailReason
-		videoURL = task.FailReason
-	default:
-		// Default (Sora, etc.): Use original logic
+	case constant.ChannelTypeOpenAI, constant.ChannelTypeSora:
 		videoURL = fmt.Sprintf("%s/v1/videos/%s/content", baseURL, task.TaskID)
 		req.Header.Set("Authorization", "Bearer "+channel.Key)
+	default:
+		// Video URL is directly in task.FailReason
+		videoURL = task.FailReason
 	}

 	req.URL, err = url.Parse(videoURL)
--- a/controller/video_proxy_gemini.go
+++ b/controller/video_proxy_gemini.go
@@ -35,10 +35,11 @@ func getGeminiVideoURL(channel *model.Channel, task *model.Task, apiKey string)
 		return "", fmt.Errorf("api key not available for task")
 	}

+	proxy := channel.GetSetting().Proxy
 	resp, err := adaptor.FetchTask(baseURL, apiKey, map[string]any{
 		"task_id": task.TaskID,
 		"action":  task.Action,
-	})
+	}, proxy)
 	if err != nil {
 		return "", fmt.Errorf("fetch task failed: %w", err)
 	}
--- a/docs/api/api_auth.md
+++ b/docs/api/api_auth.md
@@ -1,53 +0,0 @@
-# API 鉴权文档
-
-## 认证方式
-
-### Access Token
-
-对于需要鉴权的 API 接口，必须同时提供以下两个请求头来进行 Access Token 认证：
-
-1. **请求头中的 `Authorization` 字段**
-
-    将 Access Token 放置于 HTTP 请求头部的 `Authorization` 字段中，格式如下：
-
-    ```
-    Authorization: <your_access_token>
-    ```
-
-    其中 `<your_access_token>` 需要替换为实际的 Access Token 值。
-
-2. **请求头中的 `New-Api-User` 字段**
-
-    将用户 ID 放置于 HTTP 请求头部的 `New-Api-User` 字段中，格式如下：
-
-    ```
-    New-Api-User: <your_user_id>
-    ```
-
-    其中 `<your_user_id>` 需要替换为实际的用户 ID。
-
-**注意：**
-
-*   **必须同时提供 `Authorization` 和 `New-Api-User` 两个请求头才能通过鉴权。**
-*   如果只提供其中一个请求头，或者两个请求头都未提供，则会返回 `401 Unauthorized` 错误。
-*   如果 `Authorization` 中的 Access Token 无效，则会返回 `401 Unauthorized` 错误，并提示“无权进行此操作，access token 无效”。
-*   如果 `New-Api-User` 中的用户 ID 与 Access Token 不匹配，则会返回 `401 Unauthorized` 错误，并提示“无权进行此操作，与登录用户不匹配，请重新登录”。
-*   如果没有提供 `New-Api-User` 请求头，则会返回 `401 Unauthorized` 错误，并提示“无权进行此操作，未提供 New-Api-User”。
-*   如果 `New-Api-User` 请求头格式错误，则会返回 `401 Unauthorized` 错误，并提示“无权进行此操作，New-Api-User 格式错误”。
-*   如果用户已被禁用，则会返回 `403 Forbidden` 错误，并提示“用户已被封禁”。
-*   如果用户权限不足，则会返回 `403 Forbidden` 错误，并提示“无权进行此操作，权限不足”。
-*   如果用户信息无效，则会返回 `403 Forbidden` 错误，并提示“无权进行此操作，用户信息无效”。
-
-## Curl 示例
-
-假设您的 Access Token 为 `access_token`，用户 ID 为 `123`，要访问的 API 接口为 `/api/user/self`，则可以使用以下 curl 命令：
-
-```bash
-curl -X GET \
-  -H "Authorization: access_token" \
-  -H "New-Api-User: 123" \
-  https://your-domain.com/api/user/self
-```
-
-请将 `access_token`、`123` 和 `https://your-domain.com` 替换为实际的值。
-
--- a/docs/api/web_api.md
+++ b/docs/api/web_api.md
@@ -1,198 +0,0 @@
-# New API – Web 界面后端接口文档
-
-> 本文档汇总了 **New API** 后端提供给前端 Web 界面的全部 REST 接口（不含 *Relay* 相关接口）。
->
-> 接口前缀统一为 `https://<your-domain>`，以下仅列出 **路径**、**HTTP 方法**、**鉴权要求** 与 **功能简介**。
->
-> 鉴权级别说明：
-> * **公开** – 不需要登录即可调用
-> * **用户** – 需携带用户 Token（`middleware.UserAuth`）
-> * **管理员** – 需管理员 Token（`middleware.AdminAuth`）
-> * **Root** – 仅限最高权限 Root 用户（`middleware.RootAuth`）
-
---
-
-## 1. 初始化 / 系统状态
-| 方法 | 路径 | 鉴权 | 说明 |
-|------|------|------|------|
-| GET  | /api/setup | 公开 | 获取系统初始化状态 |
-| POST | /api/setup | 公开 | 完成首次安装向导 |
-| GET  | /api/status | 公开 | 获取运行状态摘要 |
-| GET  | /api/uptime/status | 公开 | Uptime-Kuma 兼容状态探针 |
-| GET  | /api/status/test | 管理员 | 测试后端与依赖组件是否正常 |
-
-## 2. 公共信息
-| 方法 | 路径 | 鉴权 | 说明 |
-|------|------|------|------|
-| GET | /api/models | 用户 | 获取前端可用模型列表 |
-| GET | /api/notice | 公开 | 获取公告栏内容 |
-| GET | /api/about | 公开 | 关于页面信息 |
-| GET | /api/home_page_content | 公开 | 首页自定义内容 |
-| GET | /api/pricing | 可匿名/用户 | 价格与套餐信息 |
-| GET | /api/ratio_config | 公开 | 模型倍率配置（仅公开字段） |
-
-## 3. 邮件 / 身份验证
-| 方法 | 路径 | 鉴权 | 说明 |
-|------|------|------|------|
-| GET | /api/verification | 公开 (限流) | 发送邮箱验证邮件 |
-| GET | /api/reset_password | 公开 (限流) | 发送重置密码邮件 |
-| POST | /api/user/reset | 公开 | 提交重置密码请求 |
-
-## 4. OAuth / 第三方登录
-| 方法 | 路径 | 鉴权 | 说明 |
-|------|------|------|------|
-| GET | /api/oauth/github | 公开 | GitHub OAuth 跳转 |
-| GET | /api/oauth/discord | 公开 | Discord 通用 OAuth 跳转 |
-| GET | /api/oauth/oidc | 公开 | OIDC 通用 OAuth 跳转 |
-| GET | /api/oauth/linuxdo | 公开 | LinuxDo OAuth 跳转 |
-| GET | /api/oauth/wechat | 公开 | 微信扫码登录跳转 |
-| GET | /api/oauth/wechat/bind | 公开 | 微信账户绑定 |
-| GET | /api/oauth/email/bind | 公开 | 邮箱绑定 |
-| GET | /api/oauth/telegram/login | 公开 | Telegram 登录 |
-| GET | /api/oauth/telegram/bind | 公开 | Telegram 账户绑定 |
-| GET | /api/oauth/state | 公开 | 获取随机 state（防 CSRF） |
-
-## 5. 用户模块
-### 5.1 账号注册/登录
-| 方法 | 路径 | 鉴权 | 说明 |
-|------|------|------|------|
-| POST | /api/user/register | 公开 | 注册新账号 |
-| POST | /api/user/login | 公开 | 用户登录 |
-| GET  | /api/user/logout | 用户 | 退出登录 |
-| GET  | /api/user/epay/notify | 公开 | Epay 支付回调 |
-| GET  | /api/user/groups | 公开 | 列出所有分组（无鉴权版） |
-
-### 5.2 用户自身操作 (需登录)
-| 方法 | 路径 | 鉴权 | 说明 |
-|------|------|------|------|
-| GET | /api/user/self/groups | 用户 | 获取自己所在分组 |
-| GET | /api/user/self | 用户 | 获取个人资料 |
-| GET | /api/user/models | 用户 | 获取模型可见性 |
-| PUT | /api/user/self | 用户 | 修改个人资料 |
-| DELETE | /api/user/self | 用户 | 注销账号 |
-| GET | /api/user/token | 用户 | 生成用户级别 Access Token |
-| GET | /api/user/aff | 用户 | 获取推广码信息 |
-| POST | /api/user/topup | 用户 | 余额直充 |
-| POST | /api/user/pay | 用户 | 提交支付订单 |
-| POST | /api/user/amount | 用户 | 余额支付 |
-| POST | /api/user/aff_transfer | 用户 | 推广额度转账 |
-| PUT | /api/user/setting | 用户 | 更新用户设置 |
-
-### 5.3 管理员用户管理
-| 方法 | 路径 | 鉴权 | 说明 |
-|------|------|------|------|
-| GET | /api/user/ | 管理员 | 获取全部用户列表 |
-| GET | /api/user/search | 管理员 | 搜索用户 |
-| GET | /api/user/:id | 管理员 | 获取单个用户信息 |
-| POST | /api/user/ | 管理员 | 创建用户 |
-| POST | /api/user/manage | 管理员 | 冻结/重置等管理操作 |
-| PUT | /api/user/ | 管理员 | 更新用户 |
-| DELETE | /api/user/:id | 管理员 | 删除用户 |
-
-## 6. 站点选项 (Root)
-| 方法 | 路径 | 鉴权 | 说明 |
-|------|------|------|------|
-| GET | /api/option/ | Root | 获取全局配置 |
-| PUT | /api/option/ | Root | 更新全局配置 |
-| POST | /api/option/rest_model_ratio | Root | 重置模型倍率 |
-| POST | /api/option/migrate_console_setting | Root | 迁移旧版控制台配置 |
-
-## 7. 模型倍率同步 (Root)
-| 方法 | 路径 | 鉴权 | 说明 |
-|------|------|------|------|
-| GET | /api/ratio_sync/channels | Root | 获取可同步渠道列表 |
-| POST | /api/ratio_sync/fetch | Root | 从上游拉取倍率 |
-
-## 8. 渠道管理 (管理员)
-| 方法 | 路径 | 说明 |
-|------|------|------|
-| GET | /api/channel/ | 获取渠道列表 |
-| GET | /api/channel/search | 搜索渠道 |
-| GET | /api/channel/models | 查询渠道模型能力 |
-| GET | /api/channel/models_enabled | 查询启用模型能力 |
-| GET | /api/channel/:id | 获取单个渠道 |
-| GET | /api/channel/test | 批量测试渠道连通性 |
-| GET | /api/channel/test/:id | 单个渠道测试 |
-| GET | /api/channel/update_balance | 批量刷新余额 |
-| GET | /api/channel/update_balance/:id | 单个刷新余额 |
-| POST | /api/channel/ | 新增渠道 |
-| PUT | /api/channel/ | 更新渠道 |
-| DELETE | /api/channel/disabled | 删除已禁用渠道 |
-| POST | /api/channel/tag/disabled | 批量禁用标签渠道 |
-| POST | /api/channel/tag/enabled | 批量启用标签渠道 |
-| PUT | /api/channel/tag | 编辑渠道标签 |
-| DELETE | /api/channel/:id | 删除渠道 |
-| POST | /api/channel/batch | 批量删除渠道 |
-| POST | /api/channel/fix | 修复渠道能力表 |
-| GET | /api/channel/fetch_models/:id | 拉取单渠道模型 |
-| POST | /api/channel/fetch_models | 拉取全部渠道模型 |
-| POST | /api/channel/batch/tag | 批量设置渠道标签 |
-| GET | /api/channel/tag/models | 根据标签获取模型 |
-| POST | /api/channel/copy/:id | 复制渠道 |
-
-## 9. Token 管理
-| 方法 | 路径 | 鉴权 | 说明 |
-|------|------|------|------|
-| GET | /api/token/ | 用户 | 获取全部 Token |
-| GET | /api/token/search | 用户 | 搜索 Token |
-| GET | /api/token/:id | 用户 | 获取单个 Token |
-| POST | /api/token/ | 用户 | 创建 Token |
-| PUT | /api/token/ | 用户 | 更新 Token |
-| DELETE | /api/token/:id | 用户 | 删除 Token |
-| POST | /api/token/batch | 用户 | 批量删除 Token |
-
-## 10. 兑换码管理 (管理员)
-| 方法 | 路径 | 说明 |
-|------|------|------|
-| GET | /api/redemption/ | 获取兑换码列表 |
-| GET | /api/redemption/search | 搜索兑换码 |
-| GET | /api/redemption/:id | 获取单个兑换码 |
-| POST | /api/redemption/ | 创建兑换码 |
-| PUT | /api/redemption/ | 更新兑换码 |
-| DELETE | /api/redemption/invalid | 删除无效兑换码 |
-| DELETE | /api/redemption/:id | 删除兑换码 |
-
-## 11. 日志
-| 方法 | 路径 | 鉴权 | 说明 |
-|------|------|------|------|
-| GET | /api/log/ | 管理员 | 获取全部日志 |
-| DELETE | /api/log/ | 管理员 | 删除历史日志 |
-| GET | /api/log/stat | 管理员 | 日志统计 |
-| GET | /api/log/self/stat | 用户 | 我的日志统计 |
-| GET | /api/log/search | 管理员 | 搜索全部日志 |
-| GET | /api/log/self | 用户 | 获取我的日志 |
-| GET | /api/log/self/search | 用户 | 搜索我的日志 |
-| GET | /api/log/token | 公开 | 根据 Token 查询日志（支持 CORS） |
-
-## 12. 数据统计
-| 方法 | 路径 | 鉴权 | 说明 |
-|------|------|------|------|
-| GET | /api/data/ | 管理员 | 全站用量按日期统计 |
-| GET | /api/data/self | 用户 | 我的用量按日期统计 |
-
-## 13. 分组
-| GET | /api/group/ | 管理员 | 获取全部分组列表 |
-
-## 14. Midjourney 任务
-| 方法 | 路径 | 鉴权 | 说明 |
-|------|------|------|------|
-| GET | /api/mj/self | 用户 | 获取自己的 MJ 任务 |
-| GET | /api/mj/ | 管理员 | 获取全部 MJ 任务 |
-
-## 15. 任务中心
-| 方法 | 路径 | 鉴权 | 说明 |
-|------|------|------|------|
-| GET | /api/task/self | 用户 | 获取我的任务 |
-| GET | /api/task/ | 管理员 | 获取全部任务 |
-
-## 16. 账户计费面板 (Dashboard)
-| 方法 | 路径 | 鉴权 | 说明 |
-|------|------|------|------|
-| GET | /dashboard/billing/subscription | 用户 Token | 获取订阅额度信息 |
-| GET | /v1/dashboard/billing/subscription | 同上 | 兼容 OpenAI SDK 路径 |
-| GET | /dashboard/billing/usage | 用户 Token | 获取使用量信息 |
-| GET | /v1/dashboard/billing/usage | 同上 | 兼容 OpenAI SDK 路径 |
-
---
-
-> **更新日期**：2025.07.17
--- a/docs/models/Midjourney.md
+++ b/docs/models/Midjourney.md
@@ -1,82 +0,0 @@
-# Midjourney Proxy API文档
-
-**简介**:Midjourney Proxy API文档
-
-## 接口列表
-支持的接口如下：
-+ [x] /mj/submit/imagine
-+ [x] /mj/submit/change
-+ [x] /mj/submit/blend
-+ [x] /mj/submit/describe
-+ [x] /mj/image/{id} （通过此接口获取图片，**请必须在系统设置中填写服务器地址！！**）
-+ [x] /mj/task/{id}/fetch （此接口返回的图片地址为经过One API转发的地址）
-+ [x] /task/list-by-condition
-+ [x] /mj/submit/action （仅midjourney-proxy-plus支持，下同）
-+ [x] /mj/submit/modal
-+ [x] /mj/submit/shorten
-+ [x] /mj/task/{id}/image-seed
-+ [x] /mj/insight-face/swap （InsightFace）
-
-## 模型列表
-
-### midjourney-proxy支持
-
- mj_imagine (绘图)
- mj_variation (变换)
- mj_reroll (重绘)
- mj_blend (混合)
- mj_upscale (放大)
- mj_describe (图生文)
-
-### 仅midjourney-proxy-plus支持
-
- mj_zoom (比例变焦)
- mj_shorten (提示词缩短)
- mj_modal (窗口提交，局部重绘和自定义比例变焦必须和mj_modal一同添加)
- mj_inpaint (局部重绘提交，必须和mj_modal一同添加)
- mj_custom_zoom (自定义比例变焦，必须和mj_modal一同添加)
- mj_high_variation (强变换)
- mj_low_variation (弱变换)
- mj_pan (平移)
- swap_face (换脸)
-
-## 模型价格设置（在设置-运营设置-模型固定价格设置中设置）
-```json
-{
-  "mj_imagine": 0.1,
-  "mj_variation": 0.1,
-  "mj_reroll": 0.1,
-  "mj_blend": 0.1,
-  "mj_modal": 0.1,
-  "mj_zoom": 0.1,
-  "mj_shorten": 0.1,
-  "mj_high_variation": 0.1,
-  "mj_low_variation": 0.1,
-  "mj_pan": 0.1,
-  "mj_inpaint": 0,
-  "mj_custom_zoom": 0,
-  "mj_describe": 0.05,
-  "mj_upscale": 0.05,
-  "swap_face": 0.05
-}
-```
-其中mj_inpaint和mj_custom_zoom的价格设置为0，是因为这两个模型需要搭配mj_modal使用，所以价格由mj_modal决定。
-
-## 渠道设置
-
-### 对接 midjourney-proxy(plus)
-
-1.
-
-部署Midjourney-Proxy，并配置好midjourney账号等（强烈建议设置密钥），[项目地址](https://github.com/novicezk/midjourney-proxy)
-
-2. 在渠道管理中添加渠道，渠道类型选择**Midjourney Proxy**，如果是plus版本选择**Midjourney Proxy Plus**
-   ，模型请参考上方模型列表
-3. **代理**填写midjourney-proxy部署的地址，例如：http://localhost:8080
-4. 密钥填写midjourney-proxy的密钥，如果没有设置密钥，可以随便填
-
-### 对接上游new api
-
-1. 在渠道管理中添加渠道，渠道类型选择**Midjourney Proxy Plus**，模型请参考上方模型列表
-2. **代理**填写上游new api的地址，例如：http://localhost:3000
-3. 密钥填写上游new api的密钥
--- a/docs/models/Rerank.md
+++ b/docs/models/Rerank.md
@@ -1,62 +0,0 @@
-# Rerank API文档
-
-**简介**:Rerank API文档
-
-## 接入Dify
-模型供应商选择Jina，按要求填写模型信息即可接入Dify。
-
-## 请求方式
-
-Post: /v1/rerank
-
-Request:
-
-```json
-{
-  "model": "jina-reranker-v2-base-multilingual",
-  "query": "What is the capital of the United States?",
-  "top_n": 3,
-  "documents": [
-    "Carson City is the capital city of the American state of Nevada.",
-    "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",
-    "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.",
-    "Capitalization or capitalisation in English grammar is the use of a capital letter at the start of a word. English usage varies from capitalization in other languages.",
-    "Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states."
-  ]
-}
-```
-
-Response:
-
-```json
-{
-  "results": [
-    {
-      "document": {
-        "text": "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district."
-      },
-      "index": 2,
-      "relevance_score": 0.9999702
-    },
-    {
-      "document": {
-        "text": "Carson City is the capital city of the American state of Nevada."
-      },
-      "index": 0,
-      "relevance_score": 0.67800725
-    },
-    {
-      "document": {
-        "text": "Capitalization or capitalisation in English grammar is the use of a capital letter at the start of a word. English usage varies from capitalization in other languages."
-      },
-      "index": 3,
-      "relevance_score": 0.02800752
-    }
-  ],
-  "usage": {
-    "prompt_tokens": 158,
-    "completion_tokens": 0,
-    "total_tokens": 158
-  }
-}
-```
--- a/docs/models/Suno.md
+++ b/docs/models/Suno.md
@@ -1,44 +0,0 @@
-# Suno API文档
-
-**简介**:Suno API文档
-
-## 接口列表
-支持的接口如下：
-+ [x] /suno/submit/music
-+ [x] /suno/submit/lyrics
-+ [x] /suno/fetch
-+ [x] /suno/fetch/:id
-
-## 模型列表
-
-### Suno API支持
-
- suno_music (自定义模式、灵感模式、续写)
- suno_lyrics (生成歌词)
-
-
-## 模型价格设置（在设置-运营设置-模型固定价格设置中设置）
-```json
-{
-  "suno_music": 0.3,
-  "suno_lyrics": 0.01
-}
-```
-
-## 渠道设置
-
-### 对接 Suno API
-
-1.
-部署 Suno API，并配置好suno账号等（强烈建议设置密钥），[项目地址](https://github.com/Suno-API/Suno-API)
-
-2. 在渠道管理中添加渠道，渠道类型选择**Suno API**
-   ，模型请参考上方模型列表
-3. **代理**填写 Suno API 部署的地址，例如：http://localhost:8080
-4. 密钥填写 Suno API 的密钥，如果没有设置密钥，可以随便填
-
-### 对接上游new api
-
-1. 在渠道管理中添加渠道，渠道类型选择**Suno API**，或任意类型，只需模型包含上方模型列表的模型
-2. **代理**填写上游new api的地址，例如：http://localhost:3000
-3. 密钥填写上游new api的密钥
--- a/docs/openapi/api.json
+++ b/docs/openapi/api.json
--- a/docs/openapi/relay.json
+++ b/docs/openapi/relay.json
--- a/dto/claude.go
+++ b/dto/claude.go
@@ -203,6 +203,9 @@ type ClaudeRequest struct {
 	Stream            bool            `json:"stream,omitempty"`
 	Tools             any             `json:"tools,omitempty"`
 	ContextManagement json.RawMessage `json:"context_management,omitempty"`
+	OutputConfig      json.RawMessage `json:"output_config,omitempty"`
+	OutputFormat      json.RawMessage `json:"output_format,omitempty"`
+	Container         json.RawMessage `json:"container,omitempty"`
 	ToolChoice        any             `json:"tool_choice,omitempty"`
 	Thinking          *Thinking       `json:"thinking,omitempty"`
 	McpServers        json.RawMessage `json:"mcp_servers,omitempty"`
--- a/dto/gemini.go
+++ b/dto/gemini.go
@@ -142,7 +142,7 @@ type GeminiThinkingConfig struct {
 	IncludeThoughts bool `json:"includeThoughts,omitempty"`
 	ThinkingBudget  *int `json:"thinkingBudget,omitempty"`
 	// TODO Conflict with thinkingbudget.
-	ThinkingLevel json.RawMessage `json:"thinkingLevel,omitempty"`
+	ThinkingLevel string `json:"thinkingLevel,omitempty"`
 }

 // UnmarshalJSON allows GeminiThinkingConfig to accept both snake_case and camelCase fields.
@@ -150,9 +150,9 @@ func (c *GeminiThinkingConfig) UnmarshalJSON(data []byte) error {
 	type Alias GeminiThinkingConfig
 	var aux struct {
 		Alias
-		IncludeThoughtsSnake *bool           `json:"include_thoughts,omitempty"`
-		ThinkingBudgetSnake  *int            `json:"thinking_budget,omitempty"`
-		ThinkingLevelSnake   json.RawMessage `json:"thinking_level,omitempty"`
+		IncludeThoughtsSnake *bool  `json:"include_thoughts,omitempty"`
+		ThinkingBudgetSnake  *int   `json:"thinking_budget,omitempty"`
+		ThinkingLevelSnake   string `json:"thinking_level,omitempty"`
 	}

 	if err := common.Unmarshal(data, &aux); err != nil {
@@ -169,7 +169,7 @@ func (c *GeminiThinkingConfig) UnmarshalJSON(data []byte) error {
 		c.ThinkingBudget = aux.ThinkingBudgetSnake
 	}

-	if len(aux.ThinkingLevelSnake) > 0 {
+	if aux.ThinkingLevelSnake != "" {
 		c.ThinkingLevel = aux.ThinkingLevelSnake
 	}

--- a/dto/openai_image.go
+++ b/dto/openai_image.go
@@ -27,8 +27,11 @@ type ImageRequest struct {
 	OutputCompression json.RawMessage `json:"output_compression,omitempty"`
 	PartialImages     json.RawMessage `json:"partial_images,omitempty"`
 	// Stream            bool            `json:"stream,omitempty"`
-	Watermark *bool           `json:"watermark,omitempty"`
-	Image     json.RawMessage `json:"image,omitempty"`
+	Watermark *bool `json:"watermark,omitempty"`
+	// zhipu 4v
+	WatermarkEnabled json.RawMessage `json:"watermark_enabled,omitempty"`
+	UserId           json.RawMessage `json:"user_id,omitempty"`
+	Image            json.RawMessage `json:"image,omitempty"`
 	// 用匿名参数接收额外参数
 	Extra map[string]json.RawMessage `json:"-"`
 }
--- a/dto/openai_request.go
+++ b/dto/openai_request.go
@@ -83,6 +83,7 @@ type GeneralOpenAIRequest struct {
 	// Ali Qwen Params
 	VlHighResolutionImages json.RawMessage `json:"vl_high_resolution_images,omitempty"`
 	EnableThinking         any             `json:"enable_thinking,omitempty"`
+	ChatTemplateKwargs     json.RawMessage `json:"chat_template_kwargs,omitempty"`
 	// ollama Params
 	Think json.RawMessage `json:"think,omitempty"`
 	// baidu v2
--- a/go.mod
+++ b/go.mod
@@ -33,7 +33,7 @@ require (
 	github.com/mewkiz/flac v1.0.13
 	github.com/pkg/errors v0.9.1
 	github.com/pquerna/otp v1.5.0
-	github.com/samber/lo v1.39.0
+	github.com/samber/lo v1.52.0
 	github.com/shirou/gopsutil v3.21.11+incompatible
 	github.com/shopspring/decimal v1.4.0
 	github.com/stripe/stripe-go/v81 v81.4.0
@@ -99,6 +99,7 @@ require (
 	github.com/mitchellh/mapstructure v1.5.0 // indirect
 	github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
 	github.com/modern-go/reflect2 v1.0.2 // indirect
+	github.com/ncruces/go-strftime v0.1.9 // indirect
 	github.com/pelletier/go-toml/v2 v2.2.1 // indirect
 	github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec // indirect
 	github.com/tidwall/match v1.1.1 // indirect
@@ -110,13 +111,13 @@ require (
 	github.com/x448/float16 v0.8.4 // indirect
 	github.com/yusufpapurcu/wmi v1.2.3 // indirect
 	golang.org/x/arch v0.21.0 // indirect
-	golang.org/x/exp v0.0.0-20240404231335-c0f41cb1a7a0 // indirect
+	golang.org/x/exp v0.0.0-20250620022241-b7579e27df2b // indirect
 	golang.org/x/sys v0.38.0 // indirect
 	golang.org/x/text v0.31.0 // indirect
 	google.golang.org/protobuf v1.34.2 // indirect
 	gopkg.in/yaml.v3 v3.0.1 // indirect
-	modernc.org/libc v1.22.5 // indirect
-	modernc.org/mathutil v1.5.0 // indirect
-	modernc.org/memory v1.5.0 // indirect
-	modernc.org/sqlite v1.23.1 // indirect
+	modernc.org/libc v1.66.10 // indirect
+	modernc.org/mathutil v1.7.1 // indirect
+	modernc.org/memory v1.11.0 // indirect
+	modernc.org/sqlite v1.40.1 // indirect
 )
--- a/go.sum
+++ b/go.sum
@@ -120,6 +120,7 @@ github.com/google/go-tpm v0.9.5/go.mod h1:h9jEsEECg7gtLis0upRBQU+GhYVH6jMjrFxI8u
 github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
 github.com/google/pprof v0.0.0-20221118152302-e6195bd50e26 h1:Xim43kblpZXfIBQsbuBVKCudVG457BR2GZFIz3uw3hQ=
 github.com/google/pprof v0.0.0-20221118152302-e6195bd50e26/go.mod h1:dDKJzRmX4S37WGHujM7tX//fmj1uioxKzKxz3lo4HJo=
+github.com/google/pprof v0.0.0-20250317173921-a4b03ec1a45e h1:ijClszYn+mADRFY17kjQEVQ1XRhq2/JR1M3sGqeJoxs=
 github.com/google/uuid v1.1.2/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
 github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
 github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
@@ -193,6 +194,8 @@ github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJ
 github.com/modern-go/reflect2 v0.0.0-20180701023420-4b7aa43c6742/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3RllmbCylyMrvgv0=
 github.com/modern-go/reflect2 v1.0.2 h1:xBagoLtFs94CBntxluKeaWgTMpvLxC4ur3nMaC9Gz0M=
 github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjYzDa0/r8luk=
+github.com/ncruces/go-strftime v0.1.9 h1:bY0MQC28UADQmHmaF5dgpLmImcShSi2kHU9XLdhx/f4=
+github.com/ncruces/go-strftime v0.1.9/go.mod h1:Fwc5htZGVVkseilnfgOVb9mKy6w1naJmn9CehxcKcls=
 github.com/nxadm/tail v1.4.8 h1:nPr65rt6Y5JFSKQO7qToXr7pePgD6Gwiw05lkbyAQTE=
 github.com/nxadm/tail v1.4.8/go.mod h1:+ncqLTQzXmGhMZNUePPaPqPvBxHAIsmXswZKocGu+AU=
 github.com/onsi/ginkgo v1.16.5 h1:8xi0RTUf59SOSfEtZMvwTvXYMzG4gV23XVHOZiXNtnE=
@@ -219,6 +222,8 @@ github.com/rogpeppe/go-internal v1.8.0 h1:FCbCCtXNOY3UtUuHUYaghJg4y7Fd14rXifAYUA
 github.com/rogpeppe/go-internal v1.8.0/go.mod h1:WmiCO8CzOY8rg0OYDC4/i/2WRWAB6poM+XZ2dLUbcbE=
 github.com/samber/lo v1.39.0 h1:4gTz1wUhNYLhFSKl6O+8peW0v2F4BCY034GRpU9WnuA=
 github.com/samber/lo v1.39.0/go.mod h1:+m/ZKRl6ClXCE2Lgf3MsQlWfh4bn1bz6CXEOxnEXnEA=
+github.com/samber/lo v1.52.0 h1:Rvi+3BFHES3A8meP33VPAxiBZX/Aws5RxrschYGjomw=
+github.com/samber/lo v1.52.0/go.mod h1:4+MXEGsJzbKGaUEQFKBq2xtfuznW9oz/WrgyzMzRoM0=
 github.com/shirou/gopsutil v3.21.11+incompatible h1:+1+c1VGhc88SSonWP6foOcLhvnKlUeu/erjjvaPEYiI=
 github.com/shirou/gopsutil v3.21.11+incompatible/go.mod h1:5b4v6he4MtMOwMlS0TUMTu2PcXUg8+E1lC7eC3UO/RA=
 github.com/shopspring/decimal v1.4.0 h1:bxl37RwXBklmTi0C79JfXCEBD1cqqHt0bbgBAGFp81k=
@@ -285,6 +290,8 @@ golang.org/x/crypto v0.45.0 h1:jMBrvKuj23MTlT0bQEOBcAE0mjg8mK9RXFhRH6nyF3Q=
 golang.org/x/crypto v0.45.0/go.mod h1:XTGrrkGJve7CYK7J8PEww4aY7gM3qMCElcJQ8n8JdX4=
 golang.org/x/exp v0.0.0-20240404231335-c0f41cb1a7a0 h1:985EYyeCOxTpcgOTJpflJUwOeEz0CQOdPt73OzpE9F8=
 golang.org/x/exp v0.0.0-20240404231335-c0f41cb1a7a0/go.mod h1:/lliqkxwWAhPjf5oSOIJup2XcqJaw8RGS6k3TGEc7GI=
+golang.org/x/exp v0.0.0-20250620022241-b7579e27df2b h1:M2rDM6z3Fhozi9O7NWsxAkg/yqS/lQJ6PmkyIV3YP+o=
+golang.org/x/exp v0.0.0-20250620022241-b7579e27df2b/go.mod h1:3//PLf8L/X+8b4vuAfHzxeRUl04Adcb341+IGKfnqS8=
 golang.org/x/image v0.23.0 h1:HseQ7c2OpPKTPVzNjG5fwJsOTCiiwS4QdsYi5XU6H68=
 golang.org/x/image v0.23.0/go.mod h1:wJJBTdLfCCf3tiHa1fNxpZmUI4mmoZvwMCPP0ddoNKY=
 golang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg=
@@ -345,9 +352,17 @@ gorm.io/gorm v1.25.2 h1:gs1o6Vsa+oVKG/a9ElL3XgyGfghFfkKA2SInQaCyMho=
 gorm.io/gorm v1.25.2/go.mod h1:L4uxeKpfBml98NYqVqwAdmV1a2nBtAec/cf3fpucW/k=
 modernc.org/libc v1.22.5 h1:91BNch/e5B0uPbJFgqbxXuOnxBQjlS//icfQEGmvyjE=
 modernc.org/libc v1.22.5/go.mod h1:jj+Z7dTNX8fBScMVNRAYZ/jF91K8fdT2hYMThc3YjBY=
+modernc.org/libc v1.66.10 h1:yZkb3YeLx4oynyR+iUsXsybsX4Ubx7MQlSYEw4yj59A=
+modernc.org/libc v1.66.10/go.mod h1:8vGSEwvoUoltr4dlywvHqjtAqHBaw0j1jI7iFBTAr2I=
 modernc.org/mathutil v1.5.0 h1:rV0Ko/6SfM+8G+yKiyI830l3Wuz1zRutdslNoQ0kfiQ=
 modernc.org/mathutil v1.5.0/go.mod h1:mZW8CKdRPY1v87qxC/wUdX5O1qDzXMP5TH3wjfpga6E=
+modernc.org/mathutil v1.7.1 h1:GCZVGXdaN8gTqB1Mf/usp1Y/hSqgI2vAGGP4jZMCxOU=
+modernc.org/mathutil v1.7.1/go.mod h1:4p5IwJITfppl0G4sUEDtCr4DthTaT47/N3aT6MhfgJg=
 modernc.org/memory v1.5.0 h1:N+/8c5rE6EqugZwHii4IFsaJ7MUhoWX07J5tC/iI5Ds=
 modernc.org/memory v1.5.0/go.mod h1:PkUhL0Mugw21sHPeskwZW4D6VscE/GQJOnIpCnW6pSU=
+modernc.org/memory v1.11.0 h1:o4QC8aMQzmcwCK3t3Ux/ZHmwFPzE6hf2Y5LbkRs+hbI=
+modernc.org/memory v1.11.0/go.mod h1:/JP4VbVC+K5sU2wZi9bHoq2MAkCnrt2r98UGeSK7Mjw=
 modernc.org/sqlite v1.23.1 h1:nrSBg4aRQQwq59JpvGEQ15tNxoO5pX/kUjcRNwSAGQM=
 modernc.org/sqlite v1.23.1/go.mod h1:OrDj17Mggn6MhE+iPbBNf7RGKODDE9NFT0f3EwDzJqk=
+modernc.org/sqlite v1.40.1 h1:VfuXcxcUWWKRBuP8+BR9L7VnmusMgBNNnBYGEe9w/iY=
+modernc.org/sqlite v1.40.1/go.mod h1:9fjQZ0mB1LLP0GYrp39oOJXx/I2sxEnZtzCmEQIKvGE=
--- a/middleware/distributor.go
+++ b/middleware/distributor.go
@@ -181,6 +181,10 @@ func getModelRequest(c *gin.Context) (*ModelRequest, bool, error) {
 		}
 		c.Set("platform", string(constant.TaskPlatformSuno))
 		c.Set("relay_mode", relayMode)
+	} else if strings.Contains(c.Request.URL.Path, "/v1/videos/") && strings.HasSuffix(c.Request.URL.Path, "/remix") {
+		relayMode := relayconstant.RelayModeVideoSubmit
+		c.Set("relay_mode", relayMode)
+		shouldSelectChannel = false
 	} else if strings.Contains(c.Request.URL.Path, "/v1/videos") {
 		//curl https://api.openai.com/v1/videos \
 		//  -H "Authorization: Bearer $OPENAI_API_KEY" \
--- a/relay/channel/adapter.go
+++ b/relay/channel/adapter.go
@@ -47,7 +47,7 @@ type TaskAdaptor interface {
 	GetChannelName() string

 	// FetchTask
-	FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error)
+	FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error)

 	ParseTaskResult(respBody []byte) (*relaycommon.TaskInfo, error)
 }
--- a/relay/channel/aws/constants.go
+++ b/relay/channel/aws/constants.go
@@ -18,6 +18,7 @@ var awsModelIDMap = map[string]string{
 	"claude-opus-4-1-20250805":   "anthropic.claude-opus-4-1-20250805-v1:0",
 	"claude-sonnet-4-5-20250929": "anthropic.claude-sonnet-4-5-20250929-v1:0",
 	"claude-haiku-4-5-20251001":  "anthropic.claude-haiku-4-5-20251001-v1:0",
+	"claude-opus-4-5-20251101":  "anthropic.claude-opus-4-5-20251101-v1:0",
 	// Nova models
 	"nova-micro-v1:0":   "amazon.nova-micro-v1:0",
 	"nova-lite-v1:0":    "amazon.nova-lite-v1:0",
@@ -76,6 +77,11 @@ var awsModelCanCrossRegionMap = map[string]map[string]bool{
 		"ap": true,
 		"eu": true,
 	},
+	"anthropic.claude-opus-4-5-20251101-v1:0": {
+		"us": true,
+		"ap": true,
+		"eu": true,
+	},
 	"anthropic.claude-haiku-4-5-20251001-v1:0": {
 		"us": true,
 		"ap": true,
--- a/relay/channel/aws/relay-aws.go
+++ b/relay/channel/aws/relay-aws.go
@@ -25,6 +25,17 @@ import (
 	"github.com/aws/smithy-go/auth/bearer"
 )

+// getAwsErrorStatusCode extracts HTTP status code from AWS SDK error
+func getAwsErrorStatusCode(err error) int {
+	// Check for HTTP response error which contains status code
+	var httpErr interface{ HTTPStatusCode() int }
+	if errors.As(err, &httpErr) {
+		return httpErr.HTTPStatusCode()
+	}
+	// Default to 500 if we can't determine the status code
+	return http.StatusInternalServerError
+}
+
 func newAwsClient(c *gin.Context, info *relaycommon.RelayInfo) (*bedrockruntime.Client, error) {
 	var (
 		httpClient *http.Client
@@ -173,7 +184,8 @@ func awsHandler(c *gin.Context, info *relaycommon.RelayInfo, a *Adaptor) (*types

 	awsResp, err := a.AwsClient.InvokeModel(c.Request.Context(), a.AwsReq.(*bedrockruntime.InvokeModelInput))
 	if err != nil {
-		return types.NewOpenAIError(errors.Wrap(err, "InvokeModel"), types.ErrorCodeAwsInvokeError, http.StatusInternalServerError), nil
+		statusCode := getAwsErrorStatusCode(err)
+		return types.NewOpenAIError(errors.Wrap(err, "InvokeModel"), types.ErrorCodeAwsInvokeError, statusCode), nil
 	}

 	claudeInfo := &claude.ClaudeResponseInfo{
@@ -199,7 +211,8 @@ func awsHandler(c *gin.Context, info *relaycommon.RelayInfo, a *Adaptor) (*types
 func awsStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, a *Adaptor) (*types.NewAPIError, *dto.Usage) {
 	awsResp, err := a.AwsClient.InvokeModelWithResponseStream(c.Request.Context(), a.AwsReq.(*bedrockruntime.InvokeModelWithResponseStreamInput))
 	if err != nil {
-		return types.NewOpenAIError(errors.Wrap(err, "InvokeModelWithResponseStream"), types.ErrorCodeAwsInvokeError, http.StatusInternalServerError), nil
+		statusCode := getAwsErrorStatusCode(err)
+		return types.NewOpenAIError(errors.Wrap(err, "InvokeModelWithResponseStream"), types.ErrorCodeAwsInvokeError, statusCode), nil
 	}
 	stream := awsResp.GetStream()
 	defer stream.Close()
@@ -238,7 +251,8 @@ func handleNovaRequest(c *gin.Context, info *relaycommon.RelayInfo, a *Adaptor)

 	awsResp, err := a.AwsClient.InvokeModel(c.Request.Context(), a.AwsReq.(*bedrockruntime.InvokeModelInput))
 	if err != nil {
-		return types.NewError(errors.Wrap(err, "InvokeModel"), types.ErrorCodeChannelAwsClientError), nil
+		statusCode := getAwsErrorStatusCode(err)
+		return types.NewOpenAIError(errors.Wrap(err, "InvokeModel"), types.ErrorCodeAwsInvokeError, statusCode), nil
 	}

 	// 解析Nova响应
--- a/relay/channel/claude/constants.go
+++ b/relay/channel/claude/constants.go
@@ -9,6 +9,7 @@ var ModelList = []string{
 	"claude-3-opus-20240229",
 	"claude-3-haiku-20240307",
 	"claude-3-5-haiku-20241022",
+	"claude-haiku-4-5-20251001",
 	"claude-3-5-sonnet-20240620",
 	"claude-3-5-sonnet-20241022",
 	"claude-3-7-sonnet-20250219",
@@ -21,6 +22,8 @@ var ModelList = []string{
 	"claude-opus-4-1-20250805-thinking",
 	"claude-sonnet-4-5-20250929",
 	"claude-sonnet-4-5-20250929-thinking",
+	"claude-opus-4-5-20251101",
+	"claude-opus-4-5-20251101-thinking",
 }

 var ChannelName = "claude"
--- a/relay/channel/claude/relay-claude.go
+++ b/relay/channel/claude/relay-claude.go
@@ -673,7 +673,7 @@ func HandleStreamResponseData(c *gin.Context, info *relaycommon.RelayInfo, claud
 func HandleStreamFinalResponse(c *gin.Context, info *relaycommon.RelayInfo, claudeInfo *ClaudeResponseInfo, requestMode int) {

 	if requestMode == RequestModeCompletion {
-		claudeInfo.Usage = service.ResponseText2Usage(c, claudeInfo.ResponseText.String(), info.UpstreamModelName, info.PromptTokens)
+		claudeInfo.Usage = service.ResponseText2Usage(c, claudeInfo.ResponseText.String(), info.UpstreamModelName, info.GetEstimatePromptTokens())
 	} else {
 		if claudeInfo.Usage.PromptTokens == 0 {
 			//上游出错
@@ -734,10 +734,7 @@ func HandleClaudeResponseData(c *gin.Context, info *relaycommon.RelayInfo, claud
 		return types.WithClaudeError(*claudeError, http.StatusInternalServerError)
 	}
 	if requestMode == RequestModeCompletion {
-		completionTokens := service.CountTextToken(claudeResponse.Completion, info.OriginModelName)
-		claudeInfo.Usage.PromptTokens = info.PromptTokens
-		claudeInfo.Usage.CompletionTokens = completionTokens
-		claudeInfo.Usage.TotalTokens = info.PromptTokens + completionTokens
+		claudeInfo.Usage = service.ResponseText2Usage(c, claudeResponse.Completion, info.UpstreamModelName, info.GetEstimatePromptTokens())
 	} else {
 		claudeInfo.Usage.PromptTokens = claudeResponse.Usage.InputTokens
 		claudeInfo.Usage.CompletionTokens = claudeResponse.Usage.OutputTokens
--- a/relay/channel/cloudflare/relay_cloudflare.go
+++ b/relay/channel/cloudflare/relay_cloudflare.go
@@ -74,7 +74,7 @@ func cfStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Res
 	if err := scanner.Err(); err != nil {
 		logger.LogError(c, "error_scanning_stream_response: "+err.Error())
 	}
-	usage := service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.PromptTokens)
+	usage := service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.GetEstimatePromptTokens())
 	if info.ShouldIncludeUsage {
 		response := helper.GenerateFinalUsageResponse(id, info.StartTime.Unix(), info.UpstreamModelName, *usage)
 		err := helper.ObjectData(c, response)
@@ -105,7 +105,7 @@ func cfHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Response)
 	for _, choice := range response.Choices {
 		responseText += choice.Message.StringContent()
 	}
-	usage := service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.PromptTokens)
+	usage := service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.GetEstimatePromptTokens())
 	response.Usage = *usage
 	response.Id = helper.GetResponseID(c)
 	jsonResponse, err := json.Marshal(response)
@@ -142,10 +142,6 @@ func cfSTTHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Respon
 	c.Writer.WriteHeader(resp.StatusCode)
 	_, _ = c.Writer.Write(jsonResponse)

-	usage := &dto.Usage{}
-	usage.PromptTokens = info.PromptTokens
-	usage.CompletionTokens = service.CountTextToken(cfResp.Result.Text, info.UpstreamModelName)
-	usage.TotalTokens = usage.PromptTokens + usage.CompletionTokens
-
+	usage := service.ResponseText2Usage(c, cfResp.Result.Text, info.UpstreamModelName, info.GetEstimatePromptTokens())
 	return nil, usage
 }
--- a/relay/channel/cohere/relay-cohere.go
+++ b/relay/channel/cohere/relay-cohere.go
@@ -165,7 +165,7 @@ func cohereStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http
 		}
 	})
 	if usage.PromptTokens == 0 {
-		usage = service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.PromptTokens)
+		usage = service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.GetEstimatePromptTokens())
 	}
 	return usage, nil
 }
@@ -225,9 +225,9 @@ func cohereRerankHandler(c *gin.Context, resp *http.Response, info *relaycommon.
 	}
 	usage := dto.Usage{}
 	if cohereResp.Meta.BilledUnits.InputTokens == 0 {
-		usage.PromptTokens = info.PromptTokens
+		usage.PromptTokens = info.GetEstimatePromptTokens()
 		usage.CompletionTokens = 0
-		usage.TotalTokens = info.PromptTokens
+		usage.TotalTokens = info.GetEstimatePromptTokens()
 	} else {
 		usage.PromptTokens = cohereResp.Meta.BilledUnits.InputTokens
 		usage.CompletionTokens = cohereResp.Meta.BilledUnits.OutputTokens
--- a/relay/channel/dify/relay-dify.go
+++ b/relay/channel/dify/relay-dify.go
@@ -246,7 +246,7 @@ func difyStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.R
 	})
 	helper.Done(c)
 	if usage.TotalTokens == 0 {
-		usage = service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.PromptTokens)
+		usage = service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.GetEstimatePromptTokens())
 	}
 	usage.CompletionTokens += nodeToken
 	return usage, nil
--- a/relay/channel/gemini/adaptor.go
+++ b/relay/channel/gemini/adaptor.go
@@ -137,6 +137,8 @@ func (a *Adaptor) GetRequestURL(info *relaycommon.RelayInfo) (string, error) {
 			info.UpstreamModelName = strings.TrimSuffix(info.UpstreamModelName, "-thinking")
 		} else if strings.HasSuffix(info.UpstreamModelName, "-nothinking") {
 			info.UpstreamModelName = strings.TrimSuffix(info.UpstreamModelName, "-nothinking")
+		} else if baseModel, level := parseThinkingLevelSuffix(info.UpstreamModelName); level != "" {
+			info.UpstreamModelName = baseModel
 		}
 	}

--- a/relay/channel/gemini/relay-gemini-native.go
+++ b/relay/channel/gemini/relay-gemini-native.go
@@ -5,7 +5,6 @@ import (
 	"net/http"

 	"github.com/QuantumNous/new-api/common"
-	"github.com/QuantumNous/new-api/constant"
 	"github.com/QuantumNous/new-api/dto"
 	"github.com/QuantumNous/new-api/logger"
 	relaycommon "github.com/QuantumNous/new-api/relay/common"
@@ -70,12 +69,7 @@ func NativeGeminiEmbeddingHandler(c *gin.Context, resp *http.Response, info *rel
 		println(string(responseBody))
 	}

-	usage := &dto.Usage{
-		PromptTokens: info.PromptTokens,
-		TotalTokens:  info.PromptTokens,
-	}
-
-	common.SetContextKey(c, constant.ContextKeyLocalCountTokens, true)
+	usage := service.ResponseText2Usage(c, "", info.UpstreamModelName, info.GetEstimatePromptTokens())

 	if info.IsGeminiBatchEmbedding {
 		var geminiResponse dto.GeminiBatchEmbeddingResponse
--- a/relay/channel/gemini/relay-gemini.go
+++ b/relay/channel/gemini/relay-gemini.go
@@ -19,8 +19,8 @@ import (
 	"github.com/QuantumNous/new-api/relay/helper"
 	"github.com/QuantumNous/new-api/service"
 	"github.com/QuantumNous/new-api/setting/model_setting"
+	"github.com/QuantumNous/new-api/setting/reasoning"
 	"github.com/QuantumNous/new-api/types"
-
 	"github.com/gin-gonic/gin"
 )

@@ -122,6 +122,14 @@ func clampThinkingBudgetByEffort(modelName string, effort string) int {
 	return clampThinkingBudget(modelName, maxBudget)
 }

+func parseThinkingLevelSuffix(modelName string) (string, string) {
+	base, level, ok := reasoning.TrimEffortSuffix(modelName)
+	if !ok {
+		return modelName, ""
+	}
+	return base, level
+}
+
 func ThinkingAdaptor(geminiRequest *dto.GeminiChatRequest, info *relaycommon.RelayInfo, oaiRequest ...dto.GeneralOpenAIRequest) {
 	if model_setting.GetGeminiSettings().ThinkingAdapterEnabled {
 		modelName := info.UpstreamModelName
@@ -178,6 +186,12 @@ func ThinkingAdaptor(geminiRequest *dto.GeminiChatRequest, info *relaycommon.Rel
 					ThinkingBudget: common.GetPointer(0),
 				}
 			}
+		} else if _, level := parseThinkingLevelSuffix(modelName); level != "" {
+			geminiRequest.GenerationConfig.ThinkingConfig = &dto.GeminiThinkingConfig{
+				IncludeThoughts: true,
+				ThinkingLevel:   level,
+			}
+			info.ReasoningEffort = level
 		}
 	}
 }
@@ -208,6 +222,7 @@ func CovertOpenAI2Gemini(c *gin.Context, textRequest dto.GeneralOpenAIRequest, i

 	adaptorWithExtraBody := false

+	// patch extra_body
 	if len(textRequest.ExtraBody) > 0 {
 		if !strings.HasSuffix(info.UpstreamModelName, "-nothinking") {
 			var extraBody map[string]interface{}
@@ -239,6 +254,39 @@ func CovertOpenAI2Gemini(c *gin.Context, textRequest dto.GeneralOpenAIRequest, i
 						}
 					}
 				}
+
+				// check error param name like imageConfig, should be image_config
+				if _, hasErrorParam := googleBody["imageConfig"]; hasErrorParam {
+					return nil, errors.New("extra_body.google.imageConfig is not supported, use extra_body.google.image_config instead")
+				}
+
+				if imageConfig, ok := googleBody["image_config"].(map[string]interface{}); ok {
+					// check error param name like aspectRatio, should be aspect_ratio
+					if _, hasErrorParam := imageConfig["aspectRatio"]; hasErrorParam {
+						return nil, errors.New("extra_body.google.image_config.aspectRatio is not supported, use extra_body.google.image_config.aspect_ratio instead")
+					}
+					// check error param name like imageSize, should be image_size
+					if _, hasErrorParam := imageConfig["imageSize"]; hasErrorParam {
+						return nil, errors.New("extra_body.google.image_config.imageSize is not supported, use extra_body.google.image_config.image_size instead")
+					}
+
+					// convert snake_case to camelCase for Gemini API
+					geminiImageConfig := make(map[string]interface{})
+					if aspectRatio, ok := imageConfig["aspect_ratio"]; ok {
+						geminiImageConfig["aspectRatio"] = aspectRatio
+					}
+					if imageSize, ok := imageConfig["image_size"]; ok {
+						geminiImageConfig["imageSize"] = imageSize
+					}
+
+					if len(geminiImageConfig) > 0 {
+						imageConfigBytes, err := common.Marshal(geminiImageConfig)
+						if err != nil {
+							return nil, fmt.Errorf("failed to marshal image_config: %w", err)
+						}
+						geminiRequest.GenerationConfig.ImageConfig = imageConfigBytes
+					}
+				}
 			}
 		}
 	}
@@ -412,9 +460,68 @@ func CovertOpenAI2Gemini(c *gin.Context, textRequest dto.GeneralOpenAIRequest, i
 				if part.Text == "" {
 					continue
 				}
-				parts = append(parts, dto.GeminiPart{
-					Text: part.Text,
-				})
+				// check markdown image ![image](data:image/jpeg;base64,xxxxxxxxxxxx)
+				// 使用字符串查找而非正则，避免大文本性能问题
+				text := part.Text
+				hasMarkdownImage := false
+				for {
+					// 快速检查是否包含 markdown 图片标记
+					startIdx := strings.Index(text, "![")
+					if startIdx == -1 {
+						break
+					}
+					// 找到 ](
+					bracketIdx := strings.Index(text[startIdx:], "](data:")
+					if bracketIdx == -1 {
+						break
+					}
+					bracketIdx += startIdx
+					// 找到闭合的 )
+					closeIdx := strings.Index(text[bracketIdx+2:], ")")
+					if closeIdx == -1 {
+						break
+					}
+					closeIdx += bracketIdx + 2
+
+					hasMarkdownImage = true
+					// 添加图片前的文本
+					if startIdx > 0 {
+						textBefore := text[:startIdx]
+						if textBefore != "" {
+							parts = append(parts, dto.GeminiPart{
+								Text: textBefore,
+							})
+						}
+					}
+					// 提取 data URL (从 "](" 后面开始，到 ")" 之前)
+					dataUrl := text[bracketIdx+2 : closeIdx]
+					imageNum += 1
+					if constant.GeminiVisionMaxImageNum != -1 && imageNum > constant.GeminiVisionMaxImageNum {
+						return nil, fmt.Errorf("too many images in the message, max allowed is %d", constant.GeminiVisionMaxImageNum)
+					}
+					format, base64String, err := service.DecodeBase64FileData(dataUrl)
+					if err != nil {
+						return nil, fmt.Errorf("decode markdown base64 image data failed: %s", err.Error())
+					}
+					imgPart := dto.GeminiPart{
+						InlineData: &dto.GeminiInlineData{
+							MimeType: format,
+							Data:     base64String,
+						},
+					}
+					if shouldAttachThoughtSignature {
+						imgPart.ThoughtSignature = json.RawMessage(strconv.Quote(thoughtSignatureBypassValue))
+					}
+					parts = append(parts, imgPart)
+					// 继续处理剩余文本
+					text = text[closeIdx+1:]
+				}
+				// 添加剩余文本或原始文本（如果没有找到 markdown 图片）
+				if !hasMarkdownImage {
+					parts = append(parts, dto.GeminiPart{
+						Text: part.Text,
+					})
+				}
 			} else if part.Type == dto.ContentTypeImageURL {
 				imageNum += 1

@@ -484,6 +591,17 @@ func CovertOpenAI2Gemini(c *gin.Context, textRequest dto.GeneralOpenAIRequest, i
 			}
 		}

+		// 如果需要附加签名但还没有附加（没有 tool_calls 或 tool_calls 为空），
+		// 则在第一个文本 part 上附加 thoughtSignature
+		if shouldAttachThoughtSignature && !signatureAttached && len(parts) > 0 {
+			for i := range parts {
+				if parts[i].Text != "" {
+					parts[i].ThoughtSignature = json.RawMessage(strconv.Quote(thoughtSignatureBypassValue))
+					break
+				}
+			}
+		}
+
 		content.Parts = parts

 		// there's no assistant role in gemini and API shall vomit if Role is not user or model
@@ -1011,7 +1129,7 @@ func geminiStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http
 	if usage.CompletionTokens <= 0 {
 		str := responseText.String()
 		if len(str) > 0 {
-			usage = service.ResponseText2Usage(c, responseText.String(), info.UpstreamModelName, info.PromptTokens)
+			usage = service.ResponseText2Usage(c, responseText.String(), info.UpstreamModelName, info.GetEstimatePromptTokens())
 		} else {
 			usage = &dto.Usage{}
 		}
@@ -1184,11 +1302,7 @@ func GeminiEmbeddingHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *h
 	// Google has not yet clarified how embedding models will be billed
 	// refer to openai billing method to use input tokens billing
 	// https://platform.openai.com/docs/guides/embeddings#what-are-embeddings
-	usage := &dto.Usage{
-		PromptTokens:     info.PromptTokens,
-		CompletionTokens: 0,
-		TotalTokens:      info.PromptTokens,
-	}
+	usage := service.ResponseText2Usage(c, "", info.UpstreamModelName, info.GetEstimatePromptTokens())
 	openAIResponse.Usage = *usage

 	jsonResponse, jsonErr := common.Marshal(openAIResponse)
--- a/relay/channel/minimax/tts.go
+++ b/relay/channel/minimax/tts.go
@@ -163,7 +163,7 @@ func handleTTSResponse(c *gin.Context, resp *http.Response, info *relaycommon.Re
 	}

 	usage = &dto.Usage{
-		PromptTokens:     info.PromptTokens,
+		PromptTokens:     info.GetEstimatePromptTokens(),
 		CompletionTokens: 0,
 		TotalTokens:      int(minimaxResp.ExtraInfo.UsageCharacters),
 	}
--- a/relay/channel/moonshot/adaptor.go
+++ b/relay/channel/moonshot/adaptor.go
@@ -6,6 +6,7 @@ import (
 	"io"
 	"net/http"

+	channelconstant "github.com/QuantumNous/new-api/constant"
 	"github.com/QuantumNous/new-api/dto"
 	"github.com/QuantumNous/new-api/relay/channel"
 	"github.com/QuantumNous/new-api/relay/channel/claude"
@@ -44,6 +45,16 @@ func (a *Adaptor) Init(info *relaycommon.RelayInfo) {
 }

 func (a *Adaptor) GetRequestURL(info *relaycommon.RelayInfo) (string, error) {
+	baseURL := info.ChannelBaseUrl
+	if specialPlan, ok := channelconstant.ChannelSpecialBases[baseURL]; ok {
+		if info.RelayFormat == types.RelayFormatClaude {
+			return fmt.Sprintf("%s/v1/messages", specialPlan.ClaudeBaseURL), nil
+		}
+		if info.RelayFormat == types.RelayFormatOpenAI {
+			return fmt.Sprintf("%s/chat/completions", specialPlan.OpenAIBaseURL), nil
+		}
+	}
+
 	switch info.RelayFormat {
 	case types.RelayFormatClaude:
 		return fmt.Sprintf("%s/anthropic/v1/messages", info.ChannelBaseUrl), nil
--- a/relay/channel/openai/adaptor.go
+++ b/relay/channel/openai/adaptor.go
@@ -306,10 +306,11 @@ func (a *Adaptor) ConvertOpenAIRequest(c *gin.Context, info *relaycommon.RelayIn
 			request.Temperature = nil
 		}

+		// gpt-5系列模型适配 归零不再支持的参数
 		if strings.HasPrefix(info.UpstreamModelName, "gpt-5") {
-			if info.UpstreamModelName != "gpt-5-chat-latest" {
-				request.Temperature = nil
-			}
+			request.Temperature = nil
+			request.TopP = 0 // oai 的 top_p 默认值是 1.0，但是为了 omitempty 属性直接不传，这里显式设置为 0
+			request.LogProbs = false
 		}

 		// 转换模型推理力度后缀
--- a/relay/channel/openai/helper.go
+++ b/relay/channel/openai/helper.go
@@ -5,7 +5,6 @@ import (
 	"strings"

 	"github.com/QuantumNous/new-api/common"
-	"github.com/QuantumNous/new-api/constant"
 	"github.com/QuantumNous/new-api/dto"
 	"github.com/QuantumNous/new-api/logger"
 	relaycommon "github.com/QuantumNous/new-api/relay/common"
@@ -19,26 +18,10 @@ import (
 	"github.com/gin-gonic/gin"
 )

-// HandleStreamFormat processes a streaming response payload according to the provided RelayInfo and forwards it to the appropriate format-specific handler.
-//
-// It increments info.SendResponseCount, optionally converts OpenRouter "reasoning" fields to "reasoning_content" when the channel is OpenRouter and OpenRouterConvertToOpenAI is enabled, and then dispatches the (possibly modified) JSON string to the handler for the configured RelayFormat (OpenAI, Claude, or Gemini). It returns any error produced by the selected handler or nil if no handler is invoked.
+// 辅助函数
 func HandleStreamFormat(c *gin.Context, info *relaycommon.RelayInfo, data string, forceFormat bool, thinkToContent bool) error {
 	info.SendResponseCount++

-	// OpenRouter reasoning 字段转换：reasoning -> reasoning_content
-	// 仅当启用转换为OpenAI兼容格式时执行
-	if info.ChannelType == constant.ChannelTypeOpenRouter && info.ChannelOtherSettings.OpenRouterConvertToOpenAI {
-		var streamResponse dto.ChatCompletionsStreamResponse
-		if err := common.Unmarshal(common.StringToByteSlice(data), &streamResponse); err == nil {
-			convertOpenRouterReasoningFieldsStream(&streamResponse)
-			// 重新序列化为JSON
-			newData, err := common.Marshal(streamResponse)
-			if err == nil {
-				data = string(newData)
-			}
-		}
-	}
-
 	switch info.RelayFormat {
 	case types.RelayFormatOpenAI:
 		return sendStreamData(c, info, data, forceFormat, thinkToContent)
@@ -270,26 +253,9 @@ func HandleFinalResponse(c *gin.Context, info *relaycommon.RelayInfo, lastStream
 	}
 }

-// sendResponsesStreamData sends a non-empty data chunk for the given stream response to the client.
-// If data is empty, it returns without sending anything.
 func sendResponsesStreamData(c *gin.Context, streamResponse dto.ResponsesStreamResponse, data string) {
 	if data == "" {
 		return
 	}
 	helper.ResponseChunkData(c, streamResponse, data)
 }
-
-// convertOpenRouterReasoningFieldsStream converts each choice's `Delta` in a streaming ChatCompletions response
-// by normalizing any `reasoning` fields into `reasoning_content`.
-// It applies ConvertReasoningField to every choice's Delta and is a no-op if `response` is nil or has no choices.
-func convertOpenRouterReasoningFieldsStream(response *dto.ChatCompletionsStreamResponse) {
-	if response == nil || len(response.Choices) == 0 {
-		return
-	}
-
-	// 遍历所有choices，对每个Delta使用统一的泛型函数进行转换
-	for i := range response.Choices {
-		choice := &response.Choices[i]
-		ConvertReasoningField(&choice.Delta)
-	}
-}
--- a/relay/channel/openai/reasoning_converter.go
+++ b/relay/channel/openai/reasoning_converter.go
@@ -1,35 +0,0 @@
-package openai
-
-// ReasoningHolder 定义一个通用的接口，用于操作包含reasoning字段的结构体
-type ReasoningHolder interface {
-	// 获取reasoning字段的值
-	GetReasoning() string
-	// 设置reasoning字段的值
-	SetReasoning(reasoning string)
-	// 获取reasoning_content字段的值
-	GetReasoningContent() string
-	// 设置reasoning_content字段的值
-	SetReasoningContent(reasoningContent string)
-}
-
-// ConvertReasoningField 通用的reasoning字段转换函数
-// 将reasoning字段的内容移动到reasoning_content字段
-// ConvertReasoningField moves the holder's reasoning into its reasoning content and clears the original reasoning field.
-// If GetReasoning returns an empty string, the holder is unchanged. When clearing, types that implement SetReasoningToNil()
-// will have that method invoked; otherwise SetReasoning("") is used.
-func ConvertReasoningField[T ReasoningHolder](holder T) {
-	reasoning := holder.GetReasoning()
-	if reasoning != "" {
-		holder.SetReasoningContent(reasoning)
-	}
-	
-	// 使用类型断言来智能清理reasoning字段
-	switch h := any(holder).(type) {
-	case interface{ SetReasoningToNil() }:
-		// 流式响应：指针类型，设为nil
-		h.SetReasoningToNil()
-	default:
-		// 非流式响应：值类型，设为空字符串
-		holder.SetReasoning("")
-	}
-}
--- a/relay/channel/openai/relay-openai.go
+++ b/relay/channel/openai/relay-openai.go
@@ -183,7 +183,7 @@ func OaiStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Re
 	}

 	if !containStreamUsage {
-		usage = service.ResponseText2Usage(c, responseTextBuilder.String(), info.UpstreamModelName, info.PromptTokens)
+		usage = service.ResponseText2Usage(c, responseTextBuilder.String(), info.UpstreamModelName, info.GetEstimatePromptTokens())
 		usage.CompletionTokens += toolCount * 7
 	}

@@ -194,25 +194,6 @@ func OaiStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Re
 	return usage, nil
 }

-// OpenaiHandler processes an upstream OpenAI-like HTTP response, normalizes or infers token usage,
-// optionally converts OpenRouter reasoning fields to OpenAI-compatible `reasoning_content`, adapts
-// the response to the configured relay format (OpenAI, Claude, or Gemini), writes the final body
-// to the client, and returns the computed usage.
-//
-// It will:
-// - Handle OpenRouter enterprise wrapper responses when the channel is OpenRouter Enterprise.
-// - Unmarshal the upstream body into an internal simple response and, when configured,
-//   convert OpenRouter `reasoning` fields into `reasoning_content`.
-// - If usage prompt tokens are missing, infer completion tokens by counting tokens in choices
-//   (falling back to per-choice text token counting) and set Prompt/Completion/Total tokens.
-// - Apply channel-specific post-processing to usage (cached token adjustments).
-// - Depending on RelayFormat and channel settings, inject updated usage into the body,
-//   reserialize the converted simple response when ForceFormat is enabled or when OpenRouter
-//   conversion was applied, or convert the response to Claude/Gemini formats.
-// - Write the final response body to the client via a graceful copy helper.
-//
-// Returns the final usage (possibly inferred or modified) or a NewAPIError describing any failure
-// encountered while reading, parsing, or transforming the upstream response.
 func OpenaiHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Response) (*dto.Usage, *types.NewAPIError) {
 	defer service.CloseResponseBodyGracefully(resp)

@@ -245,12 +226,6 @@ func OpenaiHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Respo
 		return nil, types.NewOpenAIError(err, types.ErrorCodeBadResponseBody, http.StatusInternalServerError)
 	}

-	// OpenRouter reasoning 字段转换：reasoning -> reasoning_content
-	// 仅当启用转换为OpenAI兼容格式时执行（修改现有无条件转换）
-	if info.ChannelType == constant.ChannelTypeOpenRouter && info.ChannelOtherSettings.OpenRouterConvertToOpenAI {
-		convertOpenRouterReasoningFields(&simpleResponse)
-	}
-
 	if oaiError := simpleResponse.GetOpenAIError(); oaiError != nil && oaiError.Type != "" {
 		return nil, types.WithOpenAIError(*oaiError, resp.StatusCode)
 	}
@@ -270,9 +245,9 @@ func OpenaiHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Respo
 			}
 		}
 		simpleResponse.Usage = dto.Usage{
-			PromptTokens:     info.PromptTokens,
+			PromptTokens:     info.GetEstimatePromptTokens(),
 			CompletionTokens: completionTokens,
-			TotalTokens:      info.PromptTokens + completionTokens,
+			TotalTokens:      info.GetEstimatePromptTokens() + completionTokens,
 		}
 		usageModified = true
 	}
@@ -296,13 +271,6 @@ func OpenaiHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Respo
 				return nil, types.NewError(err, types.ErrorCodeBadResponseBody)
 			}
 		} else {
-			// 对于 OpenRouter，仅在执行转换后重新序列化
-			if info.ChannelType == constant.ChannelTypeOpenRouter && info.ChannelOtherSettings.OpenRouterConvertToOpenAI {
-				responseBody, err = common.Marshal(simpleResponse)
-				if err != nil {
-					return nil, types.NewError(err, types.ErrorCodeBadResponseBody)
-				}
-			}
 			break
 		}
 	case types.RelayFormatClaude:
@@ -368,8 +336,8 @@ func OpenaiTTSHandler(c *gin.Context, resp *http.Response, info *relaycommon.Rel
 	// and can be terminated directly.
 	defer service.CloseResponseBodyGracefully(resp)
 	usage := &dto.Usage{}
-	usage.PromptTokens = info.PromptTokens
-	usage.TotalTokens = info.PromptTokens
+	usage.PromptTokens = info.GetEstimatePromptTokens()
+	usage.TotalTokens = info.GetEstimatePromptTokens()
 	for k, v := range resp.Header {
 		c.Writer.Header().Set(k, v[0])
 	}
@@ -415,7 +383,7 @@ func OpenaiSTTHandler(c *gin.Context, resp *http.Response, info *relaycommon.Rel
 	}

 	usage := &dto.Usage{}
-	usage.PromptTokens = info.PromptTokens
+	usage.PromptTokens = info.GetEstimatePromptTokens()
 	usage.CompletionTokens = 0
 	usage.TotalTokens = usage.PromptTokens + usage.CompletionTokens
 	return nil, usage
@@ -704,10 +672,6 @@ func applyUsagePostProcessing(info *relaycommon.RelayInfo, usage *dto.Usage, res
 	}
 }

-// extractCachedTokensFromBody extracts a cached token count from a JSON response body.
-// It looks for cached token values in the following fields (in order): `usage.prompt_tokens_details.cached_tokens`,
-// `usage.cached_tokens`, and `usage.prompt_cache_hit_tokens`. It returns the first found value and `true`;
-// if none are present or the body cannot be parsed, it returns 0 and `false`.
 func extractCachedTokensFromBody(body []byte) (int, bool) {
 	if len(body) == 0 {
 		return 0, false
@@ -738,18 +702,3 @@ func extractCachedTokensFromBody(body []byte) (int, bool) {
 	}
 	return 0, false
 }
-
-// convertOpenRouterReasoningFields 转换OpenRouter响应中的reasoning字段为reasoning_content
-// convertOpenRouterReasoningFields converts OpenRouter-style `reasoning` fields into `reasoning_content` for every choice's message in the provided OpenAITextResponse.
-// It modifies the response in place and is a no-op if `response` is nil or contains no choices.
-func convertOpenRouterReasoningFields(response *dto.OpenAITextResponse) {
-	if response == nil || len(response.Choices) == 0 {
-		return
-	}
-
-	// 遍历所有choices，对每个Message使用统一的泛型函数进行转换
-	for i := range response.Choices {
-		choice := &response.Choices[i]
-		ConvertReasoningField(&choice.Message)
-	}
-}
--- a/relay/channel/openai/relay_responses.go
+++ b/relay/channel/openai/relay_responses.go
@@ -141,7 +141,7 @@ func OaiResponsesStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp
 	}

 	if usage.PromptTokens == 0 && usage.CompletionTokens != 0 {
-		usage.PromptTokens = info.PromptTokens
+		usage.PromptTokens = info.GetEstimatePromptTokens()
 	}

 	usage.TotalTokens = usage.PromptTokens + usage.CompletionTokens
--- a/relay/channel/palm/adaptor.go
+++ b/relay/channel/palm/adaptor.go
@@ -81,7 +81,7 @@ func (a *Adaptor) DoResponse(c *gin.Context, resp *http.Response, info *relaycom
 	if info.IsStream {
 		var responseText string
 		err, responseText = palmStreamHandler(c, resp)
-		usage = service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.PromptTokens)
+		usage = service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.GetEstimatePromptTokens())
 	} else {
 		usage, err = palmHandler(c, info, resp)
 	}
--- a/relay/channel/palm/relay-palm.go
+++ b/relay/channel/palm/relay-palm.go
@@ -121,13 +121,8 @@ func palmHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Respons
 		}, resp.StatusCode)
 	}
 	fullTextResponse := responsePaLM2OpenAI(&palmResponse)
-	completionTokens := service.CountTextToken(palmResponse.Candidates[0].Content, info.UpstreamModelName)
-	usage := dto.Usage{
-		PromptTokens:     info.PromptTokens,
-		CompletionTokens: completionTokens,
-		TotalTokens:      info.PromptTokens + completionTokens,
-	}
-	fullTextResponse.Usage = usage
+	usage := service.ResponseText2Usage(c, palmResponse.Candidates[0].Content, info.UpstreamModelName, info.GetEstimatePromptTokens())
+	fullTextResponse.Usage = *usage
 	jsonResponse, err := common.Marshal(fullTextResponse)
 	if err != nil {
 		return nil, types.NewError(err, types.ErrorCodeBadResponseBody)
@@ -135,5 +130,5 @@ func palmHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Respons
 	c.Writer.Header().Set("Content-Type", "application/json")
 	c.Writer.WriteHeader(resp.StatusCode)
 	service.IOCopyBytesGracefully(c, resp, jsonResponse)
-	return &usage, nil
+	return usage, nil
 }
--- a/relay/channel/task/ali/adaptor.go
+++ b/relay/channel/task/ali/adaptor.go
@@ -393,7 +393,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, info *rela
 }

 // FetchTask 查询任务状态
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -408,7 +408,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http

 	req.Header.Set("Authorization", "Bearer "+key)

-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
+	if err != nil {
+		return nil, fmt.Errorf("new proxy http client failed: %w", err)
+	}
+	return client.Do(req)
 }

 func (a *TaskAdaptor) GetModelList() []string {
--- a/relay/channel/task/doubao/adaptor.go
+++ b/relay/channel/task/doubao/adaptor.go
@@ -146,7 +146,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, info *rela
 }

 // FetchTask fetch task status
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -163,7 +163,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	req.Header.Set("Content-Type", "application/json")
 	req.Header.Set("Authorization", "Bearer "+key)

-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
+	if err != nil {
+		return nil, fmt.Errorf("new proxy http client failed: %w", err)
+	}
+	return client.Do(req)
 }

 func (a *TaskAdaptor) GetModelList() []string {
--- a/relay/channel/task/gemini/adaptor.go
+++ b/relay/channel/task/gemini/adaptor.go
@@ -200,7 +200,7 @@ func (a *TaskAdaptor) GetChannelName() string {
 }

 // FetchTask fetch task status
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -223,7 +223,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	req.Header.Set("Accept", "application/json")
 	req.Header.Set("x-goog-api-key", key)

-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
+	if err != nil {
+		return nil, fmt.Errorf("new proxy http client failed: %w", err)
+	}
+	return client.Do(req)
 }

 func (a *TaskAdaptor) ParseTaskResult(respBody []byte) (*relaycommon.TaskInfo, error) {
--- a/relay/channel/task/hailuo/adaptor.go
+++ b/relay/channel/task/hailuo/adaptor.go
@@ -110,7 +110,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, info *rela
 	return hResp.TaskID, responseBody, nil
 }

-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -126,7 +126,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	req.Header.Set("Accept", "application/json")
 	req.Header.Set("Authorization", "Bearer "+key)

-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
+	if err != nil {
+		return nil, fmt.Errorf("new proxy http client failed: %w", err)
+	}
+	return client.Do(req)
 }

 func (a *TaskAdaptor) GetModelList() []string {
--- a/relay/channel/task/jimeng/adaptor.go
+++ b/relay/channel/task/jimeng/adaptor.go
@@ -210,7 +210,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, info *rela
 }

 // FetchTask fetch task status
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -251,7 +251,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 			return nil, errors.Wrap(err, "sign request failed")
 		}
 	}
-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
+	if err != nil {
+		return nil, fmt.Errorf("new proxy http client failed: %w", err)
+	}
+	return client.Do(req)
 }

 func (a *TaskAdaptor) GetModelList() []string {
--- a/relay/channel/task/kling/adaptor.go
+++ b/relay/channel/task/kling/adaptor.go
@@ -199,7 +199,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, info *rela
 }

 // FetchTask fetch task status
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -228,7 +228,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	req.Header.Set("Authorization", "Bearer "+token)
 	req.Header.Set("User-Agent", "kling-sdk/1.0")

-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
+	if err != nil {
+		return nil, fmt.Errorf("new proxy http client failed: %w", err)
+	}
+	return client.Do(req)
 }

 func (a *TaskAdaptor) GetModelList() []string {
--- a/relay/channel/task/sora/adaptor.go
+++ b/relay/channel/task/sora/adaptor.go
@@ -5,8 +5,10 @@ import (
 	"fmt"
 	"io"
 	"net/http"
+	"strings"

 	"github.com/QuantumNous/new-api/common"
+	"github.com/QuantumNous/new-api/constant"
 	"github.com/QuantumNous/new-api/dto"
 	"github.com/QuantumNous/new-api/model"
 	"github.com/QuantumNous/new-api/relay/channel"
@@ -67,11 +69,30 @@ func (a *TaskAdaptor) Init(info *relaycommon.RelayInfo) {
 	a.apiKey = info.ApiKey
 }

+func validateRemixRequest(c *gin.Context) *dto.TaskError {
+	var req struct {
+		Prompt string `json:"prompt"`
+	}
+	if err := common.UnmarshalBodyReusable(c, &req); err != nil {
+		return service.TaskErrorWrapperLocal(err, "invalid_request", http.StatusBadRequest)
+	}
+	if strings.TrimSpace(req.Prompt) == "" {
+		return service.TaskErrorWrapperLocal(fmt.Errorf("field prompt is required"), "invalid_request", http.StatusBadRequest)
+	}
+	return nil
+}
+
 func (a *TaskAdaptor) ValidateRequestAndSetAction(c *gin.Context, info *relaycommon.RelayInfo) (taskErr *dto.TaskError) {
+	if info.Action == constant.TaskActionRemix {
+		return validateRemixRequest(c)
+	}
 	return relaycommon.ValidateMultipartDirect(c, info)
 }

 func (a *TaskAdaptor) BuildRequestURL(info *relaycommon.RelayInfo) (string, error) {
+	if info.Action == constant.TaskActionRemix {
+		return fmt.Sprintf("%s/v1/videos/%s/remix", a.baseURL, info.OriginTaskID), nil
+	}
 	return fmt.Sprintf("%s/v1/videos", a.baseURL), nil
 }

@@ -125,7 +146,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, _ *relayco
 }

 // FetchTask fetch task status
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -140,7 +161,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http

 	req.Header.Set("Authorization", "Bearer "+key)

-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
+	if err != nil {
+		return nil, fmt.Errorf("new proxy http client failed: %w", err)
+	}
+	return client.Do(req)
 }

 func (a *TaskAdaptor) GetModelList() []string {
--- a/relay/channel/task/suno/adaptor.go
+++ b/relay/channel/task/suno/adaptor.go
@@ -132,7 +132,7 @@ func (a *TaskAdaptor) GetChannelName() string {
 	return ChannelName
 }

-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	requestUrl := fmt.Sprintf("%s/suno/fetch", baseUrl)
 	byteBody, err := json.Marshal(body)
 	if err != nil {
@@ -153,11 +153,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	req = req.WithContext(ctx)
 	req.Header.Set("Content-Type", "application/json")
 	req.Header.Set("Authorization", "Bearer "+key)
-	resp, err := service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
 	if err != nil {
-		return nil, err
+		return nil, fmt.Errorf("new proxy http client failed: %w", err)
 	}
-	return resp, nil
+	return client.Do(req)
 }

 func actionValidate(c *gin.Context, sunoRequest *dto.SunoSubmitReq, action string) (err error) {
--- a/relay/channel/task/vertex/adaptor.go
+++ b/relay/channel/task/vertex/adaptor.go
@@ -12,7 +12,6 @@ import (

 	"github.com/QuantumNous/new-api/common"
 	"github.com/QuantumNous/new-api/model"
-
 	"github.com/gin-gonic/gin"

 	"github.com/QuantumNous/new-api/constant"
@@ -121,7 +120,11 @@ func (a *TaskAdaptor) BuildRequestHeader(c *gin.Context, req *http.Request, info
 		return fmt.Errorf("failed to decode credentials: %w", err)
 	}

-	token, err := vertexcore.AcquireAccessToken(*adc, "")
+	proxy := ""
+	if info != nil {
+		proxy = info.ChannelSetting.Proxy
+	}
+	token, err := vertexcore.AcquireAccessToken(*adc, proxy)
 	if err != nil {
 		return fmt.Errorf("failed to acquire access token: %w", err)
 	}
@@ -147,13 +150,40 @@ func (a *TaskAdaptor) BuildRequestBody(c *gin.Context, info *relaycommon.RelayIn
 			body.Parameters["storageUri"] = v
 		}
 		if v, ok := req.Metadata["sampleCount"]; ok {
-			body.Parameters["sampleCount"] = v
+			if i, ok := v.(int); ok {
+				body.Parameters["sampleCount"] = i
+			}
+			if f, ok := v.(float64); ok {
+				body.Parameters["sampleCount"] = int(f)
+			}
 		}
 	}
 	if _, ok := body.Parameters["sampleCount"]; !ok {
 		body.Parameters["sampleCount"] = 1
 	}

+	if body.Parameters["sampleCount"].(int) <= 0 {
+		return nil, fmt.Errorf("sampleCount must be greater than 0")
+	}
+
+	// if req.Duration > 0 {
+	// 	body.Parameters["durationSeconds"] = req.Duration
+	// } else if req.Seconds != "" {
+	// 	seconds, err := strconv.Atoi(req.Seconds)
+	// 	if err != nil {
+	// 		return nil, errors.Wrap(err, "convert seconds to int failed")
+	// 	}
+	// 	body.Parameters["durationSeconds"] = seconds
+	// }
+
+	info.PriceData.OtherRatios = map[string]float64{
+		"sampleCount": float64(body.Parameters["sampleCount"].(int)),
+	}
+
+	// if v, ok := body.Parameters["durationSeconds"]; ok {
+	// 	info.PriceData.OtherRatios["durationSeconds"] = float64(v.(int))
+	// }
+
 	data, err := json.Marshal(body)
 	if err != nil {
 		return nil, err
@@ -190,7 +220,7 @@ func (a *TaskAdaptor) GetModelList() []string { return []string{"veo-3.0-generat
 func (a *TaskAdaptor) GetChannelName() string { return "vertex" }

 // FetchTask fetch task status
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -223,7 +253,7 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	if err := json.Unmarshal([]byte(key), adc); err != nil {
 		return nil, fmt.Errorf("failed to decode credentials: %w", err)
 	}
-	token, err := vertexcore.AcquireAccessToken(*adc, "")
+	token, err := vertexcore.AcquireAccessToken(*adc, proxy)
 	if err != nil {
 		return nil, fmt.Errorf("failed to acquire access token: %w", err)
 	}
@@ -235,7 +265,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	req.Header.Set("Accept", "application/json")
 	req.Header.Set("Authorization", "Bearer "+token)
 	req.Header.Set("x-goog-user-project", adc.ProjectID)
-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
+	if err != nil {
+		return nil, fmt.Errorf("new proxy http client failed: %w", err)
+	}
+	return client.Do(req)
 }

 func (a *TaskAdaptor) ParseTaskResult(respBody []byte) (*relaycommon.TaskInfo, error) {
--- a/relay/channel/task/vidu/adaptor.go
+++ b/relay/channel/task/vidu/adaptor.go
@@ -188,7 +188,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, info *rela
 	return vResp.TaskId, responseBody, nil
 }

-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -204,7 +204,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	req.Header.Set("Accept", "application/json")
 	req.Header.Set("Authorization", "Token "+key)

-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
+	if err != nil {
+		return nil, fmt.Errorf("new proxy http client failed: %w", err)
+	}
+	return client.Do(req)
 }

 func (a *TaskAdaptor) GetModelList() []string {
--- a/relay/channel/tencent/relay-tencent.go
+++ b/relay/channel/tencent/relay-tencent.go
@@ -105,7 +105,7 @@ func tencentStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *htt
 		data = strings.TrimPrefix(data, "data:")

 		var tencentResponse TencentChatResponse
-		err := json.Unmarshal([]byte(data), &tencentResponse)
+		err := common.Unmarshal([]byte(data), &tencentResponse)
 		if err != nil {
 			common.SysLog("error unmarshalling stream response: " + err.Error())
 			continue
@@ -130,7 +130,7 @@ func tencentStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *htt

 	service.CloseResponseBodyGracefully(resp)

-	return service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.PromptTokens), nil
+	return service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.GetEstimatePromptTokens()), nil
 }

 func tencentHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Response) (*dto.Usage, *types.NewAPIError) {
--- a/relay/channel/vertex/adaptor.go
+++ b/relay/channel/vertex/adaptor.go
@@ -17,6 +17,7 @@ import (
 	relaycommon "github.com/QuantumNous/new-api/relay/common"
 	"github.com/QuantumNous/new-api/relay/constant"
 	"github.com/QuantumNous/new-api/setting/model_setting"
+	"github.com/QuantumNous/new-api/setting/reasoning"
 	"github.com/QuantumNous/new-api/types"

 	"github.com/gin-gonic/gin"
@@ -39,6 +40,7 @@ var claudeModelMap = map[string]string{
 	"claude-opus-4-20250514":     "claude-opus-4@20250514",
 	"claude-opus-4-1-20250805":   "claude-opus-4-1@20250805",
 	"claude-sonnet-4-5-20250929": "claude-sonnet-4-5@20250929",
+	"claude-opus-4-5-20251101":   "claude-opus-4-5@20251101",
 }

 const anthropicVersion = "vertex-2023-10-16"
@@ -180,6 +182,8 @@ func (a *Adaptor) GetRequestURL(info *relaycommon.RelayInfo) (string, error) {
 				info.UpstreamModelName = strings.TrimSuffix(info.UpstreamModelName, "-thinking")
 			} else if strings.HasSuffix(info.UpstreamModelName, "-nothinking") {
 				info.UpstreamModelName = strings.TrimSuffix(info.UpstreamModelName, "-nothinking")
+			} else if baseModel, level, ok := reasoning.TrimEffortSuffix(info.UpstreamModelName); ok && level != "" {
+				info.UpstreamModelName = baseModel
 			}
 		}

--- a/relay/channel/volcengine/adaptor.go
+++ b/relay/channel/volcengine/adaptor.go
@@ -13,6 +13,7 @@ import (
 	channelconstant "github.com/QuantumNous/new-api/constant"
 	"github.com/QuantumNous/new-api/dto"
 	"github.com/QuantumNous/new-api/relay/channel"
+	"github.com/QuantumNous/new-api/relay/channel/claude"
 	"github.com/QuantumNous/new-api/relay/channel/openai"
 	relaycommon "github.com/QuantumNous/new-api/relay/common"
 	"github.com/QuantumNous/new-api/relay/constant"
@@ -23,11 +24,8 @@ import (
 )

 const (
-	contextKeyTTSRequest          = "volcengine_tts_request"
-	contextKeyResponseFormat      = "response_format"
-	DoubaoCodingPlan              = "doubao-coding-plan"
-	DoubaoCodingPlanClaudeBaseURL = "https://ark.cn-beijing.volces.com/api/coding"
-	DoubaoCodingPlanOpenAIBaseURL = "https://ark.cn-beijing.volces.com/api/coding/v3"
+	contextKeyTTSRequest     = "volcengine_tts_request"
+	contextKeyResponseFormat = "response_format"
 )

 type Adaptor struct {
@@ -39,6 +37,10 @@ func (a *Adaptor) ConvertGeminiRequest(*gin.Context, *relaycommon.RelayInfo, *dt
 }

 func (a *Adaptor) ConvertClaudeRequest(c *gin.Context, info *relaycommon.RelayInfo, req *dto.ClaudeRequest) (any, error) {
+	if _, ok := channelconstant.ChannelSpecialBases[info.ChannelBaseUrl]; ok {
+		adaptor := claude.Adaptor{}
+		return adaptor.ConvertClaudeRequest(c, info, req)
+	}
 	adaptor := openai.Adaptor{}
 	return adaptor.ConvertClaudeRequest(c, info, req)
 }
@@ -238,11 +240,12 @@ func (a *Adaptor) GetRequestURL(info *relaycommon.RelayInfo) (string, error) {
 	if baseUrl == "" {
 		baseUrl = channelconstant.ChannelBaseURLs[channelconstant.ChannelTypeVolcEngine]
 	}
+	specialPlan, hasSpecialPlan := channelconstant.ChannelSpecialBases[baseUrl]

 	switch info.RelayFormat {
 	case types.RelayFormatClaude:
-		if baseUrl == DoubaoCodingPlan {
-			return fmt.Sprintf("%s/v1/messages", DoubaoCodingPlanClaudeBaseURL), nil
+		if hasSpecialPlan && specialPlan.ClaudeBaseURL != "" {
+			return fmt.Sprintf("%s/v1/messages", specialPlan.ClaudeBaseURL), nil
 		}
 		if strings.HasPrefix(info.UpstreamModelName, "bot") {
 			return fmt.Sprintf("%s/api/v3/bots/chat/completions", baseUrl), nil
@@ -251,8 +254,8 @@ func (a *Adaptor) GetRequestURL(info *relaycommon.RelayInfo) (string, error) {
 	default:
 		switch info.RelayMode {
 		case constant.RelayModeChatCompletions:
-			if baseUrl == DoubaoCodingPlan {
-				return fmt.Sprintf("%s/chat/completions", DoubaoCodingPlanOpenAIBaseURL), nil
+			if hasSpecialPlan && specialPlan.OpenAIBaseURL != "" {
+				return fmt.Sprintf("%s/chat/completions", specialPlan.OpenAIBaseURL), nil
 			}
 			if strings.HasPrefix(info.UpstreamModelName, "bot") {
 				return fmt.Sprintf("%s/api/v3/bots/chat/completions", baseUrl), nil
@@ -340,6 +343,15 @@ func (a *Adaptor) DoRequest(c *gin.Context, info *relaycommon.RelayInfo, request
 }

 func (a *Adaptor) DoResponse(c *gin.Context, resp *http.Response, info *relaycommon.RelayInfo) (usage any, err *types.NewAPIError) {
+	if info.RelayFormat == types.RelayFormatClaude {
+		if _, ok := channelconstant.ChannelSpecialBases[info.ChannelBaseUrl]; ok {
+			if info.IsStream {
+				return claude.ClaudeStreamHandler(c, resp, info, claude.RequestModeMessage)
+			}
+			return claude.ClaudeHandler(c, resp, info, claude.RequestModeMessage)
+		}
+	}
+
 	if info.RelayMode == constant.RelayModeAudioSpeech {
 		encoding := mapEncoding(c.GetString(contextKeyResponseFormat))
 		if info.IsStream {
--- a/relay/channel/volcengine/protocols.go
+++ b/relay/channel/volcengine/protocols.go
@@ -385,7 +385,7 @@ func (m *Message) writeSessionID(buf *bytes.Buffer) error {
 	}

 	size := len(m.SessionID)
-	if size > math.MaxUint32 {
+	if int64(size) > math.MaxUint32 {
 		return fmt.Errorf("session ID size (%d) exceeds max(uint32)", size)
 	}

@@ -407,7 +407,7 @@ func (m *Message) writeErrorCode(buf *bytes.Buffer) error {

 func (m *Message) writePayload(buf *bytes.Buffer) error {
 	size := len(m.Payload)
-	if size > math.MaxUint32 {
+	if int64(size) > math.MaxUint32 {
 		return fmt.Errorf("payload size (%d) exceeds max(uint32)", size)
 	}

--- a/relay/channel/volcengine/tts.go
+++ b/relay/channel/volcengine/tts.go
@@ -184,9 +184,9 @@ func handleTTSResponse(c *gin.Context, resp *http.Response, info *relaycommon.Re
 	c.Data(http.StatusOK, contentType, audioData)

 	usage = &dto.Usage{
-		PromptTokens:     info.PromptTokens,
+		PromptTokens:     info.GetEstimatePromptTokens(),
 		CompletionTokens: 0,
-		TotalTokens:      info.PromptTokens,
+		TotalTokens:      info.GetEstimatePromptTokens(),
 	}

 	return usage, nil
@@ -284,9 +284,9 @@ func handleTTSWebSocketResponse(c *gin.Context, requestURL string, volcRequest V
 			if msg.Sequence < 0 {
 				c.Status(http.StatusOK)
 				usage = &dto.Usage{
-					PromptTokens:     info.PromptTokens,
+					PromptTokens:     info.GetEstimatePromptTokens(),
 					CompletionTokens: 0,
-					TotalTokens:      info.PromptTokens,
+					TotalTokens:      info.GetEstimatePromptTokens(),
 				}
 				return usage, nil
 			}
@@ -297,9 +297,9 @@ func handleTTSWebSocketResponse(c *gin.Context, requestURL string, volcRequest V

 	c.Status(http.StatusOK)
 	usage = &dto.Usage{
-		PromptTokens:     info.PromptTokens,
+		PromptTokens:     info.GetEstimatePromptTokens(),
 		CompletionTokens: 0,
-		TotalTokens:      info.PromptTokens,
+		TotalTokens:      info.GetEstimatePromptTokens(),
 	}
 	return usage, nil
 }
--- a/relay/channel/xai/text.go
+++ b/relay/channel/xai/text.go
@@ -70,7 +70,7 @@ func xAIStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Re
 	})

 	if !containStreamUsage {
-		usage = service.ResponseText2Usage(c, responseTextBuilder.String(), info.UpstreamModelName, info.PromptTokens)
+		usage = service.ResponseText2Usage(c, responseTextBuilder.String(), info.UpstreamModelName, info.GetEstimatePromptTokens())
 		usage.CompletionTokens += toolCount * 7
 	}

--- a/relay/channel/zhipu_4v/adaptor.go
+++ b/relay/channel/zhipu_4v/adaptor.go
@@ -6,6 +6,7 @@ import (
 	"io"
 	"net/http"

+	channelconstant "github.com/QuantumNous/new-api/constant"
 	"github.com/QuantumNous/new-api/dto"
 	"github.com/QuantumNous/new-api/relay/channel"
 	"github.com/QuantumNous/new-api/relay/channel/claude"
@@ -35,23 +36,39 @@ func (a *Adaptor) ConvertAudioRequest(c *gin.Context, info *relaycommon.RelayInf
 }

 func (a *Adaptor) ConvertImageRequest(c *gin.Context, info *relaycommon.RelayInfo, request dto.ImageRequest) (any, error) {
-	//TODO implement me
-	return nil, errors.New("not implemented")
+	return request, nil
 }

 func (a *Adaptor) Init(info *relaycommon.RelayInfo) {
 }

 func (a *Adaptor) GetRequestURL(info *relaycommon.RelayInfo) (string, error) {
+	baseURL := info.ChannelBaseUrl
+	if baseURL == "" {
+		baseURL = channelconstant.ChannelBaseURLs[channelconstant.ChannelTypeZhipu_v4]
+	}
+	specialPlan, hasSpecialPlan := channelconstant.ChannelSpecialBases[baseURL]
+
 	switch info.RelayFormat {
 	case types.RelayFormatClaude:
-		return fmt.Sprintf("%s/api/anthropic/v1/messages", info.ChannelBaseUrl), nil
+		if hasSpecialPlan && specialPlan.ClaudeBaseURL != "" {
+			return fmt.Sprintf("%s/v1/messages", specialPlan.ClaudeBaseURL), nil
+		}
+		return fmt.Sprintf("%s/api/anthropic/v1/messages", baseURL), nil
 	default:
 		switch info.RelayMode {
 		case relayconstant.RelayModeEmbeddings:
-			return fmt.Sprintf("%s/api/paas/v4/embeddings", info.ChannelBaseUrl), nil
+			if hasSpecialPlan && specialPlan.OpenAIBaseURL != "" {
+				return fmt.Sprintf("%s/embeddings", specialPlan.OpenAIBaseURL), nil
+			}
+			return fmt.Sprintf("%s/api/paas/v4/embeddings", baseURL), nil
+		case relayconstant.RelayModeImagesGenerations:
+			return fmt.Sprintf("%s/api/paas/v4/images/generations", baseURL), nil
 		default:
-			return fmt.Sprintf("%s/api/paas/v4/chat/completions", info.ChannelBaseUrl), nil
+			if hasSpecialPlan && specialPlan.OpenAIBaseURL != "" {
+				return fmt.Sprintf("%s/chat/completions", specialPlan.OpenAIBaseURL), nil
+			}
+			return fmt.Sprintf("%s/api/paas/v4/chat/completions", baseURL), nil
 		}
 	}
 }
@@ -98,6 +115,9 @@ func (a *Adaptor) DoResponse(c *gin.Context, resp *http.Response, info *relaycom
 			return claude.ClaudeHandler(c, resp, info, claude.RequestModeMessage)
 		}
 	default:
+		if info.RelayMode == relayconstant.RelayModeImagesGenerations {
+			return zhipu4vImageHandler(c, resp, info)
+		}
 		adaptor := openai.Adaptor{}
 		return adaptor.DoResponse(c, resp, info)
 	}
--- a/relay/channel/zhipu_4v/constants.go
+++ b/relay/channel/zhipu_4v/constants.go
@@ -1,7 +1,7 @@
 package zhipu_4v

 var ModelList = []string{
-	"glm-4", "glm-4v", "glm-3-turbo", "glm-4-alltools", "glm-4-plus", "glm-4-0520", "glm-4-air", "glm-4-airx", "glm-4-long", "glm-4-flash", "glm-4v-plus",
+	"glm-4", "glm-4v", "glm-3-turbo", "glm-4-alltools", "glm-4-plus", "glm-4-0520", "glm-4-air", "glm-4-airx", "glm-4-long", "glm-4-flash", "glm-4v-plus", "glm-4.6",
 }

 var ChannelName = "zhipu_4v"
--- a/relay/channel/zhipu_4v/image.go
+++ b/relay/channel/zhipu_4v/image.go
@@ -0,0 +1,127 @@
+package zhipu_4v
+
+import (
+	"io"
+	"net/http"
+
+	"github.com/QuantumNous/new-api/common"
+	"github.com/QuantumNous/new-api/dto"
+	"github.com/QuantumNous/new-api/logger"
+	relaycommon "github.com/QuantumNous/new-api/relay/common"
+	"github.com/QuantumNous/new-api/service"
+	"github.com/QuantumNous/new-api/types"
+
+	"github.com/gin-gonic/gin"
+)
+
+type zhipuImageRequest struct {
+	Model            string `json:"model"`
+	Prompt           string `json:"prompt"`
+	Quality          string `json:"quality,omitempty"`
+	Size             string `json:"size,omitempty"`
+	WatermarkEnabled *bool  `json:"watermark_enabled,omitempty"`
+	UserID           string `json:"user_id,omitempty"`
+}
+
+type zhipuImageResponse struct {
+	Created       *int64            `json:"created,omitempty"`
+	Data          []zhipuImageData  `json:"data,omitempty"`
+	ContentFilter any               `json:"content_filter,omitempty"`
+	Usage         *dto.Usage        `json:"usage,omitempty"`
+	Error         *zhipuImageError  `json:"error,omitempty"`
+	RequestID     string            `json:"request_id,omitempty"`
+	ExtendParam   map[string]string `json:"extendParam,omitempty"`
+}
+
+type zhipuImageError struct {
+	Code    string `json:"code"`
+	Message string `json:"message"`
+}
+
+type zhipuImageData struct {
+	Url      string `json:"url,omitempty"`
+	ImageUrl string `json:"image_url,omitempty"`
+	B64Json  string `json:"b64_json,omitempty"`
+	B64Image string `json:"b64_image,omitempty"`
+}
+
+type openAIImagePayload struct {
+	Created int64             `json:"created"`
+	Data    []openAIImageData `json:"data"`
+}
+
+type openAIImageData struct {
+	B64Json string `json:"b64_json"`
+}
+
+func zhipu4vImageHandler(c *gin.Context, resp *http.Response, info *relaycommon.RelayInfo) (*dto.Usage, *types.NewAPIError) {
+	responseBody, err := io.ReadAll(resp.Body)
+	if err != nil {
+		return nil, types.NewOpenAIError(err, types.ErrorCodeReadResponseBodyFailed, http.StatusInternalServerError)
+	}
+	service.CloseResponseBodyGracefully(resp)
+
+	var zhipuResp zhipuImageResponse
+	if err := common.Unmarshal(responseBody, &zhipuResp); err != nil {
+		return nil, types.NewOpenAIError(err, types.ErrorCodeBadResponseBody, http.StatusInternalServerError)
+	}
+
+	if zhipuResp.Error != nil && zhipuResp.Error.Message != "" {
+		return nil, types.WithOpenAIError(types.OpenAIError{
+			Message: zhipuResp.Error.Message,
+			Type:    "zhipu_image_error",
+			Code:    zhipuResp.Error.Code,
+		}, resp.StatusCode)
+	}
+
+	payload := openAIImagePayload{}
+	if zhipuResp.Created != nil && *zhipuResp.Created != 0 {
+		payload.Created = *zhipuResp.Created
+	} else {
+		payload.Created = info.StartTime.Unix()
+	}
+	for _, data := range zhipuResp.Data {
+		url := data.Url
+		if url == "" {
+			url = data.ImageUrl
+		}
+		if url == "" {
+			logger.LogWarn(c, "zhipu_image_missing_url")
+			continue
+		}
+
+		var b64 string
+		switch {
+		case data.B64Json != "":
+			b64 = data.B64Json
+		case data.B64Image != "":
+			b64 = data.B64Image
+		default:
+			_, downloaded, err := service.GetImageFromUrl(url)
+			if err != nil {
+				logger.LogError(c, "zhipu_image_get_b64_failed: "+err.Error())
+				continue
+			}
+			b64 = downloaded
+		}
+
+		if b64 == "" {
+			logger.LogWarn(c, "zhipu_image_empty_b64")
+			continue
+		}
+
+		imageData := openAIImageData{
+			B64Json: b64,
+		}
+		payload.Data = append(payload.Data, imageData)
+	}
+
+	jsonResp, err := common.Marshal(payload)
+	if err != nil {
+		return nil, types.NewError(err, types.ErrorCodeBadResponseBody)
+	}
+
+	service.IOCopyBytesGracefully(c, resp, jsonResp)
+
+	return &dto.Usage{}, nil
+}
--- a/relay/common/override.go
+++ b/relay/common/override.go
@@ -11,6 +11,8 @@ import (
 	"github.com/tidwall/sjson"
 )

+var negativeIndexRegexp = regexp.MustCompile(`\.(-\d+)`)
+
 type ConditionOperation struct {
 	Path           string      `json:"path"`             // JSON路径
 	Mode           string      `json:"mode"`             // full, prefix, suffix, contains, gt, gte, lt, lte
@@ -186,8 +188,7 @@ func checkSingleCondition(jsonStr, contextJSON string, condition ConditionOperat
 }

 func processNegativeIndex(jsonStr string, path string) string {
-	re := regexp.MustCompile(`\.(-\d+)`)
-	matches := re.FindAllStringSubmatch(path, -1)
+	matches := negativeIndexRegexp.FindAllStringSubmatch(path, -1)

 	if len(matches) == 0 {
 		return path
--- a/relay/common/relay_info.go
+++ b/relay/common/relay_info.go
@@ -73,6 +73,11 @@ type ChannelMeta struct {
 	SupportStreamOptions bool // 是否支持流式选项
 }

+type TokenCountMeta struct {
+	//promptTokens int
+	estimatePromptTokens int
+}
+
 type RelayInfo struct {
 	TokenId           int
 	TokenKey          string
@@ -91,7 +96,6 @@ type RelayInfo struct {
 	RelayMode              int
 	OriginModelName        string
 	RequestURLPath         string
-	PromptTokens           int
 	ShouldIncludeUsage     bool
 	DisablePing            bool // 是否禁止向下游发送自定义 Ping
 	ClientWs               *websocket.Conn
@@ -115,6 +119,7 @@ type RelayInfo struct {
 	Request dto.Request

 	ThinkingContentInfo
+	TokenCountMeta
 	*ClaudeConvertInfo
 	*RerankerInfo
 	*ResponsesUsageInfo
@@ -189,7 +194,7 @@ func (info *RelayInfo) ToString() string {
 	fmt.Fprintf(b, "IsPlayground: %t, ", info.IsPlayground)
 	fmt.Fprintf(b, "RequestURLPath: %q, ", info.RequestURLPath)
 	fmt.Fprintf(b, "OriginModelName: %q, ", info.OriginModelName)
-	fmt.Fprintf(b, "PromptTokens: %d, ", info.PromptTokens)
+	fmt.Fprintf(b, "EstimatePromptTokens: %d, ", info.estimatePromptTokens)
 	fmt.Fprintf(b, "ShouldIncludeUsage: %t, ", info.ShouldIncludeUsage)
 	fmt.Fprintf(b, "DisablePing: %t, ", info.DisablePing)
 	fmt.Fprintf(b, "SendResponseCount: %d, ", info.SendResponseCount)
@@ -391,7 +396,6 @@ func genBaseRelayInfo(c *gin.Context, request dto.Request) *RelayInfo {
 		UserEmail:  common.GetContextKeyString(c, constant.ContextKeyUserEmail),

 		OriginModelName: common.GetContextKeyString(c, constant.ContextKeyOriginalModel),
-		PromptTokens:    common.GetContextKeyInt(c, constant.ContextKeyPromptTokens),

 		TokenId:        common.GetContextKeyInt(c, constant.ContextKeyTokenId),
 		TokenKey:       common.GetContextKeyString(c, constant.ContextKeyTokenKey),
@@ -408,6 +412,10 @@ func genBaseRelayInfo(c *gin.Context, request dto.Request) *RelayInfo {
 			IsFirstThinkingContent:  true,
 			SendLastThinkingContent: false,
 		},
+		TokenCountMeta: TokenCountMeta{
+			//promptTokens: common.GetContextKeyInt(c, constant.ContextKeyPromptTokens),
+			estimatePromptTokens: common.GetContextKeyInt(c, constant.ContextKeyEstimatedTokens),
+		},
 	}

 	if info.RelayMode == relayconstant.RelayModeUnknown {
@@ -463,8 +471,16 @@ func GenRelayInfo(c *gin.Context, relayFormat types.RelayFormat, request dto.Req
 	}
 }

-func (info *RelayInfo) SetPromptTokens(promptTokens int) {
-	info.PromptTokens = promptTokens
+//func (info *RelayInfo) SetPromptTokens(promptTokens int) {
+//	info.promptTokens = promptTokens
+//}
+
+func (info *RelayInfo) SetEstimatePromptTokens(promptTokens int) {
+	info.estimatePromptTokens = promptTokens
+}
+
+func (info *RelayInfo) GetEstimatePromptTokens() int {
+	return info.estimatePromptTokens
 }

 func (info *RelayInfo) SetFirstResponseTime() {
--- a/relay/common_handler/rerank.go
+++ b/relay/common_handler/rerank.go
@@ -57,8 +57,8 @@ func RerankHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Respo
 		jinaResp = dto.RerankResponse{
 			Results: jinaRespResults,
 			Usage: dto.Usage{
-				PromptTokens: info.PromptTokens,
-				TotalTokens:  info.PromptTokens,
+				PromptTokens: info.GetEstimatePromptTokens(),
+				TotalTokens:  info.GetEstimatePromptTokens(),
 			},
 		}
 	} else {
--- a/relay/compatible_handler.go
+++ b/relay/compatible_handler.go
@@ -192,9 +192,9 @@ func TextHelper(c *gin.Context, info *relaycommon.RelayInfo) (newAPIError *types
 func postConsumeQuota(ctx *gin.Context, relayInfo *relaycommon.RelayInfo, usage *dto.Usage, extraContent string) {
 	if usage == nil {
 		usage = &dto.Usage{
-			PromptTokens:     relayInfo.PromptTokens,
+			PromptTokens:     relayInfo.GetEstimatePromptTokens(),
 			CompletionTokens: 0,
-			TotalTokens:      relayInfo.PromptTokens,
+			TotalTokens:      relayInfo.GetEstimatePromptTokens(),
 		}
 		extraContent += "（可能是请求出错）"
 	}
--- a/relay/helper/price.go
+++ b/relay/helper/price.go
@@ -99,7 +99,10 @@ func ModelPriceHelper(c *gin.Context, info *relaycommon.RelayInfo, promptTokens
 	// check if free model pre-consume is disabled
 	if !operation_setting.GetQuotaSetting().EnableFreeModelPreConsume {
 		// if model price or ratio is 0, do not pre-consume quota
-		if usePrice {
+		if groupRatioInfo.GroupRatio == 0 {
+			preConsumedQuota = 0
+			freeModel = true
+		} else if usePrice {
 			if modelPrice == 0 {
 				preConsumedQuota = 0
 				freeModel = true
--- a/relay/helper/stream_scanner.go
+++ b/relay/helper/stream_scanner.go
@@ -22,11 +22,18 @@ import (
 )

 const (
-	InitialScannerBufferSize = 64 << 10 // 64KB (64*1024)
-	MaxScannerBufferSize     = 10 << 20 // 10MB (10*1024*1024)
-	DefaultPingInterval      = 10 * time.Second
+	InitialScannerBufferSize    = 64 << 10 // 64KB (64*1024)
+	DefaultMaxScannerBufferSize = 64 << 20 // 64MB (64*1024*1024) default SSE buffer size
+	DefaultPingInterval         = 10 * time.Second
 )

+func getScannerBufferSize() int {
+	if constant.StreamScannerMaxBufferMB > 0 {
+		return constant.StreamScannerMaxBufferMB << 20
+	}
+	return DefaultMaxScannerBufferSize
+}
+
 func StreamScannerHandler(c *gin.Context, resp *http.Response, info *relaycommon.RelayInfo, dataHandler func(data string) bool) {

 	if resp == nil || dataHandler == nil {
@@ -65,6 +72,8 @@ func StreamScannerHandler(c *gin.Context, resp *http.Response, info *relaycommon
 	if common.DebugEnabled {
 		// print timeout and ping interval for debugging
 		println("relay timeout seconds:", common.RelayTimeout)
+		println("relay max idle conns:", common.RelayMaxIdleConns)
+		println("relay max idle conns per host:", common.RelayMaxIdleConnsPerHost)
 		println("streaming timeout seconds:", int64(streamingTimeout.Seconds()))
 		println("ping interval seconds:", int64(pingInterval.Seconds()))
 	}
@@ -95,7 +104,7 @@ func StreamScannerHandler(c *gin.Context, resp *http.Response, info *relaycommon
 		close(stopChan)
 	}()

-	scanner.Buffer(make([]byte, InitialScannerBufferSize), MaxScannerBufferSize)
+	scanner.Buffer(make([]byte, InitialScannerBufferSize), getScannerBufferSize())
 	scanner.Split(bufio.ScanLines)
 	SetEventStreamHeaders(c)

--- a/relay/relay_task.go
+++ b/relay/relay_task.go
@@ -32,7 +32,94 @@ func RelayTaskSubmit(c *gin.Context, info *relaycommon.RelayInfo) (taskErr *dto.
 	if info.TaskRelayInfo == nil {
 		info.TaskRelayInfo = &relaycommon.TaskRelayInfo{}
 	}
+	path := c.Request.URL.Path
+	if strings.Contains(path, "/v1/videos/") && strings.HasSuffix(path, "/remix") {
+		info.Action = constant.TaskActionRemix
+	}
+
+	// 提取 remix 任务的 video_id
+	if info.Action == constant.TaskActionRemix {
+		videoID := c.Param("video_id")
+		if strings.TrimSpace(videoID) == "" {
+			return service.TaskErrorWrapperLocal(fmt.Errorf("video_id is required"), "invalid_request", http.StatusBadRequest)
+		}
+		info.OriginTaskID = videoID
+	}
+
 	platform := constant.TaskPlatform(c.GetString("platform"))
+
+	// 获取原始任务信息
+	if info.OriginTaskID != "" {
+		originTask, exist, err := model.GetByTaskId(info.UserId, info.OriginTaskID)
+		if err != nil {
+			taskErr = service.TaskErrorWrapper(err, "get_origin_task_failed", http.StatusInternalServerError)
+			return
+		}
+		if !exist {
+			taskErr = service.TaskErrorWrapperLocal(errors.New("task_origin_not_exist"), "task_not_exist", http.StatusBadRequest)
+			return
+		}
+		if info.OriginModelName == "" {
+			if originTask.Properties.OriginModelName != "" {
+				info.OriginModelName = originTask.Properties.OriginModelName
+			} else if originTask.Properties.UpstreamModelName != "" {
+				info.OriginModelName = originTask.Properties.UpstreamModelName
+			} else {
+				var taskData map[string]interface{}
+				_ = json.Unmarshal(originTask.Data, &taskData)
+				if m, ok := taskData["model"].(string); ok && m != "" {
+					info.OriginModelName = m
+					platform = originTask.Platform
+				}
+			}
+		}
+		if originTask.ChannelId != info.ChannelId {
+			channel, err := model.GetChannelById(originTask.ChannelId, true)
+			if err != nil {
+				taskErr = service.TaskErrorWrapperLocal(err, "channel_not_found", http.StatusBadRequest)
+				return
+			}
+			if channel.Status != common.ChannelStatusEnabled {
+				taskErr = service.TaskErrorWrapperLocal(errors.New("the channel of the origin task is disabled"), "task_channel_disable", http.StatusBadRequest)
+				return
+			}
+			key, _, newAPIError := channel.GetNextEnabledKey()
+			if newAPIError != nil {
+				taskErr = service.TaskErrorWrapper(newAPIError, "channel_no_available_key", newAPIError.StatusCode)
+				return
+			}
+			common.SetContextKey(c, constant.ContextKeyChannelKey, key)
+			common.SetContextKey(c, constant.ContextKeyChannelType, channel.Type)
+			common.SetContextKey(c, constant.ContextKeyChannelBaseUrl, channel.GetBaseURL())
+			common.SetContextKey(c, constant.ContextKeyChannelId, originTask.ChannelId)
+
+			info.ChannelBaseUrl = channel.GetBaseURL()
+			info.ChannelId = originTask.ChannelId
+			info.ChannelType = channel.Type
+			info.ApiKey = key
+			platform = originTask.Platform
+		}
+
+		// 使用原始任务的参数
+		if info.Action == constant.TaskActionRemix {
+			var taskData map[string]interface{}
+			_ = json.Unmarshal(originTask.Data, &taskData)
+			secondsStr, _ := taskData["seconds"].(string)
+			seconds, _ := strconv.Atoi(secondsStr)
+			if seconds <= 0 {
+				seconds = 4
+			}
+			sizeStr, _ := taskData["size"].(string)
+			if info.PriceData.OtherRatios == nil {
+				info.PriceData.OtherRatios = map[string]float64{}
+			}
+			info.PriceData.OtherRatios["seconds"] = float64(seconds)
+			info.PriceData.OtherRatios["size"] = 1
+			if sizeStr == "1792x1024" || sizeStr == "1024x1792" {
+				info.PriceData.OtherRatios["size"] = 1.666667
+			}
+		}
+	}
 	if platform == "" {
 		platform = GetTaskPlatform(c)
 	}
@@ -94,34 +181,6 @@ func RelayTaskSubmit(c *gin.Context, info *relaycommon.RelayInfo) (taskErr *dto.
 		return
 	}

-	if info.OriginTaskID != "" {
-		originTask, exist, err := model.GetByTaskId(info.UserId, info.OriginTaskID)
-		if err != nil {
-			taskErr = service.TaskErrorWrapper(err, "get_origin_task_failed", http.StatusInternalServerError)
-			return
-		}
-		if !exist {
-			taskErr = service.TaskErrorWrapperLocal(errors.New("task_origin_not_exist"), "task_not_exist", http.StatusBadRequest)
-			return
-		}
-		if originTask.ChannelId != info.ChannelId {
-			channel, err := model.GetChannelById(originTask.ChannelId, true)
-			if err != nil {
-				taskErr = service.TaskErrorWrapperLocal(err, "channel_not_found", http.StatusBadRequest)
-				return
-			}
-			if channel.Status != common.ChannelStatusEnabled {
-				return service.TaskErrorWrapperLocal(errors.New("该任务所属渠道已被禁用"), "task_channel_disable", http.StatusBadRequest)
-			}
-			c.Set("base_url", channel.GetBaseURL())
-			c.Set("channel_id", originTask.ChannelId)
-			c.Request.Header.Set("Authorization", fmt.Sprintf("Bearer %s", channel.Key))
-
-			info.ChannelBaseUrl = channel.GetBaseURL()
-			info.ChannelId = originTask.ChannelId
-		}
-	}
-
 	// build body
 	requestBody, err := adaptor.BuildRequestBody(c, info)
 	if err != nil {
@@ -326,6 +385,7 @@ func videoFetchByIDRespBodyBuilder(c *gin.Context) (respBody []byte, taskResp *d
 		if channelModel.GetBaseURL() != "" {
 			baseURL = channelModel.GetBaseURL()
 		}
+		proxy := channelModel.GetSetting().Proxy
 		adaptor := GetTaskAdaptor(constant.TaskPlatform(strconv.Itoa(channelModel.Type)))
 		if adaptor == nil {
 			return
@@ -333,7 +393,7 @@ func videoFetchByIDRespBodyBuilder(c *gin.Context) (respBody []byte, taskResp *d
 		resp, err2 := adaptor.FetchTask(baseURL, channelModel.Key, map[string]any{
 			"task_id": originTask.TaskID,
 			"action":  originTask.Action,
-		})
+		}, proxy)
 		if err2 != nil || resp == nil {
 			return
 		}
--- a/router/video-router.go
+++ b/router/video-router.go
@@ -9,11 +9,12 @@ import (

 func SetVideoRouter(router *gin.Engine) {
 	videoV1Router := router.Group("/v1")
-	videoV1Router.GET("/videos/:task_id/content", controller.VideoProxy)
 	videoV1Router.Use(middleware.TokenAuth(), middleware.Distribute())
 	{
+		videoV1Router.GET("/videos/:task_id/content", controller.VideoProxy)
 		videoV1Router.POST("/video/generations", controller.RelayTask)
 		videoV1Router.GET("/video/generations/:task_id", controller.RelayTask)
+		videoV1Router.POST("/videos/:video_id/remix", controller.RelayTask)
 	}
 	// openai compatible API video routes
 	// docs: https://platform.openai.com/docs/api-reference/videos/create
--- a/service/convert.go
+++ b/service/convert.go
@@ -201,6 +201,10 @@ func generateStopBlock(index int) *dto.ClaudeResponse {
 }

 func StreamResponseOpenAI2Claude(openAIResponse *dto.ChatCompletionsStreamResponse, info *relaycommon.RelayInfo) []*dto.ClaudeResponse {
+	if info.ClaudeConvertInfo.Done {
+		return nil
+	}
+
 	var claudeResponses []*dto.ClaudeResponse
 	if info.SendResponseCount == 1 {
 		msg := &dto.ClaudeMediaMessage{
@@ -209,7 +213,7 @@ func StreamResponseOpenAI2Claude(openAIResponse *dto.ChatCompletionsStreamRespon
 			Type:  "message",
 			Role:  "assistant",
 			Usage: &dto.ClaudeUsage{
-				InputTokens:  info.PromptTokens,
+				InputTokens:  info.GetEstimatePromptTokens(),
 				OutputTokens: 0,
 			},
 		}
@@ -218,45 +222,117 @@ func StreamResponseOpenAI2Claude(openAIResponse *dto.ChatCompletionsStreamRespon
 			Type:    "message_start",
 			Message: msg,
 		})
-		claudeResponses = append(claudeResponses)
 		//claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
 		//	Type: "ping",
 		//})
 		if openAIResponse.IsToolCall() {
 			info.ClaudeConvertInfo.LastMessagesType = relaycommon.LastMessageTypeTools
+			var toolCall dto.ToolCallResponse
+			if len(openAIResponse.Choices) > 0 && len(openAIResponse.Choices[0].Delta.ToolCalls) > 0 {
+				toolCall = openAIResponse.Choices[0].Delta.ToolCalls[0]
+			} else {
+				first := openAIResponse.GetFirstToolCall()
+				if first != nil {
+					toolCall = *first
+				} else {
+					toolCall = dto.ToolCallResponse{}
+				}
+			}
 			resp := &dto.ClaudeResponse{
 				Type: "content_block_start",
 				ContentBlock: &dto.ClaudeMediaMessage{
-					Id:    openAIResponse.GetFirstToolCall().ID,
+					Id:    toolCall.ID,
 					Type:  "tool_use",
-					Name:  openAIResponse.GetFirstToolCall().Function.Name,
+					Name:  toolCall.Function.Name,
 					Input: map[string]interface{}{},
 				},
 			}
 			resp.SetIndex(0)
 			claudeResponses = append(claudeResponses, resp)
+			// 首块包含工具 delta，则追加 input_json_delta
+			if toolCall.Function.Arguments != "" {
+				claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
+					Index: &info.ClaudeConvertInfo.Index,
+					Type:  "content_block_delta",
+					Delta: &dto.ClaudeMediaMessage{
+						Type:        "input_json_delta",
+						PartialJson: &toolCall.Function.Arguments,
+					},
+				})
+			}
 		} else {

 		}
 		// 判断首个响应是否存在内容（非标准的 OpenAI 响应）
-		if len(openAIResponse.Choices) > 0 && len(openAIResponse.Choices[0].Delta.GetContentString()) > 0 {
+		if len(openAIResponse.Choices) > 0 {
+			reasoning := openAIResponse.Choices[0].Delta.GetReasoningContent()
+			content := openAIResponse.Choices[0].Delta.GetContentString()
+
+			if reasoning != "" {
+				claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
+					Index: &info.ClaudeConvertInfo.Index,
+					Type:  "content_block_start",
+					ContentBlock: &dto.ClaudeMediaMessage{
+						Type:     "thinking",
+						Thinking: common.GetPointer[string](""),
+					},
+				})
+				claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
+					Index: &info.ClaudeConvertInfo.Index,
+					Type:  "content_block_delta",
+					Delta: &dto.ClaudeMediaMessage{
+						Type:     "thinking_delta",
+						Thinking: &reasoning,
+					},
+				})
+				info.ClaudeConvertInfo.LastMessagesType = relaycommon.LastMessageTypeThinking
+			} else if content != "" {
+				claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
+					Index: &info.ClaudeConvertInfo.Index,
+					Type:  "content_block_start",
+					ContentBlock: &dto.ClaudeMediaMessage{
+						Type: "text",
+						Text: common.GetPointer[string](""),
+					},
+				})
+				claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
+					Index: &info.ClaudeConvertInfo.Index,
+					Type:  "content_block_delta",
+					Delta: &dto.ClaudeMediaMessage{
+						Type: "text_delta",
+						Text: common.GetPointer[string](content),
+					},
+				})
+				info.ClaudeConvertInfo.LastMessagesType = relaycommon.LastMessageTypeText
+			}
+		}
+
+		// 如果首块就带 finish_reason，需要立即发送停止块
+		if len(openAIResponse.Choices) > 0 && openAIResponse.Choices[0].FinishReason != nil && *openAIResponse.Choices[0].FinishReason != "" {
+			info.FinishReason = *openAIResponse.Choices[0].FinishReason
+			claudeResponses = append(claudeResponses, generateStopBlock(info.ClaudeConvertInfo.Index))
+			oaiUsage := openAIResponse.Usage
+			if oaiUsage == nil {
+				oaiUsage = info.ClaudeConvertInfo.Usage
+			}
+			if oaiUsage != nil {
+				claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
+					Type: "message_delta",
+					Usage: &dto.ClaudeUsage{
+						InputTokens:              oaiUsage.PromptTokens,
+						OutputTokens:             oaiUsage.CompletionTokens,
+						CacheCreationInputTokens: oaiUsage.PromptTokensDetails.CachedCreationTokens,
+						CacheReadInputTokens:     oaiUsage.PromptTokensDetails.CachedTokens,
+					},
+					Delta: &dto.ClaudeMediaMessage{
+						StopReason: common.GetPointer[string](stopReasonOpenAI2Claude(info.FinishReason)),
+					},
+				})
+			}
 			claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
-				Index: &info.ClaudeConvertInfo.Index,
-				Type:  "content_block_start",
-				ContentBlock: &dto.ClaudeMediaMessage{
-					Type: "text",
-					Text: common.GetPointer[string](""),
-				},
+				Type: "message_stop",
 			})
-			claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
-				Index: &info.ClaudeConvertInfo.Index,
-				Type:  "content_block_delta",
-				Delta: &dto.ClaudeMediaMessage{
-					Type: "text_delta",
-					Text: common.GetPointer[string](openAIResponse.Choices[0].Delta.GetContentString()),
-				},
-			})
-			info.ClaudeConvertInfo.LastMessagesType = relaycommon.LastMessageTypeText
+			info.ClaudeConvertInfo.Done = true
 		}
 		return claudeResponses
 	}
@@ -264,7 +340,7 @@ func StreamResponseOpenAI2Claude(openAIResponse *dto.ChatCompletionsStreamRespon
 	if len(openAIResponse.Choices) == 0 {
 		// no choices
 		// 可能为非标准的 OpenAI 响应，判断是否已经完成
-		if info.Done {
+		if info.ClaudeConvertInfo.Done {
 			claudeResponses = append(claudeResponses, generateStopBlock(info.ClaudeConvertInfo.Index))
 			oaiUsage := info.ClaudeConvertInfo.Usage
 			if oaiUsage != nil {
@@ -288,16 +364,110 @@ func StreamResponseOpenAI2Claude(openAIResponse *dto.ChatCompletionsStreamRespon
 		return claudeResponses
 	} else {
 		chosenChoice := openAIResponse.Choices[0]
-		if chosenChoice.FinishReason != nil && *chosenChoice.FinishReason != "" {
-			// should be done
+		doneChunk := chosenChoice.FinishReason != nil && *chosenChoice.FinishReason != ""
+		if doneChunk {
 			info.FinishReason = *chosenChoice.FinishReason
-			if !info.Done {
-				return claudeResponses
+		}
+
+		var claudeResponse dto.ClaudeResponse
+		var isEmpty bool
+		claudeResponse.Type = "content_block_delta"
+		if len(chosenChoice.Delta.ToolCalls) > 0 {
+			toolCalls := chosenChoice.Delta.ToolCalls
+			if info.ClaudeConvertInfo.LastMessagesType != relaycommon.LastMessageTypeTools {
+				claudeResponses = append(claudeResponses, generateStopBlock(info.ClaudeConvertInfo.Index))
+				info.ClaudeConvertInfo.Index++
+			}
+			info.ClaudeConvertInfo.LastMessagesType = relaycommon.LastMessageTypeTools
+
+			for i, toolCall := range toolCalls {
+				blockIndex := info.ClaudeConvertInfo.Index
+				if toolCall.Index != nil {
+					blockIndex = *toolCall.Index
+				} else if len(toolCalls) > 1 {
+					blockIndex = info.ClaudeConvertInfo.Index + i
+				}
+
+				idx := blockIndex
+				claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
+					Index: &idx,
+					Type:  "content_block_start",
+					ContentBlock: &dto.ClaudeMediaMessage{
+						Id:    toolCall.ID,
+						Type:  "tool_use",
+						Name:  toolCall.Function.Name,
+						Input: map[string]interface{}{},
+					},
+				})
+
+				claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
+					Index: &idx,
+					Type:  "content_block_delta",
+					Delta: &dto.ClaudeMediaMessage{
+						Type:        "input_json_delta",
+						PartialJson: &toolCall.Function.Arguments,
+					},
+				})
+
+				info.ClaudeConvertInfo.Index = blockIndex
+			}
+		} else {
+			reasoning := chosenChoice.Delta.GetReasoningContent()
+			textContent := chosenChoice.Delta.GetContentString()
+			if reasoning != "" || textContent != "" {
+				if reasoning != "" {
+					if info.ClaudeConvertInfo.LastMessagesType != relaycommon.LastMessageTypeThinking {
+						claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
+							Index: &info.ClaudeConvertInfo.Index,
+							Type:  "content_block_start",
+							ContentBlock: &dto.ClaudeMediaMessage{
+								Type:     "thinking",
+								Thinking: common.GetPointer[string](""),
+							},
+						})
+					}
+					info.ClaudeConvertInfo.LastMessagesType = relaycommon.LastMessageTypeThinking
+					claudeResponse.Delta = &dto.ClaudeMediaMessage{
+						Type:     "thinking_delta",
+						Thinking: &reasoning,
+					}
+				} else {
+					if info.ClaudeConvertInfo.LastMessagesType != relaycommon.LastMessageTypeText {
+						if info.ClaudeConvertInfo.LastMessagesType == relaycommon.LastMessageTypeThinking || info.ClaudeConvertInfo.LastMessagesType == relaycommon.LastMessageTypeTools {
+							claudeResponses = append(claudeResponses, generateStopBlock(info.ClaudeConvertInfo.Index))
+							info.ClaudeConvertInfo.Index++
+						}
+						claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
+							Index: &info.ClaudeConvertInfo.Index,
+							Type:  "content_block_start",
+							ContentBlock: &dto.ClaudeMediaMessage{
+								Type: "text",
+								Text: common.GetPointer[string](""),
+							},
+						})
+					}
+					info.ClaudeConvertInfo.LastMessagesType = relaycommon.LastMessageTypeText
+					claudeResponse.Delta = &dto.ClaudeMediaMessage{
+						Type: "text_delta",
+						Text: common.GetPointer[string](textContent),
+					}
+				}
+			} else {
+				isEmpty = true
 			}
 		}
-		if info.Done {
+
+		claudeResponse.Index = &info.ClaudeConvertInfo.Index
+		if !isEmpty && claudeResponse.Delta != nil {
+			claudeResponses = append(claudeResponses, &claudeResponse)
+		}
+
+		if doneChunk || info.ClaudeConvertInfo.Done {
 			claudeResponses = append(claudeResponses, generateStopBlock(info.ClaudeConvertInfo.Index))
-			oaiUsage := info.ClaudeConvertInfo.Usage
+			oaiUsage := openAIResponse.Usage
+			if oaiUsage == nil {
+				oaiUsage = info.ClaudeConvertInfo.Usage
+			}
 			if oaiUsage != nil {
 				claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
 					Type: "message_delta",
@@ -315,83 +485,8 @@ func StreamResponseOpenAI2Claude(openAIResponse *dto.ChatCompletionsStreamRespon
 			claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
 				Type: "message_stop",
 			})
-		} else {
-			var claudeResponse dto.ClaudeResponse
-			var isEmpty bool
-			claudeResponse.Type = "content_block_delta"
-			if len(chosenChoice.Delta.ToolCalls) > 0 {
-				if info.ClaudeConvertInfo.LastMessagesType != relaycommon.LastMessageTypeTools {
-					claudeResponses = append(claudeResponses, generateStopBlock(info.ClaudeConvertInfo.Index))
-					info.ClaudeConvertInfo.Index++
-					claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
-						Index: &info.ClaudeConvertInfo.Index,
-						Type:  "content_block_start",
-						ContentBlock: &dto.ClaudeMediaMessage{
-							Id:    openAIResponse.GetFirstToolCall().ID,
-							Type:  "tool_use",
-							Name:  openAIResponse.GetFirstToolCall().Function.Name,
-							Input: map[string]interface{}{},
-						},
-					})
-				}
-				info.ClaudeConvertInfo.LastMessagesType = relaycommon.LastMessageTypeTools
-				// tools delta
-				claudeResponse.Delta = &dto.ClaudeMediaMessage{
-					Type:        "input_json_delta",
-					PartialJson: &chosenChoice.Delta.ToolCalls[0].Function.Arguments,
-				}
-			} else {
-				reasoning := chosenChoice.Delta.GetReasoningContent()
-				textContent := chosenChoice.Delta.GetContentString()
-				if reasoning != "" || textContent != "" {
-					if reasoning != "" {
-						if info.ClaudeConvertInfo.LastMessagesType != relaycommon.LastMessageTypeThinking {
-							//info.ClaudeConvertInfo.Index++
-							claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
-								Index: &info.ClaudeConvertInfo.Index,
-								Type:  "content_block_start",
-								ContentBlock: &dto.ClaudeMediaMessage{
-									Type:     "thinking",
-									Thinking: common.GetPointer[string](""),
-								},
-							})
-						}
-						info.ClaudeConvertInfo.LastMessagesType = relaycommon.LastMessageTypeThinking
-						// text delta
-						claudeResponse.Delta = &dto.ClaudeMediaMessage{
-							Type:     "thinking_delta",
-							Thinking: &reasoning,
-						}
-					} else {
-						if info.ClaudeConvertInfo.LastMessagesType != relaycommon.LastMessageTypeText {
-							if info.LastMessagesType == relaycommon.LastMessageTypeThinking || info.LastMessagesType == relaycommon.LastMessageTypeTools {
-								claudeResponses = append(claudeResponses, generateStopBlock(info.ClaudeConvertInfo.Index))
-								info.ClaudeConvertInfo.Index++
-							}
-							claudeResponses = append(claudeResponses, &dto.ClaudeResponse{
-								Index: &info.ClaudeConvertInfo.Index,
-								Type:  "content_block_start",
-								ContentBlock: &dto.ClaudeMediaMessage{
-									Type: "text",
-									Text: common.GetPointer[string](""),
-								},
-							})
-						}
-						info.ClaudeConvertInfo.LastMessagesType = relaycommon.LastMessageTypeText
-						// text delta
-						claudeResponse.Delta = &dto.ClaudeMediaMessage{
-							Type: "text_delta",
-							Text: common.GetPointer[string](textContent),
-						}
-					}
-				} else {
-					isEmpty = true
-				}
-			}
-			claudeResponse.Index = &info.ClaudeConvertInfo.Index
-			if !isEmpty {
-				claudeResponses = append(claudeResponses, &claudeResponse)
-			}
+			info.ClaudeConvertInfo.Done = true
+			return claudeResponses
 		}
 	}

@@ -734,12 +829,18 @@ func StreamResponseOpenAI2Gemini(openAIResponse *dto.ChatCompletionsStreamRespon
 	geminiResponse := &dto.GeminiChatResponse{
 		Candidates: make([]dto.GeminiChatCandidate, 0, len(openAIResponse.Choices)),
 		UsageMetadata: dto.GeminiUsageMetadata{
-			PromptTokenCount:     info.PromptTokens,
+			PromptTokenCount:     info.GetEstimatePromptTokens(),
 			CandidatesTokenCount: 0, // 流式响应中可能没有完整的 usage 信息
-			TotalTokenCount:      info.PromptTokens,
+			TotalTokenCount:      info.GetEstimatePromptTokens(),
 		},
 	}

+	if openAIResponse.Usage != nil {
+		geminiResponse.UsageMetadata.PromptTokenCount = openAIResponse.Usage.PromptTokens
+		geminiResponse.UsageMetadata.CandidatesTokenCount = openAIResponse.Usage.CompletionTokens
+		geminiResponse.UsageMetadata.TotalTokenCount = openAIResponse.Usage.TotalTokens
+	}
+
 	for _, choice := range openAIResponse.Choices {
 		candidate := dto.GeminiChatCandidate{
 			Index:         int64(choice.Index),
--- a/service/http_client.go
+++ b/service/http_client.go
@@ -34,12 +34,20 @@ func checkRedirect(req *http.Request, via []*http.Request) error {
 }

 func InitHttpClient() {
+	transport := &http.Transport{
+		MaxIdleConns:        common.RelayMaxIdleConns,
+		MaxIdleConnsPerHost: common.RelayMaxIdleConnsPerHost,
+		ForceAttemptHTTP2:   true,
+	}
+
 	if common.RelayTimeout == 0 {
 		httpClient = &http.Client{
+			Transport:     transport,
 			CheckRedirect: checkRedirect,
 		}
 	} else {
 		httpClient = &http.Client{
+			Transport:     transport,
 			Timeout:       time.Duration(common.RelayTimeout) * time.Second,
 			CheckRedirect: checkRedirect,
 		}
@@ -50,6 +58,14 @@ func GetHttpClient() *http.Client {
 	return httpClient
 }

+// GetHttpClientWithProxy returns the default client or a proxy-enabled one when proxyURL is provided.
+func GetHttpClientWithProxy(proxyURL string) (*http.Client, error) {
+	if proxyURL == "" {
+		return GetHttpClient(), nil
+	}
+	return NewProxyHttpClient(proxyURL)
+}
+
 // ResetProxyClientCache 清空代理客户端缓存，确保下次使用时重新初始化
 func ResetProxyClientCache() {
 	proxyClientLock.Lock()
@@ -84,7 +100,10 @@ func NewProxyHttpClient(proxyURL string) (*http.Client, error) {
 	case "http", "https":
 		client := &http.Client{
 			Transport: &http.Transport{
-				Proxy: http.ProxyURL(parsedURL),
+				MaxIdleConns:        common.RelayMaxIdleConns,
+				MaxIdleConnsPerHost: common.RelayMaxIdleConnsPerHost,
+				ForceAttemptHTTP2:   true,
+				Proxy:               http.ProxyURL(parsedURL),
 			},
 			CheckRedirect: checkRedirect,
 		}
@@ -116,6 +135,9 @@ func NewProxyHttpClient(proxyURL string) (*http.Client, error) {

 		client := &http.Client{
 			Transport: &http.Transport{
+				MaxIdleConns:        common.RelayMaxIdleConns,
+				MaxIdleConnsPerHost: common.RelayMaxIdleConnsPerHost,
+				ForceAttemptHTTP2:   true,
 				DialContext: func(ctx context.Context, network, addr string) (net.Conn, error) {
 					return dialer.Dial(network, addr)
 				},
--- a/service/token_counter.go
+++ b/service/token_counter.go
@@ -1,7 +1,6 @@
 package service

 import (
-	"encoding/json"
 	"errors"
 	"fmt"
 	"image"
@@ -12,7 +11,6 @@ import (
 	"math"
 	"path/filepath"
 	"strings"
-	"sync"
 	"unicode/utf8"

 	"github.com/QuantumNous/new-api/common"
@@ -23,64 +21,8 @@ import (
 	"github.com/QuantumNous/new-api/types"

 	"github.com/gin-gonic/gin"
-	"github.com/tiktoken-go/tokenizer"
-	"github.com/tiktoken-go/tokenizer/codec"
 )

-// tokenEncoderMap won't grow after initialization
-var defaultTokenEncoder tokenizer.Codec
-
-// tokenEncoderMap is used to store token encoders for different models
-var tokenEncoderMap = make(map[string]tokenizer.Codec)
-
-// tokenEncoderMutex protects tokenEncoderMap for concurrent access
-var tokenEncoderMutex sync.RWMutex
-
-func InitTokenEncoders() {
-	common.SysLog("initializing token encoders")
-	defaultTokenEncoder = codec.NewCl100kBase()
-	common.SysLog("token encoders initialized")
-}
-
-func getTokenEncoder(model string) tokenizer.Codec {
-	// First, try to get the encoder from cache with read lock
-	tokenEncoderMutex.RLock()
-	if encoder, exists := tokenEncoderMap[model]; exists {
-		tokenEncoderMutex.RUnlock()
-		return encoder
-	}
-	tokenEncoderMutex.RUnlock()
-
-	// If not in cache, create new encoder with write lock
-	tokenEncoderMutex.Lock()
-	defer tokenEncoderMutex.Unlock()
-
-	// Double-check if another goroutine already created the encoder
-	if encoder, exists := tokenEncoderMap[model]; exists {
-		return encoder
-	}
-
-	// Create new encoder
-	modelCodec, err := tokenizer.ForModel(tokenizer.Model(model))
-	if err != nil {
-		// Cache the default encoder for this model to avoid repeated failures
-		tokenEncoderMap[model] = defaultTokenEncoder
-		return defaultTokenEncoder
-	}
-
-	// Cache the new encoder
-	tokenEncoderMap[model] = modelCodec
-	return modelCodec
-}
-
-func getTokenNum(tokenEncoder tokenizer.Codec, text string) int {
-	if text == "" {
-		return 0
-	}
-	tkm, _ := tokenEncoder.Count(text)
-	return tkm
-}
-
 func getImageToken(fileMeta *types.FileMeta, model string, stream bool) (int, error) {
 	if fileMeta == nil {
 		return 0, fmt.Errorf("image_url_is_nil")
@@ -257,7 +199,7 @@ func getImageToken(fileMeta *types.FileMeta, model string, stream bool) (int, er
 	return tiles*tileTokens + baseTokens, nil
 }

-func CountRequestToken(c *gin.Context, meta *types.TokenCountMeta, info *relaycommon.RelayInfo) (int, error) {
+func EstimateRequestToken(c *gin.Context, meta *types.TokenCountMeta, info *relaycommon.RelayInfo) (int, error) {
 	// 是否统计token
 	if !constant.CountToken {
 		return 0, nil
@@ -375,14 +317,14 @@ func CountRequestToken(c *gin.Context, meta *types.TokenCountMeta, info *relayco
 	for i, file := range meta.Files {
 		switch file.FileType {
 		case types.FileTypeImage:
-			if info.RelayFormat == types.RelayFormatGemini {
-				tkm += 520 // gemini per input image tokens
-			} else {
+			if common.IsOpenAITextModel(info.OriginModelName) {
 				token, err := getImageToken(file, model, info.IsStream)
 				if err != nil {
 					return 0, fmt.Errorf("error counting image token, media index[%d], original data[%s], err: %v", i, file.OriginData, err)
 				}
 				tkm += token
+			} else {
+				tkm += 520
 			}
 		case types.FileTypeAudio:
 			tkm += 256
@@ -399,111 +341,6 @@ func CountRequestToken(c *gin.Context, meta *types.TokenCountMeta, info *relayco
 	return tkm, nil
 }

-func CountTokenClaudeRequest(request dto.ClaudeRequest, model string) (int, error) {
-	tkm := 0
-
-	// Count tokens in messages
-	msgTokens, err := CountTokenClaudeMessages(request.Messages, model, request.Stream)
-	if err != nil {
-		return 0, err
-	}
-	tkm += msgTokens
-
-	// Count tokens in system message
-	if request.System != "" {
-		systemTokens := CountTokenInput(request.System, model)
-		tkm += systemTokens
-	}
-
-	if request.Tools != nil {
-		// check is array
-		if tools, ok := request.Tools.([]any); ok {
-			if len(tools) > 0 {
-				parsedTools, err1 := common.Any2Type[[]dto.Tool](request.Tools)
-				if err1 != nil {
-					return 0, fmt.Errorf("tools: Input should be a valid list: %v", err)
-				}
-				toolTokens, err2 := CountTokenClaudeTools(parsedTools, model)
-				if err2 != nil {
-					return 0, fmt.Errorf("tools: %v", err)
-				}
-				tkm += toolTokens
-			}
-		} else {
-			return 0, errors.New("tools: Input should be a valid list")
-		}
-	}
-
-	return tkm, nil
-}
-
-func CountTokenClaudeMessages(messages []dto.ClaudeMessage, model string, stream bool) (int, error) {
-	tokenEncoder := getTokenEncoder(model)
-	tokenNum := 0
-
-	for _, message := range messages {
-		// Count tokens for role
-		tokenNum += getTokenNum(tokenEncoder, message.Role)
-		if message.IsStringContent() {
-			tokenNum += getTokenNum(tokenEncoder, message.GetStringContent())
-		} else {
-			content, err := message.ParseContent()
-			if err != nil {
-				return 0, err
-			}
-			for _, mediaMessage := range content {
-				switch mediaMessage.Type {
-				case "text":
-					tokenNum += getTokenNum(tokenEncoder, mediaMessage.GetText())
-				case "image":
-					//imageTokenNum, err := getClaudeImageToken(mediaMsg.Source, model, stream)
-					//if err != nil {
-					//	return 0, err
-					//}
-					tokenNum += 1000
-				case "tool_use":
-					if mediaMessage.Input != nil {
-						tokenNum += getTokenNum(tokenEncoder, mediaMessage.Name)
-						inputJSON, _ := json.Marshal(mediaMessage.Input)
-						tokenNum += getTokenNum(tokenEncoder, string(inputJSON))
-					}
-				case "tool_result":
-					if mediaMessage.Content != nil {
-						contentJSON, _ := json.Marshal(mediaMessage.Content)
-						tokenNum += getTokenNum(tokenEncoder, string(contentJSON))
-					}
-				}
-			}
-		}
-	}
-
-	// Add a constant for message formatting (this may need adjustment based on Claude's exact formatting)
-	tokenNum += len(messages) * 2 // Assuming 2 tokens per message for formatting
-
-	return tokenNum, nil
-}
-
-func CountTokenClaudeTools(tools []dto.Tool, model string) (int, error) {
-	tokenEncoder := getTokenEncoder(model)
-	tokenNum := 0
-
-	for _, tool := range tools {
-		tokenNum += getTokenNum(tokenEncoder, tool.Name)
-		tokenNum += getTokenNum(tokenEncoder, tool.Description)
-
-		schemaJSON, err := json.Marshal(tool.InputSchema)
-		if err != nil {
-			return 0, errors.New(fmt.Sprintf("marshal_tool_schema_fail: %s", err.Error()))
-		}
-		tokenNum += getTokenNum(tokenEncoder, string(schemaJSON))
-	}
-
-	// Add a constant for tool formatting (this may need adjustment based on Claude's exact formatting)
-	tokenNum += len(tools) * 3 // Assuming 3 tokens per tool for formatting
-
-	return tokenNum, nil
-}
-
 func CountTokenRealtime(info *relaycommon.RelayInfo, request dto.RealtimeEvent, model string) (int, int, error) {
 	audioToken := 0
 	textToken := 0
@@ -578,31 +415,6 @@ func CountTokenInput(input any, model string) int {
 	return CountTokenInput(fmt.Sprintf("%v", input), model)
 }

-func CountTokenStreamChoices(messages []dto.ChatCompletionsStreamResponseChoice, model string) int {
-	tokens := 0
-	for _, message := range messages {
-		tkm := CountTokenInput(message.Delta.GetContentString(), model)
-		tokens += tkm
-		if message.Delta.ToolCalls != nil {
-			for _, tool := range message.Delta.ToolCalls {
-				tkm := CountTokenInput(tool.Function.Name, model)
-				tokens += tkm
-				tkm = CountTokenInput(tool.Function.Arguments, model)
-				tokens += tkm
-			}
-		}
-	}
-	return tokens
-}
-
-func CountTTSToken(text string, model string) int {
-	if strings.HasPrefix(model, "tts") {
-		return utf8.RuneCountInString(text)
-	} else {
-		return CountTextToken(text, model)
-	}
-}
-
 func CountAudioTokenInput(audioBase64 string, audioFormat string) (int, error) {
 	if audioBase64 == "" {
 		return 0, nil
@@ -625,17 +437,16 @@ func CountAudioTokenOutput(audioBase64 string, audioFormat string) (int, error)
 	return int(duration / 60 * 200 / 0.24), nil
 }

-//func CountAudioToken(sec float64, audioType string) {
-//	if audioType == "input" {
-//
-//	}
-//}
-
-// CountTextToken 统计文本的token数量，仅当文本包含敏感词，返回错误，同时返回token数量
+// CountTextToken 统计文本的token数量，仅OpenAI模型使用tokenizer，其余模型使用估算
 func CountTextToken(text string, model string) int {
 	if text == "" {
 		return 0
 	}
-	tokenEncoder := getTokenEncoder(model)
-	return getTokenNum(tokenEncoder, text)
+	if common.IsOpenAITextModel(model) {
+		tokenEncoder := getTokenEncoder(model)
+		return getTokenNum(tokenEncoder, text)
+	} else {
+		// 非openai模型，使用tiktoken-go计算没有意义，使用估算节省资源
+		return EstimateTokenByModel(model, text)
+	}
 }
--- a/service/token_estimator.go
+++ b/service/token_estimator.go
@@ -0,0 +1,230 @@
+package service
+
+import (
+	"math"
+	"strings"
+	"sync"
+	"unicode"
+)
+
+// Provider 定义模型厂商大类
+type Provider string
+
+const (
+	OpenAI  Provider = "openai"  // 代表 GPT-3.5, GPT-4, GPT-4o
+	Gemini  Provider = "gemini"  // 代表 Gemini 1.0, 1.5 Pro/Flash
+	Claude  Provider = "claude"  // 代表 Claude 3, 3.5 Sonnet
+	Unknown Provider = "unknown" // 兜底默认
+)
+
+// multipliers 定义不同厂商的计费权重
+type multipliers struct {
+	Word       float64 // 英文单词 (每词)
+	Number     float64 // 数字 (每连续数字串)
+	CJK        float64 // 中日韩字符 (每字)
+	Symbol     float64 // 普通标点符号 (每个)
+	MathSymbol float64 // 数学符号 (∑,∫,∂,√等，每个)
+	URLDelim   float64 // URL分隔符 (/,:,?,&,=,#,%) - tokenizer优化好
+	AtSign     float64 // @符号 - 导致单词切分，消耗较高
+	Emoji      float64 // Emoji表情 (每个)
+	Newline    float64 // 换行符/制表符 (每个)
+	Space      float64 // 空格 (每个)
+	BasePad    int     // 基础起步消耗 (Start/End tokens)
+}
+
+var (
+	multipliersMap = map[Provider]multipliers{
+		Gemini: {
+			Word: 1.15, Number: 2.8, CJK: 0.68, Symbol: 0.38, MathSymbol: 1.05, URLDelim: 1.2, AtSign: 2.5, Emoji: 1.08, Newline: 1.15, Space: 0.2, BasePad: 0,
+		},
+		Claude: {
+			Word: 1.13, Number: 1.63, CJK: 1.21, Symbol: 0.4, MathSymbol: 4.52, URLDelim: 1.26, AtSign: 2.82, Emoji: 2.6, Newline: 0.89, Space: 0.39, BasePad: 0,
+		},
+		OpenAI: {
+			Word: 1.02, Number: 1.55, CJK: 0.85, Symbol: 0.4, MathSymbol: 2.68, URLDelim: 1.0, AtSign: 2.0, Emoji: 2.12, Newline: 0.5, Space: 0.42, BasePad: 0,
+		},
+	}
+	multipliersLock sync.RWMutex
+)
+
+// getMultipliers 根据厂商获取权重配置
+func getMultipliers(p Provider) multipliers {
+	multipliersLock.RLock()
+	defer multipliersLock.RUnlock()
+
+	switch p {
+	case Gemini:
+		return multipliersMap[Gemini]
+	case Claude:
+		return multipliersMap[Claude]
+	case OpenAI:
+		return multipliersMap[OpenAI]
+	default:
+		// 默认兜底 (按 OpenAI 的算)
+		return multipliersMap[OpenAI]
+	}
+}
+
+// EstimateToken 计算 Token 数量
+func EstimateToken(provider Provider, text string) int {
+	m := getMultipliers(provider)
+	var count float64
+
+	// 状态机变量
+	type WordType int
+	const (
+		None WordType = iota
+		Latin
+		Number
+	)
+	currentWordType := None
+
+	for _, r := range text {
+		// 1. 处理空格和换行符
+		if unicode.IsSpace(r) {
+			currentWordType = None
+			// 换行符和制表符使用Newline权重
+			if r == '\n' || r == '\t' {
+				count += m.Newline
+			} else {
+				// 普通空格使用Space权重
+				count += m.Space
+			}
+			continue
+		}
+
+		// 2. 处理 CJK (中日韩) - 按字符计费
+		if isCJK(r) {
+			currentWordType = None
+			count += m.CJK
+			continue
+		}
+
+		// 3. 处理Emoji - 使用专门的Emoji权重
+		if isEmoji(r) {
+			currentWordType = None
+			count += m.Emoji
+			continue
+		}
+
+		// 4. 处理拉丁字母/数字 (英文单词)
+		if isLatinOrNumber(r) {
+			isNum := unicode.IsNumber(r)
+			newType := Latin
+			if isNum {
+				newType = Number
+			}
+
+			// 如果之前不在单词中，或者类型发生变化（字母<->数字），则视为新token
+			// 注意：对于OpenAI，通常"version 3.5"会切分，"abc123xyz"有时也会切分
+			// 这里简单起见，字母和数字切换时增加权重
+			if currentWordType == None || currentWordType != newType {
+				if newType == Number {
+					count += m.Number
+				} else {
+					count += m.Word
+				}
+				currentWordType = newType
+			}
+			// 单词中间的字符不额外计费
+			continue
+		}
+
+		// 5. 处理标点符号/特殊字符 - 按类型使用不同权重
+		currentWordType = None
+		if isMathSymbol(r) {
+			count += m.MathSymbol
+		} else if r == '@' {
+			count += m.AtSign
+		} else if isURLDelim(r) {
+			count += m.URLDelim
+		} else {
+			count += m.Symbol
+		}
+	}
+
+	// 向上取整并加上基础 padding
+	return int(math.Ceil(count)) + m.BasePad
+}
+
+// 辅助：判断是否为 CJK 字符
+func isCJK(r rune) bool {
+	return unicode.Is(unicode.Han, r) ||
+		(r >= 0x3040 && r <= 0x30FF) || // 日文
+		(r >= 0xAC00 && r <= 0xD7A3) // 韩文
+}
+
+// 辅助：判断是否为单词主体 (字母或数字)
+func isLatinOrNumber(r rune) bool {
+	return unicode.IsLetter(r) || unicode.IsNumber(r)
+}
+
+// 辅助：判断是否为Emoji字符
+func isEmoji(r rune) bool {
+	// Emoji的Unicode范围
+	// 基本范围：0x1F300-0x1F9FF (Emoticons, Symbols, Pictographs)
+	// 补充范围：0x2600-0x26FF (Misc Symbols), 0x2700-0x27BF (Dingbats)
+	// 表情符号：0x1F600-0x1F64F (Emoticons)
+	// 其他：0x1F900-0x1F9FF (Supplemental Symbols and Pictographs)
+	return (r >= 0x1F300 && r <= 0x1F9FF) ||
+		(r >= 0x2600 && r <= 0x26FF) ||
+		(r >= 0x2700 && r <= 0x27BF) ||
+		(r >= 0x1F600 && r <= 0x1F64F) ||
+		(r >= 0x1F900 && r <= 0x1F9FF) ||
+		(r >= 0x1FA00 && r <= 0x1FAFF) // Symbols and Pictographs Extended-A
+}
+
+// 辅助：判断是否为数学符号
+func isMathSymbol(r rune) bool {
+	// 数学运算符和符号
+	// 基本数学符号：∑ ∫ ∂ √ ∞ ≤ ≥ ≠ ≈ ± × ÷
+	// 上下标数字：² ³ ¹ ⁴ ⁵ ⁶ ⁷ ⁸ ⁹ ⁰
+	// 希腊字母等也常用于数学
+	mathSymbols := "∑∫∂√∞≤≥≠≈±×÷∈∉∋∌⊂⊃⊆⊇∪∩∧∨¬∀∃∄∅∆∇∝∟∠∡∢°′″‴⁺⁻⁼⁽⁾ⁿ₀₁₂₃₄₅₆₇₈₉₊₋₌₍₎²³¹⁴⁵⁶⁷⁸⁹⁰"
+	for _, m := range mathSymbols {
+		if r == m {
+			return true
+		}
+	}
+	// Mathematical Operators (U+2200–U+22FF)
+	if r >= 0x2200 && r <= 0x22FF {
+		return true
+	}
+	// Supplemental Mathematical Operators (U+2A00–U+2AFF)
+	if r >= 0x2A00 && r <= 0x2AFF {
+		return true
+	}
+	// Mathematical Alphanumeric Symbols (U+1D400–U+1D7FF)
+	if r >= 0x1D400 && r <= 0x1D7FF {
+		return true
+	}
+	return false
+}
+
+// 辅助：判断是否为URL分隔符（tokenizer对这些优化较好）
+func isURLDelim(r rune) bool {
+	// URL中常见的分隔符，tokenizer通常优化处理
+	urlDelims := "/:?&=;#%"
+	for _, d := range urlDelims {
+		if r == d {
+			return true
+		}
+	}
+	return false
+}
+
+func EstimateTokenByModel(model, text string) int {
+	// strings.Contains(model, "gpt-4o")
+	if text == "" {
+		return 0
+	}
+
+	model = strings.ToLower(model)
+	if strings.Contains(model, "gemini") {
+		return EstimateToken(Gemini, text)
+	} else if strings.Contains(model, "claude") {
+		return EstimateToken(Claude, text)
+	} else {
+		return EstimateToken(OpenAI, text)
+	}
+}
--- a/service/tokenizer.go
+++ b/service/tokenizer.go
@@ -0,0 +1,63 @@
+package service
+
+import (
+	"sync"
+
+	"github.com/QuantumNous/new-api/common"
+	"github.com/tiktoken-go/tokenizer"
+	"github.com/tiktoken-go/tokenizer/codec"
+)
+
+// tokenEncoderMap won't grow after initialization
+var defaultTokenEncoder tokenizer.Codec
+
+// tokenEncoderMap is used to store token encoders for different models
+var tokenEncoderMap = make(map[string]tokenizer.Codec)
+
+// tokenEncoderMutex protects tokenEncoderMap for concurrent access
+var tokenEncoderMutex sync.RWMutex
+
+func InitTokenEncoders() {
+	common.SysLog("initializing token encoders")
+	defaultTokenEncoder = codec.NewCl100kBase()
+	common.SysLog("token encoders initialized")
+}
+
+func getTokenEncoder(model string) tokenizer.Codec {
+	// First, try to get the encoder from cache with read lock
+	tokenEncoderMutex.RLock()
+	if encoder, exists := tokenEncoderMap[model]; exists {
+		tokenEncoderMutex.RUnlock()
+		return encoder
+	}
+	tokenEncoderMutex.RUnlock()
+
+	// If not in cache, create new encoder with write lock
+	tokenEncoderMutex.Lock()
+	defer tokenEncoderMutex.Unlock()
+
+	// Double-check if another goroutine already created the encoder
+	if encoder, exists := tokenEncoderMap[model]; exists {
+		return encoder
+	}
+
+	// Create new encoder
+	modelCodec, err := tokenizer.ForModel(tokenizer.Model(model))
+	if err != nil {
+		// Cache the default encoder for this model to avoid repeated failures
+		tokenEncoderMap[model] = defaultTokenEncoder
+		return defaultTokenEncoder
+	}
+
+	// Cache the new encoder
+	tokenEncoderMap[model] = modelCodec
+	return modelCodec
+}
+
+func getTokenNum(tokenEncoder tokenizer.Codec, text string) int {
+	if text == "" {
+		return 0
+	}
+	tkm, _ := tokenEncoder.Count(text)
+	return tkm
+}
--- a/service/usage_helpr.go
+++ b/service/usage_helpr.go
@@ -23,8 +23,7 @@ func ResponseText2Usage(c *gin.Context, responseText string, modeName string, pr
 	common.SetContextKey(c, constant.ContextKeyLocalCountTokens, true)
 	usage := &dto.Usage{}
 	usage.PromptTokens = promptTokens
-	ctkm := CountTextToken(responseText, modeName)
-	usage.CompletionTokens = ctkm
+	usage.CompletionTokens = EstimateTokenByModel(modeName, responseText)
 	usage.TotalTokens = usage.PromptTokens + usage.CompletionTokens
 	return usage
 }
--- a/setting/model_setting/global.go
+++ b/setting/model_setting/global.go
@@ -32,7 +32,7 @@ func GetGlobalSettings() *GlobalSettings {
 	return &globalSettings
 }

-// ShouldPreserveThinkingSuffix 判断模型是否配置为保留 thinking/-nothinking 后缀
+// ShouldPreserveThinkingSuffix 判断模型是否配置为保留 thinking/-nothinking/-low/-high/-medium 后缀
 func ShouldPreserveThinkingSuffix(modelName string) bool {
 	target := strings.TrimSpace(modelName)
 	if target == "" {
--- a/setting/ratio_setting/cache_ratio.go
+++ b/setting/ratio_setting/cache_ratio.go
@@ -43,6 +43,7 @@ var defaultCacheRatio = map[string]float64{
 	"claude-3-opus-20240229":              0.1,
 	"claude-3-haiku-20240307":             0.1,
 	"claude-3-5-haiku-20241022":           0.1,
+	"claude-haiku-4-5-20251001":           0.1,
 	"claude-3-5-sonnet-20240620":          0.1,
 	"claude-3-5-sonnet-20241022":          0.1,
 	"claude-3-7-sonnet-20250219":          0.1,
@@ -55,6 +56,8 @@ var defaultCacheRatio = map[string]float64{
 	"claude-opus-4-1-20250805-thinking":   0.1,
 	"claude-sonnet-4-5-20250929":          0.1,
 	"claude-sonnet-4-5-20250929-thinking": 0.1,
+	"claude-opus-4-5-20251101":            0.1,
+	"claude-opus-4-5-20251101-thinking":   0.1,
 }

 var defaultCreateCacheRatio = map[string]float64{
@@ -62,6 +65,7 @@ var defaultCreateCacheRatio = map[string]float64{
 	"claude-3-opus-20240229":              1.25,
 	"claude-3-haiku-20240307":             1.25,
 	"claude-3-5-haiku-20241022":           1.25,
+	"claude-haiku-4-5-20251001":           1.25,
 	"claude-3-5-sonnet-20240620":          1.25,
 	"claude-3-5-sonnet-20241022":          1.25,
 	"claude-3-7-sonnet-20250219":          1.25,
@@ -74,6 +78,8 @@ var defaultCreateCacheRatio = map[string]float64{
 	"claude-opus-4-1-20250805-thinking":   1.25,
 	"claude-sonnet-4-5-20250929":          1.25,
 	"claude-sonnet-4-5-20250929-thinking": 1.25,
+	"claude-opus-4-5-20251101":            1.25,
+	"claude-opus-4-5-20251101-thinking":   1.25,
 }

 //var defaultCreateCacheRatio = map[string]float64{}
--- a/setting/ratio_setting/model_ratio.go
+++ b/setting/ratio_setting/model_ratio.go
@@ -7,6 +7,7 @@ import (

 	"github.com/QuantumNous/new-api/common"
 	"github.com/QuantumNous/new-api/setting/operation_setting"
+	"github.com/QuantumNous/new-api/setting/reasoning"
 )

 // from songquanpeng/one-api
@@ -136,6 +137,7 @@ var defaultModelRatio = map[string]float64{
 	"claude-2.1":                                4,     // $8 / 1M tokens
 	"claude-3-haiku-20240307":                   0.125, // $0.25 / 1M tokens
 	"claude-3-5-haiku-20241022":                 0.5,   // $1 / 1M tokens
+	"claude-haiku-4-5-20251001":                 0.5,   // $1 / 1M tokens
 	"claude-3-sonnet-20240229":                  1.5,   // $3 / 1M tokens
 	"claude-3-5-sonnet-20240620":                1.5,
 	"claude-3-5-sonnet-20241022":                1.5,
@@ -143,6 +145,7 @@ var defaultModelRatio = map[string]float64{
 	"claude-3-7-sonnet-20250219-thinking":       1.5,
 	"claude-sonnet-4-20250514":                  1.5,
 	"claude-sonnet-4-5-20250929":                1.5,
+	"claude-opus-4-5-20251101":                  2.5,
 	"claude-3-opus-20240229":                    7.5, // $15 / 1M tokens
 	"claude-opus-4-20250514":                    7.5,
 	"claude-opus-4-1-20250805":                  7.5,
@@ -558,7 +561,7 @@ func getHardcodedCompletionModelRatio(name string) (float64, bool) {

 	if strings.Contains(name, "claude-3") {
 		return 5, true
-	} else if strings.Contains(name, "claude-sonnet-4") || strings.Contains(name, "claude-opus-4") {
+	} else if strings.Contains(name, "claude-sonnet-4") || strings.Contains(name, "claude-opus-4") || strings.Contains(name, "claude-haiku-4") {
 		return 5, true
 	} else if strings.Contains(name, "claude-instant-1") || strings.Contains(name, "claude-2") {
 		return 3, true
@@ -820,6 +823,10 @@ func FormatMatchingModelName(name string) string {
 		name = handleThinkingBudgetModel(name, "gemini-2.5-pro", "gemini-2.5-pro-thinking-*")
 	}

+	if base, _, ok := reasoning.TrimEffortSuffix(name); ok {
+		name = base
+	}
+
 	if strings.HasPrefix(name, "gpt-4-gizmo") {
 		name = "gpt-4-gizmo-*"
 	}
--- a/setting/reasoning/suffix.go
+++ b/setting/reasoning/suffix.go
@@ -0,0 +1,20 @@
+package reasoning
+
+import (
+	"strings"
+
+	"github.com/samber/lo"
+)
+
+var EffortSuffixes = []string{"-high", "-medium", "-low"}
+
+// TrimEffortSuffix -> modelName level(low) exists
+func TrimEffortSuffix(modelName string) (string, string, bool) {
+	suffix, found := lo.Find(EffortSuffixes, func(s string) bool {
+		return strings.HasSuffix(modelName, s)
+	})
+	if !found {
+		return modelName, "", false
+	}
+	return strings.TrimSuffix(modelName, suffix), strings.TrimPrefix(suffix, "-"), true
+}
--- a/web/src/components/auth/LoginForm.jsx
+++ b/web/src/components/auth/LoginForm.jsx
@@ -294,7 +294,7 @@ const LoginForm = () => {
      setGithubButtonDisabled(true);
    }, 20000);
    try {
-      onGitHubOAuthClicked(status.github_client_id);
+      onGitHubOAuthClicked(status.github_client_id, { shouldLogout: true });
    } finally {
      // 由于重定向，这里不会执行到，但为了完整性添加
      setTimeout(() => setGithubLoading(false), 3000);
@@ -309,7 +309,7 @@ const LoginForm = () => {
    }
    setDiscordLoading(true);
    try {
-      onDiscordOAuthClicked(status.discord_client_id);
+      onDiscordOAuthClicked(status.discord_client_id, { shouldLogout: true });
    } finally {
      // 由于重定向，这里不会执行到，但为了完整性添加
      setTimeout(() => setDiscordLoading(false), 3000);
@@ -324,7 +324,12 @@ const LoginForm = () => {
    }
    setOidcLoading(true);
    try {
-      onOIDCClicked(status.oidc_authorization_endpoint, status.oidc_client_id);
+      onOIDCClicked(
+        status.oidc_authorization_endpoint,
+        status.oidc_client_id,
+        false,
+        { shouldLogout: true },
+      );
    } finally {
      // 由于重定向，这里不会执行到，但为了完整性添加
      setTimeout(() => setOidcLoading(false), 3000);
@@ -339,7 +344,7 @@ const LoginForm = () => {
    }
    setLinuxdoLoading(true);
    try {
-      onLinuxDOOAuthClicked(status.linuxdo_client_id);
+      onLinuxDOOAuthClicked(status.linuxdo_client_id, { shouldLogout: true });
    } finally {
      // 由于重定向，这里不会执行到，但为了完整性添加
      setTimeout(() => setLinuxdoLoading(false), 3000);
--- a/web/src/components/auth/RegisterForm.jsx
+++ b/web/src/components/auth/RegisterForm.jsx
@@ -261,7 +261,7 @@ const RegisterForm = () => {
      setGithubButtonDisabled(true);
    }, 20000);
    try {
-      onGitHubOAuthClicked(status.github_client_id);
+      onGitHubOAuthClicked(status.github_client_id, { shouldLogout: true });
    } finally {
      setTimeout(() => setGithubLoading(false), 3000);
    }
@@ -270,7 +270,7 @@ const RegisterForm = () => {
  const handleDiscordClick = () => {
    setDiscordLoading(true);
    try {
-      onDiscordOAuthClicked(status.discord_client_id);
+      onDiscordOAuthClicked(status.discord_client_id, { shouldLogout: true });
    } finally {
      setTimeout(() => setDiscordLoading(false), 3000);
    }
@@ -279,7 +279,12 @@ const RegisterForm = () => {
  const handleOIDCClick = () => {
    setOidcLoading(true);
    try {
-      onOIDCClicked(status.oidc_authorization_endpoint, status.oidc_client_id);
+      onOIDCClicked(
+        status.oidc_authorization_endpoint,
+        status.oidc_client_id,
+        false,
+        { shouldLogout: true },
+      );
    } finally {
      setTimeout(() => setOidcLoading(false), 3000);
    }
@@ -288,7 +293,7 @@ const RegisterForm = () => {
  const handleLinuxDOClick = () => {
    setLinuxdoLoading(true);
    try {
-      onLinuxDOOAuthClicked(status.linuxdo_client_id);
+      onLinuxDOOAuthClicked(status.linuxdo_client_id, { shouldLogout: true });
    } finally {
      setTimeout(() => setLinuxdoLoading(false), 3000);
    }
--- a/web/src/components/layout/SiderBar.jsx
+++ b/web/src/components/layout/SiderBar.jsx
@@ -377,7 +377,6 @@ const SiderBar = ({ onNavigate = () => {} }) => {
      className='sidebar-container'
      style={{
        width: 'var(--sidebar-current-width)',
-        background: 'var(--semi-color-bg-0)',
      }}
    >
      <SkeletonWrapper
--- a/web/src/components/playground/CustomInputRender.jsx
+++ b/web/src/components/playground/CustomInputRender.jsx
@@ -17,12 +17,87 @@ along with this program. If not, see <https://www.gnu.org/licenses/>.
 For commercial licensing, please contact support@quantumnous.com
 */

-import React from 'react';
+import React, { useRef, useEffect, useCallback } from 'react';
+import { Toast } from '@douyinfe/semi-ui';
+import { useTranslation } from 'react-i18next';
+import { usePlayground } from '../../contexts/PlaygroundContext';

 const CustomInputRender = (props) => {
+  const { t } = useTranslation();
+  const { onPasteImage, imageEnabled } = usePlayground();
  const { detailProps } = props;
  const { clearContextNode, uploadNode, inputNode, sendNode, onClick } =
    detailProps;
+  const containerRef = useRef(null);
+
+  const handlePaste = useCallback(async (e) => {
+    const items = e.clipboardData?.items;
+    if (!items) return;
+
+    for (let i = 0; i < items.length; i++) {
+      const item = items[i];
+      
+      if (item.type.indexOf('image') !== -1) {
+        e.preventDefault();
+        const file = item.getAsFile();
+        
+        if (file) {
+          try {
+            if (!imageEnabled) {
+              Toast.warning({
+                content: t('请先在设置中启用图片功能'),
+                duration: 3,
+              });
+              return;
+            }
+
+            const reader = new FileReader();
+            reader.onload = (event) => {
+              const base64 = event.target.result;
+              
+              if (onPasteImage) {
+                onPasteImage(base64);
+                Toast.success({
+                  content: t('图片已添加'),
+                  duration: 2,
+                });
+              } else {
+                Toast.error({
+                  content: t('无法添加图片'),
+                  duration: 2,
+                });
+              }
+            };
+            reader.onerror = () => {
+              console.error('Failed to read image file:', reader.error);
+              Toast.error({
+                content: t('粘贴图片失败'),
+                duration: 2,
+              });
+            };
+            reader.readAsDataURL(file);
+          } catch (error) {
+            console.error('Failed to paste image:', error);
+            Toast.error({
+              content: t('粘贴图片失败'),
+              duration: 2,
+            });
+          }
+        }
+        break;
+      }
+    }
+  }, [onPasteImage, imageEnabled, t]);
+
+  useEffect(() => {
+    const container = containerRef.current;
+    if (!container) return;
+
+    container.addEventListener('paste', handlePaste);
+    return () => {
+      container.removeEventListener('paste', handlePaste);
+    };
+  }, [handlePaste]);

  // 清空按钮
  const styledClearNode = clearContextNode
@@ -57,11 +132,12 @@ const CustomInputRender = (props) => {
  });

  return (
-    <div className='p-2 sm:p-4'>
+    <div className='p-2 sm:p-4' ref={containerRef}>
      <div
        className='flex items-center gap-2 sm:gap-3 p-2 bg-gray-50 rounded-xl sm:rounded-2xl shadow-sm hover:shadow-md transition-shadow'
        style={{ border: '1px solid var(--semi-color-border)' }}
        onClick={onClick}
+        title={t('支持 Ctrl+V 粘贴图片')}
      >
        {/* 清空对话按钮 - 左边 */}
        {styledClearNode}
--- a/web/src/components/playground/CustomRequestEditor.jsx
+++ b/web/src/components/playground/CustomRequestEditor.jsx
@@ -82,7 +82,7 @@ const CustomRequestEditor = ({
      return true;
    } catch (error) {
      setIsValid(false);
-      setErrorMessage(`JSON格式错误: ${error.message}`);
+      setErrorMessage(`${t('JSON格式错误')}: ${error.message}`);
      return false;
    }
  };
@@ -123,14 +123,14 @@ const CustomRequestEditor = ({
        <div className='flex items-center gap-2'>
          <Code size={16} className='text-gray-500' />
          <Typography.Text strong className='text-sm'>
-            自定义请求体模式
+            {t('自定义请求体模式')}
          </Typography.Text>
        </div>
        <Switch
          checked={customRequestMode}
          onChange={handleModeToggle}
-          checkedText='开'
-          uncheckedText='关'
+          checkedText={t('开')}
+          uncheckedText={t('关')}
          size='small'
        />
      </div>
@@ -140,7 +140,7 @@ const CustomRequestEditor = ({
          {/* 提示信息 */}
          <Banner
            type='warning'
-            description='启用此模式后，将使用您自定义的请求体发送API请求，模型配置面板的参数设置将被忽略。'
+            description={t('启用此模式后，将使用您自定义的请求体发送API请求，模型配置面板的参数设置将被忽略。')}
            icon={<AlertTriangle size={16} />}
            className='!rounded-lg'
            closeIcon={null}
@@ -150,21 +150,21 @@ const CustomRequestEditor = ({
          <div>
            <div className='flex items-center justify-between mb-2'>
              <Typography.Text strong className='text-sm'>
-                请求体 JSON
+                {t('请求体 JSON')}
              </Typography.Text>
              <div className='flex items-center gap-2'>
                {isValid ? (
                  <div className='flex items-center gap-1 text-green-600'>
                    <Check size={14} />
                    <Typography.Text className='text-xs'>
-                      格式正确
+                      {t('格式正确')}
                    </Typography.Text>
                  </div>
                ) : (
                  <div className='flex items-center gap-1 text-red-600'>
                    <X size={14} />
                    <Typography.Text className='text-xs'>
-                      格式错误
+                      {t('格式错误')}
                    </Typography.Text>
                  </div>
                )}
@@ -177,7 +177,7 @@ const CustomRequestEditor = ({
                  disabled={!isValid}
                  className='!rounded-lg'
                >
-                  格式化
+                  {t('格式化')}
                </Button>
              </div>
            </div>
@@ -201,7 +201,7 @@ const CustomRequestEditor = ({
            )}

            <Typography.Text className='text-xs text-gray-500 mt-2 block'>
-              请输入有效的JSON格式的请求体。您可以参考预览面板中的默认请求体格式。
+              {t('请输入有效的JSON格式的请求体。您可以参考预览面板中的默认请求体格式。')}
            </Typography.Text>
          </div>
        </>
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
Seefs	4e69c98b42	Merge pull request #2412 from seefs001/pr-2372 feat: add openai video remix endpoint	2025-12-11 23:35:23 +08:00
Seefs	ca29fc5702	Merge pull request #2194 from NoahCodeGG/fix/process_channel_error	2025-12-11 18:12:06 +08:00
Calcium-Ion	fca015c6c4	Merge pull request #2397 from seefs001/fix/tool-call-claude fix: try to fix tool call issues	2025-12-09 16:57:24 +08:00
Seefs	23292a5ae9	Merge pull request #2360 from feitianbubu/pr2/fix-price-currency	2025-12-09 14:10:26 +08:00
Calcium-Ion	e346f0bf16	Merge pull request #2398 from seefs001/fix/video-proxy fix: Use channel proxy settings for task query scenarios	2025-12-09 14:05:30 +08:00
Calcium-Ion	cae05c068c	Merge pull request #2396 from seefs001/fix/login fix: Try to fix login error "already logged in" issue	2025-12-09 14:04:48 +08:00
Calcium-Ion	78c10209c0	Merge pull request #2395 from seefs001/fix/siderbar fix: sidebar color overlap	2025-12-09 14:04:26 +08:00
Calcium-Ion	4ffd54c50d	Merge pull request #2394 from seefs001/fix/fetch-model-header-overide fix: fetch upstream models	2025-12-09 14:03:34 +08:00
Calcium-Ion	08466358b2	Merge pull request #2359 from seefs001/fix/qwen-chat-args fix: qwen chat_template_kwargs	2025-12-09 14:01:26 +08:00
Calcium-Ion	5212fbd73d	Merge pull request #2358 from seefs001/fix/regrex-repeat-compile fix: regex repeat compile	2025-12-09 14:01:07 +08:00
Calcium-Ion	b0e120dcab	Merge pull request #2357 from seefs001/feature/go1.25-greengc chore(go): enable greenteagc	2025-12-09 14:00:52 +08:00
Calcium-Ion	9561c7b50f	Merge pull request #2356 from seefs001/feature/zhipiu_4v_image feat: zhipu 4v image generations	2025-12-09 14:00:20 +08:00
Seefs	1cb2b6f882	fix:try to fix tool call issues	2025-12-09 13:55:52 +08:00
Seefs	5889571108	fix: Use channel proxy settings for task query scenarios	2025-12-09 11:15:27 +08:00
Seefs	2e33948842	fix: Add styles only on mobile	2025-12-09 10:46:16 +08:00
Seefs	d1aaa07ad7	fix: Try to fix login error "already logged in" issue	2025-12-08 22:32:45 +08:00
Seefs	ea70c20f8e	fix: sidebar color overlap	2025-12-08 21:25:21 +08:00
Seefs	c7539d11a0	fix: fetch upstream models	2025-12-08 21:14:50 +08:00
Seefs	3ebc713327	Merge pull request #2387 from binorxin/fix-bug fix(go.mod): 更新modernc.org/sqlite依赖项版本	2025-12-08 21:02:18 +08:00
Seefs	72d2a94b0d	Merge pull request #2229 from HynoR/chore/v1 fix: Set default to unsupported value for gpt-5 model series requests	2025-12-08 20:59:30 +08:00
Seefs	12a5c7ce5e	Merge pull request #2368 from oudi/main Increase token name length limit from 30 to 50	2025-12-08 20:48:40 +08:00
Seefs	5eae6a3874	Merge pull request #2375 from FlowerRealm/feat/add-claude-haiku-4-5 feat: add claude-haiku-4-5-20251001 model support	2025-12-08 20:46:02 +08:00
Seefs	7b108a6900	Merge pull request #2388 from FirstMelody/main fix(adaptor): fix reasoning suffix not processing in vertex adapter	2025-12-08 20:45:37 +08:00
borx	3d282ac548	fix(go.mod): 更新modernc.org/sqlite依赖项版本	2025-12-08 01:16:30 +08:00
firstmelody	121746a79e	fix(adaptor): fix reasoning suffix not processing in vertex adapter	2025-12-08 01:12:29 +08:00
FlowerRealm	c3c119a9b4	feat: add claude-haiku-4-5-20251001 model support - Add model to Claude ModelList - Add model ratio (0.5, $1/1M input tokens) - Add completion ratio support (5x, $5/1M output tokens) - Add cache read ratio (0.1, $0.10/1M tokens) - Add cache write ratio (1.25, $1.25/1M tokens) Model specs: - Context window: 200K tokens - Max output: 64K tokens - Release date: October 1, 2025	2025-12-05 18:54:20 +08:00
oudi	6d6e5b3337	Merge pull request #1 from oudi/token-length-patch Increase token name length limit from 30 to 50	2025-12-04 11:21:46 +08:00
oudi	d64205e35a	Increase token name length limit from 30 to 50	2025-12-04 11:18:51 +08:00
CaIon	0b9f6a58bc	feat: 将任务查询数量改为可配置环境变量 TASK_QUERY_LIMIT	2025-12-03 19:27:15 +08:00
feitianbubu	293a5de0f8	feat: update price display use current currency symbol	2025-12-03 10:51:03 +08:00
Seefs	c07347f24f	fix: qwen chat_template_kwargs	2025-12-03 00:47:40 +08:00
Seefs	896e4ac671	fix: regex repeat compile	2025-12-03 00:41:47 +08:00
CaIon	7d1bad1b37	fix(token_counter): correct model name reference in image token estimation	2025-12-03 00:25:05 +08:00
Seefs	8e7be25429	chore(go): enable greenteagc	2025-12-02 23:15:20 +08:00
Seefs	2e37347851	feat: zhipu v4 image generations	2025-12-02 22:56:58 +08:00
CaIon	45556c961f	fix(price): adjust pre-consume quota logic for free models based on group ratio	2025-12-02 22:09:48 +08:00
Calcium-Ion	ffc45a756e	Merge pull request #2344 from seefs001/feature/gemini-thinking-level feat: gemini 3 thinking level gemini-3-pro-preview-high	2025-12-02 21:55:43 +08:00
Calcium-Ion	48635360cd	Merge pull request #2355 from QuantumNous/feat/optimize-token-counter feat: refactor token estimation logic	2025-12-02 21:51:09 +08:00
Calcium-Ion	e7e5cc2c05	Merge pull request #2351 from prnake/fix-max-conns fix: try resolve the high concurrency issue to a single host	2025-12-02 21:44:24 +08:00
CaIon	0c051e968f	feat(token_estimator): add concurrency support for multipliers retrieval	2025-12-02 21:38:58 +08:00
CaIon	f5b409d74f	feat: refactor token estimation logic - Introduced new OpenAI text models in `common/model.go`. - Added `IsOpenAITextModel` function to check for OpenAI text models. - Refactored token estimation methods across various channels to use estimated prompt tokens instead of direct prompt token counts. - Updated related functions and structures to accommodate the new token estimation approach, enhancing overall token management.	2025-12-02 21:34:39 +08:00
Calcium-Ion	509d1f633a	Merge pull request #2353 from QuantumNous/openapi chore: update the relay openapi file	2025-12-02 18:18:35 +08:00
t0ng7u	0c6d890f6e	chore: update the relay openapi file	2025-12-02 18:17:01 +08:00
Papersnake	2f7eebcd10	fix: add ForceAttemptHTTP2	2025-12-02 10:08:58 +08:00
Papersnake	3954feb993	fix: set MaxIdleConnsPerHost to 100	2025-12-02 09:55:03 +08:00
Calcium-Ion	d3ca454c3b	Merge pull request #2348 from QuantumNous/openapi chore: update openapi files	2025-12-02 00:32:17 +08:00
t0ng7u	46aca8fad3	chore: update openapi files	2025-12-01 21:39:09 +08:00
Calcium-Ion	86aeb72549	Merge pull request #2346 from QuantumNous/nano-banana-multi-turn feat(gemini): implement markdown image handling in text processing	2025-12-01 18:42:51 +08:00
CaIon	4dbdbdec1d	feat(gemini): implement markdown image handling in text processing	2025-12-01 17:54:41 +08:00
Seefs	b6a02d8303	feat: gemini 3 thinking level gemini-3-pro-preview-high	2025-12-01 16:40:46 +08:00
CaIon	36a739e777	Remove outdated API documentation for authentication, web API, and models (Midjourney, Rerank, Suno). Add OpenAPI specifications for backend management and relay interfaces.	2025-11-30 21:44:05 +08:00
CaIon	98f92f990a	feat(gemini): add validation and conversion for imageConfig parameters in extra_body	2025-11-30 19:31:08 +08:00
CaIon	3f7ea1fd83	fix(vertex): ensure sampleCount is a positive integer and update OtherRatios	2025-11-30 19:05:33 +08:00
Calcium-Ion	f6e7a2344b	Merge pull request #2340 from QuantumNous/revert-2305-pr/add-gemini-3-pro-image-preview-oai Revert "OAI生图接口支持gemini 3 pro image preview"	2025-11-30 18:50:16 +08:00
Seefs	3257723a55	Revert "OAI生图接口支持gemini 3 pro image preview"	2025-11-30 18:49:18 +08:00
Calcium-Ion	b19b2d62df	Merge pull request #2339 from QuantumNous/revert-2330-pr/fix-nano-banana-err Revert "fix: nano-banana not compatible imageSize"	2025-11-30 18:48:09 +08:00
Calcium-Ion	f9c8624f2c	Merge pull request #2338 from QuantumNous/revert-2321-pr/gemini-image-edit Revert "Gemini Image系列支持图像编辑"	2025-11-30 18:48:01 +08:00
Calcium-Ion	6c8253156b	Merge pull request #2337 from QuantumNous/revert-2315-pr/gemini-veo3.1-i2v Revert "Gemini Veo3.1[AI Studio]增加图生视频支持"	2025-11-30 18:47:50 +08:00
Calcium-Ion	a66b314f5b	Merge pull request #2336 from QuantumNous/revert-2309-pr/fix-gemini-ImageConfig Revert "fix: gemini image correct generationConfig"	2025-11-30 18:47:39 +08:00
Seefs	e29ff0060d	Revert "fix: nano-banana not compatible imageSize"	2025-11-30 18:46:10 +08:00
Seefs	d4a2c2ab54	Revert "Gemini Image系列支持图像编辑"	2025-11-30 18:45:54 +08:00
Seefs	ded463ee57	Revert "Gemini Veo3.1[AI Studio]增加图生视频支持"	2025-11-30 18:45:37 +08:00
Seefs	e337936227	Revert "fix: gemini image correct generationConfig"	2025-11-30 18:45:23 +08:00
Seefs	8d0827cb9e	Merge pull request #2314 from seefs001/fix/i18n-missing fix(i18n): fill missing translations in i18n.	2025-11-30 16:31:52 +08:00
Calcium-Ion	c07331ee21	Merge pull request #2304 from seefs001/fix/claude-missing-field fix: claude request missing field	2025-11-30 16:22:35 +08:00
Calcium-Ion	287a59e2fd	fix: edit vertex key type (#2311 )	2025-11-30 16:21:49 +08:00
Seefs	451c594e34	Merge pull request #2334 from seefs001/feature/glm-coding feat: glm coding plan && kimi coding plan	2025-11-30 16:21:12 +08:00
Calcium-Ion	46a18c4658	Merge pull request #2335 from seefs001/fix/nano-banana-pro-4k fix: nano banana pro 4k(StreamScannerMaxBufferMB env)	2025-11-30 16:20:46 +08:00
Calcium-Ion	d5cb53154f	Merge pull request #2312 from ImogeneOctaviap794/feat/enhance-playground-debugging feat(playground): enhance SSE debugging and add image paste support with i18n	2025-11-30 16:20:39 +08:00
Seefs	2b54e5fc53	Merge pull request #2330 from feitianbubu/pr/fix-nano-banana-err fix: nano-banana not compatible imageSize	2025-11-30 16:18:20 +08:00
Seefs	2520c8b25d	fix: nano banana pro 4k(StreamScannerMaxBufferMB env)	2025-11-30 16:08:25 +08:00
Seefs	590745b846	Merge pull request #2329 from mfzzf/fix/aws-anthropic-http-err-code fix(aws): extract HTTP status code from AWS SDK errors	2025-11-29 15:19:01 +08:00
feitianbubu	77eb536b69	fix: nano-banana not compatible imageSize	2025-11-29 00:58:25 +08:00
jason.mei	c6a8e4c252	fix(aws): simplify HTTP status code extraction from AWS errors	2025-11-28 18:03:53 +08:00
jason.mei	f2e51963dc	fix(aws): extract HTTP status code from AWS SDK errors	2025-11-28 17:43:37 +08:00
IcedTangerine	fa72a27a59	Merge pull request #2324 from feitianbubu/pr/video-download-oai feat: 视频下载和界面预览统一使用OAI标准接口	2025-11-28 17:03:39 +08:00
feitianbubu	2a77453e1a	feat: all video preview use videos/:id/content	2025-11-28 13:11:31 +08:00
IcedTangerine	b47cf4efb3	Merge pull request #2321 from feitianbubu/pr/gemini-image-edit Gemini Image系列支持图像编辑	2025-11-27 18:04:50 +08:00
IcedTangerine	420c6e58f2	Fix defer placement for image file closure	2025-11-27 18:01:34 +08:00
IcedTangerine	4d00dad002	Fix error message formatting in relay_utils.go	2025-11-27 17:59:38 +08:00
IcedTangerine	a0982996a4	Use defer to close image file after opening Ensure image file is closed using defer after opening.	2025-11-27 17:56:59 +08:00
IcedTangerine	36cf515617	Merge pull request #2315 from feitianbubu/pr/gemini-veo3.1-i2v Gemini Veo3.1[AI Studio]增加图生视频支持	2025-11-27 17:24:13 +08:00
feitianbubu	cb5a37abed	feat: gemini image support edit	2025-11-27 16:04:04 +08:00
feitianbubu	f7d6c36032	feat: gemini video veo3.1 add task fail check	2025-11-26 21:56:14 +08:00
feitianbubu	4a367edfde	feat: gemini video veo3.1 add i2v	2025-11-26 21:56:13 +08:00
ImogeneOctaviap794	9140dee70c	feat(playground): enhance SSE debugging and add image paste support with i18n - Add SSEViewer component for interactive SSE message inspection * Display SSE data stream with collapsible panels * Show parsed JSON with syntax highlighting * Display key information badges (content, tokens, finish reason) * Support copy individual or all SSE messages * Show error messages with detailed information - Support Ctrl+V to paste images in chat input * Enable image paste in CustomInputRender component * Auto-detect and add pasted images to image list * Show toast notifications for paste results - Add complete i18n support for 6 languages * Chinese (zh): Complete translations * English (en): Complete translations * Japanese (ja): Add 28 new translations * French (fr): Add 28 new translations * Russian (ru): Add 28 new translations * Vietnamese (vi): Add 32 new translations - Update .gitignore to exclude data directory	2025-11-26 20:40:32 +08:00
Calcium-Ion	95a7749e1d	Merge pull request #2309 from feitianbubu/pr/fix-gemini-ImageConfig fix: gemini image correct generationConfig	2025-11-26 18:46:06 +08:00
Seefs	a25d00bace	fix: edit vertex key type	2025-11-26 18:12:36 +08:00
feitianbubu	ab3cda3202	fix: gemini image correct generationConfig	2025-11-26 15:54:11 +08:00
IcedTangerine	5ac1d02200	Merge pull request #2305 from feitianbubu/pr/add-gemini-3-pro-image-preview-oai OAI生图接口支持gemini 3 pro image preview	2025-11-26 13:35:17 +08:00
feitianbubu	d859872e0d	feat: gemini-3-pro-image-preview add extra param	2025-11-26 12:03:24 +08:00
feitianbubu	bff04514a8	feat: support gemini-3-pro-image-preview via images/generations	2025-11-26 12:03:24 +08:00
Seefs	dab5fad61e	fix: claude request missing field	2025-11-26 02:06:25 +08:00
Seefs	a6a20a2069	Merge pull request #2296 from seefs001/fix/adapter-missing fix: volcengine claude DoResponse	2025-11-25 16:45:14 +08:00
Calcium-Ion	4866b3db13	Merge pull request #2295 from seefs001/fix/adapter-missing fix: volcengine claude DoResponse	2025-11-25 15:54:39 +08:00
Seefs	5060904331	fix: volcengine claude DoResponse	2025-11-25 15:45:31 +08:00
Calcium-Ion	393c2b620c	Merge pull request #2294 from seefs001/fix/adapter-missing fix: volcengine && baidu claude adapter	2025-11-25 15:31:26 +08:00
Seefs	e5e3e0f201	fix: volcengine && baidu claude adapter	2025-11-25 15:06:03 +08:00
Seefs	b3d5fbd9f2	Merge pull request #2282 from amikebzek/claude/analyze-gemini-integration-011nJGemhrPUdqwg3qDvmqVB feat: enable thoughtSignature for non-function-call messages	2025-11-25 14:50:55 +08:00
Seefs	31a652f8e2	Merge pull request #2293 from prnake/claude-opus-4-5 feat: add claude-opus-4-5-20251101	2025-11-25 14:44:57 +08:00
Papersnake	79682dc542	feat: add claude-opus-4-5-20251101	2025-11-25 10:53:01 +08:00
Papersnake	5931d333cb	feat: add claude-opus-4-5-20251101 ratio	2025-11-25 10:49:34 +08:00
Seefs	2f80e3fba1	Merge pull request #2261 from wzxjohn/hotfix/analytic fix: root page does not have analytic code	2025-11-24 14:06:02 +08:00
Seefs	bd9e23ce4e	Merge pull request #2264 from binorxin/main fix: cast size to int64 before comparing with MaxUint32	2025-11-24 14:05:14 +08:00
Claude	25aed08361	feat: enable thoughtSignature for non-function-call messages Previously thoughtSignature was only attached to messages with function calls. This change extends the feature to also attach thoughtSignature to the first text part of assistant/model messages when no tool_calls are present, ensuring compatibility with Gemini thinking models in regular conversation scenarios.	2025-11-24 00:31:20 +00:00
borx	182f3a9b4d	fix: cast size to int64 before comparing with MaxUint32	2025-11-20 23:57:30 +08:00
HynoR	c6125eccb1	fix: Set default to unsupported value for gpt-5 model series requests	2025-11-15 13:28:38 +08:00
NoahCode	138810f19c	fix(channel): update channel identification logic in error processing	2025-11-08 20:33:14 +08:00
wzxjohn	2a62aea46c	fix: typo	2025-10-30 14:21:46 +08:00
wzxjohn	4a0c119140	fix(web): index page does not have analytic	2025-10-30 12:17:51 +08:00