📝 Add docstrings to fix/channel-test-responses-fallback

Docstrings generation was requested by @FlowerRealm. * https://github.com/QuantumNous/new-api/pull/2501#issuecomment-3686382220 The following files were modified: * `controller/channel-test.go` * `relay/helper/valid_request.go` * `service/error.go`
feat(token): enhance error handling in ValidateUserToken for better clarity
2026-04-18 02:17:28 +00:00 · 2025-12-23 11:56:30 +00:00 · 2025-12-22 18:01:38 +08:00 · 2025-12-21 21:28:35 +08:00 · 2025-12-21 21:18:59 +08:00 · 2025-12-21 21:00:33 +08:00
141 changed files with 17561 additions and 1741 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -16,6 +16,7 @@ new-api
 tiktoken_cache
 .eslintcache
 .gocache
 .gomodcache/
 .cache
 web/bun.lock
--- a/9
+++ b/9
@@ -14,7 +14,7 @@ ENV GO111MODULE=on CGO_ENABLED=0
 ARG TARGETOS
 ARG TARGETARCH
 ENV GOOS=${TARGETOS:-linux} GOARCH=${TARGETARCH:-amd64}
-
+ENV GOEXPERIMENT=greenteagc
 WORKDIR /build
@@ -25,10 +25,11 @@ COPY . .
 COPY --from=builder /build/dist ./web/dist
 RUN go build -ldflags "-s -w -X 'github.com/QuantumNous/new-api/common.Version=$(cat VERSION)'" -o new-api
-FROM alpine
+FROM debian:bookworm-slim
-RUN apk upgrade --no-cache \
+RUN apt-get update \
-    && apk add --no-cache ca-certificates tzdata \
+    && apt-get install -y --no-install-recommends ca-certificates tzdata libasan8 wget \
    && rm -rf /var/lib/apt/lists/* \
    && update-ca-certificates
 COPY --from=builder2 /build/new-api /
--- a/README.en.md
+++ b/README.en.md
@@ -146,7 +146,7 @@ docker run --name new-api -d --restart always \
 🎉 After deployment is complete, visit `http://localhost:3000` to start using!
-📖 For more deployment methods, please refer to [Deployment Guide](https://docs.newapi.pro/installation)
+📖 For more deployment methods, please refer to [Deployment Guide](https://docs.newapi.pro/en/docs/installation)
 ---
@@ -154,7 +154,7 @@ docker run --name new-api -d --restart always \
 <div align="center">
-### 📖 [Official Documentation](https://docs.newapi.pro/) | [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/QuantumNous/new-api)
+### 📖 [Official Documentation](https://docs.newapi.pro/en/docs) | [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/QuantumNous/new-api)
 </div>
@@ -162,17 +162,17 @@ docker run --name new-api -d --restart always \
 | Category | Link |
 |------|------|
-| 🚀 Deployment Guide | [Installation Documentation](https://docs.newapi.pro/installation) |
+| 🚀 Deployment Guide | [Installation Documentation](https://docs.newapi.pro/en/docs/installation) |
-| ⚙️ Environment Configuration | [Environment Variables](https://docs.newapi.pro/installation/environment-variables) |
+| ⚙️ Environment Configuration | [Environment Variables](https://docs.newapi.pro/en/docs/installation/config-maintenance/environment-variables) |
-| 📡 API Documentation | [API Documentation](https://docs.newapi.pro/api) |
+| 📡 API Documentation | [API Documentation](https://docs.newapi.pro/en/docs/api) |
-| ❓ FAQ | [FAQ](https://docs.newapi.pro/support/faq) |
+| ❓ FAQ | [FAQ](https://docs.newapi.pro/en/docs/support/faq) |
-| 💬 Community Interaction | [Communication Channels](https://docs.newapi.pro/support/community-interaction) |
+| 💬 Community Interaction | [Communication Channels](https://docs.newapi.pro/en/docs/support/community-interaction) |
 ---
 ## ✨ Key Features
-> For detailed features, please refer to [Features Introduction](https://docs.newapi.pro/wiki/features-introduction)
+> For detailed features, please refer to [Features Introduction](https://docs.newapi.pro/en/docs/guide/wiki/basic-concepts/features-introduction)
 ### 🎨 Core Functions
@@ -201,11 +201,11 @@ docker run --name new-api -d --restart always \
 ### 🚀 Advanced Features
 **API Format Support:**
- ⚡ [OpenAI Responses](https://docs.newapi.pro/api/openai-responses)
+- ⚡ [OpenAI Responses](https://docs.newapi.pro/en/docs/api/ai-model/chat/openai/create-response)
- ⚡ [OpenAI Realtime API](https://docs.newapi.pro/api/openai-realtime) (including Azure)
+- ⚡ [OpenAI Realtime API](https://docs.newapi.pro/en/docs/api/ai-model/realtime/create-realtime-session) (including Azure)
- ⚡ [Claude Messages](https://docs.newapi.pro/api/anthropic-chat)
+- ⚡ [Claude Messages](https://docs.newapi.pro/en/docs/api/ai-model/chat/create-message)
- ⚡ [Google Gemini](https://docs.newapi.pro/api/google-gemini-chat/)
+- ⚡ [Google Gemini](https://doc.newapi.pro/en/api/google-gemini-chat)
- 🔄 [Rerank Models](https://docs.newapi.pro/api/jinaai-rerank) (Cohere, Jina)
+- 🔄 [Rerank Models](https://docs.newapi.pro/en/docs/api/ai-model/rerank/create-rerank) (Cohere, Jina)
 **Intelligent Routing:**
 - ⚖️ Channel weighted random
@@ -238,6 +238,7 @@ docker run --name new-api -d --restart always \
 - `gemini-2.5-flash-nothinking` - Disable thinking mode
 - `gemini-2.5-pro-thinking` - Enable thinking mode
 - `gemini-2.5-pro-thinking-128` - Enable thinking mode with thinking budget of 128 tokens
 - You can also append `-low`, `-medium`, or `-high` to any Gemini model name to request the corresponding reasoning effort (no extra thinking-budget suffix needed).
 </details>
@@ -245,16 +246,16 @@ docker run --name new-api -d --restart always \
 ## 🤖 Model Support
-> For details, please refer to [API Documentation - Relay Interface](https://docs.newapi.pro/api)
+> For details, please refer to [API Documentation - Relay Interface](https://docs.newapi.pro/en/docs/api)
 | Model Type | Description | Documentation |
 |---------|------|------|
 | 🤖 OpenAI GPTs | gpt-4-gizmo-* series | - |
-| 🎨 Midjourney-Proxy | [Midjourney-Proxy(Plus)](https://github.com/novicezk/midjourney-proxy) | [Documentation](https://docs.newapi.pro/api/midjourney-proxy-image) |
+| 🎨 Midjourney-Proxy | [Midjourney-Proxy(Plus)](https://github.com/novicezk/midjourney-proxy) | [Documentation](https://doc.newapi.pro/en/api/midjourney-proxy-image) |
-| 🎵 Suno-API | [Suno API](https://github.com/Suno-API/Suno-API) | [Documentation](https://docs.newapi.pro/api/suno-music) |
+| 🎵 Suno-API | [Suno API](https://github.com/Suno-API/Suno-API) | [Documentation](https://doc.newapi.pro/en/api/suno-music) |
-| 🔄 Rerank | Cohere, Jina | [Documentation](https://docs.newapi.pro/api/jinaai-rerank) |
+| 🔄 Rerank | Cohere, Jina | [Documentation](https://docs.newapi.pro/en/docs/api/ai-model/rerank/create-rerank) |
-| 💬 Claude | Messages format | [Documentation](https://docs.newapi.pro/api/anthropic-chat) |
+| 💬 Claude | Messages format | [Documentation](https://docs.newapi.pro/en/docs/api/ai-model/chat/create-message) |
-| 🌐 Gemini | Google Gemini format | [Documentation](https://docs.newapi.pro/api/google-gemini-chat/) |
+| 🌐 Gemini | Google Gemini format | [Documentation](https://doc.newapi.pro/en/api/google-gemini-chat) |
 | 🔧 Dify | ChatFlow mode | - |
 | 🎯 Custom | Supports complete call address | - |
@@ -263,16 +264,16 @@ docker run --name new-api -d --restart always \
 <details>
 <summary>View complete interface list</summary>
- [Chat Interface (Chat Completions)](https://docs.newapi.pro/api/openai-chat)
+- [Chat Interface (Chat Completions)](https://docs.newapi.pro/en/docs/api/ai-model/chat/openai/create-chat-completion)
- [Response Interface (Responses)](https://docs.newapi.pro/api/openai-responses)
+- [Response Interface (Responses)](https://docs.newapi.pro/en/docs/api/ai-model/chat/openai/create-response)
- [Image Interface (Image)](https://docs.newapi.pro/api/openai-image)
+- [Image Interface (Image)](https://docs.newapi.pro/en/docs/api/ai-model/images/openai/v1-images-generations--post)
- [Audio Interface (Audio)](https://docs.newapi.pro/api/openai-audio)
+- [Audio Interface (Audio)](https://docs.newapi.pro/en/docs/api/ai-model/audio/openai/create-transcription)
- [Video Interface (Video)](https://docs.newapi.pro/api/openai-video)
+- [Video Interface (Video)](https://docs.newapi.pro/en/docs/api/ai-model/videos/create-video-generation)
- [Embedding Interface (Embeddings)](https://docs.newapi.pro/api/openai-embeddings)
+- [Embedding Interface (Embeddings)](https://docs.newapi.pro/en/docs/api/ai-model/embeddings/create-embedding)
- [Rerank Interface (Rerank)](https://docs.newapi.pro/api/jinaai-rerank)
+- [Rerank Interface (Rerank)](https://docs.newapi.pro/en/docs/api/ai-model/rerank/create-rerank)
- [Realtime Conversation (Realtime)](https://docs.newapi.pro/api/openai-realtime)
+- [Realtime Conversation (Realtime)](https://docs.newapi.pro/en/docs/api/ai-model/realtime/create-realtime-session)
- [Claude Chat](https://docs.newapi.pro/api/anthropic-chat)
+- [Claude Chat](https://docs.newapi.pro/en/docs/api/ai-model/chat/create-message)
- [Google Gemini Chat](https://docs.newapi.pro/api/google-gemini-chat/)
+- [Google Gemini Chat](https://doc.newapi.pro/en/api/google-gemini-chat)
 </details>
@@ -304,10 +305,11 @@ docker run --name new-api -d --restart always \
 | `REDIS_CONN_STRING` | Redis connection string | - |
 | `STREAMING_TIMEOUT` | Streaming timeout (seconds) | `300` |
 | `STREAM_SCANNER_MAX_BUFFER_MB` | Max per-line buffer (MB) for the stream scanner; increase when upstream sends huge image/base64 payloads | `64` |
 | `MAX_REQUEST_BODY_MB` | Max request body size (MB, counted **after decompression**; prevents huge requests/zip bombs from exhausting memory). Exceeding it returns `413` | `32` |
 | `AZURE_DEFAULT_API_VERSION` | Azure API version | `2025-04-01-preview` |
 | `ERROR_LOG_ENABLED` | Error log switch | `false` |
-📖 **Complete configuration:** [Environment Variables Documentation](https://docs.newapi.pro/installation/environment-variables)
+📖 **Complete configuration:** [Environment Variables Documentation](https://docs.newapi.pro/en/docs/installation/config-maintenance/environment-variables)
 </details>
@@ -409,10 +411,10 @@ docker run --name new-api -d --restart always \
 | Resource | Link |
 |------|------|
-| 📘 FAQ | [FAQ](https://docs.newapi.pro/support/faq) |
+| 📘 FAQ | [FAQ](https://docs.newapi.pro/en/docs/support/faq) |
-| 💬 Community Interaction | [Communication Channels](https://docs.newapi.pro/support/community-interaction) |
+| 💬 Community Interaction | [Communication Channels](https://docs.newapi.pro/en/docs/support/community-interaction) |
-| 🐛 Issue Feedback | [Issue Feedback](https://docs.newapi.pro/support/feedback-issues) |
+| 🐛 Issue Feedback | [Issue Feedback](https://docs.newapi.pro/en/docs/support/feedback-issues) |
-| 📚 Complete Documentation | [Official Documentation](https://docs.newapi.pro/support) |
+| 📚 Complete Documentation | [Official Documentation](https://docs.newapi.pro/en/docs) |
 ### 🤝 Contribution Guide
@@ -441,7 +443,7 @@ Welcome all forms of contribution!
 If this project is helpful to you, welcome to give us a ⭐️ Star！
-**[Official Documentation](https://docs.newapi.pro/)** • **[Issue Feedback](https://github.com/Calcium-Ion/new-api/issues)** • **[Latest Release](https://github.com/Calcium-Ion/new-api/releases)**
+**[Official Documentation](https://docs.newapi.pro/en/docs)** • **[Issue Feedback](https://github.com/Calcium-Ion/new-api/issues)** • **[Latest Release](https://github.com/Calcium-Ion/new-api/releases)**
 <sub>Built with ❤️ by QuantumNous</sub>
--- a/README.fr.md
+++ b/README.fr.md
@@ -146,7 +146,7 @@ docker run --name new-api -d --restart always \
 🎉 Après le déploiement, visitez `http://localhost:3000` pour commencer à utiliser!
-📖 Pour plus de méthodes de déploiement, veuillez vous référer à [Guide de déploiement](https://docs.newapi.pro/installation)
+📖 Pour plus de méthodes de déploiement, veuillez vous référer à [Guide de déploiement](https://docs.newapi.pro/en/docs/installation)
 ---
@@ -154,7 +154,7 @@ docker run --name new-api -d --restart always \
 <div align="center">
-### 📖 [Documentation officielle](https://docs.newapi.pro/) | [![Demander à DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/QuantumNous/new-api)
+### 📖 [Documentation officielle](https://docs.newapi.pro/en/docs) | [![Demander à DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/QuantumNous/new-api)
 </div>
@@ -162,17 +162,17 @@ docker run --name new-api -d --restart always \
 | Catégorie | Lien |
 |------|------|
-| 🚀 Guide de déploiement | [Documentation d'installation](https://docs.newapi.pro/installation) |
+| 🚀 Guide de déploiement | [Documentation d'installation](https://docs.newapi.pro/en/docs/installation) |
-| ⚙️ Configuration de l'environnement | [Variables d'environnement](https://docs.newapi.pro/installation/environment-variables) |
+| ⚙️ Configuration de l'environnement | [Variables d'environnement](https://docs.newapi.pro/en/docs/installation/config-maintenance/environment-variables) |
-| 📡 Documentation de l'API | [Documentation de l'API](https://docs.newapi.pro/api) |
+| 📡 Documentation de l'API | [Documentation de l'API](https://docs.newapi.pro/en/docs/api) |
-| ❓ FAQ | [FAQ](https://docs.newapi.pro/support/faq) |
+| ❓ FAQ | [FAQ](https://docs.newapi.pro/en/docs/support/faq) |
-| 💬 Interaction avec la communauté | [Canaux de communication](https://docs.newapi.pro/support/community-interaction) |
+| 💬 Interaction avec la communauté | [Canaux de communication](https://docs.newapi.pro/en/docs/support/community-interaction) |
 ---
 ## ✨ Fonctionnalités clés
-> Pour les fonctionnalités détaillées, veuillez vous référer à [Présentation des fonctionnalités](https://docs.newapi.pro/wiki/features-introduction) |
+> Pour les fonctionnalités détaillées, veuillez vous référer à [Présentation des fonctionnalités](https://docs.newapi.pro/en/docs/guide/wiki/basic-concepts/features-introduction) |
 ### 🎨 Fonctions principales
@@ -200,11 +200,11 @@ docker run --name new-api -d --restart always \
 ### 🚀 Fonctionnalités avancées
 **Prise en charge des formats d'API:**
- ⚡ [OpenAI Responses](https://docs.newapi.pro/api/openai-responses)
+- ⚡ [OpenAI Responses](https://docs.newapi.pro/en/docs/api/ai-model/chat/openai/create-response)
- ⚡ [OpenAI Realtime API](https://docs.newapi.pro/api/openai-realtime) (y compris Azure)
+- ⚡ [OpenAI Realtime API](https://docs.newapi.pro/en/docs/api/ai-model/realtime/create-realtime-session) (y compris Azure)
- ⚡ [Claude Messages](https://docs.newapi.pro/api/anthropic-chat)
+- ⚡ [Claude Messages](https://docs.newapi.pro/en/docs/api/ai-model/chat/create-message)
- ⚡ [Google Gemini](https://docs.newapi.pro/api/google-gemini-chat/)
+- ⚡ [Google Gemini](https://doc.newapi.pro/en/api/google-gemini-chat)
- 🔄 [Modèles Rerank](https://docs.newapi.pro/api/jinaai-rerank) (Cohere, Jina)
+- 🔄 [Modèles Rerank](https://docs.newapi.pro/en/docs/api/ai-model/rerank/create-rerank) (Cohere, Jina)
 **Routage intelligent:**
 - ⚖️ Sélection aléatoire pondérée des canaux
@@ -234,6 +234,7 @@ docker run --name new-api -d --restart always \
 - `gemini-2.5-flash-nothinking` - Désactiver le mode de pensée
 - `gemini-2.5-pro-thinking` - Activer le mode de pensée
 - `gemini-2.5-pro-thinking-128` - Activer le mode de pensée avec budget de pensée de 128 tokens
 - Vous pouvez également ajouter les suffixes `-low`, `-medium` ou `-high` aux modèles Gemini pour fixer le niveau d’effort de raisonnement (sans suffixe de budget supplémentaire).
 </details>
@@ -241,16 +242,16 @@ docker run --name new-api -d --restart always \
 ## 🤖 Prise en charge des modèles
-> Pour les détails, veuillez vous référer à [Documentation de l'API - Interface de relais](https://docs.newapi.pro/api)
+> Pour les détails, veuillez vous référer à [Documentation de l'API - Interface de relais](https://docs.newapi.pro/en/docs/api)
 | Type de modèle | Description | Documentation |
 |---------|------|------|
 | 🤖 OpenAI GPTs | série gpt-4-gizmo-* | - |
-| 🎨 Midjourney-Proxy | [Midjourney-Proxy(Plus)](https://github.com/novicezk/midjourney-proxy) | [Documentation](https://docs.newapi.pro/api/midjourney-proxy-image) |
+| 🎨 Midjourney-Proxy | [Midjourney-Proxy(Plus)](https://github.com/novicezk/midjourney-proxy) | [Documentation](https://doc.newapi.pro/en/api/midjourney-proxy-image) |
-| 🎵 Suno-API | [Suno API](https://github.com/Suno-API/Suno-API) | [Documentation](https://docs.newapi.pro/api/suno-music) |
+| 🎵 Suno-API | [Suno API](https://github.com/Suno-API/Suno-API) | [Documentation](https://doc.newapi.pro/en/api/suno-music) |
-| 🔄 Rerank | Cohere, Jina | [Documentation](https://docs.newapi.pro/api/jinaai-rerank) |
+| 🔄 Rerank | Cohere, Jina | [Documentation](https://docs.newapi.pro/en/docs/api/ai-model/rerank/create-rerank) |
-| 💬 Claude | Format Messages | [Documentation](https://docs.newapi.pro/api/anthropic-chat) |
+| 💬 Claude | Format Messages | [Documentation](https://docs.newapi.pro/en/docs/api/ai-model/chat/create-message) |
-| 🌐 Gemini | Format Google Gemini | [Documentation](https://docs.newapi.pro/api/google-gemini-chat/) |
+| 🌐 Gemini | Format Google Gemini | [Documentation](https://doc.newapi.pro/en/api/google-gemini-chat) |
 | 🔧 Dify | Mode ChatFlow | - |
 | 🎯 Personnalisé | Prise en charge de l'adresse d'appel complète | - |
@@ -259,16 +260,16 @@ docker run --name new-api -d --restart always \
 <details>
 <summary>Voir la liste complète des interfaces</summary>
- [Interface de discussion (Chat Completions)](https://docs.newapi.pro/api/openai-chat)
+- [Interface de discussion (Chat Completions)](https://docs.newapi.pro/en/docs/api/ai-model/chat/openai/create-chat-completion)
- [Interface de réponse (Responses)](https://docs.newapi.pro/api/openai-responses)
+- [Interface de réponse (Responses)](https://docs.newapi.pro/en/docs/api/ai-model/chat/openai/create-response)
- [Interface d'image (Image)](https://docs.newapi.pro/api/openai-image)
+- [Interface d'image (Image)](https://docs.newapi.pro/en/docs/api/ai-model/images/openai/v1-images-generations--post)
- [Interface audio (Audio)](https://docs.newapi.pro/api/openai-audio)
+- [Interface audio (Audio)](https://docs.newapi.pro/en/docs/api/ai-model/audio/openai/create-transcription)
- [Interface vidéo (Video)](https://docs.newapi.pro/api/openai-video)
+- [Interface vidéo (Video)](https://docs.newapi.pro/en/docs/api/ai-model/videos/create-video-generation)
- [Interface d'incorporation (Embeddings)](https://docs.newapi.pro/api/openai-embeddings)
+- [Interface d'incorporation (Embeddings)](https://docs.newapi.pro/en/docs/api/ai-model/embeddings/create-embedding)
- [Interface de rerank (Rerank)](https://docs.newapi.pro/api/jinaai-rerank)
+- [Interface de rerank (Rerank)](https://docs.newapi.pro/en/docs/api/ai-model/rerank/create-rerank)
- [Conversation en temps réel (Realtime)](https://docs.newapi.pro/api/openai-realtime)
+- [Conversation en temps réel (Realtime)](https://docs.newapi.pro/en/docs/api/ai-model/realtime/create-realtime-session)
- [Discussion Claude](https://docs.newapi.pro/api/anthropic-chat)
+- [Discussion Claude](https://docs.newapi.pro/en/docs/api/ai-model/chat/create-message)
- [Discussion Google Gemini](https://docs.newapi.pro/api/google-gemini-chat/)
+- [Discussion Google Gemini](https://doc.newapi.pro/en/api/google-gemini-chat)
 </details>
@@ -300,10 +301,11 @@ docker run --name new-api -d --restart always \
 | `REDIS_CONN_STRING` | Chaine de connexion Redis | - |
 | `STREAMING_TIMEOUT` | Délai d'expiration du streaming (secondes) | `300` |
 | `STREAM_SCANNER_MAX_BUFFER_MB` | Taille max du buffer par ligne (Mo) pour le scanner SSE ; à augmenter quand les sorties image/base64 sont très volumineuses (ex. images 4K) | `64` |
 | `MAX_REQUEST_BODY_MB` | Taille maximale du corps de requête (Mo, comptée **après décompression** ; évite les requêtes énormes/zip bombs qui saturent la mémoire). Dépassement ⇒ `413` | `32` |
 | `AZURE_DEFAULT_API_VERSION` | Version de l'API Azure | `2025-04-01-preview` |
 | `ERROR_LOG_ENABLED` | Interrupteur du journal d'erreurs | `false` |
-📖 **Configuration complète:** [Documentation des variables d'environnement](https://docs.newapi.pro/installation/environment-variables)
+📖 **Configuration complète:** [Documentation des variables d'environnement](https://docs.newapi.pro/en/docs/installation/config-maintenance/environment-variables)
 </details>
@@ -403,10 +405,10 @@ docker run --name new-api -d --restart always \
 | Ressource | Lien |
 |------|------|
-| 📘 FAQ | [FAQ](https://docs.newapi.pro/support/faq) |
+| 📘 FAQ | [FAQ](https://docs.newapi.pro/en/docs/support/faq) |
-| 💬 Interaction avec la communauté | [Canaux de communication](https://docs.newapi.pro/support/community-interaction) |
+| 💬 Interaction avec la communauté | [Canaux de communication](https://docs.newapi.pro/en/docs/support/community-interaction) |
-| 🐛 Commentaires sur les problèmes | [Commentaires sur les problèmes](https://docs.newapi.pro/support/feedback-issues) |
+| 🐛 Commentaires sur les problèmes | [Commentaires sur les problèmes](https://docs.newapi.pro/en/docs/support/feedback-issues) |
-| 📚 Documentation complète | [Documentation officielle](https://docs.newapi.pro/support) |
+| 📚 Documentation complète | [Documentation officielle](https://docs.newapi.pro/en/docs) |
 ### 🤝 Guide de contribution
@@ -435,7 +437,7 @@ Bienvenue à toutes les formes de contribution!
 Si ce projet vous est utile, bienvenue à nous donner une ⭐️ Étoile！
-**[Documentation officielle](https://docs.newapi.pro/)** • **[Commentaires sur les problèmes](https://github.com/Calcium-Ion/new-api/issues)** • **[Dernière version](https://github.com/Calcium-Ion/new-api/releases)**
+**[Documentation officielle](https://docs.newapi.pro/en/docs)** • **[Commentaires sur les problèmes](https://github.com/Calcium-Ion/new-api/issues)** • **[Dernière version](https://github.com/Calcium-Ion/new-api/releases)**
 <sub>Construit avec ❤️ par QuantumNous</sub>
--- a/README.ja.md
+++ b/README.ja.md
@@ -146,7 +146,7 @@ docker run --name new-api -d --restart always \
 🎉 デプロイが完了したら、`http://localhost:3000` にアクセスして使用を開始してください！
-📖 その他のデプロイ方法については[デプロイガイド](https://docs.newapi.pro/installation)を参照してください。
+📖 その他のデプロイ方法については[デプロイガイド](https://docs.newapi.pro/ja/docs/installation)を参照してください。
 ---
@@ -154,7 +154,7 @@ docker run --name new-api -d --restart always \
 <div align="center">
-### 📖 [公式ドキュメント](https://docs.newapi.pro/) | [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/QuantumNous/new-api)
+### 📖 [公式ドキュメント](https://docs.newapi.pro/ja/docs) | [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/QuantumNous/new-api)
 </div>
@@ -162,17 +162,17 @@ docker run --name new-api -d --restart always \
 | カテゴリ | リンク |
 |------|------|
-| 🚀 デプロイガイド | [インストールドキュメント](https://docs.newapi.pro/installation) |
+| 🚀 デプロイガイド | [インストールドキュメント](https://docs.newapi.pro/ja/docs/installation) |
-| ⚙️ 環境設定 | [環境変数](https://docs.newapi.pro/installation/environment-variables) |
+| ⚙️ 環境設定 | [環境変数](https://docs.newapi.pro/ja/docs/installation/config-maintenance/environment-variables) |
-| 📡 APIドキュメント | [APIドキュメント](https://docs.newapi.pro/api) |
+| 📡 APIドキュメント | [APIドキュメント](https://docs.newapi.pro/ja/docs/api) |
-| ❓ よくある質問 | [FAQ](https://docs.newapi.pro/support/faq) |
+| ❓ よくある質問 | [FAQ](https://docs.newapi.pro/ja/docs/support/faq) |
-| 💬 コミュニティ交流 | [交流チャネル](https://docs.newapi.pro/support/community-interaction) |
+| 💬 コミュニティ交流 | [交流チャネル](https://docs.newapi.pro/ja/docs/support/community-interaction) |
 ---
 ## ✨ 主な機能
-> 詳細な機能については[機能説明](https://docs.newapi.pro/wiki/features-introduction)を参照してください。
+> 詳細な機能については[機能説明](https://docs.newapi.pro/ja/docs/guide/wiki/basic-concepts/features-introduction)を参照してください。
 ### 🎨 コア機能
@@ -202,15 +202,15 @@ docker run --name new-api -d --restart always \
 ### 🚀 高度な機能
 **APIフォーマットサポート:**
- ⚡ [OpenAI Responses](https://docs.newapi.pro/api/openai-responses)
+- ⚡ [OpenAI Responses](https://docs.newapi.pro/ja/docs/api/ai-model/chat/openai/create-response)
- ⚡ [OpenAI Realtime API](https://docs.newapi.pro/api/openai-realtime)（Azureを含む）
+- ⚡ [OpenAI Realtime API](https://docs.newapi.pro/ja/docs/api/ai-model/realtime/create-realtime-session)（Azureを含む）
- ⚡ [Claude Messages](https://docs.newapi.pro/api/anthropic-chat)
+- ⚡ [Claude Messages](https://docs.newapi.pro/ja/docs/api/ai-model/chat/create-message)
- ⚡ [Google Gemini](https://docs.newapi.pro/api/google-gemini-chat/)
+- ⚡ [Google Gemini](https://doc.newapi.pro/ja/api/google-gemini-chat)
- 🔄 [Rerankモデル](https://docs.newapi.pro/api/jinaai-rerank)
+- 🔄 [Rerankモデル](https://docs.newapi.pro/ja/docs/api/ai-model/rerank/create-rerank)
- ⚡ [OpenAI Realtime API](https://docs.newapi.pro/api/openai-realtime)
+- ⚡ [OpenAI Realtime API](https://docs.newapi.pro/ja/docs/api/ai-model/realtime/create-realtime-session)
- ⚡ [Claude Messages](https://docs.newapi.pro/api/anthropic-chat)
+- ⚡ [Claude Messages](https://docs.newapi.pro/ja/docs/api/ai-model/chat/create-message)
- ⚡ [Google Gemini](https://docs.newapi.pro/api/google-gemini-chat/)
+- ⚡ [Google Gemini](https://doc.newapi.pro/ja/api/google-gemini-chat)
- 🔄 [Rerankモデル](https://docs.newapi.pro/api/jinaai-rerank)（Cohere、Jina）
+- 🔄 [Rerankモデル](https://docs.newapi.pro/ja/docs/api/ai-model/rerank/create-rerank)（Cohere、Jina）
 **インテリジェントルーティング:**
 - ⚖️ チャネル重み付けランダム
@@ -243,6 +243,7 @@ docker run --name new-api -d --restart always \
 - `gemini-2.5-flash-nothinking` - 思考モードを無効にする
 - `gemini-2.5-pro-thinking` - 思考モードを有効にする
 - `gemini-2.5-pro-thinking-128` - 思考モードを有効にし、思考予算を128トークンに設定する
 - Gemini モデル名の末尾に `-low` / `-medium` / `-high` を付けることで推論強度を直接指定できます（追加の思考予算サフィックスは不要です）。
 </details>
@@ -250,16 +251,16 @@ docker run --name new-api -d --restart always \
 ## 🤖 モデルサポート
-> 詳細については[APIドキュメント - 中継インターフェース](https://docs.newapi.pro/api)
+> 詳細については[APIドキュメント - 中継インターフェース](https://docs.newapi.pro/ja/docs/api)
 | モデルタイプ | 説明 | ドキュメント |
 |---------|------|------|
 | 🤖 OpenAI GPTs | gpt-4-gizmo-* シリーズ | - |
-| 🎨 Midjourney-Proxy | [Midjourney-Proxy(Plus)](https://github.com/novicezk/midjourney-proxy) | [ドキュメント](https://docs.newapi.pro/api/midjourney-proxy-image) |
+| 🎨 Midjourney-Proxy | [Midjourney-Proxy(Plus)](https://github.com/novicezk/midjourney-proxy) | [ドキュメント](https://doc.newapi.pro/ja/api/midjourney-proxy-image) |
-| 🎵 Suno-API | [Suno API](https://github.com/Suno-API/Suno-API) | [ドキュメント](https://docs.newapi.pro/api/suno-music) |
+| 🎵 Suno-API | [Suno API](https://github.com/Suno-API/Suno-API) | [ドキュメント](https://doc.newapi.pro/ja/api/suno-music) |
-| 🔄 Rerank | Cohere、Jina | [ドキュメント](https://docs.newapi.pro/api/jinaai-rerank) |
+| 🔄 Rerank | Cohere、Jina | [ドキュメント](https://docs.newapi.pro/ja/docs/api/ai-model/rerank/create-rerank) |
-| 💬 Claude | Messagesフォーマット | [ドキュメント](https://docs.newapi.pro/api/suno-music) |
+| 💬 Claude | Messagesフォーマット | [ドキュメント](https://docs.newapi.pro/ja/docs/api/ai-model/chat/create-message) |
-| 🌐 Gemini | Google Geminiフォーマット | [ドキュメント](https://docs.newapi.pro/api/google-gemini-chat/) |
+| 🌐 Gemini | Google Geminiフォーマット | [ドキュメント](https://doc.newapi.pro/ja/api/google-gemini-chat) |
 | 🔧 Dify | ChatFlowモード | - |
 | 🎯 カスタム | 完全な呼び出しアドレスの入力をサポート | - |
@@ -268,16 +269,16 @@ docker run --name new-api -d --restart always \
 <details>
 <summary>完全なインターフェースリストを表示</summary>
- [チャットインターフェース (Chat Completions)](https://docs.newapi.pro/api/openai-chat)
+- [チャットインターフェース (Chat Completions)](https://docs.newapi.pro/ja/docs/api/ai-model/chat/openai/create-chat-completion)
- [レスポンスインターフェース (Responses)](https://docs.newapi.pro/api/openai-responses)
+- [レスポンスインターフェース (Responses)](https://docs.newapi.pro/ja/docs/api/ai-model/chat/openai/create-response)
- [イメージインターフェース (Image)](https://docs.newapi.pro/api/openai-image)
+- [イメージインターフェース (Image)](https://docs.newapi.pro/ja/docs/api/ai-model/images/openai/v1-images-generations--post)
- [オーディオインターフェース (Audio)](https://docs.newapi.pro/api/openai-audio)
+- [オーディオインターフェース (Audio)](https://docs.newapi.pro/ja/docs/api/ai-model/audio/openai/create-transcription)
- [ビデオインターフェース (Video)](https://docs.newapi.pro/api/openai-video)
+- [ビデオインターフェース (Video)](https://docs.newapi.pro/ja/docs/api/ai-model/videos/create-video-generation)
- [エンベッドインターフェース (Embeddings)](https://docs.newapi.pro/api/openai-embeddings)
+- [エンベッドインターフェース (Embeddings)](https://docs.newapi.pro/ja/docs/api/ai-model/embeddings/create-embedding)
- [再ランク付けインターフェース (Rerank)](https://docs.newapi.pro/api/jinaai-rerank)
+- [再ランク付けインターフェース (Rerank)](https://docs.newapi.pro/ja/docs/api/ai-model/rerank/create-rerank)
- [リアルタイム対話インターフェース (Realtime)](https://docs.newapi.pro/api/openai-realtime)
+- [リアルタイム対話インターフェース (Realtime)](https://docs.newapi.pro/ja/docs/api/ai-model/realtime/create-realtime-session)
- [Claudeチャット](https://docs.newapi.pro/api/anthropic-chat)
+- [Claudeチャット](https://docs.newapi.pro/ja/docs/api/ai-model/chat/create-message)
- [Google Geminiチャット](https://docs.newapi.pro/api/google-gemini-chat/)
+- [Google Geminiチャット](https://doc.newapi.pro/ja/api/google-gemini-chat)
 </details>
@@ -309,10 +310,11 @@ docker run --name new-api -d --restart always \
 | `REDIS_CONN_STRING` | Redis接続文字列 | - |
 | `STREAMING_TIMEOUT` | ストリーミング応答のタイムアウト時間（秒） | `300` |
 | `STREAM_SCANNER_MAX_BUFFER_MB` | ストリームスキャナの1行あたりバッファ上限（MB）。4K画像など巨大なbase64 `data:` ペイロードを扱う場合は値を増加させてください | `64` |
 | `MAX_REQUEST_BODY_MB` | リクエストボディ最大サイズ（MB、**解凍後**に計測。巨大リクエスト/zip bomb によるメモリ枯渇を防止）。超過時は `413` | `32` |
 | `AZURE_DEFAULT_API_VERSION` | Azure APIバージョン | `2025-04-01-preview` |
 | `ERROR_LOG_ENABLED` | エラーログスイッチ | `false` |
-📖 **完全な設定:** [環境変数ドキュメント](https://docs.newapi.pro/installation/environment-variables)
+📖 **完全な設定:** [環境変数ドキュメント](https://docs.newapi.pro/ja/docs/installation/config-maintenance/environment-variables)
 </details>
@@ -412,10 +414,10 @@ docker run --name new-api -d --restart always \
 | リソース | リンク |
 |------|------|
-| 📘 よくある質問 | [FAQ](https://docs.newapi.pro/support/faq) |
+| 📘 よくある質問 | [FAQ](https://docs.newapi.pro/ja/docs/support/faq) |
-| 💬 コミュニティ交流 | [交流チャネル](https://docs.newapi.pro/support/community-interaction) |
+| 💬 コミュニティ交流 | [交流チャネル](https://docs.newapi.pro/ja/docs/support/community-interaction) |
-| 🐛 問題のフィードバック | [問題フィードバック](https://docs.newapi.pro/support/feedback-issues) |
+| 🐛 問題のフィードバック | [問題フィードバック](https://docs.newapi.pro/ja/docs/support/feedback-issues) |
-| 📚 完全なドキュメント | [公式ドキュメント](https://docs.newapi.pro/support) |
+| 📚 完全なドキュメント | [公式ドキュメント](https://docs.newapi.pro/ja/docs) |
 ### 🤝 貢献ガイド
@@ -444,7 +446,7 @@ docker run --name new-api -d --restart always \
 このプロジェクトがあなたのお役に立てたなら、ぜひ ⭐️ スターをください！
-**[公式ドキュメント](https://docs.newapi.pro/)** • **[問題フィードバック](https://github.com/Calcium-Ion/new-api/issues)** • **[最新リリース](https://github.com/Calcium-Ion/new-api/releases)**
+**[公式ドキュメント](https://docs.newapi.pro/ja/docs)** • **[問題フィードバック](https://github.com/Calcium-Ion/new-api/issues)** • **[最新リリース](https://github.com/Calcium-Ion/new-api/releases)**
 <sub>❤️ で構築された QuantumNous</sub>
--- a/README.md
+++ b/README.md
@@ -146,7 +146,7 @@ docker run --name new-api -d --restart always \
 🎉 部署完成后，访问 `http://localhost:3000` 即可使用！
-📖 更多部署方式请参考 [部署指南](https://docs.newapi.pro/installation)
+📖 更多部署方式请参考 [部署指南](https://docs.newapi.pro/zh/docs/installation)
 ---
@@ -154,7 +154,7 @@ docker run --name new-api -d --restart always \
 <div align="center">
-### 📖 [官方文档](https://docs.newapi.pro/) | [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/QuantumNous/new-api)
+### 📖 [官方文档](https://docs.newapi.pro/zh/docs) | [![Ask DeepWiki](https://deepwiki.com/badge.svg)](https://deepwiki.com/QuantumNous/new-api)
 </div>
@@ -162,17 +162,17 @@ docker run --name new-api -d --restart always \
 | 分类 | 链接 |
 |------|------|
-| 🚀 部署指南 | [安装文档](https://docs.newapi.pro/installation) |
+| 🚀 部署指南 | [安装文档](https://docs.newapi.pro/zh/docs/installation) |
-| ⚙️ 环境配置 | [环境变量](https://docs.newapi.pro/installation/environment-variables) |
+| ⚙️ 环境配置 | [环境变量](https://docs.newapi.pro/zh/docs/installation/config-maintenance/environment-variables) |
-| 📡 接口文档 | [API 文档](https://docs.newapi.pro/api) |
+| 📡 接口文档 | [API 文档](https://docs.newapi.pro/zh/docs/api) |
-| ❓ 常见问题 | [FAQ](https://docs.newapi.pro/support/faq) |
+| ❓ 常见问题 | [FAQ](https://docs.newapi.pro/zh/docs/support/faq) |
-| 💬 社区交流 | [交流渠道](https://docs.newapi.pro/support/community-interaction) |
+| 💬 社区交流 | [交流渠道](https://docs.newapi.pro/zh/docs/support/community-interaction) |
 ---
 ## ✨ 主要特性
-> 详细特性请参考 [特性说明](https://docs.newapi.pro/wiki/features-introduction)
+> 详细特性请参考 [特性说明](https://docs.newapi.pro/zh/docs/guide/wiki/basic-concepts/features-introduction)
 ### 🎨 核心功能
@@ -202,11 +202,11 @@ docker run --name new-api -d --restart always \
 ### 🚀 高级功能
 **API 格式支持：**
- ⚡ [OpenAI Responses](https://docs.newapi.pro/api/openai-responses)
+- ⚡ [OpenAI Responses](https://docs.newapi.pro/zh/docs/api/ai-model/chat/openai/create-response)
- ⚡ [OpenAI Realtime API](https://docs.newapi.pro/api/openai-realtime)（含 Azure）
+- ⚡ [OpenAI Realtime API](https://docs.newapi.pro/zh/docs/api/ai-model/realtime/create-realtime-session)（含 Azure）
- ⚡ [Claude Messages](https://docs.newapi.pro/api/anthropic-chat)
+- ⚡ [Claude Messages](https://docs.newapi.pro/zh/docs/api/ai-model/chat/create-message)
- ⚡ [Google Gemini](https://docs.newapi.pro/api/google-gemini-chat/)
+- ⚡ [Google Gemini](https://doc.newapi.pro/api/google-gemini-chat)
- 🔄 [Rerank 模型](https://docs.newapi.pro/api/jinaai-rerank)（Cohere、Jina）
+- 🔄 [Rerank 模型](https://docs.newapi.pro/zh/docs/api/ai-model/rerank/create-rerank)（Cohere、Jina）
 **智能路由：**
 - ⚖️ 渠道加权随机
@@ -239,6 +239,7 @@ docker run --name new-api -d --restart always \
 - `gemini-2.5-flash-nothinking` - 禁用思考模式
 - `gemini-2.5-pro-thinking` - 启用思考模式
 - `gemini-2.5-pro-thinking-128` - 启用思考模式，并设置思考预算为128tokens
 - 也可以直接在 Gemini 模型名称后追加 `-low` / `-medium` / `-high` 来控制思考力度（无需再设置思考预算后缀）
 </details>
@@ -246,16 +247,16 @@ docker run --name new-api -d --restart always \
 ## 🤖 模型支持
-> 详情请参考 [接口文档 - 中继接口](https://docs.newapi.pro/api)
+> 详情请参考 [接口文档 - 中继接口](https://docs.newapi.pro/zh/docs/api)
 | 模型类型 | 说明 | 文档 |
 |---------|------|------|
 | 🤖 OpenAI GPTs | gpt-4-gizmo-* 系列 | - |
-| 🎨 Midjourney-Proxy | [Midjourney-Proxy(Plus)](https://github.com/novicezk/midjourney-proxy) | [文档](https://docs.newapi.pro/api/midjourney-proxy-image) |
+| 🎨 Midjourney-Proxy | [Midjourney-Proxy(Plus)](https://github.com/novicezk/midjourney-proxy) | [文档](https://doc.newapi.pro/api/midjourney-proxy-image) |
-| 🎵 Suno-API | [Suno API](https://github.com/Suno-API/Suno-API) | [文档](https://docs.newapi.pro/api/suno-music) |
+| 🎵 Suno-API | [Suno API](https://github.com/Suno-API/Suno-API) | [文档](https://doc.newapi.pro/api/suno-music) |
-| 🔄 Rerank | Cohere、Jina | [文档](https://docs.newapi.pro/api/jinaai-rerank) |
+| 🔄 Rerank | Cohere、Jina | [文档](https://docs.newapi.pro/zh/docs/api/ai-model/rerank/create-rerank) |
-| 💬 Claude | Messages 格式 | [文档](https://docs.newapi.pro/api/anthropic-chat) |
+| 💬 Claude | Messages 格式 | [文档](https://docs.newapi.pro/zh/docs/api/ai-model/chat/create-message) |
-| 🌐 Gemini | Google Gemini 格式 | [文档](https://docs.newapi.pro/api/google-gemini-chat/) |
+| 🌐 Gemini | Google Gemini 格式 | [文档](https://doc.newapi.pro/api/google-gemini-chat) |
 | 🔧 Dify | ChatFlow 模式 | - |
 | 🎯 自定义 | 支持完整调用地址 | - |
@@ -264,16 +265,16 @@ docker run --name new-api -d --restart always \
 <details>
 <summary>查看完整接口列表</summary>
- [聊天接口 (Chat Completions)](https://docs.newapi.pro/api/openai-chat)
+- [聊天接口 (Chat Completions)](https://docs.newapi.pro/zh/docs/api/ai-model/chat/openai/create-chat-completion)
- [响应接口 (Responses)](https://docs.newapi.pro/api/openai-responses)
+- [响应接口 (Responses)](https://docs.newapi.pro/zh/docs/api/ai-model/chat/openai/create-response)
- [图像接口 (Image)](https://docs.newapi.pro/api/openai-image)
+- [图像接口 (Image)](https://docs.newapi.pro/zh/docs/api/ai-model/images/openai/v1-images-generations--post)
- [音频接口 (Audio)](https://docs.newapi.pro/api/openai-audio)
+- [音频接口 (Audio)](https://docs.newapi.pro/zh/docs/api/ai-model/audio/openai/create-transcription)
- [视频接口 (Video)](https://docs.newapi.pro/api/openai-video)
+- [视频接口 (Video)](https://docs.newapi.pro/zh/docs/api/ai-model/videos/create-video-generation)
- [嵌入接口 (Embeddings)](https://docs.newapi.pro/api/openai-embeddings)
+- [嵌入接口 (Embeddings)](https://docs.newapi.pro/zh/docs/api/ai-model/embeddings/create-embedding)
- [重排序接口 (Rerank)](https://docs.newapi.pro/api/jinaai-rerank)
+- [重排序接口 (Rerank)](https://docs.newapi.pro/zh/docs/api/ai-model/rerank/create-rerank)
- [实时对话 (Realtime)](https://docs.newapi.pro/api/openai-realtime)
+- [实时对话 (Realtime)](https://docs.newapi.pro/zh/docs/api/ai-model/realtime/create-realtime-session)
- [Claude 聊天](https://docs.newapi.pro/api/anthropic-chat)
+- [Claude 聊天](https://docs.newapi.pro/zh/docs/api/ai-model/chat/create-message)
- [Google Gemini 聊天](https://docs.newapi.pro/api/google-gemini-chat)
+- [Google Gemini 聊天](https://doc.newapi.pro/api/google-gemini-chat)
 </details>
@@ -305,10 +306,11 @@ docker run --name new-api -d --restart always \
 | `REDIS_CONN_STRING` | Redis 连接字符串                                                  | - |
 | `STREAMING_TIMEOUT` | 流式超时时间（秒）                                                    | `300` |
 | `STREAM_SCANNER_MAX_BUFFER_MB` | 流式扫描器单行最大缓冲（MB），图像生成等超大 `data:` 片段（如 4K 图片 base64）需适当调大 | `64` |
 | `MAX_REQUEST_BODY_MB` | 请求体最大大小（MB，**解压后**计；防止超大请求/zip bomb 导致内存暴涨），超过将返回 `413` | `32` |
 | `AZURE_DEFAULT_API_VERSION` | Azure API 版本                                                 | `2025-04-01-preview` |
 | `ERROR_LOG_ENABLED` | 错误日志开关                                                       | `false` |
-📖 **完整配置：** [环境变量文档](https://docs.newapi.pro/installation/environment-variables)
+📖 **完整配置：** [环境变量文档](https://docs.newapi.pro/zh/docs/installation/config-maintenance/environment-variables)
 </details>
@@ -410,10 +412,10 @@ docker run --name new-api -d --restart always \
 | 资源 | 链接 |
 |------|------|
-| 📘 常见问题 | [FAQ](https://docs.newapi.pro/support/faq) |
+| 📘 常见问题 | [FAQ](https://docs.newapi.pro/zh/docs/support/faq) |
-| 💬 社区交流 | [交流渠道](https://docs.newapi.pro/support/community-interaction) |
+| 💬 社区交流 | [交流渠道](https://docs.newapi.pro/zh/docs/support/community-interaction) |
-| 🐛 反馈问题 | [问题反馈](https://docs.newapi.pro/support/feedback-issues) |
+| 🐛 反馈问题 | [问题反馈](https://docs.newapi.pro/zh/docs/support/feedback-issues) |
-| 📚 完整文档 | [官方文档](https://docs.newapi.pro/support) |
+| 📚 完整文档 | [官方文档](https://docs.newapi.pro/zh/docs) |
 ### 🤝 贡献指南
@@ -442,7 +444,7 @@ docker run --name new-api -d --restart always \
 如果这个项目对你有帮助，欢迎给我们一个 ⭐️ Star！
-**[官方文档](https://docs.newapi.pro/)** • **[问题反馈](https://github.com/Calcium-Ion/new-api/issues)** • **[最新发布](https://github.com/Calcium-Ion/new-api/releases)**
+**[官方文档](https://docs.newapi.pro/zh/docs)** • **[问题反馈](https://github.com/Calcium-Ion/new-api/issues)** • **[最新发布](https://github.com/Calcium-Ion/new-api/releases)**
 <sub>Built with ❤️ by QuantumNous</sub>
--- a/common/audio.go
+++ b/common/audio.go
@@ -71,15 +71,66 @@ func getMP3Duration(r io.Reader) (float64, error) {
 // getWAVDuration 解析 WAV 文件头以获取时长。
 func getWAVDuration(r io.ReadSeeker) (float64, error) {
 	// 1. 强制复位指针
 	r.Seek(0, io.SeekStart)
 	dec := wav.NewDecoder(r)
 	// IsValidFile 会读取 fmt 块
 	if !dec.IsValidFile() {
 		return 0, errors.New("invalid wav file")
 	}
-	d, err := dec.Duration()
+
-	if err != nil {
+	// 尝试寻找 data 块
-		return 0, errors.Wrap(err, "failed to get wav duration")
+	if err := dec.FwdToPCM(); err != nil {
 		return 0, errors.Wrap(err, "failed to find PCM data chunk")
 	}
-	return d.Seconds(), nil
+
 	pcmSize := int64(dec.PCMSize)
 	// 如果读出来的 Size 是 0，尝试用文件大小反推
 	if pcmSize == 0 {
 		// 获取文件总大小
 		currentPos, _ := r.Seek(0, io.SeekCurrent) // 当前通常在 data chunk header 之后
 		endPos, _ := r.Seek(0, io.SeekEnd)
 		fileSize := endPos
 		// 恢复位置（虽然如果不继续读也没关系）
 		r.Seek(currentPos, io.SeekStart)
 		// 数据区大小 ≈ 文件总大小 - 当前指针位置(即Header大小)
 		// 注意：FwdToPCM 成功后，CurrentPos 应该刚好指向 Data 区数据的开始
 		// 或者是 Data Chunk ID + Size 之后。
 		// WAV Header 一般 44 字节。
 		if fileSize > 44 {
 			// 如果 FwdToPCM 成功，Reader 应该位于 data 块的数据起始处
 			// 所以剩余的所有字节理论上都是音频数据
 			pcmSize = fileSize - currentPos
 			// 简单的兜底：如果算出来还是负数或0，强制按文件大小-44计算
 			if pcmSize <= 0 {
 				pcmSize = fileSize - 44
 			}
 		}
 	}
 	numChans := int64(dec.NumChans)
 	bitDepth := int64(dec.BitDepth)
 	sampleRate := float64(dec.SampleRate)
 	if sampleRate == 0 || numChans == 0 || bitDepth == 0 {
 		return 0, errors.New("invalid wav header metadata")
 	}
 	bytesPerFrame := numChans * (bitDepth / 8)
 	if bytesPerFrame == 0 {
 		return 0, errors.New("invalid byte depth calculation")
 	}
 	totalFrames := pcmSize / bytesPerFrame
 	durationSeconds := float64(totalFrames) / sampleRate
 	return durationSeconds, nil
 }
 // getFLACDuration 解析 FLAC 文件的 STREAMINFO 块。
--- a/common/constants.go
+++ b/common/constants.go
@@ -121,6 +121,9 @@ var BatchUpdateInterval int
 var RelayTimeout int // unit is second
 var RelayMaxIdleConns int
 var RelayMaxIdleConnsPerHost int
 var GeminiSafetySetting string
 // https://docs.cohere.com/docs/safety-modes Type; NONE/CONTEXTUAL/STRICT
--- a/common/email.go
+++ b/common/email.go
@@ -32,7 +32,7 @@ func SendEmail(subject string, receiver string, content string) error {
 	}
 	encodedSubject := fmt.Sprintf("=?UTF-8?B?%s?=", base64.StdEncoding.EncodeToString([]byte(subject)))
 	mail := []byte(fmt.Sprintf("To: %s\r\n"+
-		"From: %s<%s>\r\n"+
+		"From: %s <%s>\r\n"+
 		"Subject: %s\r\n"+
 		"Date: %s\r\n"+
 		"Message-ID: %s\r\n"+ // 添加 Message-ID 头
--- a/common/gin.go
+++ b/common/gin.go
@@ -2,7 +2,7 @@ package common
 import (
 	"bytes"
-	"errors"
+	"fmt"
 	"io"
 	"mime"
 	"mime/multipart"
@@ -12,24 +12,61 @@ import (
 	"time"
 	"github.com/QuantumNous/new-api/constant"
 	"github.com/pkg/errors"
 	"github.com/gin-gonic/gin"
 )
 const KeyRequestBody = "key_request_body"
-func GetRequestBody(c *gin.Context) ([]byte, error) {
+var ErrRequestBodyTooLarge = errors.New("request body too large")
-	requestBody, _ := c.Get(KeyRequestBody)
+
-	if requestBody != nil {
+func IsRequestBodyTooLargeError(err error) bool {
-		return requestBody.([]byte), nil
+	if err == nil {
 		return false
 	}
-	requestBody, err := io.ReadAll(c.Request.Body)
+	if errors.Is(err, ErrRequestBodyTooLarge) {
 		return true
 	}
 	var mbe *http.MaxBytesError
 	return errors.As(err, &mbe)
 }
 func GetRequestBody(c *gin.Context) ([]byte, error) {
 	cached, exists := c.Get(KeyRequestBody)
 	if exists && cached != nil {
 		if b, ok := cached.([]byte); ok {
 			return b, nil
 		}
 	}
 	maxMB := constant.MaxRequestBodyMB
 	if maxMB < 0 {
 		// no limit
 		body, err := io.ReadAll(c.Request.Body)
 		_ = c.Request.Body.Close()
 		if err != nil {
 			return nil, err
 		}
 		c.Set(KeyRequestBody, body)
 		return body, nil
 	}
 	maxBytes := int64(maxMB) << 20
 	limited := io.LimitReader(c.Request.Body, maxBytes+1)
 	body, err := io.ReadAll(limited)
 	if err != nil {
 		_ = c.Request.Body.Close()
 		if IsRequestBodyTooLargeError(err) {
 			return nil, errors.Wrap(ErrRequestBodyTooLarge, fmt.Sprintf("request body exceeds %d MB", maxMB))
 		}
 		return nil, err
 	}
 	_ = c.Request.Body.Close()
-	c.Set(KeyRequestBody, requestBody)
+	if int64(len(body)) > maxBytes {
-	return requestBody.([]byte), nil
+		return nil, errors.Wrap(ErrRequestBodyTooLarge, fmt.Sprintf("request body exceeds %d MB", maxMB))
 	}
 	c.Set(KeyRequestBody, body)
 	return body, nil
 }
 func UnmarshalBodyReusable(c *gin.Context, v any) error {
--- a/common/init.go
+++ b/common/init.go
@@ -90,6 +90,8 @@ func InitEnv() {
 	SyncFrequency = GetEnvOrDefault("SYNC_FREQUENCY", 60)
 	BatchUpdateInterval = GetEnvOrDefault("BATCH_UPDATE_INTERVAL", 5)
 	RelayTimeout = GetEnvOrDefault("RELAY_TIMEOUT", 0)
 	RelayMaxIdleConns = GetEnvOrDefault("RELAY_MAX_IDLE_CONNS", 500)
 	RelayMaxIdleConnsPerHost = GetEnvOrDefault("RELAY_MAX_IDLE_CONNS_PER_HOST", 100)
 	// Initialize string variables with GetEnvOrDefaultString
 	GeminiSafetySetting = GetEnvOrDefaultString("GEMINI_SAFETY_SETTING", "BLOCK_NONE")
@@ -115,6 +117,8 @@ func initConstantEnv() {
 	constant.DifyDebug = GetEnvOrDefaultBool("DIFY_DEBUG", true)
 	constant.MaxFileDownloadMB = GetEnvOrDefault("MAX_FILE_DOWNLOAD_MB", 20)
 	constant.StreamScannerMaxBufferMB = GetEnvOrDefault("STREAM_SCANNER_MAX_BUFFER_MB", 64)
 	// MaxRequestBodyMB 请求体最大大小（解压后），用于防止超大请求/zip bomb导致内存暴涨
 	constant.MaxRequestBodyMB = GetEnvOrDefault("MAX_REQUEST_BODY_MB", 64)
 	// ForceStreamOption 覆盖请求参数，强制返回usage信息
 	constant.ForceStreamOption = GetEnvOrDefaultBool("FORCE_STREAM_OPTION", true)
 	constant.CountToken = GetEnvOrDefaultBool("CountToken", true)
@@ -129,6 +133,8 @@ func initConstantEnv() {
 	constant.GenerateDefaultToken = GetEnvOrDefaultBool("GENERATE_DEFAULT_TOKEN", false)
 	// 是否启用错误日志
 	constant.ErrorLogEnabled = GetEnvOrDefaultBool("ERROR_LOG_ENABLED", false)
 	// 任务轮询时查询的最大数量
 	constant.TaskQueryLimit = GetEnvOrDefault("TASK_QUERY_LIMIT", 1000)
 	soraPatchStr := GetEnvOrDefaultString("TASK_PRICE_PATCH", "")
 	if soraPatchStr != "" {
--- a/common/ip.go
+++ b/common/ip.go
@@ -2,6 +2,15 @@ package common
 import "net"
 func IsIP(s string) bool {
 	ip := net.ParseIP(s)
 	return ip != nil
 }
 func ParseIP(s string) net.IP {
 	return net.ParseIP(s)
 }
 func IsPrivateIP(ip net.IP) bool {
 	if ip.IsLoopback() || ip.IsLinkLocalUnicast() || ip.IsLinkLocalMulticast() {
 		return true
@@ -20,3 +29,23 @@ func IsPrivateIP(ip net.IP) bool {
 	}
 	return false
 }
 func IsIpInCIDRList(ip net.IP, cidrList []string) bool {
 	for _, cidr := range cidrList {
 		_, network, err := net.ParseCIDR(cidr)
 		if err != nil {
 			// 尝试作为单个IP处理
 			if whitelistIP := net.ParseIP(cidr); whitelistIP != nil {
 				if ip.Equal(whitelistIP) {
 					return true
 				}
 			}
 			continue
 		}
 		if network.Contains(ip) {
 			return true
 		}
 	}
 	return false
 }
--- a/common/json.go
+++ b/common/json.go
@@ -23,11 +23,11 @@ func Marshal(v any) ([]byte, error) {
 }
 func GetJsonType(data json.RawMessage) string {
-	data = bytes.TrimSpace(data)
+	trimmed := bytes.TrimSpace(data)
-	if len(data) == 0 {
+	if len(trimmed) == 0 {
 		return "unknown"
 	}
-	firstChar := bytes.TrimSpace(data)[0]
+	firstChar := trimmed[0]
 	switch firstChar {
 	case '{':
 		return "object"
--- a/common/model.go
+++ b/common/model.go
@@ -17,6 +17,13 @@ var (
 		"flux-",
 		"flux.1-",
 	}
 	OpenAITextModels = []string{
 		"gpt-",
 		"o1",
 		"o3",
 		"o4",
 		"chatgpt",
 	}
 )
 func IsOpenAIResponseOnlyModel(modelName string) bool {
@@ -40,3 +47,13 @@ func IsImageGenerationModel(modelName string) bool {
 	}
 	return false
 }
 func IsOpenAITextModel(modelName string) bool {
 	modelName = strings.ToLower(modelName)
 	for _, m := range OpenAITextModels {
 		if strings.Contains(modelName, m) {
 			return true
 		}
 	}
 	return false
 }
--- a/common/ssrf_protection.go
+++ b/common/ssrf_protection.go
@@ -186,23 +186,7 @@ func isIPListed(ip net.IP, list []string) bool {
 		return false
 	}
-	for _, whitelistCIDR := range list {
+	return IsIpInCIDRList(ip, list)
 		_, network, err := net.ParseCIDR(whitelistCIDR)
 		if err != nil {
 			// 尝试作为单个IP处理
 			if whitelistIP := net.ParseIP(whitelistCIDR); whitelistIP != nil {
 				if ip.Equal(whitelistIP) {
 					return true
 				}
 			}
 			continue
 		}
 		if network.Contains(ip) {
 			return true
 		}
 	}
 	return false
 }
 // IsIPAccessAllowed 检查IP是否允许访问
--- a/common/str.go
+++ b/common/str.go
@@ -3,12 +3,19 @@ package common
 import (
 	"encoding/base64"
 	"encoding/json"
 	"math/rand"
 	"net/url"
 	"regexp"
 	"strconv"
 	"strings"
 	"unsafe"
 	"github.com/samber/lo"
 )
 var (
 	maskURLPattern    = regexp.MustCompile(`(http|https)://[^\s/$.?#].[^\s]*`)
 	maskDomainPattern = regexp.MustCompile(`\b(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,}\b`)
 	maskIPPattern     = regexp.MustCompile(`\b(?:\d{1,3}\.){3}\d{1,3}\b`)
 )
 func GetStringIfEmpty(str string, defaultValue string) string {
@@ -19,12 +26,10 @@ func GetStringIfEmpty(str string, defaultValue string) string {
 }
 func GetRandomString(length int) string {
-	//rand.Seed(time.Now().UnixNano())
+	if length <= 0 {
-	key := make([]byte, length)
+		return ""
 	for i := 0; i < length; i++ {
 		key[i] = keyChars[rand.Intn(len(keyChars))]
 	}
-	return string(key)
+	return lo.RandomString(length, lo.AlphanumericCharset)
 }
 func MapToJsonStr(m map[string]interface{}) string {
@@ -170,8 +175,7 @@ func maskHostForPlainDomain(domain string) string {
 // api.openai.com -> ***.***.com
 func MaskSensitiveInfo(str string) string {
 	// Mask URLs
-	urlPattern := regexp.MustCompile(`(http|https)://[^\s/$.?#].[^\s]*`)
+	str = maskURLPattern.ReplaceAllStringFunc(str, func(urlStr string) string {
 	str = urlPattern.ReplaceAllStringFunc(str, func(urlStr string) string {
 		u, err := url.Parse(urlStr)
 		if err != nil {
 			return urlStr
@@ -224,14 +228,12 @@ func MaskSensitiveInfo(str string) string {
 	})
 	// Mask domain names without protocol (like openai.com, www.openai.com)
-	domainPattern := regexp.MustCompile(`\b(?:[a-zA-Z0-9](?:[a-zA-Z0-9-]{0,61}[a-zA-Z0-9])?\.)+[a-zA-Z]{2,}\b`)
+	str = maskDomainPattern.ReplaceAllStringFunc(str, func(domain string) string {
 	str = domainPattern.ReplaceAllStringFunc(str, func(domain string) string {
 		return maskHostForPlainDomain(domain)
 	})
 	// Mask IP addresses
-	ipPattern := regexp.MustCompile(`\b(?:\d{1,3}\.){3}\d{1,3}\b`)
+	str = maskIPPattern.ReplaceAllString(str, "***.***.***.***")
 	str = ipPattern.ReplaceAllString(str, "***.***.***.***")
 	return str
 }
--- a/common/utils.go
+++ b/common/utils.go
@@ -217,11 +217,6 @@ func IntMax(a int, b int) int {
 	}
 }
 func IsIP(s string) bool {
 	ip := net.ParseIP(s)
 	return ip != nil
 }
 func GetUUID() string {
 	code := uuid.New().String()
 	code = strings.Replace(code, "-", "", -1)
--- a/constant/context_key.go
+++ b/constant/context_key.go
@@ -3,8 +3,9 @@ package constant
 type ContextKey string
 const (
-	ContextKeyTokenCountMeta ContextKey = "token_count_meta"
+	ContextKeyTokenCountMeta  ContextKey = "token_count_meta"
-	ContextKeyPromptTokens   ContextKey = "prompt_tokens"
+	ContextKeyPromptTokens    ContextKey = "prompt_tokens"
 	ContextKeyEstimatedTokens ContextKey = "estimated_tokens"
 	ContextKeyOriginalModel    ContextKey = "original_model"
 	ContextKeyRequestStartTime ContextKey = "request_start_time"
@@ -17,6 +18,7 @@ const (
 	ContextKeyTokenSpecificChannelId ContextKey = "specific_channel_id"
 	ContextKeyTokenModelLimitEnabled ContextKey = "token_model_limit_enabled"
 	ContextKeyTokenModelLimit        ContextKey = "token_model_limit"
 	ContextKeyTokenCrossGroupRetry   ContextKey = "token_cross_group_retry"
 	/* channel related keys */
 	ContextKeyChannelId                ContextKey = "channel_id"
@@ -36,6 +38,10 @@ const (
 	ContextKeyChannelMultiKeyIndex     ContextKey = "channel_multi_key_index"
 	ContextKeyChannelKey               ContextKey = "channel_key"
 	ContextKeyAutoGroup           ContextKey = "auto_group"
 	ContextKeyAutoGroupIndex      ContextKey = "auto_group_index"
 	ContextKeyAutoGroupRetryIndex ContextKey = "auto_group_retry_index"
 	/* user related keys */
 	ContextKeyUserId      ContextKey = "id"
 	ContextKeyUserSetting ContextKey = "user_setting"
--- a/constant/env.go
+++ b/constant/env.go
@@ -9,12 +9,14 @@ var CountToken bool
 var GetMediaToken bool
 var GetMediaTokenNotStream bool
 var UpdateTask bool
 var MaxRequestBodyMB int
 var AzureDefaultAPIVersion string
 var GeminiVisionMaxImageNum int
 var NotifyLimitCount int
 var NotificationLimitDurationMinute int
 var GenerateDefaultToken bool
 var ErrorLogEnabled bool
 var TaskQueryLimit int
 // temporary variable for sora patch, will be removed in future
 var TaskPricePatches []string
--- a/constant/task.go
+++ b/constant/task.go
@@ -15,6 +15,7 @@ const (
 	TaskActionTextGenerate      = "textGenerate"
 	TaskActionFirstTailGenerate = "firstTailGenerate"
 	TaskActionReferenceGenerate = "referenceGenerate"
 	TaskActionRemix             = "remixGenerate"
 )
 var SunoModel2Action = map[string]string{
--- a/controller/billing.go
+++ b/controller/billing.go
@@ -2,9 +2,9 @@ package controller
 import (
 	"github.com/QuantumNous/new-api/common"
 	"github.com/QuantumNous/new-api/dto"
 	"github.com/QuantumNous/new-api/model"
 	"github.com/QuantumNous/new-api/setting/operation_setting"
 	"github.com/QuantumNous/new-api/types"
 	"github.com/gin-gonic/gin"
 )
@@ -29,7 +29,7 @@ func GetSubscription(c *gin.Context) {
 		expiredTime = 0
 	}
 	if err != nil {
-		openAIError := dto.OpenAIError{
+		openAIError := types.OpenAIError{
 			Message: err.Error(),
 			Type:    "upstream_error",
 		}
@@ -81,7 +81,7 @@ func GetUsage(c *gin.Context) {
 		quota, err = model.GetUserUsedQuota(userId)
 	}
 	if err != nil {
-		openAIError := dto.OpenAIError{
+		openAIError := types.OpenAIError{
 			Message: err.Error(),
 			Type:    "new_api_error",
 		}
--- a/controller/channel-test.go
+++ b/controller/channel-test.go
@@ -40,6 +40,13 @@ type testResult struct {
 	newAPIError *types.NewAPIError
 }
 // testChannel executes a test request against the given channel using the provided testModel and optional endpointType,
 // and returns a testResult containing the test context and any encountered error information.
 // It selects or derives a model when testModel is empty, auto-detects the request endpoint (chat, responses, embeddings, images, rerank) when endpointType is not specified,
 // converts and relays the request to the upstream adapter, and parses the upstream response to collect usage and pricing information.
 // On upstream responses that indicate the chat/completions `messages` parameter is unsupported and endpointType was not specified, it will retry the test using the Responses API.
 // The function records consumption logs and returns a testResult with a populated context on success, or with localErr/newAPIError set on failure;
 // for channel types that are not supported for testing it returns a localErr explaining that the channel test is not supported.
 func testChannel(channel *model.Channel, testModel string, endpointType string) testResult {
 	tik := time.Now()
 	var unsupportedTestChannelTypes = []int{
@@ -75,6 +82,8 @@ func testChannel(channel *model.Channel, testModel string, endpointType string)
 		}
 	}
 	originTestModel := testModel
 	requestPath := "/v1/chat/completions"
 	// 如果指定了端点类型，使用指定的端点类型
@@ -84,6 +93,10 @@ func testChannel(channel *model.Channel, testModel string, endpointType string)
 		}
 	} else {
 		// 如果没有指定端点类型，使用原有的自动检测逻辑
 		if common.IsOpenAIResponseOnlyModel(testModel) {
 			requestPath = "/v1/responses"
 		}
 		// 先判断是否为 Embedding 模型
 		if strings.Contains(strings.ToLower(testModel), "embedding") ||
 			strings.HasPrefix(testModel, "m3e") || // m3e 系列模型
@@ -319,6 +332,13 @@ func testChannel(channel *model.Channel, testModel string, endpointType string)
 		httpResp = resp.(*http.Response)
 		if httpResp.StatusCode != http.StatusOK {
 			err := service.RelayErrorHandler(c.Request.Context(), httpResp, true)
 			// 自动检测模式下，如果上游不支持 chat.completions 的 messages 参数，尝试切换到 Responses API 再测一次。
 			if endpointType == "" && requestPath == "/v1/chat/completions" && err != nil {
 				lowerErr := strings.ToLower(err.Error())
 				if strings.Contains(lowerErr, "unsupported parameter") && strings.Contains(lowerErr, "messages") {
 					return testChannel(channel, originTestModel, string(constant.EndpointTypeOpenAIResponse))
 				}
 			}
 			return testResult{
 				context:     c,
 				localErr:    err,
@@ -351,7 +371,7 @@ func testChannel(channel *model.Channel, testModel string, endpointType string)
 			newAPIError: types.NewOpenAIError(err, types.ErrorCodeReadResponseBodyFailed, http.StatusInternalServerError),
 		}
 	}
-	info.PromptTokens = usage.PromptTokens
+	info.SetEstimatePromptTokens(usage.PromptTokens)
 	quota := 0
 	if !priceData.UsePrice {
@@ -389,6 +409,7 @@ func testChannel(channel *model.Channel, testModel string, endpointType string)
 	}
 }
 // for embedding models, and otherwise a chat/completion request with model-specific token limit heuristics.
 func buildTestRequest(model string, endpointType string) dto.Request {
 	// 根据端点类型构建不同的测试请求
 	if endpointType != "" {
@@ -417,9 +438,12 @@ func buildTestRequest(model string, endpointType string) dto.Request {
 			}
 		case constant.EndpointTypeOpenAIResponse:
 			// 返回 OpenAIResponsesRequest
 			maxOutputTokens := uint(10)
 			return &dto.OpenAIResponsesRequest{
-				Model: model,
+				Model:           model,
-				Input: json.RawMessage("\"hi\""),
+				Input:           json.RawMessage(`[{"role":"user","content":"hi"}]`),
 				MaxOutputTokens: maxOutputTokens,
 				Stream:          true,
 			}
 		case constant.EndpointTypeAnthropic, constant.EndpointTypeGemini, constant.EndpointTypeOpenAI:
 			// 返回 GeneralOpenAIRequest
@@ -442,6 +466,16 @@ func buildTestRequest(model string, endpointType string) dto.Request {
 	}
 	// 自动检测逻辑（保持原有行为）
 	if common.IsOpenAIResponseOnlyModel(model) {
 		maxOutputTokens := uint(10)
 		return &dto.OpenAIResponsesRequest{
 			Model:           model,
 			Input:           json.RawMessage(`[{"role":"user","content":"hi"}]`),
 			MaxOutputTokens: maxOutputTokens,
 			Stream:          true,
 		}
 	}
 	// 先判断是否为 Embedding 模型
 	if strings.Contains(strings.ToLower(model), "embedding") ||
 		strings.HasPrefix(model, "m3e") ||
@@ -640,4 +674,4 @@ func AutomaticallyTestChannels() {
 			}
 		}
 	})
-}
+}
--- a/controller/channel.go
+++ b/controller/channel.go
@@ -165,6 +165,30 @@ func GetAllChannels(c *gin.Context) {
 	return
 }
 func buildFetchModelsHeaders(channel *model.Channel, key string) (http.Header, error) {
 	var headers http.Header
 	switch channel.Type {
 	case constant.ChannelTypeAnthropic:
 		headers = GetClaudeAuthHeader(key)
 	default:
 		headers = GetAuthHeader(key)
 	}
 	headerOverride := channel.GetHeaderOverride()
 	for k, v := range headerOverride {
 		str, ok := v.(string)
 		if !ok {
 			return nil, fmt.Errorf("invalid header override for key %s", k)
 		}
 		if strings.Contains(str, "{api_key}") {
 			str = strings.ReplaceAll(str, "{api_key}", key)
 		}
 		headers.Set(k, str)
 	}
 	return headers, nil
 }
 func FetchUpstreamModels(c *gin.Context) {
 	id, err := strconv.Atoi(c.Param("id"))
 	if err != nil {
@@ -223,14 +247,13 @@ func FetchUpstreamModels(c *gin.Context) {
 	}
 	key = strings.TrimSpace(key)
-	// 获取响应体 - 根据渠道类型决定是否添加 AuthHeader
+	headers, err := buildFetchModelsHeaders(channel, key)
-	var body []byte
+	if err != nil {
-	switch channel.Type {
+		common.ApiError(c, err)
-	case constant.ChannelTypeAnthropic:
+		return
 		body, err = GetResponseBody("GET", url, channel, GetClaudeAuthHeader(key))
 	default:
 		body, err = GetResponseBody("GET", url, channel, GetAuthHeader(key))
 	}
 	body, err := GetResponseBody("GET", url, channel, headers)
 	if err != nil {
 		common.ApiError(c, err)
 		return
--- a/controller/discord.go
+++ b/controller/discord.go
@@ -114,7 +114,7 @@ func DiscordOAuth(c *gin.Context) {
 		DiscordBind(c)
 		return
 	}
-		if !system_setting.GetDiscordSettings().Enabled {
+	if !system_setting.GetDiscordSettings().Enabled {
 		c.JSON(http.StatusOK, gin.H{
 			"success": false,
 			"message": "管理员未开启通过 Discord 登录以及注册",
--- a/controller/model.go
+++ b/controller/model.go
@@ -18,6 +18,7 @@ import (
 	"github.com/QuantumNous/new-api/service"
 	"github.com/QuantumNous/new-api/setting/operation_setting"
 	"github.com/QuantumNous/new-api/setting/ratio_setting"
 	"github.com/QuantumNous/new-api/types"
 	"github.com/gin-gonic/gin"
 	"github.com/samber/lo"
 )
@@ -275,7 +276,7 @@ func RetrieveModel(c *gin.Context, modelType int) {
 			c.JSON(200, aiModel)
 		}
 	} else {
-		openAIError := dto.OpenAIError{
+		openAIError := types.OpenAIError{
 			Message: fmt.Sprintf("The model '%s' does not exist", modelId),
 			Type:    "invalid_request_error",
 			Param:   "model",
--- a/controller/playground.go
+++ b/controller/playground.go
@@ -3,12 +3,10 @@ package controller
 import (
 	"errors"
 	"fmt"
 	"time"
 	"github.com/QuantumNous/new-api/common"
 	"github.com/QuantumNous/new-api/constant"
 	"github.com/QuantumNous/new-api/middleware"
 	"github.com/QuantumNous/new-api/model"
 	relaycommon "github.com/QuantumNous/new-api/relay/common"
 	"github.com/QuantumNous/new-api/types"
 	"github.com/gin-gonic/gin"
@@ -31,8 +29,11 @@ func Playground(c *gin.Context) {
 		return
 	}
-	group := common.GetContextKeyString(c, constant.ContextKeyUsingGroup)
+	relayInfo, err := relaycommon.GenRelayInfo(c, types.RelayFormatOpenAI, nil, nil)
-	modelName := c.GetString("original_model")
+	if err != nil {
 		newAPIError = types.NewError(err, types.ErrorCodeInvalidRequest, types.ErrOptionWithSkipRetry())
 		return
 	}
 	userId := c.GetInt("id")
@@ -46,16 +47,10 @@ func Playground(c *gin.Context) {
 	tempToken := &model.Token{
 		UserId: userId,
-		Name:   fmt.Sprintf("playground-%s", group),
+		Name:   fmt.Sprintf("playground-%s", relayInfo.UsingGroup),
-		Group:  group,
+		Group:  relayInfo.UsingGroup,
 	}
 	_ = middleware.SetupContextForToken(c, tempToken)
 	_, newAPIError = getChannel(c, group, modelName, 0)
 	if newAPIError != nil {
 		return
 	}
 	//middleware.SetupContextForSelectedChannel(c, channel, playgroundRequest.Model)
 	common.SetContextKey(c, constant.ContextKeyRequestStartTime, time.Now())
 	Relay(c, types.RelayFormatOpenAI)
 }
--- a/controller/relay.go
+++ b/controller/relay.go
@@ -2,6 +2,7 @@ package controller
 import (
 	"bytes"
 	"errors"
 	"fmt"
 	"io"
 	"log"
@@ -64,8 +65,8 @@ func geminiRelayHandler(c *gin.Context, info *relaycommon.RelayInfo) *types.NewA
 func Relay(c *gin.Context, relayFormat types.RelayFormat) {
 	requestId := c.GetString(common.RequestIdKey)
-	group := common.GetContextKeyString(c, constant.ContextKeyUsingGroup)
+	//group := common.GetContextKeyString(c, constant.ContextKeyUsingGroup)
-	originalModel := common.GetContextKeyString(c, constant.ContextKeyOriginalModel)
+	//originalModel := common.GetContextKeyString(c, constant.ContextKeyOriginalModel)
 	var (
 		newAPIError *types.NewAPIError
@@ -104,7 +105,12 @@ func Relay(c *gin.Context, relayFormat types.RelayFormat) {
 	request, err := helper.GetAndValidateRequest(c, relayFormat)
 	if err != nil {
-		newAPIError = types.NewError(err, types.ErrorCodeInvalidRequest)
+		// Map "request body too large" to 413 so clients can handle it correctly
 		if common.IsRequestBodyTooLargeError(err) || errors.Is(err, common.ErrRequestBodyTooLarge) {
 			newAPIError = types.NewErrorWithStatusCode(err, types.ErrorCodeReadRequestBodyFailed, http.StatusRequestEntityTooLarge, types.ErrOptionWithSkipRetry())
 		} else {
 			newAPIError = types.NewError(err, types.ErrorCodeInvalidRequest)
 		}
 		return
 	}
@@ -114,9 +120,17 @@ func Relay(c *gin.Context, relayFormat types.RelayFormat) {
 		return
 	}
-	meta := request.GetTokenCountMeta()
+	needSensitiveCheck := setting.ShouldCheckPromptSensitive()
 	needCountToken := constant.CountToken
 	// Avoid building huge CombineText (strings.Join) when token counting and sensitive check are both disabled.
 	var meta *types.TokenCountMeta
 	if needSensitiveCheck || needCountToken {
 		meta = request.GetTokenCountMeta()
 	} else {
 		meta = fastTokenCountMetaForPricing(request)
 	}
-	if setting.ShouldCheckPromptSensitive() {
+	if needSensitiveCheck && meta != nil {
 		contains, words := service.CheckSensitiveText(meta.CombineText)
 		if contains {
 			logger.LogWarn(c, fmt.Sprintf("user sensitive words detected: %s", strings.Join(words, ", ")))
@@ -125,13 +139,13 @@ func Relay(c *gin.Context, relayFormat types.RelayFormat) {
 		}
 	}
-	tokens, err := service.CountRequestToken(c, meta, relayInfo)
+	tokens, err := service.EstimateRequestToken(c, meta, relayInfo)
 	if err != nil {
 		newAPIError = types.NewError(err, types.ErrorCodeCountTokenFailed)
 		return
 	}
-	relayInfo.SetPromptTokens(tokens)
+	relayInfo.SetEstimatePromptTokens(tokens)
 	priceData, err := helper.ModelPriceHelper(c, relayInfo, tokens, meta)
 	if err != nil {
@@ -157,16 +171,32 @@ func Relay(c *gin.Context, relayFormat types.RelayFormat) {
 		}
 	}()
-	for i := 0; i <= common.RetryTimes; i++ {
+	retryParam := &service.RetryParam{
-		channel, err := getChannel(c, group, originalModel, i)
+		Ctx:        c,
-		if err != nil {
+		TokenGroup: relayInfo.TokenGroup,
-			logger.LogError(c, err.Error())
+		ModelName:  relayInfo.OriginModelName,
-			newAPIError = err
+		Retry:      common.GetPointer(0),
 	}
 	for ; retryParam.GetRetry() <= common.RetryTimes; retryParam.IncreaseRetry() {
 		channel, channelErr := getChannel(c, relayInfo, retryParam)
 		if channelErr != nil {
 			logger.LogError(c, channelErr.Error())
 			newAPIError = channelErr
 			break
 		}
 		addUsedChannel(c, channel.Id)
-		requestBody, _ := common.GetRequestBody(c)
+		requestBody, bodyErr := common.GetRequestBody(c)
 		if bodyErr != nil {
 			// Ensure consistent 413 for oversized bodies even when error occurs later (e.g., retry path)
 			if common.IsRequestBodyTooLargeError(bodyErr) || errors.Is(bodyErr, common.ErrRequestBodyTooLarge) {
 				newAPIError = types.NewErrorWithStatusCode(bodyErr, types.ErrorCodeReadRequestBodyFailed, http.StatusRequestEntityTooLarge, types.ErrOptionWithSkipRetry())
 			} else {
 				newAPIError = types.NewErrorWithStatusCode(bodyErr, types.ErrorCodeReadRequestBodyFailed, http.StatusBadRequest, types.ErrOptionWithSkipRetry())
 			}
 			break
 		}
 		c.Request.Body = io.NopCloser(bytes.NewBuffer(requestBody))
 		switch relayFormat {
@@ -186,7 +216,7 @@ func Relay(c *gin.Context, relayFormat types.RelayFormat) {
 		processChannelError(c, *types.NewChannelError(channel.Id, channel.Type, channel.Name, channel.ChannelInfo.IsMultiKey, common.GetContextKeyString(c, constant.ContextKeyChannelKey), channel.GetAutoBan()), newAPIError)
-		if !shouldRetry(c, newAPIError, common.RetryTimes-i) {
+		if !shouldRetry(c, newAPIError, common.RetryTimes-retryParam.GetRetry()) {
 			break
 		}
 	}
@@ -211,8 +241,35 @@ func addUsedChannel(c *gin.Context, channelId int) {
 	c.Set("use_channel", useChannel)
 }
-func getChannel(c *gin.Context, group, originalModel string, retryCount int) (*model.Channel, *types.NewAPIError) {
+func fastTokenCountMetaForPricing(request dto.Request) *types.TokenCountMeta {
-	if retryCount == 0 {
+	if request == nil {
 		return &types.TokenCountMeta{}
 	}
 	meta := &types.TokenCountMeta{
 		TokenType: types.TokenTypeTokenizer,
 	}
 	switch r := request.(type) {
 	case *dto.GeneralOpenAIRequest:
 		if r.MaxCompletionTokens > r.MaxTokens {
 			meta.MaxTokens = int(r.MaxCompletionTokens)
 		} else {
 			meta.MaxTokens = int(r.MaxTokens)
 		}
 	case *dto.OpenAIResponsesRequest:
 		meta.MaxTokens = int(r.MaxOutputTokens)
 	case *dto.ClaudeRequest:
 		meta.MaxTokens = int(r.MaxTokens)
 	case *dto.ImageRequest:
 		// Pricing for image requests depends on ImagePriceRatio; safe to compute even when CountToken is disabled.
 		return r.GetTokenCountMeta()
 	default:
 		// Best-effort: leave CombineText empty to avoid large allocations.
 	}
 	return meta
 }
 func getChannel(c *gin.Context, info *relaycommon.RelayInfo, retryParam *service.RetryParam) (*model.Channel, *types.NewAPIError) {
 	if info.ChannelMeta == nil {
 		autoBan := c.GetBool("auto_ban")
 		autoBanInt := 1
 		if !autoBan {
@@ -225,14 +282,18 @@ func getChannel(c *gin.Context, group, originalModel string, retryCount int) (*m
 			AutoBan: &autoBanInt,
 		}, nil
 	}
-	channel, selectGroup, err := service.CacheGetRandomSatisfiedChannel(c, group, originalModel, retryCount)
+	channel, selectGroup, err := service.CacheGetRandomSatisfiedChannel(retryParam)
 	info.PriceData.GroupRatioInfo = helper.HandleGroupRatio(c, info)
 	if err != nil {
-		return nil, types.NewError(fmt.Errorf("获取分组 %s 下模型 %s 的可用渠道失败（retry）: %s", selectGroup, originalModel, err.Error()), types.ErrorCodeGetChannelFailed, types.ErrOptionWithSkipRetry())
+		return nil, types.NewError(fmt.Errorf("获取分组 %s 下模型 %s 的可用渠道失败（retry）: %s", selectGroup, info.OriginModelName, err.Error()), types.ErrorCodeGetChannelFailed, types.ErrOptionWithSkipRetry())
 	}
 	if channel == nil {
-		return nil, types.NewError(fmt.Errorf("分组 %s 下模型 %s 的可用渠道不存在（retry）", selectGroup, originalModel), types.ErrorCodeGetChannelFailed, types.ErrOptionWithSkipRetry())
+		return nil, types.NewError(fmt.Errorf("分组 %s 下模型 %s 的可用渠道不存在（retry）", selectGroup, info.OriginModelName), types.ErrorCodeGetChannelFailed, types.ErrOptionWithSkipRetry())
 	}
-	newAPIError := middleware.SetupContextForSelectedChannel(c, channel, originalModel)
+
 	newAPIError := middleware.SetupContextForSelectedChannel(c, channel, info.OriginModelName)
 	if newAPIError != nil {
 		return nil, newAPIError
 	}
@@ -285,7 +346,7 @@ func processChannelError(c *gin.Context, channelError types.ChannelError, err *t
 	logger.LogError(c, fmt.Sprintf("channel error (channel #%d, status code: %d): %s", channelError.ChannelId, err.StatusCode, err.Error()))
 	// 不要使用context获取渠道信息，异步处理时可能会出现渠道信息不一致的情况
 	// do not use context to get channel info, there may be inconsistent channel info when processing asynchronously
-	if service.ShouldDisableChannel(channelError.ChannelId, err) && channelError.AutoBan {
+	if service.ShouldDisableChannel(channelError.ChannelType, err) && channelError.AutoBan {
 		gopool.Go(func() {
 			service.DisableChannel(channelError, err.Error())
 		})
@@ -366,7 +427,7 @@ func RelayMidjourney(c *gin.Context) {
 }
 func RelayNotImplemented(c *gin.Context) {
-	err := dto.OpenAIError{
+	err := types.OpenAIError{
 		Message: "API not implemented",
 		Type:    "new_api_error",
 		Param:   "",
@@ -378,7 +439,7 @@ func RelayNotImplemented(c *gin.Context) {
 }
 func RelayNotFound(c *gin.Context) {
-	err := dto.OpenAIError{
+	err := types.OpenAIError{
 		Message: fmt.Sprintf("Invalid URL (%s %s)", c.Request.Method, c.Request.URL.Path),
 		Type:    "invalid_request_error",
 		Param:   "",
@@ -392,8 +453,6 @@ func RelayNotFound(c *gin.Context) {
 func RelayTask(c *gin.Context) {
 	retryTimes := common.RetryTimes
 	channelId := c.GetInt("channel_id")
 	group := c.GetString("group")
 	originalModel := c.GetString("original_model")
 	c.Set("use_channel", []string{fmt.Sprintf("%d", channelId)})
 	relayInfo, err := relaycommon.GenRelayInfo(c, types.RelayFormatTask, nil, nil)
 	if err != nil {
@@ -403,8 +462,14 @@ func RelayTask(c *gin.Context) {
 	if taskErr == nil {
 		retryTimes = 0
 	}
-	for i := 0; shouldRetryTaskRelay(c, channelId, taskErr, retryTimes) && i < retryTimes; i++ {
+	retryParam := &service.RetryParam{
-		channel, newAPIError := getChannel(c, group, originalModel, i)
+		Ctx:        c,
 		TokenGroup: relayInfo.TokenGroup,
 		ModelName:  relayInfo.OriginModelName,
 		Retry:      common.GetPointer(0),
 	}
 	for ; shouldRetryTaskRelay(c, channelId, taskErr, retryTimes) && retryParam.GetRetry() < retryTimes; retryParam.IncreaseRetry() {
 		channel, newAPIError := getChannel(c, relayInfo, retryParam)
 		if newAPIError != nil {
 			logger.LogError(c, fmt.Sprintf("CacheGetRandomSatisfiedChannel failed: %s", newAPIError.Error()))
 			taskErr = service.TaskErrorWrapperLocal(newAPIError.Err, "get_channel_failed", http.StatusInternalServerError)
@@ -414,10 +479,18 @@ func RelayTask(c *gin.Context) {
 		useChannel := c.GetStringSlice("use_channel")
 		useChannel = append(useChannel, fmt.Sprintf("%d", channelId))
 		c.Set("use_channel", useChannel)
-		logger.LogInfo(c, fmt.Sprintf("using channel #%d to retry (remain times %d)", channel.Id, i))
+		logger.LogInfo(c, fmt.Sprintf("using channel #%d to retry (remain times %d)", channel.Id, retryParam.GetRetry()))
 		//middleware.SetupContextForSelectedChannel(c, channel, originalModel)
-		requestBody, _ := common.GetRequestBody(c)
+		requestBody, err := common.GetRequestBody(c)
 		if err != nil {
 			if common.IsRequestBodyTooLargeError(err) || errors.Is(err, common.ErrRequestBodyTooLarge) {
 				taskErr = service.TaskErrorWrapperLocal(err, "read_request_body_failed", http.StatusRequestEntityTooLarge)
 			} else {
 				taskErr = service.TaskErrorWrapperLocal(err, "read_request_body_failed", http.StatusBadRequest)
 			}
 			break
 		}
 		c.Request.Body = io.NopCloser(bytes.NewBuffer(requestBody))
 		taskErr = taskRelayHandler(c, relayInfo)
 	}
--- a/controller/task.go
+++ b/controller/task.go
@@ -29,7 +29,7 @@ func UpdateTaskBulk() {
 		time.Sleep(time.Duration(15) * time.Second)
 		common.SysLog("任务进度轮询开始")
 		ctx := context.TODO()
-		allTasks := model.GetAllUnFinishSyncTasks(500)
+		allTasks := model.GetAllUnFinishSyncTasks(constant.TaskQueryLimit)
 		platformTask := make(map[constant.TaskPlatform][]*model.Task)
 		for _, t := range allTasks {
 			platformTask[t.Platform] = append(platformTask[t.Platform], t)
@@ -88,7 +88,7 @@ func UpdateSunoTaskAll(ctx context.Context, taskChannelM map[int][]string, taskM
 	for channelId, taskIds := range taskChannelM {
 		err := updateSunoTaskAll(ctx, channelId, taskIds, taskM)
 		if err != nil {
-			logger.LogError(ctx, fmt.Sprintf("渠道 #%d 更新异步任务失败: %d", channelId, err.Error()))
+			logger.LogError(ctx, fmt.Sprintf("渠道 #%d 更新异步任务失败: %s", channelId, err.Error()))
 		}
 	}
 	return nil
@@ -116,9 +116,10 @@ func updateSunoTaskAll(ctx context.Context, channelId int, taskIds []string, tas
 	if adaptor == nil {
 		return errors.New("adaptor not found")
 	}
 	proxy := channel.GetSetting().Proxy
 	resp, err := adaptor.FetchTask(*channel.BaseURL, channel.Key, map[string]any{
 		"ids": taskIds,
-	})
+	}, proxy)
 	if err != nil {
 		common.SysLog(fmt.Sprintf("Get Task Do req error: %v", err))
 		return err
@@ -140,7 +141,7 @@ func updateSunoTaskAll(ctx context.Context, channelId int, taskIds []string, tas
 		return err
 	}
 	if !responseItems.IsSuccess() {
-		common.SysLog(fmt.Sprintf("渠道 #%d 未完成的任务有: %d, 成功获取到任务数: %d", channelId, len(taskIds), string(responseBody)))
+		common.SysLog(fmt.Sprintf("渠道 #%d 未完成的任务有: %d, 成功获取到任务数: %s", channelId, len(taskIds), string(responseBody)))
 		return err
 	}
--- a/controller/task_video.go
+++ b/controller/task_video.go
@@ -67,6 +67,7 @@ func updateVideoSingleTask(ctx context.Context, adaptor channel.TaskAdaptor, cha
 	if channel.GetBaseURL() != "" {
 		baseURL = channel.GetBaseURL()
 	}
 	proxy := channel.GetSetting().Proxy
 	task := taskM[taskId]
 	if task == nil {
@@ -76,7 +77,7 @@ func updateVideoSingleTask(ctx context.Context, adaptor channel.TaskAdaptor, cha
 	resp, err := adaptor.FetchTask(baseURL, channel.Key, map[string]any{
 		"task_id": taskId,
 		"action":  task.Action,
-	})
+	}, proxy)
 	if err != nil {
 		return fmt.Errorf("fetchTask failed for task %s: %w", taskId, err)
 	}
--- a/controller/token.go
+++ b/controller/token.go
@@ -142,7 +142,7 @@ func AddToken(c *gin.Context) {
 		common.ApiError(c, err)
 		return
 	}
-	if len(token.Name) > 30 {
+	if len(token.Name) > 50 {
 		c.JSON(http.StatusOK, gin.H{
 			"success": false,
 			"message": "令牌名称过长",
@@ -171,6 +171,7 @@ func AddToken(c *gin.Context) {
 		ModelLimits:        token.ModelLimits,
 		AllowIps:           token.AllowIps,
 		Group:              token.Group,
 		CrossGroupRetry:    token.CrossGroupRetry,
 	}
 	err = cleanToken.Insert()
 	if err != nil {
@@ -208,7 +209,7 @@ func UpdateToken(c *gin.Context) {
 		common.ApiError(c, err)
 		return
 	}
-	if len(token.Name) > 30 {
+	if len(token.Name) > 50 {
 		c.JSON(http.StatusOK, gin.H{
 			"success": false,
 			"message": "令牌名称过长",
@@ -248,6 +249,7 @@ func UpdateToken(c *gin.Context) {
 		cleanToken.ModelLimits = token.ModelLimits
 		cleanToken.AllowIps = token.AllowIps
 		cleanToken.Group = token.Group
 		cleanToken.CrossGroupRetry = token.CrossGroupRetry
 	}
 	err = cleanToken.Update()
 	if err != nil {
--- a/controller/topup_creem.go
+++ b/controller/topup_creem.go
@@ -7,12 +7,12 @@ import (
 	"encoding/hex"
 	"encoding/json"
 	"fmt"
 	"io"
 	"log"
 	"net/http"
 	"github.com/QuantumNous/new-api/common"
 	"github.com/QuantumNous/new-api/model"
 	"github.com/QuantumNous/new-api/setting"
 	"io"
 	"log"
 	"net/http"
 	"time"
 	"github.com/gin-gonic/gin"
--- a/controller/video_proxy.go
+++ b/controller/video_proxy.go
@@ -1,6 +1,7 @@
 package controller
 import (
 	"context"
 	"fmt"
 	"io"
 	"net/http"
@@ -10,6 +11,7 @@ import (
 	"github.com/QuantumNous/new-api/constant"
 	"github.com/QuantumNous/new-api/logger"
 	"github.com/QuantumNous/new-api/model"
 	"github.com/QuantumNous/new-api/service"
 	"github.com/gin-gonic/gin"
 )
@@ -75,11 +77,22 @@ func VideoProxy(c *gin.Context) {
 	}
 	var videoURL string
-	client := &http.Client{
+	proxy := channel.GetSetting().Proxy
-		Timeout: 60 * time.Second,
+	client, err := service.GetHttpClientWithProxy(proxy)
 	if err != nil {
 		logger.LogError(c.Request.Context(), fmt.Sprintf("Failed to create proxy client for task %s: %s", taskID, err.Error()))
 		c.JSON(http.StatusInternalServerError, gin.H{
 			"error": gin.H{
 				"message": "Failed to create proxy client",
 				"type":    "server_error",
 			},
 		})
 		return
 	}
-	req, err := http.NewRequestWithContext(c.Request.Context(), http.MethodGet, "", nil)
+	ctx, cancel := context.WithTimeout(c.Request.Context(), 60*time.Second)
 	defer cancel()
 	req, err := http.NewRequestWithContext(ctx, http.MethodGet, "", nil)
 	if err != nil {
 		logger.LogError(c.Request.Context(), fmt.Sprintf("Failed to create request: %s", err.Error()))
 		c.JSON(http.StatusInternalServerError, gin.H{
--- a/controller/video_proxy_gemini.go
+++ b/controller/video_proxy_gemini.go
@@ -35,10 +35,11 @@ func getGeminiVideoURL(channel *model.Channel, task *model.Task, apiKey string)
 		return "", fmt.Errorf("api key not available for task")
 	}
 	proxy := channel.GetSetting().Proxy
 	resp, err := adaptor.FetchTask(baseURL, apiKey, map[string]any{
 		"task_id": task.TaskID,
 		"action":  task.Action,
-	})
+	}, proxy)
 	if err != nil {
 		return "", fmt.Errorf("fetch task failed: %w", err)
 	}
--- a/docs/api/api_auth.md
+++ b/docs/api/api_auth.md
@@ -1,53 +0,0 @@
 # API 鉴权文档
 ## 认证方式
 ### Access Token
 对于需要鉴权的 API 接口，必须同时提供以下两个请求头来进行 Access Token 认证：
 1. **请求头中的 `Authorization` 字段**
    将 Access Token 放置于 HTTP 请求头部的 `Authorization` 字段中，格式如下：
    ```
    Authorization: <your_access_token>
    ```
    其中 `<your_access_token>` 需要替换为实际的 Access Token 值。
 2. **请求头中的 `New-Api-User` 字段**
    将用户 ID 放置于 HTTP 请求头部的 `New-Api-User` 字段中，格式如下：
    ```
    New-Api-User: <your_user_id>
    ```
    其中 `<your_user_id>` 需要替换为实际的用户 ID。
 **注意：**
 *   **必须同时提供 `Authorization` 和 `New-Api-User` 两个请求头才能通过鉴权。**
 *   如果只提供其中一个请求头，或者两个请求头都未提供，则会返回 `401 Unauthorized` 错误。
 *   如果 `Authorization` 中的 Access Token 无效，则会返回 `401 Unauthorized` 错误，并提示“无权进行此操作，access token 无效”。
 *   如果 `New-Api-User` 中的用户 ID 与 Access Token 不匹配，则会返回 `401 Unauthorized` 错误，并提示“无权进行此操作，与登录用户不匹配，请重新登录”。
 *   如果没有提供 `New-Api-User` 请求头，则会返回 `401 Unauthorized` 错误，并提示“无权进行此操作，未提供 New-Api-User”。
 *   如果 `New-Api-User` 请求头格式错误，则会返回 `401 Unauthorized` 错误，并提示“无权进行此操作，New-Api-User 格式错误”。
 *   如果用户已被禁用，则会返回 `403 Forbidden` 错误，并提示“用户已被封禁”。
 *   如果用户权限不足，则会返回 `403 Forbidden` 错误，并提示“无权进行此操作，权限不足”。
 *   如果用户信息无效，则会返回 `403 Forbidden` 错误，并提示“无权进行此操作，用户信息无效”。
 ## Curl 示例
 假设您的 Access Token 为 `access_token`，用户 ID 为 `123`，要访问的 API 接口为 `/api/user/self`，则可以使用以下 curl 命令：
 ```bash
 curl -X GET \
  -H "Authorization: access_token" \
  -H "New-Api-User: 123" \
  https://your-domain.com/api/user/self
 ```
 请将 `access_token`、`123` 和 `https://your-domain.com` 替换为实际的值。
--- a/docs/api/web_api.md
+++ b/docs/api/web_api.md
@@ -1,198 +0,0 @@
 # New API – Web 界面后端接口文档
 > 本文档汇总了 **New API** 后端提供给前端 Web 界面的全部 REST 接口（不含 *Relay* 相关接口）。
 >
 > 接口前缀统一为 `https://<your-domain>`，以下仅列出 **路径**、**HTTP 方法**、**鉴权要求** 与 **功能简介**。
 >
 > 鉴权级别说明：
 > * **公开** – 不需要登录即可调用
 > * **用户** – 需携带用户 Token（`middleware.UserAuth`）
 > * **管理员** – 需管理员 Token（`middleware.AdminAuth`）
 > * **Root** – 仅限最高权限 Root 用户（`middleware.RootAuth`）
 ---
 ## 1. 初始化 / 系统状态
 | 方法 | 路径 | 鉴权 | 说明 |
 |------|------|------|------|
 | GET  | /api/setup | 公开 | 获取系统初始化状态 |
 | POST | /api/setup | 公开 | 完成首次安装向导 |
 | GET  | /api/status | 公开 | 获取运行状态摘要 |
 | GET  | /api/uptime/status | 公开 | Uptime-Kuma 兼容状态探针 |
 | GET  | /api/status/test | 管理员 | 测试后端与依赖组件是否正常 |
 ## 2. 公共信息
 | 方法 | 路径 | 鉴权 | 说明 |
 |------|------|------|------|
 | GET | /api/models | 用户 | 获取前端可用模型列表 |
 | GET | /api/notice | 公开 | 获取公告栏内容 |
 | GET | /api/about | 公开 | 关于页面信息 |
 | GET | /api/home_page_content | 公开 | 首页自定义内容 |
 | GET | /api/pricing | 可匿名/用户 | 价格与套餐信息 |
 | GET | /api/ratio_config | 公开 | 模型倍率配置（仅公开字段） |
 ## 3. 邮件 / 身份验证
 | 方法 | 路径 | 鉴权 | 说明 |
 |------|------|------|------|
 | GET | /api/verification | 公开 (限流) | 发送邮箱验证邮件 |
 | GET | /api/reset_password | 公开 (限流) | 发送重置密码邮件 |
 | POST | /api/user/reset | 公开 | 提交重置密码请求 |
 ## 4. OAuth / 第三方登录
 | 方法 | 路径 | 鉴权 | 说明 |
 |------|------|------|------|
 | GET | /api/oauth/github | 公开 | GitHub OAuth 跳转 |
 | GET | /api/oauth/discord | 公开 | Discord 通用 OAuth 跳转 |
 | GET | /api/oauth/oidc | 公开 | OIDC 通用 OAuth 跳转 |
 | GET | /api/oauth/linuxdo | 公开 | LinuxDo OAuth 跳转 |
 | GET | /api/oauth/wechat | 公开 | 微信扫码登录跳转 |
 | GET | /api/oauth/wechat/bind | 公开 | 微信账户绑定 |
 | GET | /api/oauth/email/bind | 公开 | 邮箱绑定 |
 | GET | /api/oauth/telegram/login | 公开 | Telegram 登录 |
 | GET | /api/oauth/telegram/bind | 公开 | Telegram 账户绑定 |
 | GET | /api/oauth/state | 公开 | 获取随机 state（防 CSRF） |
 ## 5. 用户模块
 ### 5.1 账号注册/登录
 | 方法 | 路径 | 鉴权 | 说明 |
 |------|------|------|------|
 | POST | /api/user/register | 公开 | 注册新账号 |
 | POST | /api/user/login | 公开 | 用户登录 |
 | GET  | /api/user/logout | 用户 | 退出登录 |
 | GET  | /api/user/epay/notify | 公开 | Epay 支付回调 |
 | GET  | /api/user/groups | 公开 | 列出所有分组（无鉴权版） |
 ### 5.2 用户自身操作 (需登录)
 | 方法 | 路径 | 鉴权 | 说明 |
 |------|------|------|------|
 | GET | /api/user/self/groups | 用户 | 获取自己所在分组 |
 | GET | /api/user/self | 用户 | 获取个人资料 |
 | GET | /api/user/models | 用户 | 获取模型可见性 |
 | PUT | /api/user/self | 用户 | 修改个人资料 |
 | DELETE | /api/user/self | 用户 | 注销账号 |
 | GET | /api/user/token | 用户 | 生成用户级别 Access Token |
 | GET | /api/user/aff | 用户 | 获取推广码信息 |
 | POST | /api/user/topup | 用户 | 余额直充 |
 | POST | /api/user/pay | 用户 | 提交支付订单 |
 | POST | /api/user/amount | 用户 | 余额支付 |
 | POST | /api/user/aff_transfer | 用户 | 推广额度转账 |
 | PUT | /api/user/setting | 用户 | 更新用户设置 |
 ### 5.3 管理员用户管理
 | 方法 | 路径 | 鉴权 | 说明 |
 |------|------|------|------|
 | GET | /api/user/ | 管理员 | 获取全部用户列表 |
 | GET | /api/user/search | 管理员 | 搜索用户 |
 | GET | /api/user/:id | 管理员 | 获取单个用户信息 |
 | POST | /api/user/ | 管理员 | 创建用户 |
 | POST | /api/user/manage | 管理员 | 冻结/重置等管理操作 |
 | PUT | /api/user/ | 管理员 | 更新用户 |
 | DELETE | /api/user/:id | 管理员 | 删除用户 |
 ## 6. 站点选项 (Root)
 | 方法 | 路径 | 鉴权 | 说明 |
 |------|------|------|------|
 | GET | /api/option/ | Root | 获取全局配置 |
 | PUT | /api/option/ | Root | 更新全局配置 |
 | POST | /api/option/rest_model_ratio | Root | 重置模型倍率 |
 | POST | /api/option/migrate_console_setting | Root | 迁移旧版控制台配置 |
 ## 7. 模型倍率同步 (Root)
 | 方法 | 路径 | 鉴权 | 说明 |
 |------|------|------|------|
 | GET | /api/ratio_sync/channels | Root | 获取可同步渠道列表 |
 | POST | /api/ratio_sync/fetch | Root | 从上游拉取倍率 |
 ## 8. 渠道管理 (管理员)
 | 方法 | 路径 | 说明 |
 |------|------|------|
 | GET | /api/channel/ | 获取渠道列表 |
 | GET | /api/channel/search | 搜索渠道 |
 | GET | /api/channel/models | 查询渠道模型能力 |
 | GET | /api/channel/models_enabled | 查询启用模型能力 |
 | GET | /api/channel/:id | 获取单个渠道 |
 | GET | /api/channel/test | 批量测试渠道连通性 |
 | GET | /api/channel/test/:id | 单个渠道测试 |
 | GET | /api/channel/update_balance | 批量刷新余额 |
 | GET | /api/channel/update_balance/:id | 单个刷新余额 |
 | POST | /api/channel/ | 新增渠道 |
 | PUT | /api/channel/ | 更新渠道 |
 | DELETE | /api/channel/disabled | 删除已禁用渠道 |
 | POST | /api/channel/tag/disabled | 批量禁用标签渠道 |
 | POST | /api/channel/tag/enabled | 批量启用标签渠道 |
 | PUT | /api/channel/tag | 编辑渠道标签 |
 | DELETE | /api/channel/:id | 删除渠道 |
 | POST | /api/channel/batch | 批量删除渠道 |
 | POST | /api/channel/fix | 修复渠道能力表 |
 | GET | /api/channel/fetch_models/:id | 拉取单渠道模型 |
 | POST | /api/channel/fetch_models | 拉取全部渠道模型 |
 | POST | /api/channel/batch/tag | 批量设置渠道标签 |
 | GET | /api/channel/tag/models | 根据标签获取模型 |
 | POST | /api/channel/copy/:id | 复制渠道 |
 ## 9. Token 管理
 | 方法 | 路径 | 鉴权 | 说明 |
 |------|------|------|------|
 | GET | /api/token/ | 用户 | 获取全部 Token |
 | GET | /api/token/search | 用户 | 搜索 Token |
 | GET | /api/token/:id | 用户 | 获取单个 Token |
 | POST | /api/token/ | 用户 | 创建 Token |
 | PUT | /api/token/ | 用户 | 更新 Token |
 | DELETE | /api/token/:id | 用户 | 删除 Token |
 | POST | /api/token/batch | 用户 | 批量删除 Token |
 ## 10. 兑换码管理 (管理员)
 | 方法 | 路径 | 说明 |
 |------|------|------|
 | GET | /api/redemption/ | 获取兑换码列表 |
 | GET | /api/redemption/search | 搜索兑换码 |
 | GET | /api/redemption/:id | 获取单个兑换码 |
 | POST | /api/redemption/ | 创建兑换码 |
 | PUT | /api/redemption/ | 更新兑换码 |
 | DELETE | /api/redemption/invalid | 删除无效兑换码 |
 | DELETE | /api/redemption/:id | 删除兑换码 |
 ## 11. 日志
 | 方法 | 路径 | 鉴权 | 说明 |
 |------|------|------|------|
 | GET | /api/log/ | 管理员 | 获取全部日志 |
 | DELETE | /api/log/ | 管理员 | 删除历史日志 |
 | GET | /api/log/stat | 管理员 | 日志统计 |
 | GET | /api/log/self/stat | 用户 | 我的日志统计 |
 | GET | /api/log/search | 管理员 | 搜索全部日志 |
 | GET | /api/log/self | 用户 | 获取我的日志 |
 | GET | /api/log/self/search | 用户 | 搜索我的日志 |
 | GET | /api/log/token | 公开 | 根据 Token 查询日志（支持 CORS） |
 ## 12. 数据统计
 | 方法 | 路径 | 鉴权 | 说明 |
 |------|------|------|------|
 | GET | /api/data/ | 管理员 | 全站用量按日期统计 |
 | GET | /api/data/self | 用户 | 我的用量按日期统计 |
 ## 13. 分组
 | GET | /api/group/ | 管理员 | 获取全部分组列表 |
 ## 14. Midjourney 任务
 | 方法 | 路径 | 鉴权 | 说明 |
 |------|------|------|------|
 | GET | /api/mj/self | 用户 | 获取自己的 MJ 任务 |
 | GET | /api/mj/ | 管理员 | 获取全部 MJ 任务 |
 ## 15. 任务中心
 | 方法 | 路径 | 鉴权 | 说明 |
 |------|------|------|------|
 | GET | /api/task/self | 用户 | 获取我的任务 |
 | GET | /api/task/ | 管理员 | 获取全部任务 |
 ## 16. 账户计费面板 (Dashboard)
 | 方法 | 路径 | 鉴权 | 说明 |
 |------|------|------|------|
 | GET | /dashboard/billing/subscription | 用户 Token | 获取订阅额度信息 |
 | GET | /v1/dashboard/billing/subscription | 同上 | 兼容 OpenAI SDK 路径 |
 | GET | /dashboard/billing/usage | 用户 Token | 获取使用量信息 |
 | GET | /v1/dashboard/billing/usage | 同上 | 兼容 OpenAI SDK 路径 |
 ---
 > **更新日期**：2025.07.17
--- a/docs/models/Midjourney.md
+++ b/docs/models/Midjourney.md
@@ -1,82 +0,0 @@
 # Midjourney Proxy API文档
 **简介**:Midjourney Proxy API文档
 ## 接口列表
 支持的接口如下：
 + [x] /mj/submit/imagine
 + [x] /mj/submit/change
 + [x] /mj/submit/blend
 + [x] /mj/submit/describe
 + [x] /mj/image/{id} （通过此接口获取图片，**请必须在系统设置中填写服务器地址！！**）
 + [x] /mj/task/{id}/fetch （此接口返回的图片地址为经过One API转发的地址）
 + [x] /task/list-by-condition
 + [x] /mj/submit/action （仅midjourney-proxy-plus支持，下同）
 + [x] /mj/submit/modal
 + [x] /mj/submit/shorten
 + [x] /mj/task/{id}/image-seed
 + [x] /mj/insight-face/swap （InsightFace）
 ## 模型列表
 ### midjourney-proxy支持
 - mj_imagine (绘图)
 - mj_variation (变换)
 - mj_reroll (重绘)
 - mj_blend (混合)
 - mj_upscale (放大)
 - mj_describe (图生文)
 ### 仅midjourney-proxy-plus支持
 - mj_zoom (比例变焦)
 - mj_shorten (提示词缩短)
 - mj_modal (窗口提交，局部重绘和自定义比例变焦必须和mj_modal一同添加)
 - mj_inpaint (局部重绘提交，必须和mj_modal一同添加)
 - mj_custom_zoom (自定义比例变焦，必须和mj_modal一同添加)
 - mj_high_variation (强变换)
 - mj_low_variation (弱变换)
 - mj_pan (平移)
 - swap_face (换脸)
 ## 模型价格设置（在设置-运营设置-模型固定价格设置中设置）
 ```json
 {
  "mj_imagine": 0.1,
  "mj_variation": 0.1,
  "mj_reroll": 0.1,
  "mj_blend": 0.1,
  "mj_modal": 0.1,
  "mj_zoom": 0.1,
  "mj_shorten": 0.1,
  "mj_high_variation": 0.1,
  "mj_low_variation": 0.1,
  "mj_pan": 0.1,
  "mj_inpaint": 0,
  "mj_custom_zoom": 0,
  "mj_describe": 0.05,
  "mj_upscale": 0.05,
  "swap_face": 0.05
 }
 ```
 其中mj_inpaint和mj_custom_zoom的价格设置为0，是因为这两个模型需要搭配mj_modal使用，所以价格由mj_modal决定。
 ## 渠道设置
 ### 对接 midjourney-proxy(plus)
 1.
 部署Midjourney-Proxy，并配置好midjourney账号等（强烈建议设置密钥），[项目地址](https://github.com/novicezk/midjourney-proxy)
 2. 在渠道管理中添加渠道，渠道类型选择**Midjourney Proxy**，如果是plus版本选择**Midjourney Proxy Plus**
   ，模型请参考上方模型列表
 3. **代理**填写midjourney-proxy部署的地址，例如：http://localhost:8080
 4. 密钥填写midjourney-proxy的密钥，如果没有设置密钥，可以随便填
 ### 对接上游new api
 1. 在渠道管理中添加渠道，渠道类型选择**Midjourney Proxy Plus**，模型请参考上方模型列表
 2. **代理**填写上游new api的地址，例如：http://localhost:3000
 3. 密钥填写上游new api的密钥
--- a/docs/models/Rerank.md
+++ b/docs/models/Rerank.md
@@ -1,62 +0,0 @@
 # Rerank API文档
 **简介**:Rerank API文档
 ## 接入Dify
 模型供应商选择Jina，按要求填写模型信息即可接入Dify。
 ## 请求方式
 Post: /v1/rerank
 Request:
 ```json
 {
  "model": "jina-reranker-v2-base-multilingual",
  "query": "What is the capital of the United States?",
  "top_n": 3,
  "documents": [
    "Carson City is the capital city of the American state of Nevada.",
    "The Commonwealth of the Northern Mariana Islands is a group of islands in the Pacific Ocean. Its capital is Saipan.",
    "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district.",
    "Capitalization or capitalisation in English grammar is the use of a capital letter at the start of a word. English usage varies from capitalization in other languages.",
    "Capital punishment (the death penalty) has existed in the United States since beforethe United States was a country. As of 2017, capital punishment is legal in 30 of the 50 states."
  ]
 }
 ```
 Response:
 ```json
 {
  "results": [
    {
      "document": {
        "text": "Washington, D.C. (also known as simply Washington or D.C., and officially as the District of Columbia) is the capital of the United States. It is a federal district."
      },
      "index": 2,
      "relevance_score": 0.9999702
    },
    {
      "document": {
        "text": "Carson City is the capital city of the American state of Nevada."
      },
      "index": 0,
      "relevance_score": 0.67800725
    },
    {
      "document": {
        "text": "Capitalization or capitalisation in English grammar is the use of a capital letter at the start of a word. English usage varies from capitalization in other languages."
      },
      "index": 3,
      "relevance_score": 0.02800752
    }
  ],
  "usage": {
    "prompt_tokens": 158,
    "completion_tokens": 0,
    "total_tokens": 158
  }
 }
 ```
--- a/docs/models/Suno.md
+++ b/docs/models/Suno.md
@@ -1,44 +0,0 @@
 # Suno API文档
 **简介**:Suno API文档
 ## 接口列表
 支持的接口如下：
 + [x] /suno/submit/music
 + [x] /suno/submit/lyrics
 + [x] /suno/fetch
 + [x] /suno/fetch/:id
 ## 模型列表
 ### Suno API支持
 - suno_music (自定义模式、灵感模式、续写)
 - suno_lyrics (生成歌词)
 ## 模型价格设置（在设置-运营设置-模型固定价格设置中设置）
 ```json
 {
  "suno_music": 0.3,
  "suno_lyrics": 0.01
 }
 ```
 ## 渠道设置
 ### 对接 Suno API
 1.
 部署 Suno API，并配置好suno账号等（强烈建议设置密钥），[项目地址](https://github.com/Suno-API/Suno-API)
 2. 在渠道管理中添加渠道，渠道类型选择**Suno API**
   ，模型请参考上方模型列表
 3. **代理**填写 Suno API 部署的地址，例如：http://localhost:8080
 4. 密钥填写 Suno API 的密钥，如果没有设置密钥，可以随便填
 ### 对接上游new api
 1. 在渠道管理中添加渠道，渠道类型选择**Suno API**，或任意类型，只需模型包含上方模型列表的模型
 2. **代理**填写上游new api的地址，例如：http://localhost:3000
 3. 密钥填写上游new api的密钥
--- a/docs/openapi/api.json
+++ b/docs/openapi/api.json
--- a/docs/openapi/relay.json
+++ b/docs/openapi/relay.json
--- a/dto/audio.go
+++ b/dto/audio.go
@@ -2,6 +2,7 @@ package dto
 import (
 	"encoding/json"
 	"strings"
 	"github.com/QuantumNous/new-api/types"
@@ -24,11 +25,14 @@ func (r *AudioRequest) GetTokenCountMeta() *types.TokenCountMeta {
 		CombineText: r.Input,
 		TokenType:   types.TokenTypeTextNumber,
 	}
 	if strings.Contains(r.Model, "gpt") {
 		meta.TokenType = types.TokenTypeTokenizer
 	}
 	return meta
 }
 func (r *AudioRequest) IsStream(c *gin.Context) bool {
-	return false
+	return r.StreamFormat == "sse"
 }
 func (r *AudioRequest) SetModelName(modelName string) {
--- a/dto/error.go
+++ b/dto/error.go
@@ -1,26 +1,31 @@
 package dto
-import "github.com/QuantumNous/new-api/types"
+import (
 	"encoding/json"
-type OpenAIError struct {
+	"github.com/QuantumNous/new-api/common"
-	Message string `json:"message"`
+	"github.com/QuantumNous/new-api/types"
-	Type    string `json:"type"`
+)
-	Param   string `json:"param"`
+
-	Code    any    `json:"code"`
+//type OpenAIError struct {
-}
+//	Message string `json:"message"`
 //	Type    string `json:"type"`
 //	Param   string `json:"param"`
 //	Code    any    `json:"code"`
 //}
 type OpenAIErrorWithStatusCode struct {
-	Error      OpenAIError `json:"error"`
+	Error      types.OpenAIError `json:"error"`
-	StatusCode int         `json:"status_code"`
+	StatusCode int               `json:"status_code"`
 	LocalError bool
 }
 type GeneralErrorResponse struct {
-	Error    types.OpenAIError `json:"error"`
+	Error    json.RawMessage `json:"error"`
-	Message  string            `json:"message"`
+	Message  string          `json:"message"`
-	Msg      string            `json:"msg"`
+	Msg      string          `json:"msg"`
-	Err      string            `json:"err"`
+	Err      string          `json:"err"`
-	ErrorMsg string            `json:"error_msg"`
+	ErrorMsg string          `json:"error_msg"`
 	Header   struct {
 		Message string `json:"message"`
 	} `json:"header"`
@@ -31,9 +36,35 @@ type GeneralErrorResponse struct {
 	} `json:"response"`
 }
 func (e GeneralErrorResponse) TryToOpenAIError() *types.OpenAIError {
 	var openAIError types.OpenAIError
 	if len(e.Error) > 0 {
 		err := common.Unmarshal(e.Error, &openAIError)
 		if err == nil && openAIError.Message != "" {
 			return &openAIError
 		}
 	}
 	return nil
 }
 func (e GeneralErrorResponse) ToMessage() string {
-	if e.Error.Message != "" {
+	if len(e.Error) > 0 {
-		return e.Error.Message
+		switch common.GetJsonType(e.Error) {
 		case "object":
 			var openAIError types.OpenAIError
 			err := common.Unmarshal(e.Error, &openAIError)
 			if err == nil && openAIError.Message != "" {
 				return openAIError.Message
 			}
 		case "string":
 			var msg string
 			err := common.Unmarshal(e.Error, &msg)
 			if err == nil && msg != "" {
 				return msg
 			}
 		default:
 			return string(e.Error)
 		}
 	}
 	if e.Message != "" {
 		return e.Message
--- a/dto/gemini.go
+++ b/dto/gemini.go
@@ -142,7 +142,7 @@ type GeminiThinkingConfig struct {
 	IncludeThoughts bool `json:"includeThoughts,omitempty"`
 	ThinkingBudget  *int `json:"thinkingBudget,omitempty"`
 	// TODO Conflict with thinkingbudget.
-	ThinkingLevel json.RawMessage `json:"thinkingLevel,omitempty"`
+	ThinkingLevel string `json:"thinkingLevel,omitempty"`
 }
 // UnmarshalJSON allows GeminiThinkingConfig to accept both snake_case and camelCase fields.
@@ -150,9 +150,9 @@ func (c *GeminiThinkingConfig) UnmarshalJSON(data []byte) error {
 	type Alias GeminiThinkingConfig
 	var aux struct {
 		Alias
-		IncludeThoughtsSnake *bool           `json:"include_thoughts,omitempty"`
+		IncludeThoughtsSnake *bool  `json:"include_thoughts,omitempty"`
-		ThinkingBudgetSnake  *int            `json:"thinking_budget,omitempty"`
+		ThinkingBudgetSnake  *int   `json:"thinking_budget,omitempty"`
-		ThinkingLevelSnake   json.RawMessage `json:"thinking_level,omitempty"`
+		ThinkingLevelSnake   string `json:"thinking_level,omitempty"`
 	}
 	if err := common.Unmarshal(data, &aux); err != nil {
@@ -169,7 +169,7 @@ func (c *GeminiThinkingConfig) UnmarshalJSON(data []byte) error {
 		c.ThinkingBudget = aux.ThinkingBudgetSnake
 	}
-	if len(aux.ThinkingLevelSnake) > 0 {
+	if aux.ThinkingLevelSnake != "" {
 		c.ThinkingLevel = aux.ThinkingLevelSnake
 	}
--- a/dto/openai_image.go
+++ b/dto/openai_image.go
@@ -27,8 +27,11 @@ type ImageRequest struct {
 	OutputCompression json.RawMessage `json:"output_compression,omitempty"`
 	PartialImages     json.RawMessage `json:"partial_images,omitempty"`
 	// Stream            bool            `json:"stream,omitempty"`
-	Watermark *bool           `json:"watermark,omitempty"`
+	Watermark *bool `json:"watermark,omitempty"`
-	Image     json.RawMessage `json:"image,omitempty"`
+	// zhipu 4v
 	WatermarkEnabled json.RawMessage `json:"watermark_enabled,omitempty"`
 	UserId           json.RawMessage `json:"user_id,omitempty"`
 	Image            json.RawMessage `json:"image,omitempty"`
 	// 用匿名参数接收额外参数
 	Extra map[string]json.RawMessage `json:"-"`
 }
@@ -169,7 +172,7 @@ type ImageResponse struct {
 	Extra   any         `json:"extra,omitempty"`
 }
 type ImageData struct {
-	Url           string `json:"url,omitempty"`
+	Url           string `json:"url"`
-	B64Json       string `json:"b64_json,omitempty"`
+	B64Json       string `json:"b64_json"`
-	RevisedPrompt string `json:"revised_prompt,omitempty"`
+	RevisedPrompt string `json:"revised_prompt"`
 }
--- a/dto/openai_request.go
+++ b/dto/openai_request.go
@@ -83,6 +83,7 @@ type GeneralOpenAIRequest struct {
 	// Ali Qwen Params
 	VlHighResolutionImages json.RawMessage `json:"vl_high_resolution_images,omitempty"`
 	EnableThinking         any             `json:"enable_thinking,omitempty"`
 	ChatTemplateKwargs     json.RawMessage `json:"chat_template_kwargs,omitempty"`
 	// ollama Params
 	Think json.RawMessage `json:"think,omitempty"`
 	// baidu v2
--- a/go.mod
+++ b/go.mod
@@ -33,7 +33,7 @@ require (
 	github.com/mewkiz/flac v1.0.13
 	github.com/pkg/errors v0.9.1
 	github.com/pquerna/otp v1.5.0
-	github.com/samber/lo v1.39.0
+	github.com/samber/lo v1.52.0
 	github.com/shirou/gopsutil v3.21.11+incompatible
 	github.com/shopspring/decimal v1.4.0
 	github.com/stripe/stripe-go/v81 v81.4.0
@@ -99,6 +99,7 @@ require (
 	github.com/mitchellh/mapstructure v1.5.0 // indirect
 	github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd // indirect
 	github.com/modern-go/reflect2 v1.0.2 // indirect
 	github.com/ncruces/go-strftime v0.1.9 // indirect
 	github.com/pelletier/go-toml/v2 v2.2.1 // indirect
 	github.com/remyoudompheng/bigfft v0.0.0-20230129092748-24d4a6f8daec // indirect
 	github.com/tidwall/match v1.1.1 // indirect
@@ -110,13 +111,13 @@ require (
 	github.com/x448/float16 v0.8.4 // indirect
 	github.com/yusufpapurcu/wmi v1.2.3 // indirect
 	golang.org/x/arch v0.21.0 // indirect
-	golang.org/x/exp v0.0.0-20240404231335-c0f41cb1a7a0 // indirect
+	golang.org/x/exp v0.0.0-20250620022241-b7579e27df2b // indirect
 	golang.org/x/sys v0.38.0 // indirect
 	golang.org/x/text v0.31.0 // indirect
 	google.golang.org/protobuf v1.34.2 // indirect
 	gopkg.in/yaml.v3 v3.0.1 // indirect
-	modernc.org/libc v1.22.5 // indirect
+	modernc.org/libc v1.66.10 // indirect
-	modernc.org/mathutil v1.5.0 // indirect
+	modernc.org/mathutil v1.7.1 // indirect
-	modernc.org/memory v1.5.0 // indirect
+	modernc.org/memory v1.11.0 // indirect
-	modernc.org/sqlite v1.23.1 // indirect
+	modernc.org/sqlite v1.40.1 // indirect
 )
--- a/go.sum
+++ b/go.sum
@@ -120,6 +120,7 @@ github.com/google/go-tpm v0.9.5/go.mod h1:h9jEsEECg7gtLis0upRBQU+GhYVH6jMjrFxI8u
 github.com/google/gofuzz v1.0.0/go.mod h1:dBl0BpW6vV/+mYPU4Po3pmUjxk6FQPldtuIdl/M65Eg=
 github.com/google/pprof v0.0.0-20221118152302-e6195bd50e26 h1:Xim43kblpZXfIBQsbuBVKCudVG457BR2GZFIz3uw3hQ=
 github.com/google/pprof v0.0.0-20221118152302-e6195bd50e26/go.mod h1:dDKJzRmX4S37WGHujM7tX//fmj1uioxKzKxz3lo4HJo=
 github.com/google/pprof v0.0.0-20250317173921-a4b03ec1a45e h1:ijClszYn+mADRFY17kjQEVQ1XRhq2/JR1M3sGqeJoxs=
 github.com/google/uuid v1.1.2/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
 github.com/google/uuid v1.6.0 h1:NIvaJDMOsjHA8n1jAhLSgzrAzy1Hgr+hNrb57e+94F0=
 github.com/google/uuid v1.6.0/go.mod h1:TIyPZe4MgqvfeYDBFedMoGGpEw/LqOeaOT+nhxU+yHo=
@@ -193,6 +194,8 @@ github.com/modern-go/concurrent v0.0.0-20180306012644-bacd9c7ef1dd/go.mod h1:6dJ
 github.com/modern-go/reflect2 v0.0.0-20180701023420-4b7aa43c6742/go.mod h1:bx2lNnkwVCuqBIxFjflWJWanXIb3RllmbCylyMrvgv0=
 github.com/modern-go/reflect2 v1.0.2 h1:xBagoLtFs94CBntxluKeaWgTMpvLxC4ur3nMaC9Gz0M=
 github.com/modern-go/reflect2 v1.0.2/go.mod h1:yWuevngMOJpCy52FWWMvUC8ws7m/LJsjYzDa0/r8luk=
 github.com/ncruces/go-strftime v0.1.9 h1:bY0MQC28UADQmHmaF5dgpLmImcShSi2kHU9XLdhx/f4=
 github.com/ncruces/go-strftime v0.1.9/go.mod h1:Fwc5htZGVVkseilnfgOVb9mKy6w1naJmn9CehxcKcls=
 github.com/nxadm/tail v1.4.8 h1:nPr65rt6Y5JFSKQO7qToXr7pePgD6Gwiw05lkbyAQTE=
 github.com/nxadm/tail v1.4.8/go.mod h1:+ncqLTQzXmGhMZNUePPaPqPvBxHAIsmXswZKocGu+AU=
 github.com/onsi/ginkgo v1.16.5 h1:8xi0RTUf59SOSfEtZMvwTvXYMzG4gV23XVHOZiXNtnE=
@@ -219,6 +222,8 @@ github.com/rogpeppe/go-internal v1.8.0 h1:FCbCCtXNOY3UtUuHUYaghJg4y7Fd14rXifAYUA
 github.com/rogpeppe/go-internal v1.8.0/go.mod h1:WmiCO8CzOY8rg0OYDC4/i/2WRWAB6poM+XZ2dLUbcbE=
 github.com/samber/lo v1.39.0 h1:4gTz1wUhNYLhFSKl6O+8peW0v2F4BCY034GRpU9WnuA=
 github.com/samber/lo v1.39.0/go.mod h1:+m/ZKRl6ClXCE2Lgf3MsQlWfh4bn1bz6CXEOxnEXnEA=
 github.com/samber/lo v1.52.0 h1:Rvi+3BFHES3A8meP33VPAxiBZX/Aws5RxrschYGjomw=
 github.com/samber/lo v1.52.0/go.mod h1:4+MXEGsJzbKGaUEQFKBq2xtfuznW9oz/WrgyzMzRoM0=
 github.com/shirou/gopsutil v3.21.11+incompatible h1:+1+c1VGhc88SSonWP6foOcLhvnKlUeu/erjjvaPEYiI=
 github.com/shirou/gopsutil v3.21.11+incompatible/go.mod h1:5b4v6he4MtMOwMlS0TUMTu2PcXUg8+E1lC7eC3UO/RA=
 github.com/shopspring/decimal v1.4.0 h1:bxl37RwXBklmTi0C79JfXCEBD1cqqHt0bbgBAGFp81k=
@@ -285,6 +290,8 @@ golang.org/x/crypto v0.45.0 h1:jMBrvKuj23MTlT0bQEOBcAE0mjg8mK9RXFhRH6nyF3Q=
 golang.org/x/crypto v0.45.0/go.mod h1:XTGrrkGJve7CYK7J8PEww4aY7gM3qMCElcJQ8n8JdX4=
 golang.org/x/exp v0.0.0-20240404231335-c0f41cb1a7a0 h1:985EYyeCOxTpcgOTJpflJUwOeEz0CQOdPt73OzpE9F8=
 golang.org/x/exp v0.0.0-20240404231335-c0f41cb1a7a0/go.mod h1:/lliqkxwWAhPjf5oSOIJup2XcqJaw8RGS6k3TGEc7GI=
 golang.org/x/exp v0.0.0-20250620022241-b7579e27df2b h1:M2rDM6z3Fhozi9O7NWsxAkg/yqS/lQJ6PmkyIV3YP+o=
 golang.org/x/exp v0.0.0-20250620022241-b7579e27df2b/go.mod h1:3//PLf8L/X+8b4vuAfHzxeRUl04Adcb341+IGKfnqS8=
 golang.org/x/image v0.23.0 h1:HseQ7c2OpPKTPVzNjG5fwJsOTCiiwS4QdsYi5XU6H68=
 golang.org/x/image v0.23.0/go.mod h1:wJJBTdLfCCf3tiHa1fNxpZmUI4mmoZvwMCPP0ddoNKY=
 golang.org/x/net v0.0.0-20210226172049-e18ecbb05110/go.mod h1:m0MpNAwzfU5UDzcl9v0D8zg8gWTRqZa9RBIspLL5mdg=
@@ -345,9 +352,17 @@ gorm.io/gorm v1.25.2 h1:gs1o6Vsa+oVKG/a9ElL3XgyGfghFfkKA2SInQaCyMho=
 gorm.io/gorm v1.25.2/go.mod h1:L4uxeKpfBml98NYqVqwAdmV1a2nBtAec/cf3fpucW/k=
 modernc.org/libc v1.22.5 h1:91BNch/e5B0uPbJFgqbxXuOnxBQjlS//icfQEGmvyjE=
 modernc.org/libc v1.22.5/go.mod h1:jj+Z7dTNX8fBScMVNRAYZ/jF91K8fdT2hYMThc3YjBY=
 modernc.org/libc v1.66.10 h1:yZkb3YeLx4oynyR+iUsXsybsX4Ubx7MQlSYEw4yj59A=
 modernc.org/libc v1.66.10/go.mod h1:8vGSEwvoUoltr4dlywvHqjtAqHBaw0j1jI7iFBTAr2I=
 modernc.org/mathutil v1.5.0 h1:rV0Ko/6SfM+8G+yKiyI830l3Wuz1zRutdslNoQ0kfiQ=
 modernc.org/mathutil v1.5.0/go.mod h1:mZW8CKdRPY1v87qxC/wUdX5O1qDzXMP5TH3wjfpga6E=
 modernc.org/mathutil v1.7.1 h1:GCZVGXdaN8gTqB1Mf/usp1Y/hSqgI2vAGGP4jZMCxOU=
 modernc.org/mathutil v1.7.1/go.mod h1:4p5IwJITfppl0G4sUEDtCr4DthTaT47/N3aT6MhfgJg=
 modernc.org/memory v1.5.0 h1:N+/8c5rE6EqugZwHii4IFsaJ7MUhoWX07J5tC/iI5Ds=
 modernc.org/memory v1.5.0/go.mod h1:PkUhL0Mugw21sHPeskwZW4D6VscE/GQJOnIpCnW6pSU=
 modernc.org/memory v1.11.0 h1:o4QC8aMQzmcwCK3t3Ux/ZHmwFPzE6hf2Y5LbkRs+hbI=
 modernc.org/memory v1.11.0/go.mod h1:/JP4VbVC+K5sU2wZi9bHoq2MAkCnrt2r98UGeSK7Mjw=
 modernc.org/sqlite v1.23.1 h1:nrSBg4aRQQwq59JpvGEQ15tNxoO5pX/kUjcRNwSAGQM=
 modernc.org/sqlite v1.23.1/go.mod h1:OrDj17Mggn6MhE+iPbBNf7RGKODDE9NFT0f3EwDzJqk=
 modernc.org/sqlite v1.40.1 h1:VfuXcxcUWWKRBuP8+BR9L7VnmusMgBNNnBYGEe9w/iY=
 modernc.org/sqlite v1.40.1/go.mod h1:9fjQZ0mB1LLP0GYrp39oOJXx/I2sxEnZtzCmEQIKvGE=
--- a/middleware/auth.go
+++ b/middleware/auth.go
@@ -2,12 +2,14 @@ package middleware
 import (
 	"fmt"
 	"net"
 	"net/http"
 	"strconv"
 	"strings"
 	"github.com/QuantumNous/new-api/common"
 	"github.com/QuantumNous/new-api/constant"
 	"github.com/QuantumNous/new-api/logger"
 	"github.com/QuantumNous/new-api/model"
 	"github.com/QuantumNous/new-api/service"
 	"github.com/QuantumNous/new-api/setting/ratio_setting"
@@ -240,13 +242,20 @@ func TokenAuth() func(c *gin.Context) {
 			return
 		}
-		allowIpsMap := token.GetIpLimitsMap()
+		allowIps := token.GetIpLimits()
-		if len(allowIpsMap) != 0 {
+		if len(allowIps) > 0 {
 			clientIp := c.ClientIP()
-			if _, ok := allowIpsMap[clientIp]; !ok {
+			logger.LogDebug(c, "Token has IP restrictions, checking client IP %s", clientIp)
 			ip := net.ParseIP(clientIp)
 			if ip == nil {
 				abortWithOpenAiMessage(c, http.StatusForbidden, "无法解析客户端 IP 地址")
 				return
 			}
 			if common.IsIpInCIDRList(ip, allowIps) == false {
 				abortWithOpenAiMessage(c, http.StatusForbidden, "您的 IP 不在令牌允许访问的列表中")
 				return
 			}
 			logger.LogDebug(c, "Client IP %s passed the token IP restrictions check", clientIp)
 		}
 		userCache, err := model.GetUserCache(token.UserId)
@@ -307,7 +316,8 @@ func SetupContextForToken(c *gin.Context, token *model.Token, parts ...string) e
 	} else {
 		c.Set("token_model_limit_enabled", false)
 	}
-	c.Set("token_group", token.Group)
+	common.SetContextKey(c, constant.ContextKeyTokenGroup, token.Group)
 	common.SetContextKey(c, constant.ContextKeyTokenCrossGroupRetry, token.CrossGroupRetry)
 	if len(parts) > 1 {
 		if model.IsAdmin(token.UserId) {
 			c.Set("specific_channel_id", parts[1])
--- a/middleware/distributor.go
+++ b/middleware/distributor.go
@@ -97,7 +97,12 @@ func Distribute() func(c *gin.Context) {
 						common.SetContextKey(c, constant.ContextKeyUsingGroup, usingGroup)
 					}
 				}
-				channel, selectGroup, err = service.CacheGetRandomSatisfiedChannel(c, usingGroup, modelRequest.Model, 0)
+				channel, selectGroup, err = service.CacheGetRandomSatisfiedChannel(&service.RetryParam{
 					Ctx:        c,
 					ModelName:  modelRequest.Model,
 					TokenGroup: usingGroup,
 					Retry:      common.GetPointer(0),
 				})
 				if err != nil {
 					showGroup := usingGroup
 					if usingGroup == "auto" {
@@ -157,7 +162,7 @@ func getModelRequest(c *gin.Context) (*ModelRequest, bool, error) {
 			}
 			midjourneyModel, mjErr, success := service.GetMjRequestModel(relayMode, &midjourneyRequest)
 			if mjErr != nil {
-				return nil, false, fmt.Errorf(mjErr.Description)
+				return nil, false, fmt.Errorf("%s", mjErr.Description)
 			}
 			if midjourneyModel == "" {
 				if !success {
@@ -181,6 +186,10 @@ func getModelRequest(c *gin.Context) (*ModelRequest, bool, error) {
 		}
 		c.Set("platform", string(constant.TaskPlatformSuno))
 		c.Set("relay_mode", relayMode)
 	} else if strings.Contains(c.Request.URL.Path, "/v1/videos/") && strings.HasSuffix(c.Request.URL.Path, "/remix") {
 		relayMode := relayconstant.RelayModeVideoSubmit
 		c.Set("relay_mode", relayMode)
 		shouldSelectChannel = false
 	} else if strings.Contains(c.Request.URL.Path, "/v1/videos") {
 		//curl https://api.openai.com/v1/videos \
 		//  -H "Authorization: Bearer $OPENAI_API_KEY" \
--- a/middleware/gzip.go
+++ b/middleware/gzip.go
@@ -5,32 +5,69 @@ import (
 	"io"
 	"net/http"
 	"github.com/QuantumNous/new-api/constant"
 	"github.com/andybalholm/brotli"
 	"github.com/gin-gonic/gin"
 )
 type readCloser struct {
 	io.Reader
 	closeFn func() error
 }
 func (rc *readCloser) Close() error {
 	if rc.closeFn != nil {
 		return rc.closeFn()
 	}
 	return nil
 }
 func DecompressRequestMiddleware() gin.HandlerFunc {
 	return func(c *gin.Context) {
 		if c.Request.Body == nil || c.Request.Method == http.MethodGet {
 			c.Next()
 			return
 		}
 		maxMB := constant.MaxRequestBodyMB
 		if maxMB <= 0 {
 			maxMB = 32
 		}
 		maxBytes := int64(maxMB) << 20
 		origBody := c.Request.Body
 		wrapMaxBytes := func(body io.ReadCloser) io.ReadCloser {
 			return http.MaxBytesReader(c.Writer, body, maxBytes)
 		}
 		switch c.GetHeader("Content-Encoding") {
 		case "gzip":
-			gzipReader, err := gzip.NewReader(c.Request.Body)
+			gzipReader, err := gzip.NewReader(origBody)
 			if err != nil {
 				_ = origBody.Close()
 				c.AbortWithStatus(http.StatusBadRequest)
 				return
 			}
-			defer gzipReader.Close()
+			// Replace the request body with the decompressed data, and enforce a max size (post-decompression).
-
+			c.Request.Body = wrapMaxBytes(&readCloser{
-			// Replace the request body with the decompressed data
+				Reader: gzipReader,
-			c.Request.Body = io.NopCloser(gzipReader)
+				closeFn: func() error {
 					_ = gzipReader.Close()
 					return origBody.Close()
 				},
 			})
 			c.Request.Header.Del("Content-Encoding")
 		case "br":
-			reader := brotli.NewReader(c.Request.Body)
+			reader := brotli.NewReader(origBody)
-			c.Request.Body = io.NopCloser(reader)
+			c.Request.Body = wrapMaxBytes(&readCloser{
 				Reader: reader,
 				closeFn: func() error {
 					return origBody.Close()
 				},
 			})
 			c.Request.Header.Del("Content-Encoding")
 		default:
 			// Even for uncompressed bodies, enforce a max size to avoid huge request allocations.
 			c.Request.Body = wrapMaxBytes(origBody)
 		}
 		// Continue processing the request
--- a/model/channel.go
+++ b/model/channel.go
@@ -254,6 +254,9 @@ func (channel *Channel) Save() error {
 }
 func (channel *Channel) SaveWithoutKey() error {
 	if channel.Id == 0 {
 		return errors.New("channel ID is 0")
 	}
 	return DB.Omit("key").Save(channel).Error
 }
--- a/model/token.go
+++ b/model/token.go
@@ -6,7 +6,6 @@ import (
 	"strings"
 	"github.com/QuantumNous/new-api/common"
 	"github.com/bytedance/gopkg/util/gopool"
 	"gorm.io/gorm"
 )
@@ -27,6 +26,7 @@ type Token struct {
 	AllowIps           *string        `json:"allow_ips" gorm:"default:''"`
 	UsedQuota          int            `json:"used_quota" gorm:"default:0"` // used quota
 	Group              string         `json:"group" gorm:"default:''"`
 	CrossGroupRetry    bool           `json:"cross_group_retry" gorm:"default:false"` // 跨分组重试，仅auto分组有效
 	DeletedAt          gorm.DeletedAt `gorm:"index"`
 }
@@ -34,26 +34,26 @@ func (token *Token) Clean() {
 	token.Key = ""
 }
-func (token *Token) GetIpLimitsMap() map[string]any {
+func (token *Token) GetIpLimits() []string {
 	// delete empty spaces
 	//split with \n
-	ipLimitsMap := make(map[string]any)
+	ipLimits := make([]string, 0)
 	if token.AllowIps == nil {
-		return ipLimitsMap
+		return ipLimits
 	}
 	cleanIps := strings.ReplaceAll(*token.AllowIps, " ", "")
 	if cleanIps == "" {
-		return ipLimitsMap
+		return ipLimits
 	}
 	ips := strings.Split(cleanIps, "\n")
 	for _, ip := range ips {
 		ip = strings.TrimSpace(ip)
 		ip = strings.ReplaceAll(ip, ",", "")
-		if common.IsIP(ip) {
+		if ip != "" {
-			ipLimitsMap[ip] = true
+			ipLimits = append(ipLimits, ip)
 		}
 	}
-	return ipLimitsMap
+	return ipLimits
 }
 func GetAllUserTokens(userId int, startIdx int, num int) ([]*Token, error) {
@@ -112,7 +112,12 @@ func ValidateUserToken(key string) (token *Token, err error) {
 		}
 		return token, nil
 	}
-	return nil, errors.New("无效的令牌")
+	common.SysLog("ValidateUserToken: failed to get token: " + err.Error())
 	if errors.Is(err, gorm.ErrRecordNotFound) {
 		return nil, errors.New("无效的令牌")
 	} else {
 		return nil, errors.New("无效的令牌，数据库查询出错，请联系管理员")
 	}
 }
 func GetTokenByIds(id int, userId int) (*Token, error) {
@@ -185,7 +190,7 @@ func (token *Token) Update() (err error) {
 		}
 	}()
 	err = DB.Model(token).Select("name", "status", "expired_time", "remain_quota", "unlimited_quota",
-		"model_limits_enabled", "model_limits", "allow_ips", "group").Updates(token).Error
+		"model_limits_enabled", "model_limits", "allow_ips", "group", "cross_group_retry").Updates(token).Error
 	return err
 }
--- a/relay/audio_handler.go
+++ b/relay/audio_handler.go
@@ -67,8 +67,11 @@ func AudioHelper(c *gin.Context, info *relaycommon.RelayInfo) (newAPIError *type
 		service.ResetStatusCode(newAPIError, statusCodeMappingStr)
 		return newAPIError
 	}
-
+	if usage.(*dto.Usage).CompletionTokenDetails.AudioTokens > 0 || usage.(*dto.Usage).PromptTokensDetails.AudioTokens > 0 {
-	postConsumeQuota(c, info, usage.(*dto.Usage), "")
+		service.PostAudioConsumeQuota(c, info, usage.(*dto.Usage), "")
 	} else {
 		postConsumeQuota(c, info, usage.(*dto.Usage), "")
 	}
 	return nil
 }
--- a/relay/channel/adapter.go
+++ b/relay/channel/adapter.go
@@ -47,7 +47,7 @@ type TaskAdaptor interface {
 	GetChannelName() string
 	// FetchTask
-	FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error)
+	FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error)
 	ParseTaskResult(respBody []byte) (*relaycommon.TaskInfo, error)
 }
--- a/relay/channel/api_request.go
+++ b/relay/channel/api_request.go
@@ -27,8 +27,6 @@ import (
 func SetupApiRequestHeader(info *common.RelayInfo, c *gin.Context, req *http.Header) {
 	if info.RelayMode == constant.RelayModeAudioTranscription || info.RelayMode == constant.RelayModeAudioTranslation {
 		// multipart/form-data
 	} else if info.RelayMode == constant.RelayModeImagesEdits {
 		// multipart/form-data
 	} else if info.RelayMode == constant.RelayModeRealtime {
 		// websocket
 	} else {
--- a/relay/channel/aws/constants.go
+++ b/relay/channel/aws/constants.go
@@ -18,7 +18,7 @@ var awsModelIDMap = map[string]string{
 	"claude-opus-4-1-20250805":   "anthropic.claude-opus-4-1-20250805-v1:0",
 	"claude-sonnet-4-5-20250929": "anthropic.claude-sonnet-4-5-20250929-v1:0",
 	"claude-haiku-4-5-20251001":  "anthropic.claude-haiku-4-5-20251001-v1:0",
-	"claude-opus-4-5-20251101":  "anthropic.claude-opus-4-5-20251101-v1:0",
+	"claude-opus-4-5-20251101":   "anthropic.claude-opus-4-5-20251101-v1:0",
 	// Nova models
 	"nova-micro-v1:0":   "amazon.nova-micro-v1:0",
 	"nova-lite-v1:0":    "amazon.nova-lite-v1:0",
--- a/relay/channel/aws/relay-aws.go
+++ b/relay/channel/aws/relay-aws.go
@@ -18,6 +18,7 @@ import (
 	"github.com/gin-gonic/gin"
 	"github.com/pkg/errors"
 	"github.com/QuantumNous/new-api/setting/model_setting"
 	"github.com/aws/aws-sdk-go-v2/aws"
 	"github.com/aws/aws-sdk-go-v2/credentials"
 	"github.com/aws/aws-sdk-go-v2/service/bedrockruntime"
@@ -129,7 +130,7 @@ func doAwsClientRequest(c *gin.Context, info *relaycommon.RelayInfo, a *Adaptor,
 				Accept:      aws.String("application/json"),
 				ContentType: aws.String("application/json"),
 			}
-			awsReq.Body, err = common.Marshal(awsClaudeReq)
+			awsReq.Body, err = buildAwsRequestBody(c, info, awsClaudeReq)
 			if err != nil {
 				return nil, types.NewError(errors.Wrap(err, "marshal aws request fail"), types.ErrorCodeBadRequestBody)
 			}
@@ -141,7 +142,7 @@ func doAwsClientRequest(c *gin.Context, info *relaycommon.RelayInfo, a *Adaptor,
 				Accept:      aws.String("application/json"),
 				ContentType: aws.String("application/json"),
 			}
-			awsReq.Body, err = common.Marshal(awsClaudeReq)
+			awsReq.Body, err = buildAwsRequestBody(c, info, awsClaudeReq)
 			if err != nil {
 				return nil, types.NewError(errors.Wrap(err, "marshal aws request fail"), types.ErrorCodeBadRequestBody)
 			}
@@ -151,6 +152,24 @@ func doAwsClientRequest(c *gin.Context, info *relaycommon.RelayInfo, a *Adaptor,
 	}
 }
 // buildAwsRequestBody prepares the payload for AWS requests, applying passthrough rules when enabled.
 func buildAwsRequestBody(c *gin.Context, info *relaycommon.RelayInfo, awsClaudeReq any) ([]byte, error) {
 	if model_setting.GetGlobalSettings().PassThroughRequestEnabled || info.ChannelSetting.PassThroughBodyEnabled {
 		body, err := common.GetRequestBody(c)
 		if err != nil {
 			return nil, errors.Wrap(err, "get request body for pass-through fail")
 		}
 		var data map[string]interface{}
 		if err := common.Unmarshal(body, &data); err != nil {
 			return nil, errors.Wrap(err, "pass-through unmarshal request body fail")
 		}
 		delete(data, "model")
 		delete(data, "stream")
 		return common.Marshal(data)
 	}
 	return common.Marshal(awsClaudeReq)
 }
 func getAwsRegionPrefix(awsRegionId string) string {
 	parts := strings.Split(awsRegionId, "-")
 	regionPrefix := ""
--- a/relay/channel/baidu/relay-baidu.go
+++ b/relay/channel/baidu/relay-baidu.go
@@ -150,7 +150,7 @@ func baiduHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Respon
 		return types.NewError(err, types.ErrorCodeBadResponseBody), nil
 	}
 	if baiduResponse.ErrorMsg != "" {
-		return types.NewError(fmt.Errorf(baiduResponse.ErrorMsg), types.ErrorCodeBadResponseBody), nil
+		return types.NewError(fmt.Errorf("%s", baiduResponse.ErrorMsg), types.ErrorCodeBadResponseBody), nil
 	}
 	fullTextResponse := responseBaidu2OpenAI(&baiduResponse)
 	jsonResponse, err := json.Marshal(fullTextResponse)
@@ -175,7 +175,7 @@ func baiduEmbeddingHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *ht
 		return types.NewError(err, types.ErrorCodeBadResponseBody), nil
 	}
 	if baiduResponse.ErrorMsg != "" {
-		return types.NewError(fmt.Errorf(baiduResponse.ErrorMsg), types.ErrorCodeBadResponseBody), nil
+		return types.NewError(fmt.Errorf("%s", baiduResponse.ErrorMsg), types.ErrorCodeBadResponseBody), nil
 	}
 	fullTextResponse := embeddingResponseBaidu2OpenAI(&baiduResponse)
 	jsonResponse, err := json.Marshal(fullTextResponse)
--- a/relay/channel/claude/constants.go
+++ b/relay/channel/claude/constants.go
@@ -9,6 +9,7 @@ var ModelList = []string{
 	"claude-3-opus-20240229",
 	"claude-3-haiku-20240307",
 	"claude-3-5-haiku-20241022",
 	"claude-haiku-4-5-20251001",
 	"claude-3-5-sonnet-20240620",
 	"claude-3-5-sonnet-20241022",
 	"claude-3-7-sonnet-20250219",
--- a/relay/channel/claude/relay-claude.go
+++ b/relay/channel/claude/relay-claude.go
@@ -673,7 +673,7 @@ func HandleStreamResponseData(c *gin.Context, info *relaycommon.RelayInfo, claud
 func HandleStreamFinalResponse(c *gin.Context, info *relaycommon.RelayInfo, claudeInfo *ClaudeResponseInfo, requestMode int) {
 	if requestMode == RequestModeCompletion {
-		claudeInfo.Usage = service.ResponseText2Usage(c, claudeInfo.ResponseText.String(), info.UpstreamModelName, info.PromptTokens)
+		claudeInfo.Usage = service.ResponseText2Usage(c, claudeInfo.ResponseText.String(), info.UpstreamModelName, info.GetEstimatePromptTokens())
 	} else {
 		if claudeInfo.Usage.PromptTokens == 0 {
 			//上游出错
@@ -734,10 +734,7 @@ func HandleClaudeResponseData(c *gin.Context, info *relaycommon.RelayInfo, claud
 		return types.WithClaudeError(*claudeError, http.StatusInternalServerError)
 	}
 	if requestMode == RequestModeCompletion {
-		completionTokens := service.CountTextToken(claudeResponse.Completion, info.OriginModelName)
+		claudeInfo.Usage = service.ResponseText2Usage(c, claudeResponse.Completion, info.UpstreamModelName, info.GetEstimatePromptTokens())
 		claudeInfo.Usage.PromptTokens = info.PromptTokens
 		claudeInfo.Usage.CompletionTokens = completionTokens
 		claudeInfo.Usage.TotalTokens = info.PromptTokens + completionTokens
 	} else {
 		claudeInfo.Usage.PromptTokens = claudeResponse.Usage.InputTokens
 		claudeInfo.Usage.CompletionTokens = claudeResponse.Usage.OutputTokens
--- a/relay/channel/cloudflare/relay_cloudflare.go
+++ b/relay/channel/cloudflare/relay_cloudflare.go
@@ -74,7 +74,7 @@ func cfStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Res
 	if err := scanner.Err(); err != nil {
 		logger.LogError(c, "error_scanning_stream_response: "+err.Error())
 	}
-	usage := service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.PromptTokens)
+	usage := service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.GetEstimatePromptTokens())
 	if info.ShouldIncludeUsage {
 		response := helper.GenerateFinalUsageResponse(id, info.StartTime.Unix(), info.UpstreamModelName, *usage)
 		err := helper.ObjectData(c, response)
@@ -105,7 +105,7 @@ func cfHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Response)
 	for _, choice := range response.Choices {
 		responseText += choice.Message.StringContent()
 	}
-	usage := service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.PromptTokens)
+	usage := service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.GetEstimatePromptTokens())
 	response.Usage = *usage
 	response.Id = helper.GetResponseID(c)
 	jsonResponse, err := json.Marshal(response)
@@ -142,10 +142,6 @@ func cfSTTHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Respon
 	c.Writer.WriteHeader(resp.StatusCode)
 	_, _ = c.Writer.Write(jsonResponse)
-	usage := &dto.Usage{}
+	usage := service.ResponseText2Usage(c, cfResp.Result.Text, info.UpstreamModelName, info.GetEstimatePromptTokens())
 	usage.PromptTokens = info.PromptTokens
 	usage.CompletionTokens = service.CountTextToken(cfResp.Result.Text, info.UpstreamModelName)
 	usage.TotalTokens = usage.PromptTokens + usage.CompletionTokens
 	return nil, usage
 }
--- a/relay/channel/cohere/relay-cohere.go
+++ b/relay/channel/cohere/relay-cohere.go
@@ -165,7 +165,7 @@ func cohereStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http
 		}
 	})
 	if usage.PromptTokens == 0 {
-		usage = service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.PromptTokens)
+		usage = service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.GetEstimatePromptTokens())
 	}
 	return usage, nil
 }
@@ -225,9 +225,9 @@ func cohereRerankHandler(c *gin.Context, resp *http.Response, info *relaycommon.
 	}
 	usage := dto.Usage{}
 	if cohereResp.Meta.BilledUnits.InputTokens == 0 {
-		usage.PromptTokens = info.PromptTokens
+		usage.PromptTokens = info.GetEstimatePromptTokens()
 		usage.CompletionTokens = 0
-		usage.TotalTokens = info.PromptTokens
+		usage.TotalTokens = info.GetEstimatePromptTokens()
 	} else {
 		usage.PromptTokens = cohereResp.Meta.BilledUnits.InputTokens
 		usage.CompletionTokens = cohereResp.Meta.BilledUnits.OutputTokens
--- a/relay/channel/coze/relay-coze.go
+++ b/relay/channel/coze/relay-coze.go
@@ -208,7 +208,7 @@ func handleCozeEvent(c *gin.Context, event string, data string, responseText *st
 			return
 		}
-		common.SysLog(fmt.Sprintf("stream event error: ", errorData.Code, errorData.Message))
+		common.SysLog(fmt.Sprintf("stream event error: %v %v", errorData.Code, errorData.Message))
 	}
 }
--- a/relay/channel/dify/relay-dify.go
+++ b/relay/channel/dify/relay-dify.go
@@ -246,7 +246,7 @@ func difyStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.R
 	})
 	helper.Done(c)
 	if usage.TotalTokens == 0 {
-		usage = service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.PromptTokens)
+		usage = service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.GetEstimatePromptTokens())
 	}
 	usage.CompletionTokens += nodeToken
 	return usage, nil
--- a/relay/channel/gemini/adaptor.go
+++ b/relay/channel/gemini/adaptor.go
@@ -1,12 +1,10 @@
 package gemini
 import (
 	"encoding/json"
 	"errors"
 	"fmt"
 	"io"
 	"net/http"
 	"slices"
 	"strings"
 	"github.com/QuantumNous/new-api/dto"
@@ -57,139 +55,9 @@ func (a *Adaptor) ConvertAudioRequest(c *gin.Context, info *relaycommon.RelayInf
 	return nil, errors.New("not implemented")
 }
 type ImageConfig struct {
 	AspectRatio string `json:"aspectRatio,omitempty"`
 	ImageSize   string `json:"imageSize,omitempty"`
 }
 type SizeMapping struct {
 	AspectRatio string
 	ImageSize   string
 }
 type QualityMapping struct {
 	Standard string
 	HD       string
 	High     string
 	FourK    string
 	Auto     string
 }
 func getImageSizeMapping() QualityMapping {
 	return QualityMapping{
 		Standard: "1K",
 		HD:       "2K",
 		High:     "2K",
 		FourK:    "4K",
 		Auto:     "1K",
 	}
 }
 func getSizeMappings() map[string]SizeMapping {
 	return map[string]SizeMapping{
 		// Gemini 2.5 Flash Image - default 1K resolutions
 		"1024x1024": {AspectRatio: "1:1", ImageSize: ""},
 		"832x1248":  {AspectRatio: "2:3", ImageSize: ""},
 		"1248x832":  {AspectRatio: "3:2", ImageSize: ""},
 		"864x1184":  {AspectRatio: "3:4", ImageSize: ""},
 		"1184x864":  {AspectRatio: "4:3", ImageSize: ""},
 		"896x1152":  {AspectRatio: "4:5", ImageSize: ""},
 		"1152x896":  {AspectRatio: "5:4", ImageSize: ""},
 		"768x1344":  {AspectRatio: "9:16", ImageSize: ""},
 		"1344x768":  {AspectRatio: "16:9", ImageSize: ""},
 		"1536x672":  {AspectRatio: "21:9", ImageSize: ""},
 		// Gemini 3 Pro Image Preview resolutions
 		"1536x1024": {AspectRatio: "3:2", ImageSize: ""},
 		"1024x1536": {AspectRatio: "2:3", ImageSize: ""},
 		"1024x1792": {AspectRatio: "9:16", ImageSize: ""},
 		"1792x1024": {AspectRatio: "16:9", ImageSize: ""},
 		"2048x2048": {AspectRatio: "1:1", ImageSize: "2K"},
 		"4096x4096": {AspectRatio: "1:1", ImageSize: "4K"},
 	}
 }
 func processSizeParameters(size, quality string) ImageConfig {
 	config := ImageConfig{} // 默认为空值
 	if size != "" {
 		if strings.Contains(size, ":") {
 			config.AspectRatio = size // 直接设置，不与默认值比较
 		} else {
 			if mapping, exists := getSizeMappings()[size]; exists {
 				if mapping.AspectRatio != "" {
 					config.AspectRatio = mapping.AspectRatio
 				}
 				if mapping.ImageSize != "" {
 					config.ImageSize = mapping.ImageSize
 				}
 			}
 		}
 	}
 	if quality != "" {
 		qualityMapping := getImageSizeMapping()
 		switch strings.ToLower(strings.TrimSpace(quality)) {
 		case "hd", "high":
 			config.ImageSize = qualityMapping.HD
 		case "4k":
 			config.ImageSize = qualityMapping.FourK
 		case "standard", "medium", "low", "auto", "1k":
 			config.ImageSize = qualityMapping.Standard
 		}
 	}
 	return config
 }
 func (a *Adaptor) ConvertImageRequest(c *gin.Context, info *relaycommon.RelayInfo, request dto.ImageRequest) (any, error) {
-	if model_setting.IsGeminiModelSupportImagine(info.UpstreamModelName) {
+	if !strings.HasPrefix(info.UpstreamModelName, "imagen") {
-		var content any
+		return nil, errors.New("not supported model for image generation")
 		if base64Data, err := relaycommon.GetImageBase64sFromForm(c); err == nil {
 			content = []any{
 				dto.MediaContent{
 					Type: dto.ContentTypeText,
 					Text: request.Prompt,
 				},
 				dto.MediaContent{
 					Type: dto.ContentTypeFile,
 					File: &dto.MessageFile{
 						FileData: base64Data.String(),
 					},
 				},
 			}
 		} else {
 			content = request.Prompt
 		}
 		chatRequest := dto.GeneralOpenAIRequest{
 			Model: request.Model,
 			Messages: []dto.Message{
 				{Role: "user", Content: content},
 			},
 			N: int(request.N),
 		}
 		config := processSizeParameters(strings.TrimSpace(request.Size), request.Quality)
 		// 兼容 nano-banana 传quality[imageSize]会报错 An internal error has occurred. Please retry or report in https://developers.generativeai.google/guide/troubleshooting
 		if slices.Contains([]string{"nano-banana", "gemini-2.5-flash-image"}, info.UpstreamModelName) {
 			config.ImageSize = ""
 		}
 		googleGenerationConfig := map[string]interface{}{
 			"responseModalities": []string{"TEXT", "IMAGE"},
 			"imageConfig":        config,
 		}
 		extraBody := map[string]interface{}{
 			"google": map[string]interface{}{
 				"generationConfig": googleGenerationConfig,
 			},
 		}
 		chatRequest.ExtraBody, _ = json.Marshal(extraBody)
 		return a.ConvertOpenAIRequest(c, info, &chatRequest)
 	}
 	// convert size to aspect ratio but allow user to specify aspect ratio
@@ -199,8 +67,17 @@ func (a *Adaptor) ConvertImageRequest(c *gin.Context, info *relaycommon.RelayInf
 		if strings.Contains(size, ":") {
 			aspectRatio = size
 		} else {
-			if mapping, exists := getSizeMappings()[size]; exists && mapping.AspectRatio != "" {
+			switch size {
-				aspectRatio = mapping.AspectRatio
+			case "256x256", "512x512", "1024x1024":
 				aspectRatio = "1:1"
 			case "1536x1024":
 				aspectRatio = "3:2"
 			case "1024x1536":
 				aspectRatio = "2:3"
 			case "1024x1792":
 				aspectRatio = "9:16"
 			case "1792x1024":
 				aspectRatio = "16:9"
 			}
 		}
 	}
@@ -260,6 +137,8 @@ func (a *Adaptor) GetRequestURL(info *relaycommon.RelayInfo) (string, error) {
 			info.UpstreamModelName = strings.TrimSuffix(info.UpstreamModelName, "-thinking")
 		} else if strings.HasSuffix(info.UpstreamModelName, "-nothinking") {
 			info.UpstreamModelName = strings.TrimSuffix(info.UpstreamModelName, "-nothinking")
 		} else if baseModel, level := parseThinkingLevelSuffix(info.UpstreamModelName); level != "" {
 			info.UpstreamModelName = baseModel
 		}
 	}
@@ -381,10 +260,6 @@ func (a *Adaptor) DoResponse(c *gin.Context, resp *http.Response, info *relaycom
 		return GeminiImageHandler(c, info, resp)
 	}
 	if model_setting.IsGeminiModelSupportImagine(info.UpstreamModelName) {
 		return ChatImageHandler(c, info, resp)
 	}
 	// check if the model is an embedding model
 	if strings.HasPrefix(info.UpstreamModelName, "text-embedding") ||
 		strings.HasPrefix(info.UpstreamModelName, "embedding") ||
--- a/relay/channel/gemini/relay-gemini-native.go
+++ b/relay/channel/gemini/relay-gemini-native.go
@@ -5,7 +5,6 @@ import (
 	"net/http"
 	"github.com/QuantumNous/new-api/common"
 	"github.com/QuantumNous/new-api/constant"
 	"github.com/QuantumNous/new-api/dto"
 	"github.com/QuantumNous/new-api/logger"
 	relaycommon "github.com/QuantumNous/new-api/relay/common"
@@ -70,12 +69,7 @@ func NativeGeminiEmbeddingHandler(c *gin.Context, resp *http.Response, info *rel
 		println(string(responseBody))
 	}
-	usage := &dto.Usage{
+	usage := service.ResponseText2Usage(c, "", info.UpstreamModelName, info.GetEstimatePromptTokens())
 		PromptTokens: info.PromptTokens,
 		TotalTokens:  info.PromptTokens,
 	}
 	common.SetContextKey(c, constant.ContextKeyLocalCountTokens, true)
 	if info.IsGeminiBatchEmbedding {
 		var geminiResponse dto.GeminiBatchEmbeddingResponse
@@ -100,10 +94,10 @@ func GeminiTextGenerationStreamHandler(c *gin.Context, info *relaycommon.RelayIn
 	helper.SetEventStreamHeaders(c)
 	return geminiStreamHandler(c, info, resp, func(data string, geminiResponse *dto.GeminiChatResponse) bool {
 		// 直接发送 GeminiChatResponse 响应
 		err := helper.StringData(c, data)
 		if err != nil {
-			logger.LogError(c, err.Error())
+			logger.LogError(c, "failed to write stream data: "+err.Error())
 			return false
 		}
 		info.SendResponseCount++
 		return true
--- a/relay/channel/gemini/relay-gemini.go
+++ b/relay/channel/gemini/relay-gemini.go
@@ -19,8 +19,8 @@ import (
 	"github.com/QuantumNous/new-api/relay/helper"
 	"github.com/QuantumNous/new-api/service"
 	"github.com/QuantumNous/new-api/setting/model_setting"
 	"github.com/QuantumNous/new-api/setting/reasoning"
 	"github.com/QuantumNous/new-api/types"
 	"github.com/gin-gonic/gin"
 )
@@ -122,6 +122,14 @@ func clampThinkingBudgetByEffort(modelName string, effort string) int {
 	return clampThinkingBudget(modelName, maxBudget)
 }
 func parseThinkingLevelSuffix(modelName string) (string, string) {
 	base, level, ok := reasoning.TrimEffortSuffix(modelName)
 	if !ok {
 		return modelName, ""
 	}
 	return base, level
 }
 func ThinkingAdaptor(geminiRequest *dto.GeminiChatRequest, info *relaycommon.RelayInfo, oaiRequest ...dto.GeneralOpenAIRequest) {
 	if model_setting.GetGeminiSettings().ThinkingAdapterEnabled {
 		modelName := info.UpstreamModelName
@@ -178,12 +186,18 @@ func ThinkingAdaptor(geminiRequest *dto.GeminiChatRequest, info *relaycommon.Rel
 					ThinkingBudget: common.GetPointer(0),
 				}
 			}
 		} else if _, level := parseThinkingLevelSuffix(modelName); level != "" {
 			geminiRequest.GenerationConfig.ThinkingConfig = &dto.GeminiThinkingConfig{
 				IncludeThoughts: true,
 				ThinkingLevel:   level,
 			}
 			info.ReasoningEffort = level
 		}
 	}
 }
 // Setting safety to the lowest possible values since Gemini is already powerless enough
-func CovertOpenAI2Gemini(c *gin.Context, textRequest dto.GeneralOpenAIRequest, info *relaycommon.RelayInfo, base64Data ...*relaycommon.Base64Data) (*dto.GeminiChatRequest, error) {
+func CovertOpenAI2Gemini(c *gin.Context, textRequest dto.GeneralOpenAIRequest, info *relaycommon.RelayInfo) (*dto.GeminiChatRequest, error) {
 	geminiRequest := dto.GeminiChatRequest{
 		Contents: make([]dto.GeminiChatContent, 0, len(textRequest.Messages)),
@@ -208,6 +222,7 @@ func CovertOpenAI2Gemini(c *gin.Context, textRequest dto.GeneralOpenAIRequest, i
 	adaptorWithExtraBody := false
 	// patch extra_body
 	if len(textRequest.ExtraBody) > 0 {
 		if !strings.HasSuffix(info.UpstreamModelName, "-nothinking") {
 			var extraBody map[string]interface{}
@@ -240,13 +255,36 @@ func CovertOpenAI2Gemini(c *gin.Context, textRequest dto.GeneralOpenAIRequest, i
 					}
 				}
-				if generationConfig, ok := googleBody["generationConfig"].(map[string]any); ok {
+				// check error param name like imageConfig, should be image_config
-					generationConfigBytes, err := json.Marshal(generationConfig)
+				if _, hasErrorParam := googleBody["imageConfig"]; hasErrorParam {
-					if err != nil {
+					return nil, errors.New("extra_body.google.imageConfig is not supported, use extra_body.google.image_config instead")
-						return nil, fmt.Errorf("failed to marshal generationConfig: %w", err)
+				}
 				if imageConfig, ok := googleBody["image_config"].(map[string]interface{}); ok {
 					// check error param name like aspectRatio, should be aspect_ratio
 					if _, hasErrorParam := imageConfig["aspectRatio"]; hasErrorParam {
 						return nil, errors.New("extra_body.google.image_config.aspectRatio is not supported, use extra_body.google.image_config.aspect_ratio instead")
 					}
-					if err := json.Unmarshal(generationConfigBytes, &geminiRequest.GenerationConfig); err != nil {
+					// check error param name like imageSize, should be image_size
-						return nil, fmt.Errorf("failed to unmarshal generationConfig: %w", err)
+					if _, hasErrorParam := imageConfig["imageSize"]; hasErrorParam {
 						return nil, errors.New("extra_body.google.image_config.imageSize is not supported, use extra_body.google.image_config.image_size instead")
 					}
 					// convert snake_case to camelCase for Gemini API
 					geminiImageConfig := make(map[string]interface{})
 					if aspectRatio, ok := imageConfig["aspect_ratio"]; ok {
 						geminiImageConfig["aspectRatio"] = aspectRatio
 					}
 					if imageSize, ok := imageConfig["image_size"]; ok {
 						geminiImageConfig["imageSize"] = imageSize
 					}
 					if len(geminiImageConfig) > 0 {
 						imageConfigBytes, err := common.Marshal(geminiImageConfig)
 						if err != nil {
 							return nil, fmt.Errorf("failed to marshal image_config: %w", err)
 						}
 						geminiRequest.GenerationConfig.ImageConfig = imageConfigBytes
 					}
 				}
 			}
@@ -422,9 +460,68 @@ func CovertOpenAI2Gemini(c *gin.Context, textRequest dto.GeneralOpenAIRequest, i
 				if part.Text == "" {
 					continue
 				}
-				parts = append(parts, dto.GeminiPart{
+				// check markdown image ![image](data:image/jpeg;base64,xxxxxxxxxxxx)
-					Text: part.Text,
+				// 使用字符串查找而非正则，避免大文本性能问题
-				})
+				text := part.Text
 				hasMarkdownImage := false
 				for {
 					// 快速检查是否包含 markdown 图片标记
 					startIdx := strings.Index(text, "![")
 					if startIdx == -1 {
 						break
 					}
 					// 找到 ](
 					bracketIdx := strings.Index(text[startIdx:], "](data:")
 					if bracketIdx == -1 {
 						break
 					}
 					bracketIdx += startIdx
 					// 找到闭合的 )
 					closeIdx := strings.Index(text[bracketIdx+2:], ")")
 					if closeIdx == -1 {
 						break
 					}
 					closeIdx += bracketIdx + 2
 					hasMarkdownImage = true
 					// 添加图片前的文本
 					if startIdx > 0 {
 						textBefore := text[:startIdx]
 						if textBefore != "" {
 							parts = append(parts, dto.GeminiPart{
 								Text: textBefore,
 							})
 						}
 					}
 					// 提取 data URL (从 "](" 后面开始，到 ")" 之前)
 					dataUrl := text[bracketIdx+2 : closeIdx]
 					imageNum += 1
 					if constant.GeminiVisionMaxImageNum != -1 && imageNum > constant.GeminiVisionMaxImageNum {
 						return nil, fmt.Errorf("too many images in the message, max allowed is %d", constant.GeminiVisionMaxImageNum)
 					}
 					format, base64String, err := service.DecodeBase64FileData(dataUrl)
 					if err != nil {
 						return nil, fmt.Errorf("decode markdown base64 image data failed: %s", err.Error())
 					}
 					imgPart := dto.GeminiPart{
 						InlineData: &dto.GeminiInlineData{
 							MimeType: format,
 							Data:     base64String,
 						},
 					}
 					if shouldAttachThoughtSignature {
 						imgPart.ThoughtSignature = json.RawMessage(strconv.Quote(thoughtSignatureBypassValue))
 					}
 					parts = append(parts, imgPart)
 					// 继续处理剩余文本
 					text = text[closeIdx+1:]
 				}
 				// 添加剩余文本或原始文本（如果没有找到 markdown 图片）
 				if !hasMarkdownImage {
 					parts = append(parts, dto.GeminiPart{
 						Text: part.Text,
 					})
 				}
 			} else if part.Type == dto.ContentTypeImageURL {
 				imageNum += 1
@@ -464,11 +561,10 @@ func CovertOpenAI2Gemini(c *gin.Context, textRequest dto.GeneralOpenAIRequest, i
 					})
 				}
 			} else if part.Type == dto.ContentTypeFile {
-				file := part.GetFile()
+				if part.GetFile().FileId != "" {
 				if file.FileId != "" {
 					return nil, fmt.Errorf("only base64 file is supported in gemini")
 				}
-				format, base64String, err := service.DecodeBase64FileData(file.FileData)
+				format, base64String, err := service.DecodeBase64FileData(part.GetFile().FileData)
 				if err != nil {
 					return nil, fmt.Errorf("decode base64 file data failed: %s", err.Error())
 				}
@@ -1033,7 +1129,7 @@ func geminiStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http
 	if usage.CompletionTokens <= 0 {
 		str := responseText.String()
 		if len(str) > 0 {
-			usage = service.ResponseText2Usage(c, responseText.String(), info.UpstreamModelName, info.PromptTokens)
+			usage = service.ResponseText2Usage(c, responseText.String(), info.UpstreamModelName, info.GetEstimatePromptTokens())
 		} else {
 			usage = &dto.Usage{}
 		}
@@ -1206,11 +1302,7 @@ func GeminiEmbeddingHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *h
 	// Google has not yet clarified how embedding models will be billed
 	// refer to openai billing method to use input tokens billing
 	// https://platform.openai.com/docs/guides/embeddings#what-are-embeddings
-	usage := &dto.Usage{
+	usage := service.ResponseText2Usage(c, "", info.UpstreamModelName, info.GetEstimatePromptTokens())
 		PromptTokens:     info.PromptTokens,
 		CompletionTokens: 0,
 		TotalTokens:      info.PromptTokens,
 	}
 	openAIResponse.Usage = *usage
 	jsonResponse, jsonErr := common.Marshal(openAIResponse)
@@ -1275,70 +1367,3 @@ func GeminiImageHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.
 	return usage, nil
 }
 func convertToOaiImageResponse(geminiResponse *dto.GeminiChatResponse) (*dto.ImageResponse, error) {
 	openAIResponse := &dto.ImageResponse{
 		Created: common.GetTimestamp(),
 		Data:    make([]dto.ImageData, 0),
 	}
 	// extract images from candidates' inlineData
 	for _, candidate := range geminiResponse.Candidates {
 		for _, part := range candidate.Content.Parts {
 			if part.InlineData != nil && strings.HasPrefix(part.InlineData.MimeType, "image") {
 				openAIResponse.Data = append(openAIResponse.Data, dto.ImageData{
 					B64Json: part.InlineData.Data,
 				})
 			}
 		}
 	}
 	if len(openAIResponse.Data) == 0 {
 		return nil, errors.New("no images found in response")
 	}
 	return openAIResponse, nil
 }
 func ChatImageHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Response) (*dto.Usage, *types.NewAPIError) {
 	responseBody, readErr := io.ReadAll(resp.Body)
 	if readErr != nil {
 		return nil, types.NewOpenAIError(readErr, types.ErrorCodeBadResponseBody, http.StatusInternalServerError)
 	}
 	service.CloseResponseBodyGracefully(resp)
 	if common.DebugEnabled {
 		println("ChatImageHandler response:", string(responseBody))
 	}
 	var geminiResponse dto.GeminiChatResponse
 	if jsonErr := common.Unmarshal(responseBody, &geminiResponse); jsonErr != nil {
 		return nil, types.NewOpenAIError(jsonErr, types.ErrorCodeBadResponseBody, http.StatusInternalServerError)
 	}
 	if len(geminiResponse.Candidates) == 0 {
 		return nil, types.NewOpenAIError(errors.New("no images generated"), types.ErrorCodeBadResponseBody, http.StatusInternalServerError)
 	}
 	openAIResponse, err := convertToOaiImageResponse(&geminiResponse)
 	if err != nil {
 		return nil, types.NewOpenAIError(err, types.ErrorCodeBadResponseBody, http.StatusInternalServerError)
 	}
 	jsonResponse, jsonErr := json.Marshal(openAIResponse)
 	if jsonErr != nil {
 		return nil, types.NewError(jsonErr, types.ErrorCodeBadResponseBody)
 	}
 	c.Writer.Header().Set("Content-Type", "application/json")
 	c.Writer.WriteHeader(resp.StatusCode)
 	_, _ = c.Writer.Write(jsonResponse)
 	usage := &dto.Usage{
 		PromptTokens:     geminiResponse.UsageMetadata.PromptTokenCount,
 		CompletionTokens: geminiResponse.UsageMetadata.CandidatesTokenCount,
 		TotalTokens:      geminiResponse.UsageMetadata.TotalTokenCount,
 	}
 	return usage, nil
 }
--- a/relay/channel/minimax/tts.go
+++ b/relay/channel/minimax/tts.go
@@ -163,7 +163,7 @@ func handleTTSResponse(c *gin.Context, resp *http.Response, info *relaycommon.Re
 	}
 	usage = &dto.Usage{
-		PromptTokens:     info.PromptTokens,
+		PromptTokens:     info.GetEstimatePromptTokens(),
 		CompletionTokens: 0,
 		TotalTokens:      int(minimaxResp.ExtraInfo.UsageCharacters),
 	}
--- a/relay/channel/openai/adaptor.go
+++ b/relay/channel/openai/adaptor.go
@@ -42,7 +42,7 @@ type Adaptor struct {
 // support OAI models: o1-mini/o3-mini/o4-mini/o1/o3 etc...
 // minimal effort only available in gpt-5
 func parseReasoningEffortFromModelSuffix(model string) (string, string) {
-	effortSuffixes := []string{"-high", "-minimal", "-low", "-medium", "-none"}
+	effortSuffixes := []string{"-high", "-minimal", "-low", "-medium", "-none", "-xhigh"}
 	for _, suffix := range effortSuffixes {
 		if strings.HasSuffix(model, suffix) {
 			effort := strings.TrimPrefix(suffix, "-")
@@ -306,10 +306,11 @@ func (a *Adaptor) ConvertOpenAIRequest(c *gin.Context, info *relaycommon.RelayIn
 			request.Temperature = nil
 		}
 		// gpt-5系列模型适配 归零不再支持的参数
 		if strings.HasPrefix(info.UpstreamModelName, "gpt-5") {
-			if info.UpstreamModelName != "gpt-5-chat-latest" {
+			request.Temperature = nil
-				request.Temperature = nil
+			request.TopP = 0 // oai 的 top_p 默认值是 1.0，但是为了 omitempty 属性直接不传，这里显式设置为 0
-			}
+			request.LogProbs = false
 		}
 		// 转换模型推理力度后缀
--- a/relay/channel/openai/audio.go
+++ b/relay/channel/openai/audio.go
@@ -0,0 +1,145 @@
 package openai
 import (
 	"bytes"
 	"fmt"
 	"io"
 	"math"
 	"net/http"
 	"github.com/QuantumNous/new-api/common"
 	"github.com/QuantumNous/new-api/constant"
 	"github.com/QuantumNous/new-api/dto"
 	"github.com/QuantumNous/new-api/logger"
 	relaycommon "github.com/QuantumNous/new-api/relay/common"
 	"github.com/QuantumNous/new-api/relay/helper"
 	"github.com/QuantumNous/new-api/service"
 	"github.com/QuantumNous/new-api/types"
 	"github.com/gin-gonic/gin"
 )
 func OpenaiTTSHandler(c *gin.Context, resp *http.Response, info *relaycommon.RelayInfo) *dto.Usage {
 	// the status code has been judged before, if there is a body reading failure,
 	// it should be regarded as a non-recoverable error, so it should not return err for external retry.
 	// Analogous to nginx's load balancing, it will only retry if it can't be requested or
 	// if the upstream returns a specific status code, once the upstream has already written the header,
 	// the subsequent failure of the response body should be regarded as a non-recoverable error,
 	// and can be terminated directly.
 	defer service.CloseResponseBodyGracefully(resp)
 	usage := &dto.Usage{}
 	usage.PromptTokens = info.GetEstimatePromptTokens()
 	usage.TotalTokens = info.GetEstimatePromptTokens()
 	for k, v := range resp.Header {
 		c.Writer.Header().Set(k, v[0])
 	}
 	c.Writer.WriteHeader(resp.StatusCode)
 	if info.IsStream {
 		helper.StreamScannerHandler(c, resp, info, func(data string) bool {
 			if service.SundaySearch(data, "usage") {
 				var simpleResponse dto.SimpleResponse
 				err := common.Unmarshal([]byte(data), &simpleResponse)
 				if err != nil {
 					logger.LogError(c, err.Error())
 				}
 				if simpleResponse.Usage.TotalTokens != 0 {
 					usage.PromptTokens = simpleResponse.Usage.InputTokens
 					usage.CompletionTokens = simpleResponse.OutputTokens
 					usage.TotalTokens = simpleResponse.TotalTokens
 				}
 			}
 			_ = helper.StringData(c, data)
 			return true
 		})
 	} else {
 		common.SetContextKey(c, constant.ContextKeyLocalCountTokens, true)
 		// 读取响应体到缓冲区
 		bodyBytes, err := io.ReadAll(resp.Body)
 		if err != nil {
 			logger.LogError(c, fmt.Sprintf("failed to read TTS response body: %v", err))
 			c.Writer.WriteHeaderNow()
 			return usage
 		}
 		// 写入响应到客户端
 		c.Writer.WriteHeaderNow()
 		_, err = c.Writer.Write(bodyBytes)
 		if err != nil {
 			logger.LogError(c, fmt.Sprintf("failed to write TTS response: %v", err))
 		}
 		// 计算音频时长并更新 usage
 		audioFormat := "mp3" // 默认格式
 		if audioReq, ok := info.Request.(*dto.AudioRequest); ok && audioReq.ResponseFormat != "" {
 			audioFormat = audioReq.ResponseFormat
 		}
 		var duration float64
 		var durationErr error
 		if audioFormat == "pcm" {
 			// PCM 格式没有文件头，根据 OpenAI TTS 的 PCM 参数计算时长
 			// 采样率: 24000 Hz, 位深度: 16-bit (2 bytes), 声道数: 1
 			const sampleRate = 24000
 			const bytesPerSample = 2
 			const channels = 1
 			duration = float64(len(bodyBytes)) / float64(sampleRate*bytesPerSample*channels)
 		} else {
 			ext := "." + audioFormat
 			reader := bytes.NewReader(bodyBytes)
 			duration, durationErr = common.GetAudioDuration(c.Request.Context(), reader, ext)
 		}
 		usage.PromptTokensDetails.TextTokens = usage.PromptTokens
 		if durationErr != nil {
 			logger.LogWarn(c, fmt.Sprintf("failed to get audio duration: %v", durationErr))
 			// 如果无法获取时长，则设置保底的 CompletionTokens，根据body大小计算
 			sizeInKB := float64(len(bodyBytes)) / 1000.0
 			estimatedTokens := int(math.Ceil(sizeInKB)) // 粗略估算每KB约等于1 token
 			usage.CompletionTokens = estimatedTokens
 			usage.CompletionTokenDetails.AudioTokens = estimatedTokens
 		} else if duration > 0 {
 			// 计算 token: ceil(duration) / 60.0 * 1000，即每分钟 1000 tokens
 			completionTokens := int(math.Round(math.Ceil(duration) / 60.0 * 1000))
 			usage.CompletionTokens = completionTokens
 			usage.CompletionTokenDetails.AudioTokens = completionTokens
 		}
 		usage.TotalTokens = usage.PromptTokens + usage.CompletionTokens
 	}
 	return usage
 }
 func OpenaiSTTHandler(c *gin.Context, resp *http.Response, info *relaycommon.RelayInfo, responseFormat string) (*types.NewAPIError, *dto.Usage) {
 	defer service.CloseResponseBodyGracefully(resp)
 	responseBody, err := io.ReadAll(resp.Body)
 	if err != nil {
 		return types.NewOpenAIError(err, types.ErrorCodeReadResponseBodyFailed, http.StatusInternalServerError), nil
 	}
 	// 写入新的 response body
 	service.IOCopyBytesGracefully(c, resp, responseBody)
 	var responseData struct {
 		Usage *dto.Usage `json:"usage"`
 	}
 	if err := common.Unmarshal(responseBody, &responseData); err == nil && responseData.Usage != nil {
 		if responseData.Usage.TotalTokens > 0 {
 			usage := responseData.Usage
 			if usage.PromptTokens == 0 {
 				usage.PromptTokens = usage.InputTokens
 			}
 			if usage.CompletionTokens == 0 {
 				usage.CompletionTokens = usage.OutputTokens
 			}
 			return nil, usage
 		}
 	}
 	usage := &dto.Usage{}
 	usage.PromptTokens = info.GetEstimatePromptTokens()
 	usage.CompletionTokens = 0
 	usage.TotalTokens = usage.PromptTokens + usage.CompletionTokens
 	return nil, usage
 }
--- a/relay/channel/openai/helper.go
+++ b/relay/channel/openai/helper.go
@@ -172,7 +172,7 @@ func handleLastResponse(lastStreamData string, responseId *string, createAt *int
 	shouldSendLastResp *bool) error {
 	var lastStreamResponse dto.ChatCompletionsStreamResponse
-	if err := json.Unmarshal(common.StringToByteSlice(lastStreamData), &lastStreamResponse); err != nil {
+	if err := common.Unmarshal(common.StringToByteSlice(lastStreamData), &lastStreamResponse); err != nil {
 		return err
 	}
--- a/relay/channel/openai/relay-openai.go
+++ b/relay/channel/openai/relay-openai.go
@@ -1,7 +1,6 @@
 package openai
 import (
 	"encoding/json"
 	"fmt"
 	"io"
 	"net/http"
@@ -151,7 +150,7 @@ func OaiStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Re
 		var streamResp struct {
 			Usage *dto.Usage `json:"usage"`
 		}
-		err := json.Unmarshal([]byte(secondLastStreamData), &streamResp)
+		err := common.Unmarshal([]byte(secondLastStreamData), &streamResp)
 		if err == nil && streamResp.Usage != nil && service.ValidUsage(streamResp.Usage) {
 			usage = streamResp.Usage
 			containStreamUsage = true
@@ -183,7 +182,7 @@ func OaiStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Re
 	}
 	if !containStreamUsage {
-		usage = service.ResponseText2Usage(c, responseTextBuilder.String(), info.UpstreamModelName, info.PromptTokens)
+		usage = service.ResponseText2Usage(c, responseTextBuilder.String(), info.UpstreamModelName, info.GetEstimatePromptTokens())
 		usage.CompletionTokens += toolCount * 7
 	}
@@ -245,9 +244,9 @@ func OpenaiHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Respo
 			}
 		}
 		simpleResponse.Usage = dto.Usage{
-			PromptTokens:     info.PromptTokens,
+			PromptTokens:     info.GetEstimatePromptTokens(),
 			CompletionTokens: completionTokens,
-			TotalTokens:      info.PromptTokens + completionTokens,
+			TotalTokens:      info.GetEstimatePromptTokens() + completionTokens,
 		}
 		usageModified = true
 	}
@@ -327,68 +326,6 @@ func streamTTSResponse(c *gin.Context, resp *http.Response) {
 	}
 }
 func OpenaiTTSHandler(c *gin.Context, resp *http.Response, info *relaycommon.RelayInfo) *dto.Usage {
 	// the status code has been judged before, if there is a body reading failure,
 	// it should be regarded as a non-recoverable error, so it should not return err for external retry.
 	// Analogous to nginx's load balancing, it will only retry if it can't be requested or
 	// if the upstream returns a specific status code, once the upstream has already written the header,
 	// the subsequent failure of the response body should be regarded as a non-recoverable error,
 	// and can be terminated directly.
 	defer service.CloseResponseBodyGracefully(resp)
 	usage := &dto.Usage{}
 	usage.PromptTokens = info.PromptTokens
 	usage.TotalTokens = info.PromptTokens
 	for k, v := range resp.Header {
 		c.Writer.Header().Set(k, v[0])
 	}
 	c.Writer.WriteHeader(resp.StatusCode)
 	isStreaming := resp.ContentLength == -1 || resp.Header.Get("Content-Length") == ""
 	if isStreaming {
 		streamTTSResponse(c, resp)
 	} else {
 		c.Writer.WriteHeaderNow()
 		_, err := io.Copy(c.Writer, resp.Body)
 		if err != nil {
 			logger.LogError(c, err.Error())
 		}
 	}
 	return usage
 }
 func OpenaiSTTHandler(c *gin.Context, resp *http.Response, info *relaycommon.RelayInfo, responseFormat string) (*types.NewAPIError, *dto.Usage) {
 	defer service.CloseResponseBodyGracefully(resp)
 	responseBody, err := io.ReadAll(resp.Body)
 	if err != nil {
 		return types.NewOpenAIError(err, types.ErrorCodeReadResponseBodyFailed, http.StatusInternalServerError), nil
 	}
 	// 写入新的 response body
 	service.IOCopyBytesGracefully(c, resp, responseBody)
 	var responseData struct {
 		Usage *dto.Usage `json:"usage"`
 	}
 	if err := json.Unmarshal(responseBody, &responseData); err == nil && responseData.Usage != nil {
 		if responseData.Usage.TotalTokens > 0 {
 			usage := responseData.Usage
 			if usage.PromptTokens == 0 {
 				usage.PromptTokens = usage.InputTokens
 			}
 			if usage.CompletionTokens == 0 {
 				usage.CompletionTokens = usage.OutputTokens
 			}
 			return nil, usage
 		}
 	}
 	usage := &dto.Usage{}
 	usage.PromptTokens = info.PromptTokens
 	usage.CompletionTokens = 0
 	usage.TotalTokens = usage.PromptTokens + usage.CompletionTokens
 	return nil, usage
 }
 func OpenaiRealtimeHandler(c *gin.Context, info *relaycommon.RelayInfo) (*types.NewAPIError, *dto.RealtimeUsage) {
 	if info == nil || info.ClientWs == nil || info.TargetWs == nil {
 		return types.NewError(fmt.Errorf("invalid websocket connection"), types.ErrorCodeBadResponse), nil
@@ -687,7 +624,7 @@ func extractCachedTokensFromBody(body []byte) (int, bool) {
 		} `json:"usage"`
 	}
-	if err := json.Unmarshal(body, &payload); err != nil {
+	if err := common.Unmarshal(body, &payload); err != nil {
 		return 0, false
 	}
--- a/relay/channel/openai/relay_responses.go
+++ b/relay/channel/openai/relay_responses.go
@@ -141,7 +141,7 @@ func OaiResponsesStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp
 	}
 	if usage.PromptTokens == 0 && usage.CompletionTokens != 0 {
-		usage.PromptTokens = info.PromptTokens
+		usage.PromptTokens = info.GetEstimatePromptTokens()
 	}
 	usage.TotalTokens = usage.PromptTokens + usage.CompletionTokens
--- a/relay/channel/palm/adaptor.go
+++ b/relay/channel/palm/adaptor.go
@@ -81,7 +81,7 @@ func (a *Adaptor) DoResponse(c *gin.Context, resp *http.Response, info *relaycom
 	if info.IsStream {
 		var responseText string
 		err, responseText = palmStreamHandler(c, resp)
-		usage = service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.PromptTokens)
+		usage = service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.GetEstimatePromptTokens())
 	} else {
 		usage, err = palmHandler(c, info, resp)
 	}
--- a/relay/channel/palm/relay-palm.go
+++ b/relay/channel/palm/relay-palm.go
@@ -121,13 +121,8 @@ func palmHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Respons
 		}, resp.StatusCode)
 	}
 	fullTextResponse := responsePaLM2OpenAI(&palmResponse)
-	completionTokens := service.CountTextToken(palmResponse.Candidates[0].Content, info.UpstreamModelName)
+	usage := service.ResponseText2Usage(c, palmResponse.Candidates[0].Content, info.UpstreamModelName, info.GetEstimatePromptTokens())
-	usage := dto.Usage{
+	fullTextResponse.Usage = *usage
 		PromptTokens:     info.PromptTokens,
 		CompletionTokens: completionTokens,
 		TotalTokens:      info.PromptTokens + completionTokens,
 	}
 	fullTextResponse.Usage = usage
 	jsonResponse, err := common.Marshal(fullTextResponse)
 	if err != nil {
 		return nil, types.NewError(err, types.ErrorCodeBadResponseBody)
@@ -135,5 +130,5 @@ func palmHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Respons
 	c.Writer.Header().Set("Content-Type", "application/json")
 	c.Writer.WriteHeader(resp.StatusCode)
 	service.IOCopyBytesGracefully(c, resp, jsonResponse)
-	return &usage, nil
+	return usage, nil
 }
--- a/relay/channel/task/ali/adaptor.go
+++ b/relay/channel/task/ali/adaptor.go
@@ -393,7 +393,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, info *rela
 }
 // FetchTask 查询任务状态
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -408,7 +408,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	req.Header.Set("Authorization", "Bearer "+key)
-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
 	if err != nil {
 		return nil, fmt.Errorf("new proxy http client failed: %w", err)
 	}
 	return client.Do(req)
 }
 func (a *TaskAdaptor) GetModelList() []string {
--- a/relay/channel/task/doubao/adaptor.go
+++ b/relay/channel/task/doubao/adaptor.go
@@ -146,7 +146,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, info *rela
 }
 // FetchTask fetch task status
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -163,7 +163,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	req.Header.Set("Content-Type", "application/json")
 	req.Header.Set("Authorization", "Bearer "+key)
-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
 	if err != nil {
 		return nil, fmt.Errorf("new proxy http client failed: %w", err)
 	}
 	return client.Do(req)
 }
 func (a *TaskAdaptor) GetModelList() []string {
--- a/relay/channel/task/gemini/adaptor.go
+++ b/relay/channel/task/gemini/adaptor.go
@@ -24,9 +24,13 @@ import (
 	"github.com/pkg/errors"
 )
-// VideoGenerationConfig represents the video generation configuration
+// ============================
 // Request / Response structures
 // ============================
 // GeminiVideoGenerationConfig represents the video generation configuration
 // Based on: https://ai.google.dev/gemini-api/docs/video
-type VideoGenerationConfig struct {
+type GeminiVideoGenerationConfig struct {
 	AspectRatio      string  `json:"aspectRatio,omitempty"`      // "16:9" or "9:16"
 	DurationSeconds  float64 `json:"durationSeconds,omitempty"`  // 4, 6, or 8 (as number)
 	NegativePrompt   string  `json:"negativePrompt,omitempty"`   // unwanted elements
@@ -34,21 +38,15 @@ type VideoGenerationConfig struct {
 	Resolution       string  `json:"resolution,omitempty"`       // video resolution
 }
-type Image struct {
+// GeminiVideoRequest represents a single video generation instance
-	BytesBase64Encoded string `json:"bytesBase64Encoded,omitempty"`
+type GeminiVideoRequest struct {
-	MimeType           string `json:"mimeType,omitempty"`
+	Prompt string `json:"prompt"`
 }
-type VideoRequest struct {
+// GeminiVideoPayload represents the complete video generation request payload
-	Prompt    string `json:"prompt"`
+type GeminiVideoPayload struct {
-	Image     *Image `json:"image,omitempty"`
+	Instances  []GeminiVideoRequest        `json:"instances"`
-	LastFrame *Image `json:"lastFrame,omitempty"`
+	Parameters GeminiVideoGenerationConfig `json:"parameters,omitempty"`
 }
 // VideoPayload represents the complete video generation request payload
 type VideoPayload struct {
 	Instances  []VideoRequest        `json:"instances"`
 	Parameters VideoGenerationConfig `json:"parameters,omitempty"`
 }
 type submitResponse struct {
@@ -77,8 +75,6 @@ type operationResponse struct {
 					URI string `json:"uri"`
 				} `json:"video"`
 			} `json:"generatedSamples"`
 			RaiMediaFilteredCount   int      `json:"raiMediaFilteredCount"`
 			RaiMediaFilteredReasons []string `json:"raiMediaFilteredReasons"`
 		} `json:"generateVideoResponse"`
 	} `json:"response"`
 	Error struct {
@@ -104,7 +100,8 @@ func (a *TaskAdaptor) Init(info *relaycommon.RelayInfo) {
 // ValidateRequestAndSetAction parses body, validates fields and sets default action.
 func (a *TaskAdaptor) ValidateRequestAndSetAction(c *gin.Context, info *relaycommon.RelayInfo) (taskErr *dto.TaskError) {
-	return relaycommon.ValidateBasicTaskRequest(c, info, constant.TaskActionGenerate)
+	// Use the standard validation method for TaskSubmitReq
 	return relaycommon.ValidateBasicTaskRequest(c, info, constant.TaskActionTextGenerate)
 }
 // BuildRequestURL constructs the upstream URL.
@@ -140,21 +137,13 @@ func (a *TaskAdaptor) BuildRequestBody(c *gin.Context, info *relaycommon.RelayIn
 	}
 	// Create structured video generation request
-	body := VideoPayload{
+	body := GeminiVideoPayload{
-		Instances: []VideoRequest{
+		Instances: []GeminiVideoRequest{
 			{Prompt: req.Prompt},
 		},
-		Parameters: VideoGenerationConfig{},
+		Parameters: GeminiVideoGenerationConfig{},
 	}
 	if len(req.Images) > 0 {
 		body.Instances[0].Image = a.convertImage(req.Images[0])
 	}
 	if len(req.Images) > 1 {
 		body.Instances[0].LastFrame = a.convertImage(req.Images[1])
 	}
 	// Parse metadata for additional configuration
 	metadata := req.Metadata
 	medaBytes, err := json.Marshal(metadata)
 	if err != nil {
@@ -211,7 +200,7 @@ func (a *TaskAdaptor) GetChannelName() string {
 }
 // FetchTask fetch task status
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -234,7 +223,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	req.Header.Set("Accept", "application/json")
 	req.Header.Set("x-goog-api-key", key)
-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
 	if err != nil {
 		return nil, fmt.Errorf("new proxy http client failed: %w", err)
 	}
 	return client.Do(req)
 }
 func (a *TaskAdaptor) ParseTaskResult(respBody []byte) (*relaycommon.TaskInfo, error) {
@@ -258,19 +251,20 @@ func (a *TaskAdaptor) ParseTaskResult(respBody []byte) (*relaycommon.TaskInfo, e
 		return ti, nil
 	}
-	if len(op.Response.GenerateVideoResponse.GeneratedSamples) == 0 {
+	ti.Status = model.TaskStatusSuccess
 		ti.Status = model.TaskStatusFailure
 		ti.Reason = fmt.Sprintf("no generated video url found: %s", strings.Join(op.Response.GenerateVideoResponse.RaiMediaFilteredReasons, "; "))
 	} else {
 		if uri := op.Response.GenerateVideoResponse.GeneratedSamples[0].Video.URI; uri != "" {
 			ti.RemoteUrl = uri
 		}
 		ti.Status = model.TaskStatusSuccess
 	}
 	ti.Progress = "100%"
 	taskID := encodeLocalTaskID(op.Name)
 	ti.TaskID = taskID
 	ti.Url = fmt.Sprintf("%s/v1/videos/%s/content", system_setting.ServerAddress, taskID)
 	// Extract URL from generateVideoResponse if available
 	if len(op.Response.GenerateVideoResponse.GeneratedSamples) > 0 {
 		if uri := op.Response.GenerateVideoResponse.GeneratedSamples[0].Video.URI; uri != "" {
 			ti.RemoteUrl = uri
 		}
 	}
 	return ti, nil
 }
@@ -299,30 +293,6 @@ func (a *TaskAdaptor) ConvertToOpenAIVideo(task *model.Task) ([]byte, error) {
 	return common.Marshal(video)
 }
 func (a *TaskAdaptor) convertImage(imageStr string) *Image {
 	if strings.TrimSpace(imageStr) == "" {
 		return nil
 	}
 	img := &Image{
 		MimeType:           "image/png",
 		BytesBase64Encoded: imageStr,
 	}
 	if strings.HasPrefix(imageStr, "data:image/") {
 		parts := strings.Split(imageStr, ";base64,")
 		if len(parts) == 2 {
 			img.MimeType = strings.TrimPrefix(parts[0], "data:")
 			img.BytesBase64Encoded = parts[1]
 		}
 	} else if strings.HasPrefix(imageStr, "http") {
 		mimeType, data, err := service.GetImageFromUrl(imageStr)
 		if err == nil {
 			img.MimeType = mimeType
 			img.BytesBase64Encoded = data
 		}
 	}
 	return img
 }
 // ============================
 // helpers
 // ============================
--- a/relay/channel/task/hailuo/adaptor.go
+++ b/relay/channel/task/hailuo/adaptor.go
@@ -110,7 +110,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, info *rela
 	return hResp.TaskID, responseBody, nil
 }
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -126,7 +126,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	req.Header.Set("Accept", "application/json")
 	req.Header.Set("Authorization", "Bearer "+key)
-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
 	if err != nil {
 		return nil, fmt.Errorf("new proxy http client failed: %w", err)
 	}
 	return client.Do(req)
 }
 func (a *TaskAdaptor) GetModelList() []string {
--- a/relay/channel/task/jimeng/adaptor.go
+++ b/relay/channel/task/jimeng/adaptor.go
@@ -196,7 +196,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, info *rela
 	}
 	if jResp.Code != 10000 {
-		taskErr = service.TaskErrorWrapper(fmt.Errorf(jResp.Message), fmt.Sprintf("%d", jResp.Code), http.StatusInternalServerError)
+		taskErr = service.TaskErrorWrapper(fmt.Errorf("%s", jResp.Message), fmt.Sprintf("%d", jResp.Code), http.StatusInternalServerError)
 		return
 	}
@@ -210,7 +210,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, info *rela
 }
 // FetchTask fetch task status
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -251,7 +251,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 			return nil, errors.Wrap(err, "sign request failed")
 		}
 	}
-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
 	if err != nil {
 		return nil, fmt.Errorf("new proxy http client failed: %w", err)
 	}
 	return client.Do(req)
 }
 func (a *TaskAdaptor) GetModelList() []string {
--- a/relay/channel/task/kling/adaptor.go
+++ b/relay/channel/task/kling/adaptor.go
@@ -186,7 +186,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, info *rela
 		return
 	}
 	if kResp.Code != 0 {
-		taskErr = service.TaskErrorWrapperLocal(fmt.Errorf(kResp.Message), "task_failed", http.StatusBadRequest)
+		taskErr = service.TaskErrorWrapperLocal(fmt.Errorf("%s", kResp.Message), "task_failed", http.StatusBadRequest)
 		return
 	}
 	ov := dto.NewOpenAIVideo()
@@ -199,7 +199,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, info *rela
 }
 // FetchTask fetch task status
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -228,7 +228,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	req.Header.Set("Authorization", "Bearer "+token)
 	req.Header.Set("User-Agent", "kling-sdk/1.0")
-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
 	if err != nil {
 		return nil, fmt.Errorf("new proxy http client failed: %w", err)
 	}
 	return client.Do(req)
 }
 func (a *TaskAdaptor) GetModelList() []string {
--- a/relay/channel/task/sora/adaptor.go
+++ b/relay/channel/task/sora/adaptor.go
@@ -5,8 +5,10 @@ import (
 	"fmt"
 	"io"
 	"net/http"
 	"strings"
 	"github.com/QuantumNous/new-api/common"
 	"github.com/QuantumNous/new-api/constant"
 	"github.com/QuantumNous/new-api/dto"
 	"github.com/QuantumNous/new-api/model"
 	"github.com/QuantumNous/new-api/relay/channel"
@@ -67,11 +69,30 @@ func (a *TaskAdaptor) Init(info *relaycommon.RelayInfo) {
 	a.apiKey = info.ApiKey
 }
 func validateRemixRequest(c *gin.Context) *dto.TaskError {
 	var req struct {
 		Prompt string `json:"prompt"`
 	}
 	if err := common.UnmarshalBodyReusable(c, &req); err != nil {
 		return service.TaskErrorWrapperLocal(err, "invalid_request", http.StatusBadRequest)
 	}
 	if strings.TrimSpace(req.Prompt) == "" {
 		return service.TaskErrorWrapperLocal(fmt.Errorf("field prompt is required"), "invalid_request", http.StatusBadRequest)
 	}
 	return nil
 }
 func (a *TaskAdaptor) ValidateRequestAndSetAction(c *gin.Context, info *relaycommon.RelayInfo) (taskErr *dto.TaskError) {
 	if info.Action == constant.TaskActionRemix {
 		return validateRemixRequest(c)
 	}
 	return relaycommon.ValidateMultipartDirect(c, info)
 }
 func (a *TaskAdaptor) BuildRequestURL(info *relaycommon.RelayInfo) (string, error) {
 	if info.Action == constant.TaskActionRemix {
 		return fmt.Sprintf("%s/v1/videos/%s/remix", a.baseURL, info.OriginTaskID), nil
 	}
 	return fmt.Sprintf("%s/v1/videos", a.baseURL), nil
 }
@@ -125,7 +146,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, _ *relayco
 }
 // FetchTask fetch task status
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -140,7 +161,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	req.Header.Set("Authorization", "Bearer "+key)
-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
 	if err != nil {
 		return nil, fmt.Errorf("new proxy http client failed: %w", err)
 	}
 	return client.Do(req)
 }
 func (a *TaskAdaptor) GetModelList() []string {
--- a/relay/channel/task/suno/adaptor.go
+++ b/relay/channel/task/suno/adaptor.go
@@ -105,7 +105,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, info *rela
 		return
 	}
 	if !sunoResponse.IsSuccess() {
-		taskErr = service.TaskErrorWrapper(fmt.Errorf(sunoResponse.Message), sunoResponse.Code, http.StatusInternalServerError)
+		taskErr = service.TaskErrorWrapper(fmt.Errorf("%s", sunoResponse.Message), sunoResponse.Code, http.StatusInternalServerError)
 		return
 	}
@@ -132,7 +132,7 @@ func (a *TaskAdaptor) GetChannelName() string {
 	return ChannelName
 }
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	requestUrl := fmt.Sprintf("%s/suno/fetch", baseUrl)
 	byteBody, err := json.Marshal(body)
 	if err != nil {
@@ -153,11 +153,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	req = req.WithContext(ctx)
 	req.Header.Set("Content-Type", "application/json")
 	req.Header.Set("Authorization", "Bearer "+key)
-	resp, err := service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
 	if err != nil {
-		return nil, err
+		return nil, fmt.Errorf("new proxy http client failed: %w", err)
 	}
-	return resp, nil
+	return client.Do(req)
 }
 func actionValidate(c *gin.Context, sunoRequest *dto.SunoSubmitReq, action string) (err error) {
--- a/relay/channel/task/vertex/adaptor.go
+++ b/relay/channel/task/vertex/adaptor.go
@@ -12,7 +12,6 @@ import (
 	"github.com/QuantumNous/new-api/common"
 	"github.com/QuantumNous/new-api/model"
 	"github.com/gin-gonic/gin"
 	"github.com/QuantumNous/new-api/constant"
@@ -121,7 +120,11 @@ func (a *TaskAdaptor) BuildRequestHeader(c *gin.Context, req *http.Request, info
 		return fmt.Errorf("failed to decode credentials: %w", err)
 	}
-	token, err := vertexcore.AcquireAccessToken(*adc, "")
+	proxy := ""
 	if info != nil {
 		proxy = info.ChannelSetting.Proxy
 	}
 	token, err := vertexcore.AcquireAccessToken(*adc, proxy)
 	if err != nil {
 		return fmt.Errorf("failed to acquire access token: %w", err)
 	}
@@ -147,13 +150,40 @@ func (a *TaskAdaptor) BuildRequestBody(c *gin.Context, info *relaycommon.RelayIn
 			body.Parameters["storageUri"] = v
 		}
 		if v, ok := req.Metadata["sampleCount"]; ok {
-			body.Parameters["sampleCount"] = v
+			if i, ok := v.(int); ok {
 				body.Parameters["sampleCount"] = i
 			}
 			if f, ok := v.(float64); ok {
 				body.Parameters["sampleCount"] = int(f)
 			}
 		}
 	}
 	if _, ok := body.Parameters["sampleCount"]; !ok {
 		body.Parameters["sampleCount"] = 1
 	}
 	if body.Parameters["sampleCount"].(int) <= 0 {
 		return nil, fmt.Errorf("sampleCount must be greater than 0")
 	}
 	// if req.Duration > 0 {
 	// 	body.Parameters["durationSeconds"] = req.Duration
 	// } else if req.Seconds != "" {
 	// 	seconds, err := strconv.Atoi(req.Seconds)
 	// 	if err != nil {
 	// 		return nil, errors.Wrap(err, "convert seconds to int failed")
 	// 	}
 	// 	body.Parameters["durationSeconds"] = seconds
 	// }
 	info.PriceData.OtherRatios = map[string]float64{
 		"sampleCount": float64(body.Parameters["sampleCount"].(int)),
 	}
 	// if v, ok := body.Parameters["durationSeconds"]; ok {
 	// 	info.PriceData.OtherRatios["durationSeconds"] = float64(v.(int))
 	// }
 	data, err := json.Marshal(body)
 	if err != nil {
 		return nil, err
@@ -190,7 +220,7 @@ func (a *TaskAdaptor) GetModelList() []string { return []string{"veo-3.0-generat
 func (a *TaskAdaptor) GetChannelName() string { return "vertex" }
 // FetchTask fetch task status
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -223,7 +253,7 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	if err := json.Unmarshal([]byte(key), adc); err != nil {
 		return nil, fmt.Errorf("failed to decode credentials: %w", err)
 	}
-	token, err := vertexcore.AcquireAccessToken(*adc, "")
+	token, err := vertexcore.AcquireAccessToken(*adc, proxy)
 	if err != nil {
 		return nil, fmt.Errorf("failed to acquire access token: %w", err)
 	}
@@ -235,7 +265,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	req.Header.Set("Accept", "application/json")
 	req.Header.Set("Authorization", "Bearer "+token)
 	req.Header.Set("x-goog-user-project", adc.ProjectID)
-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
 	if err != nil {
 		return nil, fmt.Errorf("new proxy http client failed: %w", err)
 	}
 	return client.Do(req)
 }
 func (a *TaskAdaptor) ParseTaskResult(respBody []byte) (*relaycommon.TaskInfo, error) {
--- a/relay/channel/task/vidu/adaptor.go
+++ b/relay/channel/task/vidu/adaptor.go
@@ -188,7 +188,7 @@ func (a *TaskAdaptor) DoResponse(c *gin.Context, resp *http.Response, info *rela
 	return vResp.TaskId, responseBody, nil
 }
-func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http.Response, error) {
+func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any, proxy string) (*http.Response, error) {
 	taskID, ok := body["task_id"].(string)
 	if !ok {
 		return nil, fmt.Errorf("invalid task_id")
@@ -204,7 +204,11 @@ func (a *TaskAdaptor) FetchTask(baseUrl, key string, body map[string]any) (*http
 	req.Header.Set("Accept", "application/json")
 	req.Header.Set("Authorization", "Token "+key)
-	return service.GetHttpClient().Do(req)
+	client, err := service.GetHttpClientWithProxy(proxy)
 	if err != nil {
 		return nil, fmt.Errorf("new proxy http client failed: %w", err)
 	}
 	return client.Do(req)
 }
 func (a *TaskAdaptor) GetModelList() []string {
--- a/relay/channel/tencent/relay-tencent.go
+++ b/relay/channel/tencent/relay-tencent.go
@@ -105,7 +105,7 @@ func tencentStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *htt
 		data = strings.TrimPrefix(data, "data:")
 		var tencentResponse TencentChatResponse
-		err := json.Unmarshal([]byte(data), &tencentResponse)
+		err := common.Unmarshal([]byte(data), &tencentResponse)
 		if err != nil {
 			common.SysLog("error unmarshalling stream response: " + err.Error())
 			continue
@@ -130,7 +130,7 @@ func tencentStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *htt
 	service.CloseResponseBodyGracefully(resp)
-	return service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.PromptTokens), nil
+	return service.ResponseText2Usage(c, responseText, info.UpstreamModelName, info.GetEstimatePromptTokens()), nil
 }
 func tencentHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Response) (*dto.Usage, *types.NewAPIError) {
--- a/relay/channel/vertex/adaptor.go
+++ b/relay/channel/vertex/adaptor.go
@@ -17,6 +17,7 @@ import (
 	relaycommon "github.com/QuantumNous/new-api/relay/common"
 	"github.com/QuantumNous/new-api/relay/constant"
 	"github.com/QuantumNous/new-api/setting/model_setting"
 	"github.com/QuantumNous/new-api/setting/reasoning"
 	"github.com/QuantumNous/new-api/types"
 	"github.com/gin-gonic/gin"
@@ -50,10 +51,43 @@ type Adaptor struct {
 }
 func (a *Adaptor) ConvertGeminiRequest(c *gin.Context, info *relaycommon.RelayInfo, request *dto.GeminiChatRequest) (any, error) {
 	// Vertex AI does not support functionResponse.id; keep it stripped here for consistency.
 	if model_setting.GetGeminiSettings().RemoveFunctionResponseIdEnabled {
 		removeFunctionResponseID(request)
 	}
 	geminiAdaptor := gemini.Adaptor{}
 	return geminiAdaptor.ConvertGeminiRequest(c, info, request)
 }
 func removeFunctionResponseID(request *dto.GeminiChatRequest) {
 	if request == nil {
 		return
 	}
 	if len(request.Contents) > 0 {
 		for i := range request.Contents {
 			if len(request.Contents[i].Parts) == 0 {
 				continue
 			}
 			for j := range request.Contents[i].Parts {
 				part := &request.Contents[i].Parts[j]
 				if part.FunctionResponse == nil {
 					continue
 				}
 				if len(part.FunctionResponse.ID) > 0 {
 					part.FunctionResponse.ID = nil
 				}
 			}
 		}
 	}
 	if len(request.Requests) > 0 {
 		for i := range request.Requests {
 			removeFunctionResponseID(&request.Requests[i])
 		}
 	}
 }
 func (a *Adaptor) ConvertClaudeRequest(c *gin.Context, info *relaycommon.RelayInfo, request *dto.ClaudeRequest) (any, error) {
 	if v, ok := claudeModelMap[info.UpstreamModelName]; ok {
 		c.Set("request_model", v)
@@ -181,6 +215,8 @@ func (a *Adaptor) GetRequestURL(info *relaycommon.RelayInfo) (string, error) {
 				info.UpstreamModelName = strings.TrimSuffix(info.UpstreamModelName, "-thinking")
 			} else if strings.HasSuffix(info.UpstreamModelName, "-nothinking") {
 				info.UpstreamModelName = strings.TrimSuffix(info.UpstreamModelName, "-nothinking")
 			} else if baseModel, level, ok := reasoning.TrimEffortSuffix(info.UpstreamModelName); ok && level != "" {
 				info.UpstreamModelName = baseModel
 			}
 		}
--- a/relay/channel/volcengine/tts.go
+++ b/relay/channel/volcengine/tts.go
@@ -184,9 +184,9 @@ func handleTTSResponse(c *gin.Context, resp *http.Response, info *relaycommon.Re
 	c.Data(http.StatusOK, contentType, audioData)
 	usage = &dto.Usage{
-		PromptTokens:     info.PromptTokens,
+		PromptTokens:     info.GetEstimatePromptTokens(),
 		CompletionTokens: 0,
-		TotalTokens:      info.PromptTokens,
+		TotalTokens:      info.GetEstimatePromptTokens(),
 	}
 	return usage, nil
@@ -284,9 +284,9 @@ func handleTTSWebSocketResponse(c *gin.Context, requestURL string, volcRequest V
 			if msg.Sequence < 0 {
 				c.Status(http.StatusOK)
 				usage = &dto.Usage{
-					PromptTokens:     info.PromptTokens,
+					PromptTokens:     info.GetEstimatePromptTokens(),
 					CompletionTokens: 0,
-					TotalTokens:      info.PromptTokens,
+					TotalTokens:      info.GetEstimatePromptTokens(),
 				}
 				return usage, nil
 			}
@@ -297,9 +297,9 @@ func handleTTSWebSocketResponse(c *gin.Context, requestURL string, volcRequest V
 	c.Status(http.StatusOK)
 	usage = &dto.Usage{
-		PromptTokens:     info.PromptTokens,
+		PromptTokens:     info.GetEstimatePromptTokens(),
 		CompletionTokens: 0,
-		TotalTokens:      info.PromptTokens,
+		TotalTokens:      info.GetEstimatePromptTokens(),
 	}
 	return usage, nil
 }
--- a/relay/channel/xai/text.go
+++ b/relay/channel/xai/text.go
@@ -70,7 +70,7 @@ func xAIStreamHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Re
 	})
 	if !containStreamUsage {
-		usage = service.ResponseText2Usage(c, responseTextBuilder.String(), info.UpstreamModelName, info.PromptTokens)
+		usage = service.ResponseText2Usage(c, responseTextBuilder.String(), info.UpstreamModelName, info.GetEstimatePromptTokens())
 		usage.CompletionTokens += toolCount * 7
 	}
--- a/relay/channel/zhipu_4v/adaptor.go
+++ b/relay/channel/zhipu_4v/adaptor.go
@@ -36,8 +36,7 @@ func (a *Adaptor) ConvertAudioRequest(c *gin.Context, info *relaycommon.RelayInf
 }
 func (a *Adaptor) ConvertImageRequest(c *gin.Context, info *relaycommon.RelayInfo, request dto.ImageRequest) (any, error) {
-	//TODO implement me
+	return request, nil
 	return nil, errors.New("not implemented")
 }
 func (a *Adaptor) Init(info *relaycommon.RelayInfo) {
@@ -63,6 +62,8 @@ func (a *Adaptor) GetRequestURL(info *relaycommon.RelayInfo) (string, error) {
 				return fmt.Sprintf("%s/embeddings", specialPlan.OpenAIBaseURL), nil
 			}
 			return fmt.Sprintf("%s/api/paas/v4/embeddings", baseURL), nil
 		case relayconstant.RelayModeImagesGenerations:
 			return fmt.Sprintf("%s/api/paas/v4/images/generations", baseURL), nil
 		default:
 			if hasSpecialPlan && specialPlan.OpenAIBaseURL != "" {
 				return fmt.Sprintf("%s/chat/completions", specialPlan.OpenAIBaseURL), nil
@@ -114,6 +115,9 @@ func (a *Adaptor) DoResponse(c *gin.Context, resp *http.Response, info *relaycom
 			return claude.ClaudeHandler(c, resp, info, claude.RequestModeMessage)
 		}
 	default:
 		if info.RelayMode == relayconstant.RelayModeImagesGenerations {
 			return zhipu4vImageHandler(c, resp, info)
 		}
 		adaptor := openai.Adaptor{}
 		return adaptor.DoResponse(c, resp, info)
 	}
--- a/relay/channel/zhipu_4v/dto.go
+++ b/relay/channel/zhipu_4v/dto.go
@@ -4,6 +4,7 @@ import (
 	"time"
 	"github.com/QuantumNous/new-api/dto"
 	"github.com/QuantumNous/new-api/types"
 )
 //	type ZhipuMessage struct {
@@ -37,7 +38,7 @@ type ZhipuV4Response struct {
 	Model               string                         `json:"model"`
 	TextResponseChoices []dto.OpenAITextResponseChoice `json:"choices"`
 	Usage               dto.Usage                      `json:"usage"`
-	Error               dto.OpenAIError                `json:"error"`
+	Error               types.OpenAIError              `json:"error"`
 }
 //
--- a/relay/channel/zhipu_4v/image.go
+++ b/relay/channel/zhipu_4v/image.go
@@ -0,0 +1,127 @@
 package zhipu_4v
 import (
 	"io"
 	"net/http"
 	"github.com/QuantumNous/new-api/common"
 	"github.com/QuantumNous/new-api/dto"
 	"github.com/QuantumNous/new-api/logger"
 	relaycommon "github.com/QuantumNous/new-api/relay/common"
 	"github.com/QuantumNous/new-api/service"
 	"github.com/QuantumNous/new-api/types"
 	"github.com/gin-gonic/gin"
 )
 type zhipuImageRequest struct {
 	Model            string `json:"model"`
 	Prompt           string `json:"prompt"`
 	Quality          string `json:"quality,omitempty"`
 	Size             string `json:"size,omitempty"`
 	WatermarkEnabled *bool  `json:"watermark_enabled,omitempty"`
 	UserID           string `json:"user_id,omitempty"`
 }
 type zhipuImageResponse struct {
 	Created       *int64            `json:"created,omitempty"`
 	Data          []zhipuImageData  `json:"data,omitempty"`
 	ContentFilter any               `json:"content_filter,omitempty"`
 	Usage         *dto.Usage        `json:"usage,omitempty"`
 	Error         *zhipuImageError  `json:"error,omitempty"`
 	RequestID     string            `json:"request_id,omitempty"`
 	ExtendParam   map[string]string `json:"extendParam,omitempty"`
 }
 type zhipuImageError struct {
 	Code    string `json:"code"`
 	Message string `json:"message"`
 }
 type zhipuImageData struct {
 	Url      string `json:"url,omitempty"`
 	ImageUrl string `json:"image_url,omitempty"`
 	B64Json  string `json:"b64_json,omitempty"`
 	B64Image string `json:"b64_image,omitempty"`
 }
 type openAIImagePayload struct {
 	Created int64             `json:"created"`
 	Data    []openAIImageData `json:"data"`
 }
 type openAIImageData struct {
 	B64Json string `json:"b64_json"`
 }
 func zhipu4vImageHandler(c *gin.Context, resp *http.Response, info *relaycommon.RelayInfo) (*dto.Usage, *types.NewAPIError) {
 	responseBody, err := io.ReadAll(resp.Body)
 	if err != nil {
 		return nil, types.NewOpenAIError(err, types.ErrorCodeReadResponseBodyFailed, http.StatusInternalServerError)
 	}
 	service.CloseResponseBodyGracefully(resp)
 	var zhipuResp zhipuImageResponse
 	if err := common.Unmarshal(responseBody, &zhipuResp); err != nil {
 		return nil, types.NewOpenAIError(err, types.ErrorCodeBadResponseBody, http.StatusInternalServerError)
 	}
 	if zhipuResp.Error != nil && zhipuResp.Error.Message != "" {
 		return nil, types.WithOpenAIError(types.OpenAIError{
 			Message: zhipuResp.Error.Message,
 			Type:    "zhipu_image_error",
 			Code:    zhipuResp.Error.Code,
 		}, resp.StatusCode)
 	}
 	payload := openAIImagePayload{}
 	if zhipuResp.Created != nil && *zhipuResp.Created != 0 {
 		payload.Created = *zhipuResp.Created
 	} else {
 		payload.Created = info.StartTime.Unix()
 	}
 	for _, data := range zhipuResp.Data {
 		url := data.Url
 		if url == "" {
 			url = data.ImageUrl
 		}
 		if url == "" {
 			logger.LogWarn(c, "zhipu_image_missing_url")
 			continue
 		}
 		var b64 string
 		switch {
 		case data.B64Json != "":
 			b64 = data.B64Json
 		case data.B64Image != "":
 			b64 = data.B64Image
 		default:
 			_, downloaded, err := service.GetImageFromUrl(url)
 			if err != nil {
 				logger.LogError(c, "zhipu_image_get_b64_failed: "+err.Error())
 				continue
 			}
 			b64 = downloaded
 		}
 		if b64 == "" {
 			logger.LogWarn(c, "zhipu_image_empty_b64")
 			continue
 		}
 		imageData := openAIImageData{
 			B64Json: b64,
 		}
 		payload.Data = append(payload.Data, imageData)
 	}
 	jsonResp, err := common.Marshal(payload)
 	if err != nil {
 		return nil, types.NewError(err, types.ErrorCodeBadResponseBody)
 	}
 	service.IOCopyBytesGracefully(c, resp, jsonResp)
 	return &dto.Usage{}, nil
 }
--- a/relay/common/override.go
+++ b/relay/common/override.go
@@ -11,6 +11,8 @@ import (
 	"github.com/tidwall/sjson"
 )
 var negativeIndexRegexp = regexp.MustCompile(`\.(-\d+)`)
 type ConditionOperation struct {
 	Path           string      `json:"path"`             // JSON路径
 	Mode           string      `json:"mode"`             // full, prefix, suffix, contains, gt, gte, lt, lte
@@ -186,8 +188,7 @@ func checkSingleCondition(jsonStr, contextJSON string, condition ConditionOperat
 }
 func processNegativeIndex(jsonStr string, path string) string {
-	re := regexp.MustCompile(`\.(-\d+)`)
+	matches := negativeIndexRegexp.FindAllStringSubmatch(path, -1)
 	matches := re.FindAllStringSubmatch(path, -1)
 	if len(matches) == 0 {
 		return path
--- a/relay/common/relay_info.go
+++ b/relay/common/relay_info.go
@@ -11,6 +11,7 @@ import (
 	"github.com/QuantumNous/new-api/constant"
 	"github.com/QuantumNous/new-api/dto"
 	relayconstant "github.com/QuantumNous/new-api/relay/constant"
 	"github.com/QuantumNous/new-api/setting/model_setting"
 	"github.com/QuantumNous/new-api/types"
 	"github.com/gin-gonic/gin"
@@ -73,11 +74,17 @@ type ChannelMeta struct {
 	SupportStreamOptions bool // 是否支持流式选项
 }
 type TokenCountMeta struct {
 	//promptTokens int
 	estimatePromptTokens int
 }
 type RelayInfo struct {
 	TokenId           int
 	TokenKey          string
 	TokenGroup        string
 	UserId            int
-	UsingGroup        string // 使用的分组
+	UsingGroup        string // 使用的分组，当auto跨分组重试时，会变动
 	UserGroup         string // 用户所在分组
 	TokenUnlimited    bool
 	StartTime         time.Time
@@ -91,7 +98,6 @@ type RelayInfo struct {
 	RelayMode              int
 	OriginModelName        string
 	RequestURLPath         string
 	PromptTokens           int
 	ShouldIncludeUsage     bool
 	DisablePing            bool // 是否禁止向下游发送自定义 Ping
 	ClientWs               *websocket.Conn
@@ -115,6 +121,7 @@ type RelayInfo struct {
 	Request dto.Request
 	ThinkingContentInfo
 	TokenCountMeta
 	*ClaudeConvertInfo
 	*RerankerInfo
 	*ResponsesUsageInfo
@@ -189,7 +196,7 @@ func (info *RelayInfo) ToString() string {
 	fmt.Fprintf(b, "IsPlayground: %t, ", info.IsPlayground)
 	fmt.Fprintf(b, "RequestURLPath: %q, ", info.RequestURLPath)
 	fmt.Fprintf(b, "OriginModelName: %q, ", info.OriginModelName)
-	fmt.Fprintf(b, "PromptTokens: %d, ", info.PromptTokens)
+	fmt.Fprintf(b, "EstimatePromptTokens: %d, ", info.estimatePromptTokens)
 	fmt.Fprintf(b, "ShouldIncludeUsage: %t, ", info.ShouldIncludeUsage)
 	fmt.Fprintf(b, "DisablePing: %t, ", info.DisablePing)
 	fmt.Fprintf(b, "SendResponseCount: %d, ", info.SendResponseCount)
@@ -368,6 +375,12 @@ func genBaseRelayInfo(c *gin.Context, request dto.Request) *RelayInfo {
 	//channelId := common.GetContextKeyInt(c, constant.ContextKeyChannelId)
 	//paramOverride := common.GetContextKeyStringMap(c, constant.ContextKeyChannelParamOverride)
 	tokenGroup := common.GetContextKeyString(c, constant.ContextKeyTokenGroup)
 	// 当令牌分组为空时，表示使用用户分组
 	if tokenGroup == "" {
 		tokenGroup = common.GetContextKeyString(c, constant.ContextKeyUserGroup)
 	}
 	startTime := common.GetContextKeyTime(c, constant.ContextKeyRequestStartTime)
 	if startTime.IsZero() {
 		startTime = time.Now()
@@ -391,11 +404,11 @@ func genBaseRelayInfo(c *gin.Context, request dto.Request) *RelayInfo {
 		UserEmail:  common.GetContextKeyString(c, constant.ContextKeyUserEmail),
 		OriginModelName: common.GetContextKeyString(c, constant.ContextKeyOriginalModel),
 		PromptTokens:    common.GetContextKeyInt(c, constant.ContextKeyPromptTokens),
 		TokenId:        common.GetContextKeyInt(c, constant.ContextKeyTokenId),
 		TokenKey:       common.GetContextKeyString(c, constant.ContextKeyTokenKey),
 		TokenUnlimited: common.GetContextKeyBool(c, constant.ContextKeyTokenUnlimited),
 		TokenGroup:     tokenGroup,
 		isFirstResponse: true,
 		RelayMode:       relayconstant.Path2RelayMode(c.Request.URL.Path),
@@ -408,6 +421,10 @@ func genBaseRelayInfo(c *gin.Context, request dto.Request) *RelayInfo {
 			IsFirstThinkingContent:  true,
 			SendLastThinkingContent: false,
 		},
 		TokenCountMeta: TokenCountMeta{
 			//promptTokens: common.GetContextKeyInt(c, constant.ContextKeyPromptTokens),
 			estimatePromptTokens: common.GetContextKeyInt(c, constant.ContextKeyEstimatedTokens),
 		},
 	}
 	if info.RelayMode == relayconstant.RelayModeUnknown {
@@ -463,8 +480,16 @@ func GenRelayInfo(c *gin.Context, relayFormat types.RelayFormat, request dto.Req
 	}
 }
-func (info *RelayInfo) SetPromptTokens(promptTokens int) {
+//func (info *RelayInfo) SetPromptTokens(promptTokens int) {
-	info.PromptTokens = promptTokens
+//	info.promptTokens = promptTokens
 //}
 func (info *RelayInfo) SetEstimatePromptTokens(promptTokens int) {
 	info.estimatePromptTokens = promptTokens
 }
 func (info *RelayInfo) GetEstimatePromptTokens() int {
 	return info.estimatePromptTokens
 }
 func (info *RelayInfo) SetFirstResponseTime() {
@@ -610,3 +635,47 @@ func RemoveDisabledFields(jsonData []byte, channelOtherSettings dto.ChannelOther
 	}
 	return jsonDataAfter, nil
 }
 // RemoveGeminiDisabledFields removes disabled fields from Gemini request JSON data
 // Currently supports removing functionResponse.id field which Vertex AI does not support
 func RemoveGeminiDisabledFields(jsonData []byte) ([]byte, error) {
 	if !model_setting.GetGeminiSettings().RemoveFunctionResponseIdEnabled {
 		return jsonData, nil
 	}
 	var data map[string]interface{}
 	if err := common.Unmarshal(jsonData, &data); err != nil {
 		common.SysError("RemoveGeminiDisabledFields Unmarshal error: " + err.Error())
 		return jsonData, nil
 	}
 	// Process contents array
 	// Handle both camelCase (functionResponse) and snake_case (function_response)
 	if contents, ok := data["contents"].([]interface{}); ok {
 		for _, content := range contents {
 			if contentMap, ok := content.(map[string]interface{}); ok {
 				if parts, ok := contentMap["parts"].([]interface{}); ok {
 					for _, part := range parts {
 						if partMap, ok := part.(map[string]interface{}); ok {
 							// Check functionResponse (camelCase)
 							if funcResp, ok := partMap["functionResponse"].(map[string]interface{}); ok {
 								delete(funcResp, "id")
 							}
 							// Check function_response (snake_case)
 							if funcResp, ok := partMap["function_response"].(map[string]interface{}); ok {
 								delete(funcResp, "id")
 							}
 						}
 					}
 				}
 			}
 		}
 	}
 	jsonDataAfter, err := common.Marshal(data)
 	if err != nil {
 		common.SysError("RemoveGeminiDisabledFields Marshal error: " + err.Error())
 		return jsonData, nil
 	}
 	return jsonDataAfter, nil
 }
--- a/relay/common/relay_utils.go
+++ b/relay/common/relay_utils.go
@@ -1,10 +1,7 @@
 package common
 import (
 	"encoding/base64"
 	"errors"
 	"fmt"
 	"io"
 	"net/http"
 	"strconv"
 	"strings"
@@ -229,54 +226,3 @@ func ValidateBasicTaskRequest(c *gin.Context, info *RelayInfo, action string) *d
 	storeTaskRequest(c, info, action, req)
 	return nil
 }
 func GetImagesBase64sFromForm(c *gin.Context) ([]*Base64Data, error) {
 	return GetBase64sFromForm(c, "image")
 }
 func GetImageBase64sFromForm(c *gin.Context) (*Base64Data, error) {
 	base64s, err := GetImagesBase64sFromForm(c)
 	if err != nil {
 		return nil, err
 	}
 	return base64s[0], nil
 }
 type Base64Data struct {
 	MimeType string
 	Data     string
 }
 func (m Base64Data) String() string {
 	return fmt.Sprintf("data:%s;base64,%s", m.MimeType, m.Data)
 }
 func GetBase64sFromForm(c *gin.Context, fieldName string) ([]*Base64Data, error) {
 	mf := c.Request.MultipartForm
 	if mf == nil {
 		if _, err := c.MultipartForm(); err != nil {
 			return nil, fmt.Errorf("failed to parse image edit form request: %w", err)
 		}
 		mf = c.Request.MultipartForm
 	}
 	imageFiles, exists := mf.File[fieldName]
 	if !exists || len(imageFiles) == 0 {
 		return nil, errors.New("field " + fieldName + " is not found or empty")
 	}
 	var imageBase64s []*Base64Data
 	for _, file := range imageFiles {
 		image, err := file.Open()
 		if err != nil {
 			return nil, errors.New("failed to open image file")
 		}
 		defer image.Close()
 		imageData, err := io.ReadAll(image)
 		if err != nil {
 			return nil, errors.New("failed to read image file")
 		}
 		mimeType := http.DetectContentType(imageData)
 		base64Data := base64.StdEncoding.EncodeToString(imageData)
 		imageBase64s = append(imageBase64s, &Base64Data{
 			MimeType: mimeType,
 			Data:     base64Data,
 		})
 	}
 	return imageBase64s, nil
 }
--- a/relay/common_handler/rerank.go
+++ b/relay/common_handler/rerank.go
@@ -57,8 +57,8 @@ func RerankHandler(c *gin.Context, info *relaycommon.RelayInfo, resp *http.Respo
 		jinaResp = dto.RerankResponse{
 			Results: jinaRespResults,
 			Usage: dto.Usage{
-				PromptTokens: info.PromptTokens,
+				PromptTokens: info.GetEstimatePromptTokens(),
-				TotalTokens:  info.PromptTokens,
+				TotalTokens:  info.GetEstimatePromptTokens(),
 			},
 		}
 	} else {
--- a/relay/compatible_handler.go
+++ b/relay/compatible_handler.go
@@ -181,7 +181,7 @@ func TextHelper(c *gin.Context, info *relaycommon.RelayInfo) (newAPIError *types
 		return newApiErr
 	}
-	if strings.HasPrefix(info.OriginModelName, "gpt-4o-audio") {
+	if usage.(*dto.Usage).CompletionTokenDetails.AudioTokens > 0 || usage.(*dto.Usage).PromptTokensDetails.AudioTokens > 0 {
 		service.PostAudioConsumeQuota(c, info, usage.(*dto.Usage), "")
 	} else {
 		postConsumeQuota(c, info, usage.(*dto.Usage), "")
@@ -192,9 +192,9 @@ func TextHelper(c *gin.Context, info *relaycommon.RelayInfo) (newAPIError *types
 func postConsumeQuota(ctx *gin.Context, relayInfo *relaycommon.RelayInfo, usage *dto.Usage, extraContent string) {
 	if usage == nil {
 		usage = &dto.Usage{
-			PromptTokens:     relayInfo.PromptTokens,
+			PromptTokens:     relayInfo.GetEstimatePromptTokens(),
 			CompletionTokens: 0,
-			TotalTokens:      relayInfo.PromptTokens,
+			TotalTokens:      relayInfo.GetEstimatePromptTokens(),
 		}
 		extraContent += "（可能是请求出错）"
 	}
--- a/relay/helper/common.go
+++ b/relay/helper/common.go
@@ -14,15 +14,28 @@ import (
 	"github.com/gorilla/websocket"
 )
-func FlushWriter(c *gin.Context) error {
+func FlushWriter(c *gin.Context) (err error) {
-	if c.Writer == nil {
+	defer func() {
 		if r := recover(); r != nil {
 			err = fmt.Errorf("flush panic recovered: %v", r)
 		}
 	}()
 	if c == nil || c.Writer == nil {
 		return nil
 	}
-	if flusher, ok := c.Writer.(http.Flusher); ok {
+
-		flusher.Flush()
+	if c.Request != nil && c.Request.Context().Err() != nil {
-		return nil
+		return fmt.Errorf("request context done: %w", c.Request.Context().Err())
 	}
-	return errors.New("streaming error: flusher not found")
+
 	flusher, ok := c.Writer.(http.Flusher)
 	if !ok {
 		return errors.New("streaming error: flusher not found")
 	}
 	flusher.Flush()
 	return nil
 }
 func SetEventStreamHeaders(c *gin.Context) {
@@ -66,17 +79,31 @@ func ResponseChunkData(c *gin.Context, resp dto.ResponsesStreamResponse, data st
 }
 func StringData(c *gin.Context, str string) error {
-	//str = strings.TrimPrefix(str, "data: ")
+	if c == nil || c.Writer == nil {
-	//str = strings.TrimSuffix(str, "\r")
+		return errors.New("context or writer is nil")
 	}
 	if c.Request != nil && c.Request.Context().Err() != nil {
 		return fmt.Errorf("request context done: %w", c.Request.Context().Err())
 	}
 	c.Render(-1, common.CustomEvent{Data: "data: " + str})
-	_ = FlushWriter(c)
+	return FlushWriter(c)
 	return nil
 }
 func PingData(c *gin.Context) error {
-	c.Writer.Write([]byte(": PING\n\n"))
+	if c == nil || c.Writer == nil {
-	_ = FlushWriter(c)
+		return errors.New("context or writer is nil")
-	return nil
+	}
 	if c.Request != nil && c.Request.Context().Err() != nil {
 		return fmt.Errorf("request context done: %w", c.Request.Context().Err())
 	}
 	if _, err := c.Writer.Write([]byte(": PING\n\n")); err != nil {
 		return fmt.Errorf("write ping data failed: %w", err)
 	}
 	return FlushWriter(c)
 }
 func ObjectData(c *gin.Context, object interface{}) error {
--- a/relay/helper/price.go
+++ b/relay/helper/price.go
@@ -99,7 +99,10 @@ func ModelPriceHelper(c *gin.Context, info *relaycommon.RelayInfo, promptTokens
 	// check if free model pre-consume is disabled
 	if !operation_setting.GetQuotaSetting().EnableFreeModelPreConsume {
 		// if model price or ratio is 0, do not pre-consume quota
-		if usePrice {
+		if groupRatioInfo.GroupRatio == 0 {
 			preConsumedQuota = 0
 			freeModel = true
 		} else if usePrice {
 			if modelPrice == 0 {
 				preConsumedQuota = 0
 				freeModel = true
--- a/relay/helper/stream_scanner.go
+++ b/relay/helper/stream_scanner.go
@@ -72,6 +72,8 @@ func StreamScannerHandler(c *gin.Context, resp *http.Response, info *relaycommon
 	if common.DebugEnabled {
 		// print timeout and ping interval for debugging
 		println("relay timeout seconds:", common.RelayTimeout)
 		println("relay max idle conns:", common.RelayMaxIdleConns)
 		println("relay max idle conns per host:", common.RelayMaxIdleConnsPerHost)
 		println("streaming timeout seconds:", int64(streamingTimeout.Seconds()))
 		println("ping interval seconds:", int64(pingInterval.Seconds()))
 	}
--- a/Show More
+++ b/Show More
Author	SHA1	Message	Date
coderabbitai[bot]	40a3e19a78	📝 Add docstrings to `fix/channel-test-responses-fallback` Docstrings generation was requested by @FlowerRealm. * https://github.com/QuantumNous/new-api/pull/2501#issuecomment-3686382220 The following files were modified: * `controller/channel-test.go` * `relay/helper/valid_request.go` * `service/error.go`	2025-12-23 11:56:30 +00:00
CaIon	42109c5840	feat(token): enhance error handling in ValidateUserToken for better clarity	2025-12-22 18:01:38 +08:00
Calcium-Ion	afd9c29ace	Merge pull request #2486 from QuantumNous/docs/readme-update-doc-links-new-routing 🔗 docs(readme): update documentation links to new site routing	2025-12-21 21:28:35 +08:00
t0ng7u	470e0304d8	🔗 docs(readme): revert missing docs links to legacy site Keep new-site links (/{lang}/docs/...) where matching pages exist in the current docs repo Revert links that have no equivalent in the new docs to the legacy paths on doc.newapi.pro: Google Gemini Chat Midjourney-Proxy image docs Suno music docs Apply the same rule consistently across all README translations (zh/en/ja/fr)	2025-12-21 21:18:59 +08:00
t0ng7u	d6e97ab184	🔗 docs(readme): update documentation links to new site routing - Replace legacy `docs.newapi.pro` paths with the new `/{lang}/docs/...` structure across all README translations - Point key sections (installation, env vars, API, support, features) to their new locations - Ensure language-specific links use the correct locale prefix (zh/en/ja) and keep FR aligned with English routes	2025-12-21 21:00:33 +08:00
Calcium-Ion	d8aa327f05	Merge pull request #2483 from seefs001/fix/vertex-function-response-id fix: 模型设置增加针对Vertex渠道过滤content[].part[].functionResponse.id的选项，默认启用	2025-12-21 17:24:07 +08:00
Seefs	28f7a4feef	fix: 在Vertex Adapter过滤content[].part[].functionResponse.id	2025-12-21 17:22:04 +08:00
Seefs	5a64ae2a29	fix: 模型设置增加针对Vertex渠道过滤content[].part[].functionResponse.id的选项，默认启用	2025-12-21 17:09:49 +08:00
CaIon	cc3ba39e72	feat(gin): improve request body handling and error reporting	2025-12-20 13:34:10 +08:00
CaIon	4ee595c448	feat(init): increase MaxRequestBodyMB to enhance request handling	2025-12-20 13:27:55 +08:00
CaIon	d9634ad2d3	feat(channel): add error handling for SaveWithoutKey when channel ID is 0	2025-12-20 13:26:40 +08:00
Seefs	a343ce84ee	Merge pull request #2476 from TinsFox/chore/code-inspector-plugin	2025-12-20 11:04:40 +08:00
TinsFox	e6ec551fbf	chore: add code-inspector-plugin integration	2025-12-19 23:04:53 +08:00
Seefs	a98aad2501	Merge pull request #2474 from TinsFox/main	2025-12-19 21:39:56 +08:00
TinsFox	97132de2ca	style: add card spacing	2025-12-19 21:00:31 +08:00
Seefs	b35ae9f693	Merge pull request #2452 from QuantumNous/fix/oom-request-body-limit	2025-12-16 18:21:59 +08:00
t0ng7u	8cb56fc319	🧹 fix: harden request-body size handling and error unwrapping Tighten oversized request handling across relay paths and make error matching reliable. - Align `MAX_REQUEST_BODY_MB` fallback to `32` in request body reader and decompression middleware - Stop ignoring `GetRequestBody` errors in relay retry paths; return consistent 413 on oversized bodies (400 for other read errors) - Add `Unwrap()` to `types.NewAPIError` so `errors.Is/As` can match wrapped underlying errors - `go test ./...` passes	2025-12-16 18:10:00 +08:00
t0ng7u	8e3f9b1faa	🛡️ fix: prevent OOM on large/decompressed requests; skip heavy prompt meta when token count is disabled Clamp request body size (including post-decompression) to avoid memory exhaustion caused by huge payloads/zip bombs, especially with large-context Claude requests. Add a configurable `MAX_REQUEST_BODY_MB` (default `32`) and document it. - Enforce max request body size after gzip/br decompression via `http.MaxBytesReader` - Add a secondary size guard in `common.GetRequestBody` and cache-safe handling - Return 413 Request Entity Too Large on oversized bodies in relay entry - Avoid building large `TokenCountMeta.CombineText` when both token counting and sensitive check are disabled (use lightweight meta for pricing) - Update READMEs (CN/EN/FR/JA) with `MAX_REQUEST_BODY_MB` - Fix a handful of vet/formatting issues encountered during the change - `go test ./...` passes	2025-12-16 17:00:19 +08:00
Calcium-Ion	11593bd3da	Merge pull request #2445 from QuantumNous/feat/token-ip-whitelist-cidr feat(auth): enhance IP restriction handling with CIDR support	2025-12-15 20:14:09 +08:00
CaIon	e16e7d6fb9	feat(auth): refactor IP restriction handling to use clearer variable naming	2025-12-15 20:13:09 +08:00
CaIon	39593052b6	feat(auth): enhance IP restriction handling with CIDR support	2025-12-15 17:24:09 +08:00
CaIon	4ea8cbd207	Revert "feat(audio): replace SysLog with logger for improved logging in GetAudioDuration" This reverts commit `e293be0138`.	2025-12-14 00:04:40 +08:00
CaIon	e293be0138	feat(audio): replace SysLog with logger for improved logging in GetAudioDuration	2025-12-13 23:59:58 +08:00
CaIon	9c2483ef48	fix(audio): improve WAV duration calculation with enhanced PCM size handling	2025-12-13 23:57:32 +08:00
CaIon	689c43143b	feat(model_ratio): add default ratios for gpt-4o-mini-tts	2025-12-13 19:14:27 +08:00
CaIon	a2da6a9e90	refactor(channel_select): improve retry logic with reset functionality	2025-12-13 18:09:10 +08:00
Calcium-Ion	7a307e2e99	Merge pull request #2434 from QuantumNous/feat/gpt-4o-mini-tts feat: support gpt tts series model quota calculate	2025-12-13 17:55:16 +08:00
CaIon	7cae4a640b	fix(audio): correct TotalTokens calculation for accurate usage reporting	2025-12-13 17:49:57 +08:00
CaIon	e36e2e1b69	feat(audio): enhance audio request handling with token type detection and streaming support	2025-12-13 17:24:23 +08:00
CaIon	b602843ce1	feat(token): add CrossGroupRetry field to token insertion	2025-12-13 16:45:42 +08:00
CaIon	21fca238bf	refactor(error): replace dto.OpenAIError with types.OpenAIError for consistency	2025-12-13 16:43:57 +08:00
CaIon	c51936e068	refactor(channel_select): enhance retry logic and context key usage for channel selection	2025-12-13 16:43:38 +08:00
CaIon	b58fa3debc	fix(helper): improve error handling in FlushWriter and related functions	2025-12-13 13:29:21 +08:00
CaIon	1c167c1068	refactor(auth): replace direct token group setting with context key retrieval	2025-12-13 01:38:12 +08:00
Calcium-Ion	f9b6e4c243	Merge pull request #2430 from QuantumNous/fix/cross-group-retry fix(channel_select): adjust priority retry logic for cross-group	2025-12-13 01:05:40 +08:00
CaIon	b523f6a0ba	fix(channel_select): adjust priority retry logic for cross-group channel selection	2025-12-13 01:04:10 +08:00
Calcium-Ion	30cb224793	Merge pull request #2429 from QuantumNous/feat/xhigh feat(adaptor): add '-xhigh' suffix to reasoning effort options	2025-12-12 22:06:19 +08:00
CaIon	ce6fb95f96	refactor(relay): update channel retrieval to use RelayInfo structure	2025-12-12 22:04:38 +08:00
Calcium-Ion	2ac6a5b02f	Merge pull request #2424 from ion1ze/main fix: correct sender format issues fix #1347	2025-12-12 20:55:22 +08:00
CaIon	50854c17bb	feat(adaptor): add '-xhigh' suffix to reasoning effort options for model parsing	2025-12-12 20:53:48 +08:00
Calcium-Ion	147659fb6e	Merge pull request #2426 from QuantumNous/feat/auto-cross-group-retry feat(token): add cross-group retry option for token processing	2025-12-12 20:45:54 +08:00
Calcium-Ion	e9fb2ccdd1	Merge pull request #2428 from seefs001/fix/health-check fix: health check	2025-12-12 20:45:34 +08:00
Seefs	48a17efade	fix: health check	2025-12-12 20:37:32 +08:00
CaIon	7e1d1350c7	feat: implement cross-group retry functionality and update translations	2025-12-12 18:28:33 +08:00
CaIon	01b4039e96	feat(token): add cross-group retry option for token processing	2025-12-12 17:59:21 +08:00
zdwy5	e1bee48152	fix: 支持aws 通过全局参数透传或者渠道参数透传来调用 (#2423 ) * fix: 支持aws 通过全局参数透传或者渠道参数透传来调用 * fix(aws): replace json.Unmarshal with common.Unmarshal for request body processing --------- Co-authored-by: r0 <liangchunlei@01.ai> Co-authored-by: CaIon <i@caion.me>	2025-12-12 17:09:27 +08:00
zhiheng.wang	c992919d15	fix: correct sender format issues - Adjust sender field format, add space to separate nickname and email address - Ensure email header format complies with standard RFC specifications - Fix potential email client sending exceptions (Tencent Cloud)	2025-12-12 16:19:14 +08:00
Seefs	4e69c98b42	Merge pull request #2412 from seefs001/pr-2372 feat: add openai video remix endpoint	2025-12-11 23:35:23 +08:00
Seefs	ca29fc5702	Merge pull request #2194 from NoahCodeGG/fix/process_channel_error	2025-12-11 18:12:06 +08:00
Calcium-Ion	fca015c6c4	Merge pull request #2397 from seefs001/fix/tool-call-claude fix: try to fix tool call issues	2025-12-09 16:57:24 +08:00
Seefs	23292a5ae9	Merge pull request #2360 from feitianbubu/pr2/fix-price-currency	2025-12-09 14:10:26 +08:00
Calcium-Ion	e346f0bf16	Merge pull request #2398 from seefs001/fix/video-proxy fix: Use channel proxy settings for task query scenarios	2025-12-09 14:05:30 +08:00
Calcium-Ion	cae05c068c	Merge pull request #2396 from seefs001/fix/login fix: Try to fix login error "already logged in" issue	2025-12-09 14:04:48 +08:00
Calcium-Ion	78c10209c0	Merge pull request #2395 from seefs001/fix/siderbar fix: sidebar color overlap	2025-12-09 14:04:26 +08:00
Calcium-Ion	4ffd54c50d	Merge pull request #2394 from seefs001/fix/fetch-model-header-overide fix: fetch upstream models	2025-12-09 14:03:34 +08:00
Calcium-Ion	08466358b2	Merge pull request #2359 from seefs001/fix/qwen-chat-args fix: qwen chat_template_kwargs	2025-12-09 14:01:26 +08:00
Calcium-Ion	5212fbd73d	Merge pull request #2358 from seefs001/fix/regrex-repeat-compile fix: regex repeat compile	2025-12-09 14:01:07 +08:00
Calcium-Ion	b0e120dcab	Merge pull request #2357 from seefs001/feature/go1.25-greengc chore(go): enable greenteagc	2025-12-09 14:00:52 +08:00
Calcium-Ion	9561c7b50f	Merge pull request #2356 from seefs001/feature/zhipiu_4v_image feat: zhipu 4v image generations	2025-12-09 14:00:20 +08:00
Seefs	1cb2b6f882	fix:try to fix tool call issues	2025-12-09 13:55:52 +08:00
Seefs	5889571108	fix: Use channel proxy settings for task query scenarios	2025-12-09 11:15:27 +08:00
Seefs	2e33948842	fix: Add styles only on mobile	2025-12-09 10:46:16 +08:00
Seefs	d1aaa07ad7	fix: Try to fix login error "already logged in" issue	2025-12-08 22:32:45 +08:00
Seefs	ea70c20f8e	fix: sidebar color overlap	2025-12-08 21:25:21 +08:00
Seefs	c7539d11a0	fix: fetch upstream models	2025-12-08 21:14:50 +08:00
Seefs	3ebc713327	Merge pull request #2387 from binorxin/fix-bug fix(go.mod): 更新modernc.org/sqlite依赖项版本	2025-12-08 21:02:18 +08:00
Seefs	72d2a94b0d	Merge pull request #2229 from HynoR/chore/v1 fix: Set default to unsupported value for gpt-5 model series requests	2025-12-08 20:59:30 +08:00
Seefs	12a5c7ce5e	Merge pull request #2368 from oudi/main Increase token name length limit from 30 to 50	2025-12-08 20:48:40 +08:00
Seefs	5eae6a3874	Merge pull request #2375 from FlowerRealm/feat/add-claude-haiku-4-5 feat: add claude-haiku-4-5-20251001 model support	2025-12-08 20:46:02 +08:00
Seefs	7b108a6900	Merge pull request #2388 from FirstMelody/main fix(adaptor): fix reasoning suffix not processing in vertex adapter	2025-12-08 20:45:37 +08:00
borx	3d282ac548	fix(go.mod): 更新modernc.org/sqlite依赖项版本	2025-12-08 01:16:30 +08:00
firstmelody	121746a79e	fix(adaptor): fix reasoning suffix not processing in vertex adapter	2025-12-08 01:12:29 +08:00
FlowerRealm	c3c119a9b4	feat: add claude-haiku-4-5-20251001 model support - Add model to Claude ModelList - Add model ratio (0.5, $1/1M input tokens) - Add completion ratio support (5x, $5/1M output tokens) - Add cache read ratio (0.1, $0.10/1M tokens) - Add cache write ratio (1.25, $1.25/1M tokens) Model specs: - Context window: 200K tokens - Max output: 64K tokens - Release date: October 1, 2025	2025-12-05 18:54:20 +08:00
oudi	6d6e5b3337	Merge pull request #1 from oudi/token-length-patch Increase token name length limit from 30 to 50	2025-12-04 11:21:46 +08:00
oudi	d64205e35a	Increase token name length limit from 30 to 50	2025-12-04 11:18:51 +08:00
CaIon	0b9f6a58bc	feat: 将任务查询数量改为可配置环境变量 TASK_QUERY_LIMIT	2025-12-03 19:27:15 +08:00
feitianbubu	293a5de0f8	feat: update price display use current currency symbol	2025-12-03 10:51:03 +08:00
Seefs	c07347f24f	fix: qwen chat_template_kwargs	2025-12-03 00:47:40 +08:00
Seefs	896e4ac671	fix: regex repeat compile	2025-12-03 00:41:47 +08:00
CaIon	7d1bad1b37	fix(token_counter): correct model name reference in image token estimation	2025-12-03 00:25:05 +08:00
Seefs	8e7be25429	chore(go): enable greenteagc	2025-12-02 23:15:20 +08:00
Seefs	2e37347851	feat: zhipu v4 image generations	2025-12-02 22:56:58 +08:00
CaIon	45556c961f	fix(price): adjust pre-consume quota logic for free models based on group ratio	2025-12-02 22:09:48 +08:00
Calcium-Ion	ffc45a756e	Merge pull request #2344 from seefs001/feature/gemini-thinking-level feat: gemini 3 thinking level gemini-3-pro-preview-high	2025-12-02 21:55:43 +08:00
Calcium-Ion	48635360cd	Merge pull request #2355 from QuantumNous/feat/optimize-token-counter feat: refactor token estimation logic	2025-12-02 21:51:09 +08:00
Calcium-Ion	e7e5cc2c05	Merge pull request #2351 from prnake/fix-max-conns fix: try resolve the high concurrency issue to a single host	2025-12-02 21:44:24 +08:00
CaIon	0c051e968f	feat(token_estimator): add concurrency support for multipliers retrieval	2025-12-02 21:38:58 +08:00
CaIon	f5b409d74f	feat: refactor token estimation logic - Introduced new OpenAI text models in `common/model.go`. - Added `IsOpenAITextModel` function to check for OpenAI text models. - Refactored token estimation methods across various channels to use estimated prompt tokens instead of direct prompt token counts. - Updated related functions and structures to accommodate the new token estimation approach, enhancing overall token management.	2025-12-02 21:34:39 +08:00
Calcium-Ion	509d1f633a	Merge pull request #2353 from QuantumNous/openapi chore: update the relay openapi file	2025-12-02 18:18:35 +08:00
t0ng7u	0c6d890f6e	chore: update the relay openapi file	2025-12-02 18:17:01 +08:00
Papersnake	2f7eebcd10	fix: add ForceAttemptHTTP2	2025-12-02 10:08:58 +08:00
Papersnake	3954feb993	fix: set MaxIdleConnsPerHost to 100	2025-12-02 09:55:03 +08:00
Calcium-Ion	d3ca454c3b	Merge pull request #2348 from QuantumNous/openapi chore: update openapi files	2025-12-02 00:32:17 +08:00
t0ng7u	46aca8fad3	chore: update openapi files	2025-12-01 21:39:09 +08:00
Calcium-Ion	86aeb72549	Merge pull request #2346 from QuantumNous/nano-banana-multi-turn feat(gemini): implement markdown image handling in text processing	2025-12-01 18:42:51 +08:00
CaIon	4dbdbdec1d	feat(gemini): implement markdown image handling in text processing	2025-12-01 17:54:41 +08:00
Seefs	b6a02d8303	feat: gemini 3 thinking level gemini-3-pro-preview-high	2025-12-01 16:40:46 +08:00
CaIon	36a739e777	Remove outdated API documentation for authentication, web API, and models (Midjourney, Rerank, Suno). Add OpenAPI specifications for backend management and relay interfaces.	2025-11-30 21:44:05 +08:00
CaIon	98f92f990a	feat(gemini): add validation and conversion for imageConfig parameters in extra_body	2025-11-30 19:31:08 +08:00
CaIon	3f7ea1fd83	fix(vertex): ensure sampleCount is a positive integer and update OtherRatios	2025-11-30 19:05:33 +08:00
Calcium-Ion	f6e7a2344b	Merge pull request #2340 from QuantumNous/revert-2305-pr/add-gemini-3-pro-image-preview-oai Revert "OAI生图接口支持gemini 3 pro image preview"	2025-11-30 18:50:16 +08:00
Seefs	3257723a55	Revert "OAI生图接口支持gemini 3 pro image preview"	2025-11-30 18:49:18 +08:00
Calcium-Ion	b19b2d62df	Merge pull request #2339 from QuantumNous/revert-2330-pr/fix-nano-banana-err Revert "fix: nano-banana not compatible imageSize"	2025-11-30 18:48:09 +08:00
Calcium-Ion	f9c8624f2c	Merge pull request #2338 from QuantumNous/revert-2321-pr/gemini-image-edit Revert "Gemini Image系列支持图像编辑"	2025-11-30 18:48:01 +08:00
Calcium-Ion	6c8253156b	Merge pull request #2337 from QuantumNous/revert-2315-pr/gemini-veo3.1-i2v Revert "Gemini Veo3.1[AI Studio]增加图生视频支持"	2025-11-30 18:47:50 +08:00
Calcium-Ion	a66b314f5b	Merge pull request #2336 from QuantumNous/revert-2309-pr/fix-gemini-ImageConfig Revert "fix: gemini image correct generationConfig"	2025-11-30 18:47:39 +08:00
Seefs	e29ff0060d	Revert "fix: nano-banana not compatible imageSize"	2025-11-30 18:46:10 +08:00
Seefs	d4a2c2ab54	Revert "Gemini Image系列支持图像编辑"	2025-11-30 18:45:54 +08:00
Seefs	ded463ee57	Revert "Gemini Veo3.1[AI Studio]增加图生视频支持"	2025-11-30 18:45:37 +08:00
Seefs	e337936227	Revert "fix: gemini image correct generationConfig"	2025-11-30 18:45:23 +08:00
HynoR	c6125eccb1	fix: Set default to unsupported value for gpt-5 model series requests	2025-11-15 13:28:38 +08:00
NoahCode	138810f19c	fix(channel): update channel identification logic in error processing	2025-11-08 20:33:14 +08:00