TypingMind (→ feedback.typingmind.com)

Boards

Anonymous

[ ➡️ Moved to feedback.typingmind.com ] Hi everyone! 👋 We are moving our feature request and idea board to a new home at feedback.typingmind.com. All existing posts and upvotes are migrated. This page will be frozen for archive purchase.

API Schema Conflicts with Multi-Model Mode

When using models in parallel (one of the main value propositions of your product!), I inevitably run into conflicts between various model providers' API schemas, eventually breaking some of the models and preventing their further use in that chat. It seems that whatever model's response is made primary in the previous reply then has it's specific JSON formatting appended to the chat history and fed into the next response/API call. This leads to cascading schema validation failures that eventually break either the Google/Anthropic models or the DeepInfra models. All Grok and OpenAI models seem mostly agnostic; they can be paralleled with models from Google, Anthropic, and DeepInfra without issue. The main conflict is between DeepInfra's API schema for OS models, and Google and Anthropic's more proprietary API schemas. Perplexity's API schema seems especially strict, as it's models cannot be paralleled with any models not provided by Perplexity. For now, I have worked around this by keeping my chats segregated between OS and CS models. But this is just a band-aid. If one of your major value propositions is parallel models, this needs to be fixed ASAP.

Thinking/Reasoning Model Reponse Formatting Issues

Certain thinking/reasoning models (e.g. Perplexity models, DeepSeek R1 via DeepInfra, Qwen via DeepInfra) have their reasoning process permanently exposed. Their responses begin with the entire reasoning processes bracketed between <think> & </think>. Only after this text block does the actual user-intended response begin. This text cannot be hidden or collapsed like certain Grok or Claude models. Please address this!

"Background mode" support for o3-pro

OpenAI: We recommend using background mode with o3-pro: long-running tasks will be kicked off asynchronously, preventing timeouts. https://platform.openai.com/docs/guides/background

planned

stop sending the max_tokens param to api, when set to default

I just realized this issue today, but I was hitting the max_tokens limit on sonnet 4, and the message was misleading saying it comes from the api reaching it's limit... but there was just no way that could be true with that small amount of text. Changing the chat max_tokens to 64000 does not return that error message, but setting it to default cuts it off... I would expect the default to not be sent to the api at all, which would allow up to the maximum permitted by the api. Right now, apparently using default, is sending some other default values that do not match the model capabilities. At least if I normally set max tokens per model, it's sending it correctly and even switching it on each chat when changing models, so that's great.

Add streaming mode toggle per model

Allow users to individually enable or disable Streaming mode for each model, providing more granular control over Streaming functionality.

Feature Request: Integrate Models' Native Web Search

Could you please explore integrating the native web search/browsing capabilities inherent in models like Google Gemini? While the current plugin-based web search is useful, I've found its results can sometimes be less accurate or relevant compared to using Gemini directly with its built-in search functionality. Leveraging the model's native search could significantly improve the quality and accuracy of responses for queries requiring up-to-date information.

3.7 Sonnet pls fix

It doesn't work; an error message says I have entered too much data in a short period of time, which I haven't. It was caused by the model using artifacts, for example, going online to search for too much information, and causing an error.

Deep Resarch Integration

I currently prefer to use the Gemini app instead of TypingMind. The reason for this is the advanced deep research function. It must also be available in TypingMind soon.

OpenAI o3 and o3-mini reasoning summaries

Support the new features assuming the API key is from a verified organization

Long Overdue - Allow Custom System Instructions Per Model

Right now, the only way to use custom system instructions is by setting up a custom agent. The problem is, once you use a custom agent, you're locked into the agent’s parameters like temperature, max tokens, etc. and can't dynamically switch models or modify those parameters mid-conversation. What’s missing is the ability to define system instructions on a per-model level, not just per-agent. I don’t use the same system message for GPT-4.1 as I do for claude 3.7 Having one static system prompt doesn’t reflect how these models are used differently. By allowing custom system instructions tied to each model (independent of the agent setup), we could use dynamic parameters during a conversation without being blocked by agent-level defaults.

→