When using models in parallel (one of the main value propositions of your product!), I inevitably run into conflicts between various model providers' API schemas, eventually breaking some of the models and preventing their further use in that chat. It seems that whatever model's response is made primary in the previous reply then has it's specific JSON formatting appended to the chat history and fed into the next response/API call. This leads to cascading schema validation failures that eventually break either the Google/Anthropic models or the DeepInfra models. All Grok and OpenAI models seem mostly agnostic; they can be paralleled with models from Google, Anthropic, and DeepInfra without issue. The main conflict is between DeepInfra's API schema for OS models, and Google and Anthropic's more proprietary API schemas. Perplexity's API schema seems especially strict, as it's models cannot be paralleled with any models not provided by Perplexity. For now, I have worked around this by keeping my chats segregated between OS and CS models. But this is just a band-aid. If one of your major value propositions is parallel models, this needs to be fixed ASAP.