Support OpenAI Flex Processing (and Maybe Batch Too).
S
Scott
OpenAI released o3 and o4-mini with the option to reduce cost by specifying flex processing. https://platform.openai.com/docs/guides/flex-processing
Please add an option to use flex processing.
Jourin Town
batch can be hard, as it can be up to 24h and you need to harvest it instead of normal chat completion api
On the other hand, flex tier just need a simple body field
{"service_tier": "flex"}
, still use the same api, just I little bit more wait time for 50% the price, it's a great deal. I really hope this can be supported to save money.R
Robb
Commenting for support. The 50% discount is meaningful.
I would appreciate an option to set a default 'service_tier' parameter or flex mode, but have a toggle option within the chat interface to switch from the default (like thinking for Claude 3.7).