This is an alternate PR to solving the same problem as
<https://github.com/openai/codex/pull/8227>.
In this PR, when Ollama is used via `--oss` (or via `model_provider =
"ollama"`), we default it to use the Responses format. At runtime, we do
an Ollama version check, and if the version is older than when Responses
support was added to Ollama, we print out a warning.
Because there's no way of configuring the wire api for a built-in
provider, we temporarily add a new `oss_provider`/`model_provider`
called `"ollama-chat"` that will force the chat format.
Once the `"chat"` format is fully removed (see
<https://github.com/openai/codex/discussions/7782>), `ollama-chat` can
be removed as well
---------
Co-authored-by: Eric Traut <etraut@openai.com>
Co-authored-by: Michael Bolin <mbolin@openai.com>
## Overview
Adds LM Studio OSS support. Closes#1883
### Changes
This PR enhances the behavior of `--oss` flag to support LM Studio as a
provider. Additionally, it introduces a new flag`--local-provider` which
can take in `lmstudio` or `ollama` as values if the user wants to
explicitly choose which one to use.
If no provider is specified `codex --oss` will auto-select the provider
based on whichever is running.
#### Additional enhancements
The default can be set using `oss-provider` in config like:
```
oss_provider = "lmstudio"
```
For non-interactive users, they will need to either provide the provider
as an arg or have it in their `config.toml`
### Notes
For best performance, [set the default context
length](https://lmstudio.ai/docs/app/advanced/per-model) for gpt-oss to
the maximum your machine can support
---------
Co-authored-by: Matt Clayton <matt@lmstudio.ai>
Co-authored-by: Eric Traut <etraut@openai.com>