Creates a model response for the given chat conversation.
Supports OpenAI, Anthropic, Google, and Bedrock providers via
provider resolution based on the model name.
Set stream: true for Server-Sent Events streaming.
API key passed as a Bearer token.
Model identifier (e.g. gpt-4o, claude-3-sonnet).
List of messages comprising the conversation.
Sampling temperature (0–2).
Maximum number of tokens to generate.
Nucleus sampling parameter.
Frequency penalty (−2.0 to 2.0).
Presence penalty (−2.0 to 2.0).
Stop sequences.
Whether to stream partial responses via SSE.
List of tools the model may call.
Controls which tool is called. Can be "none", "auto",
or an object like {"type": "function", "function": {"name": "my_fn"}}.
Seed for deterministic sampling.
End-user identifier for abuse monitoring.
Number of completions to generate.
Whether to return log probabilities.
Number of most likely tokens to return (0–20).