Unsafe Mode: Remote Models with Best-Effort Determinism¶
⚠️ WARNING: Remote models provide only best-effort determinism. Results may vary between calls, environments, and over time. For true determinism, use local GGUF models (default SteadyText behavior).
Overview¶
SteadyText's unsafe mode allows you to use remote AI models (OpenAI, Cerebras, etc.) that support seed parameters for reproducibility. While these models attempt to provide consistent outputs when given the same seed, they cannot guarantee the same level of determinism as local models.
Why "Unsafe"?¶
Remote models are considered "unsafe" because:
- No Guaranteed Determinism: Results may vary despite using the same seed
- External Dependencies: Relies on third-party APIs that may change
- Version Changes: Model updates can alter outputs
- Infrastructure Variability: Different servers may produce different results
- API Costs: Unlike local models, remote models incur per-token charges
Prerequisites¶
To use unsafe mode with OpenAI models, you need to install the OpenAI client:
Enabling Unsafe Mode¶
Unsafe mode requires explicit opt-in via environment variable:
Supported Providers¶
OpenAI¶
Supported models (all models available through OpenAI API):
- gpt-4o
and gpt-4o-mini
(recommended for seed support)
- gpt-5-mini
and gpt-5-pro
(reasoning models, temperature automatically set to 1.0)
- o1-preview
and o1-mini
(reasoning models, temperature automatically set to 1.0)
- gpt-4-turbo
and variants
- gpt-3.5-turbo
and variants
- Any future models accessible via the OpenAI API
Note on Reasoning Models (v2025.8.17+):
GPT-5 series and o1 series are reasoning models that require temperature=1.0
. SteadyText automatically adjusts the temperature for these models.
Setup:
Note: The provider dynamically supports all models available through your OpenAI account.
Cerebras¶
Supported models (all models available through Cerebras Cloud API):
- llama3.1-8b
and llama3.1-70b
- llama3-8b
and llama3-70b
- Any future models accessible via the Cerebras API
Setup:
Note: The provider dynamically supports all models available through your Cerebras account.
Usage¶
Python API¶
import os
import steadytext
# Enable unsafe mode
os.environ["STEADYTEXT_UNSAFE_MODE"] = "true"
# Use OpenAI
text = steadytext.generate(
"Explain quantum computing",
model="openai:gpt-4o-mini",
seed=42 # Best-effort determinism
)
# Use GPT-5 reasoning models (v2025.8.17+)
text = steadytext.generate(
"Complex reasoning task",
model="openai:gpt-5-mini",
unsafe_mode=True
# Note: temperature automatically set to 1.0 for reasoning models
)
# Pass custom provider options (v2025.8.17+)
text = steadytext.generate(
"Creative writing",
model="openai:gpt-4o-mini",
unsafe_mode=True,
options={"top_p": 0.95, "presence_penalty": 0.5}
)
# Use Cerebras
text = steadytext.generate(
"Write a Python function",
model="cerebras:llama3.1-8b",
seed=42
)
# Streaming also supported
for token in steadytext.generate_iter(
"Tell me a story",
model="openai:gpt-4o-mini"
):
print(token, end='')
# Structured generation (v2.6.1+: full support)
from pydantic import BaseModel
class Person(BaseModel):
name: str
age: int
# JSON generation with schemas
result = steadytext.generate(
"Create a person named Alice, age 30",
model="openai:gpt-4o-mini",
schema=Person,
unsafe_mode=True
)
# Regex-constrained generation
phone = steadytext.generate(
"My phone number is",
model="openai:gpt-4o-mini",
regex=r"\d{3}-\d{3}-\d{4}",
unsafe_mode=True
)
# Choice-constrained generation
sentiment = steadytext.generate(
"This product is amazing!",
model="openai:gpt-4o-mini",
choices=["positive", "negative", "neutral"],
unsafe_mode=True
)
CLI¶
# Enable unsafe mode
export STEADYTEXT_UNSAFE_MODE=true
# Generate with OpenAI
echo "Explain AI" | st --unsafe-mode --model openai:gpt-4o-mini
# Use GPT-5 reasoning models (v2025.8.17+)
echo "Solve this problem" | st --unsafe-mode --model openai:gpt-5-mini
echo "Complex reasoning" | st --unsafe-mode --model openai:gpt-5-pro
# Generate with Cerebras
echo "Write code" | st --unsafe-mode --model cerebras:llama3.1-8b
# With custom seed for reproducibility
echo "Tell me a story" | st --unsafe-mode --model openai:gpt-4o-mini --seed 123
# Pass custom provider options (v2025.8.17+)
echo "Creative writing" | st --unsafe-mode --model openai:gpt-4o-mini \
--options '{"top_p": 0.95, "presence_penalty": 0.5}'
# Structured generation with remote models
echo "Create a person" | st --unsafe-mode --model openai:gpt-4o-mini \
--schema '{"type": "object", "properties": {"name": {"type": "string"}, "age": {"type": "integer"}}}' \
--wait
Limitations¶
When using unsafe mode:
- Full Structured Output (v2.6.1+): Remote models now support JSON schemas, regex patterns, and choice constraints
- No Logprobs: Log probabilities are not available from remote APIs
- No Embeddings: Only generation is supported, not embeddings (except for dedicated embedding providers like VoyageAI and Jina)
- Best-Effort Only: Determinism is not guaranteed despite seed parameters
- Reasoning Models (v2025.8.17+): GPT-5 and o1 series require
temperature=1.0
and don't support temperature tuning
Best Practices¶
- Use for Prototyping: Test ideas with remote models, then switch to local models for production
- Document Variability: Note that outputs may change over time
- Set Temperature to 0: Use
temperature=0
for maximum consistency (except for reasoning models which requiretemperature=1.0
) - Version Lock: Document which model versions you're using
- Fallback Planning: Have a plan for when remote APIs are unavailable
- Custom Options: Use the
options
parameter to fine-tune provider-specific behaviors liketop_p
,presence_penalty
, etc.
Warning Messages¶
When using unsafe mode, you'll see warnings like:
======================================================================
UNSAFE MODE WARNING: Using OpenAI (gpt-4o-mini) remote model
======================================================================
You are using a REMOTE model that provides only BEST-EFFORT determinism.
Results may vary between:
- Different API calls
- Different environments
- Different times
- Provider infrastructure changes
For TRUE determinism, use local GGUF models (default SteadyText behavior).
======================================================================
Custom Provider Options (v2025.8.17+)¶
You can pass provider-specific parameters using the options
parameter to fine-tune model behavior:
Python API¶
result = steadytext.generate(
"Write a creative story",
model="openai:gpt-4o-mini",
unsafe_mode=True,
options={
"top_p": 0.95, # Nucleus sampling
"presence_penalty": 0.6, # Reduce repetition
"frequency_penalty": 0.3, # Encourage diversity
"max_tokens": 1000 # Custom token limit
}
)
CLI¶
echo "Write a creative story" | st --unsafe-mode \
--model openai:gpt-4o-mini \
--options '{"top_p": 0.95, "presence_penalty": 0.6}'
Common options for OpenAI models:
- top_p
: Nucleus sampling threshold (0-1)
- presence_penalty
: Reduce repetition (-2.0 to 2.0)
- frequency_penalty
: Encourage diversity (-2.0 to 2.0)
- max_tokens
: Override default max token limit
- stop
: Custom stop sequences (array of strings)
Note: Available options vary by provider. Consult your provider's API documentation for supported parameters.
Comparison: Local vs Remote¶
Feature | Local Models (Default) | Remote Models (Unsafe) |
---|---|---|
Determinism | ✅ Guaranteed | ⚠️ Best-effort only |
Cost | ✅ Free after download | ❌ Per-token charges |
Speed | ✅ Fast (local) | ❌ Network latency |
Privacy | ✅ Fully private | ❌ Data sent to API |
Offline | ✅ Works offline | ❌ Requires internet |
Models | Limited selection | Many options |
Custom Options | Limited | ✅ Full provider API access |
Troubleshooting¶
"Unsafe mode requires STEADYTEXT_UNSAFE_MODE=true"¶
Set the environment variable:
"Provider not available"¶
Check your API key:
"Model does not support seed parameter"¶
Use only models listed in the supported models section above.
Migration Path¶
- Prototype with remote models for flexibility
- Evaluate outputs and identify core use cases
- Switch to local models for production deployment
- Maintain deterministic outputs over time
Remember: SteadyText's core value is deterministic text generation. Use unsafe mode only when you explicitly need remote model capabilities and understand the trade-offs.