Frequently Asked Questions (FAQ)¶
Find answers to common questions about SteadyText, troubleshooting tips, and best practices.
Table of Contents¶
- General Questions
- Installation & Setup
- Usage Questions
- Performance Questions
- Model Questions
- Caching Questions
- Daemon Questions
- PostgreSQL Extension
- Troubleshooting
- Advanced Topics
General Questions¶
What is SteadyText?¶
SteadyText is a deterministic AI text generation and embedding library for Python. It ensures that the same input always produces the same output, making it ideal for:
- Reproducible research
- Testing AI-powered applications
- Consistent embeddings for search
- Deterministic content generation
How is SteadyText different from other AI libraries?¶
Feature | SteadyText | Other Libraries |
---|---|---|
Deterministic | ✅ Always | ❌ Usually random |
Never fails | ✅ Returns None | ❌ Throws exceptions |
Zero config | ✅ Works instantly | ❌ Complex setup |
Built-in cache | ✅ Automatic | ❌ Manual setup |
PostgreSQL | ✅ Native extension | ❌ External integration |
What models does SteadyText use?¶
- Text Generation: Gemma-3n models (2B and 4B parameters)
- Embeddings: Qwen3-Embedding-0.6B (1024 dimensions)
- Format: GGUF quantized models for efficiency
Is SteadyText suitable for production?¶
Yes! SteadyText is designed for production use:
- Daemon mode: 160x faster responses
- Thread-safe: Handles concurrent requests
- Resource efficient: Quantized models use less memory
- Battle-tested: Used in production environments
- PostgreSQL integration: Database-native AI
Installation & Setup¶
How do I install SteadyText?¶
# Using pip
pip install steadytext
# Using UV (recommended)
uv add steadytext
# With PostgreSQL extension
pip install steadytext[postgres]
What are the system requirements?¶
- Python: 3.8 or higher
- Memory: 4GB RAM minimum (8GB recommended)
- Disk: 2GB for models
- OS: Linux, macOS, Windows
Do I need a GPU?¶
No, SteadyText is optimized for CPU inference. GPU support is planned for future releases.
How do I verify the installation?¶
# Check CLI
st --version
# Test generation
echo "Hello world" | st
# Python test
python -c "import steadytext; print(steadytext.generate('Hello'))"
Usage Questions¶
How do I ensure deterministic results?¶
Use the same seed value:
# Always produces the same output
result1 = steadytext.generate("Hello", seed=42)
result2 = steadytext.generate("Hello", seed=42)
assert result1 == result2
Can I use custom prompts?¶
Yes, any text prompt works:
# Simple prompts
text = steadytext.generate("Write a poem")
# Complex prompts with instructions
prompt = """
You are a helpful assistant. Please:
1. Summarize the following text
2. Extract key points
3. Suggest improvements
Text: [your text here]
"""
result = steadytext.generate(prompt)
How do I generate longer texts?¶
Adjust the max_new_tokens
parameter:
# Default: 512 tokens
short = steadytext.generate("Story", max_new_tokens=100)
# Longer output
long = steadytext.generate("Story", max_new_tokens=2000)
Can I stream the output?¶
Yes, use the streaming API:
How do embeddings work?¶
# Create embedding
embedding = steadytext.embed("Machine learning")
# Returns: numpy array of shape (1024,)
# Compare similarity
emb1 = steadytext.embed("cat")
emb2 = steadytext.embed("dog")
similarity = np.dot(emb1, emb2) # Cosine similarity
Performance Questions¶
Why is the first generation slow?¶
The first call loads the model into memory (2-3 seconds). Subsequent calls are fast (<100ms). To avoid this:
How can I improve performance?¶
-
Use daemon mode (160x faster):
-
Enable caching (enabled by default):
-
Batch operations:
What are typical response times?¶
Operation | First Call | Cached | With Daemon |
---|---|---|---|
Generate | 2-3s | <10ms | <20ms |
Embed | 1-2s | <5ms | <15ms |
Batch (100) | 3-5s | <100ms | <200ms |
Model Questions¶
Can I use different model sizes?¶
Yes, SteadyText supports multiple model sizes:
# CLI
st generate "Hello" --size small # Fast, 2B parameters
st generate "Hello" --size large # Better quality, 4B parameters
# Python
text = steadytext.generate("Hello", model_size="large")
Can I use custom models?¶
Currently, SteadyText uses pre-selected models for consistency. Custom model support is planned for future releases.
How much disk space do models use?¶
- Small generation model: ~1.3GB
- Large generation model: ~2.1GB
- Embedding model: ~0.6GB
- Total (all models): ~4GB
Where are models stored?¶
Models are cached in platform-specific directories:
# Linux/Mac
~/.cache/steadytext/models/
# Windows
%LOCALAPPDATA%\steadytext\steadytext\models\
# Check location
from steadytext.utils import get_model_cache_dir
print(get_model_cache_dir())
Caching Questions¶
How does caching work?¶
SteadyText uses a frecency cache (frequency + recency):
# First call: generates and caches
result1 = steadytext.generate("Hello", seed=42) # Slow
# Second call: returns from cache
result2 = steadytext.generate("Hello", seed=42) # Instant
# Different seed: new generation
result3 = steadytext.generate("Hello", seed=123) # Slow
Can I disable caching?¶
# Disable via environment variable
export STEADYTEXT_DISABLE_CACHE=1
# Or in Python
import os
os.environ['STEADYTEXT_DISABLE_CACHE'] = '1'
How do I clear the cache?¶
# CLI
st cache --clear
# Python
from steadytext import get_cache_manager
cache_manager = get_cache_manager()
cache_manager.clear_all_caches()
How much cache space is used?¶
# Check cache statistics
from steadytext import get_cache_manager
stats = get_cache_manager().get_cache_stats()
print(f"Generation cache: {stats['generation']['size']} entries")
print(f"Embedding cache: {stats['embedding']['size']} entries")
Daemon Questions¶
What is daemon mode?¶
The daemon is a background service that keeps models loaded in memory, providing 160x faster first responses.
How do I start the daemon?¶
# Start in background
st daemon start
# Start in foreground (see logs)
st daemon start --foreground
# Check status
st daemon status
Is the daemon used automatically?¶
Yes! When the daemon is running, all SteadyText operations automatically use it:
How do I stop the daemon?¶
Can I run multiple daemons?¶
Currently, only one daemon instance is supported per machine. Multi-daemon support is planned for future releases.
PostgreSQL Extension¶
How do I install pg_steadytext?¶
# Using Docker (recommended)
cd pg_steadytext
docker build -t pg_steadytext .
docker run -d -p 5432:5432 pg_steadytext
# Manual installation
cd pg_steadytext
make && sudo make install
How do I use it in SQL?¶
-- Enable extension
CREATE EXTENSION pg_steadytext;
-- Generate text
SELECT steadytext_generate('Write a SQL tutorial');
-- Create embeddings
SELECT steadytext_embed('PostgreSQL database');
Is it production-ready?¶
The PostgreSQL extension is currently experimental. Use with caution in production environments.
Troubleshooting¶
"Model not found" error¶
# Download models manually
st models download --all
# Or set environment variable
export STEADYTEXT_ALLOW_MODEL_DOWNLOADS=true
"None" returned instead of text¶
This is the expected behavior in v2.1.0+ when models can't be loaded:
# Check if generation succeeded
result = steadytext.generate("Hello")
if result is None:
print("Model not available")
else:
print(f"Generated: {result}")
Daemon won't start¶
# Check if port is in use
lsof -i :5557
# Try different port
st daemon start --port 5558
# Check logs
st daemon start --foreground
High memory usage¶
# Use smaller model
st generate "Hello" --size small
# Limit cache size
export STEADYTEXT_GENERATION_CACHE_MAX_SIZE_MB=50
export STEADYTEXT_EMBEDDING_CACHE_MAX_SIZE_MB=100
Slow generation¶
# Start daemon for faster responses
st daemon start
# Check cache is working
st cache --status
# Use smaller model
st generate "Hello" --size small
Advanced Topics¶
How do I use SteadyText in production?¶
-
Use daemon mode:
-
Configure caching:
-
Monitor performance:
Can I use SteadyText with async code?¶
import asyncio
from concurrent.futures import ThreadPoolExecutor
executor = ThreadPoolExecutor(max_workers=4)
async def async_generate(prompt):
loop = asyncio.get_event_loop()
return await loop.run_in_executor(
executor,
steadytext.generate,
prompt
)
# Use in async function
result = await async_generate("Hello")
How do I handle errors gracefully?¶
def safe_generate(prompt, fallback="Unable to generate"):
try:
result = steadytext.generate(prompt)
if result is None:
return fallback
return result
except Exception as e:
logger.error(f"Generation failed: {e}")
return fallback
Can I use SteadyText with langchain?¶
from langchain.llms.base import LLM
class SteadyTextLLM(LLM):
def _call(self, prompt: str, stop=None) -> str:
result = steadytext.generate(prompt)
return result if result else ""
@property
def _llm_type(self) -> str:
return "steadytext"
# Use with langchain
llm = SteadyTextLLM()
How do I benchmark performance?¶
# Run built-in benchmarks
cd benchmarks
python run_all_benchmarks.py
# Quick benchmark
python run_all_benchmarks.py --quick
Can I contribute to SteadyText?¶
Yes! We welcome contributions:
- Fork the repository
- Create a feature branch
- Make your changes
- Run tests:
uv run pytest
- Submit a pull request
See CONTRIBUTING.md for details.
Still Have Questions?¶
- GitHub Issues: Report bugs or request features
- Discussions: Join the community
- Documentation: Read the full docs
Quick Reference¶
Common Commands¶
# Generation
echo "prompt" | st
st generate "prompt" --seed 42
# Embeddings
st embed "text"
st embed "text" --format numpy
# Daemon
st daemon start
st daemon status
st daemon stop
# Cache
st cache --status
st cache --clear
# Models
st models list
st models download --all
st models preload
Common Patterns¶
# Basic usage
import steadytext
# Generate text
text = steadytext.generate("Hello world")
# Create embedding
embedding = steadytext.embed("Hello world")
# With custom seed
text = steadytext.generate("Hello", seed=123)
# Streaming
for chunk in steadytext.generate_iter("Tell a story"):
print(chunk, end='')
# Batch processing
prompts = ["One", "Two", "Three"]
results = [steadytext.generate(p) for p in prompts]