Lesson 8.4: Real-World Patterns and Best Practices
| Navigation: ← Previous: Stateful Applications | Module Overview |
Introduction
This final lesson examines real-world operator patterns by analyzing popular operators, identifying best practices, and learning from common anti-patterns. You’ll understand how production operators are built and what makes them successful.
Theory: Real-World Patterns
Learning from successful operators helps you build better operators.
Why Study Real Operators?
Proven Patterns:
- See what works in production
- Learn from experience
- Avoid common mistakes
- Adopt best practices
Architecture Insights:
- Understand design decisions
- See complex patterns in action
- Learn scaling strategies
- Understand trade-offs
Best Practices:
- Industry standards
- Community consensus
- Battle-tested approaches
- Production-ready patterns
Common Patterns
API Design:
- Clear, intuitive APIs
- Sensible defaults
- Good validation
- Comprehensive status
Error Handling:
- Graceful degradation
- Clear error messages
- Retry strategies
- Failure recovery
Observability:
- Comprehensive logging
- Rich metrics
- Useful events
- Good documentation
Anti-Patterns to Avoid
Tight Coupling:
- Hard dependencies
- Difficult to test
- Hard to maintain
- Avoid this
Ignoring Errors:
- Silent failures
- No error handling
- Poor user experience
- Avoid this
Blocking Operations:
- Synchronous waits
- Block reconciliation
- Poor performance
- Avoid this
Understanding real-world patterns helps you build production-ready operators.
Popular Operator Patterns
Prometheus Operator Pattern
graph TB
PROMETHEUS[Prometheus Operator]
PROMETHEUS --> SERVICEMONITOR[ServiceMonitor]
PROMETHEUS --> PROMETHEUS_CR[Prometheus]
PROMETHEUS --> ALERTMANAGER[Alertmanager]
SERVICEMONITOR --> CONFIG[Configuration]
PROMETHEUS_CR --> DEPLOYMENT[Deployment]
ALERTMANAGER --> RULES[Alert Rules]
style PROMETHEUS fill:#90EE90
Key Patterns:
- Declarative configuration
- Service discovery
- Multi-resource management
- Configuration validation
Elasticsearch Operator Pattern
graph TB
ES[Elasticsearch Operator]
ES --> CLUSTER[Elasticsearch Cluster]
ES --> NODES[Nodes]
ES --> SHARDS[Shards]
CLUSTER --> HEALTH[Health Management]
NODES --> SCALING[Scaling]
SHARDS --> REBALANCE[Rebalancing]
style ES fill:#FFB6C1
Key Patterns:
- Cluster management
- Node lifecycle
- Data sharding
- Health monitoring
Best Practices
Practice 1: Clear API Design
graph TB
API[API Design]
API --> SPEC[Clear Spec]
API --> STATUS[Detailed Status]
API --> VALIDATION[Validation]
API --> DEFAULTS[Defaults]
style API fill:#90EE90
Guidelines:
- Use clear, descriptive field names
- Provide sensible defaults
- Validate at API level
- Document all fields
Practice 2: Comprehensive Status
type DatabaseStatus struct {
// Conditions for state tracking
Conditions []metav1.Condition `json:"conditions,omitempty"`
// Phase for simple state
Phase string `json:"phase,omitempty"`
// Detailed status
ReadyReplicas int32 `json:"readyReplicas,omitempty"`
TotalReplicas int32 `json:"totalReplicas,omitempty"`
Endpoint string `json:"endpoint,omitempty"`
// Observed generation
ObservedGeneration int64 `json:"observedGeneration,omitempty"`
}
Practice 3: Idempotent Operations
flowchart TD
OPERATION[Operation] --> IDEMPOTENT{Idempotent?}
IDEMPOTENT -->|Yes| SAFE[Safe to Retry]
IDEMPOTENT -->|No| UNSAFE[Unsafe to Retry]
SAFE --> SUCCESS[Success]
UNSAFE --> ERROR[Error]
style IDEMPOTENT fill:#90EE90
style SAFE fill:#90EE90
All operations must be idempotent:
- Creating resources: check if exists first
- Updating resources: compare before update
- Deleting resources: handle not found gracefully
Common Anti-Patterns
Anti-Pattern 1: Tight Coupling
graph TB
BAD[Tight Coupling]
BAD --> HARD[Hard to Test]
BAD --> RIGID[Rigid Design]
BAD --> FRAGILE[Fragile]
GOOD[Loose Coupling] --> FLEXIBLE[Flexible]
GOOD --> TESTABLE[Testable]
GOOD --> MAINTAINABLE[Maintainable]
style BAD fill:#FFB6C1
style GOOD fill:#90EE90
Avoid:
- Hard-coded dependencies
- Direct API calls to external services
- Tight coupling between components
Anti-Pattern 2: Ignoring Errors
// BAD: Ignoring errors
r.Create(ctx, resource) // Error ignored!
// GOOD: Handle errors
if err := r.Create(ctx, resource); err != nil {
if !errors.IsAlreadyExists(err) {
return ctrl.Result{}, err
}
}
Anti-Pattern 3: Blocking Operations
// BAD: Blocking operation
time.Sleep(5 * time.Minute)
// GOOD: Requeue with delay
return ctrl.Result{RequeueAfter: 5 * time.Minute}, nil
Documentation Best Practices
Documentation Structure
graph TB
DOCS[Documentation]
DOCS --> README[README]
DOCS --> API[API Docs]
DOCS --> EXAMPLES[Examples]
DOCS --> TROUBLESHOOTING[Troubleshooting]
README --> QUICKSTART[Quick Start]
README --> ARCHITECTURE[Architecture]
style DOCS fill:#90EE90
Essential Documentation
- README.md
- Quick start guide
- Architecture overview
- Installation instructions
- API Documentation
- Field descriptions
- Example resources
- Validation rules
- Examples
- Common use cases
- Advanced scenarios
- Best practices
- Troubleshooting
- Common issues
- Debugging guides
- FAQ
User Experience
UX Principles
graph TB
UX[User Experience]
UX --> CLEAR[Clear Messages]
UX --> HELPFUL[Helpful Errors]
UX --> PROGRESS[Progress Indicators]
UX --> EXAMPLES[Examples]
style UX fill:#90EE90
Error Messages
// BAD: Generic error
return fmt.Errorf("error")
// GOOD: Specific, actionable error
return fmt.Errorf("spec.storage.size: must be >= 10Gi for replicas > 5, got %s", db.Spec.Storage.Size)
Key Takeaways
- Study popular operators to learn patterns
- Follow best practices for maintainability
- Avoid anti-patterns that cause issues
- Document thoroughly for users
- Design for UX with clear messages
- Make operations idempotent for reliability
- Provide comprehensive status for observability
- Test thoroughly before release
Understanding for Building Operators
When building production operators:
- Study successful operators
- Follow established patterns
- Avoid common anti-patterns
- Document comprehensively
- Focus on user experience
- Make everything idempotent
- Provide detailed status
- Test all scenarios
Related Lab
- Lab 8.4: Final Project - Build a complete operator
References
Official Documentation
Further Reading
- Kubernetes Operators by Jason Dobies and Joshua Wood - Complete reference
- Programming Kubernetes by Michael Hausenblas and Stefan Schimanski - Advanced patterns
- Prometheus Operator - Example operator
- Elasticsearch Operator - Example operator
Related Topics
- Operator Framework
- OperatorHub - Community operators
- CNCF Operator Working Group
Next Steps
Congratulations! You’ve completed the entire course! You now have the knowledge and skills to build production-ready Kubernetes operators.