Lesson 7.4: Performance and Scalability

Navigation: ← Previous: High Availability

Introduction

As operators manage more resources, performance becomes critical. This lesson covers rate limiting, batch reconciliation, caching strategies, and techniques for managing large-scale deployments efficiently.

Theory: Performance and Scalability

Performance optimization ensures operators scale efficiently as they manage more resources.

Why Performance Matters

Scalability:

Operators must handle growth
Performance degrades with scale
Optimization enables scaling
Cost efficiency

User Experience:

Fast reconciliation
Responsive status updates
Low latency
Better resource utilization

Resource Efficiency:

Lower API server load
Reduced network traffic
Lower CPU/memory usage
Cost savings

Performance Bottlenecks

API Server Load:

Too many API calls
Inefficient queries
No caching
Rate limiting issues

Reconciliation Overhead:

Inefficient reconciliation logic
Unnecessary work
No batching
Sequential processing

Memory Usage:

Large caches
Memory leaks
Inefficient data structures
No cleanup

Optimization Strategies

Rate Limiting:

Control API call rate
Prevent API server overload
Respect API server limits
Smooth traffic patterns

Caching:

Cache frequently accessed data
Reduce API calls
Faster lookups
Use informers

Batch Processing:

Process multiple resources together
Reduce overhead
Improve efficiency
Better resource utilization

Parallel Processing:

Process independent work in parallel
Utilize multiple cores
Faster completion
Careful with shared state

Understanding performance helps you build scalable, efficient operators.

Performance Optimization Strategies

graph TB
    PERFORMANCE[Performance]
    
    PERFORMANCE --> RATE[Rate Limiting]
    PERFORMANCE --> BATCH[Batch Processing]
    PERFORMANCE --> CACHE[Caching]
    PERFORMANCE --> PARALLEL[Parallel Processing]
    
    RATE --> THROTTLE[Throttle Requests]
    BATCH --> EFFICIENT[Efficient Updates]
    CACHE --> FAST[Fast Lookups]
    PARALLEL --> CONCURRENT[Concurrent Operations]
    
    style PERFORMANCE fill:#90EE90

Rate Limiting

Why Rate Limit?

flowchart TD
    OPERATOR[Operator] --> API[Kubernetes API]
    API --> OVERLOAD{Too Many<br/>Requests?}
    OVERLOAD -->|Yes| THROTTLE[API Throttles]
    OVERLOAD -->|No| SUCCESS[Success]
    
    THROTTLE --> ERRORS[Errors]
    ERRORS --> RETRY[Retries]
    RETRY --> MORE[More Load]
    
    style THROTTLE fill:#FFB6C1
    style SUCCESS fill:#90EE90

Using Controller-Runtime’s Built-in Rate Limiting

Controller-runtime (used by kubebuilder) includes built-in rate limiting. Configure it when setting up your controller:

// In internal/controller/database_controller.go

import (
    "time"
    "k8s.io/client-go/util/workqueue"
    ctrl "sigs.k8s.io/controller-runtime"
    "sigs.k8s.io/controller-runtime/pkg/controller"
    "sigs.k8s.io/controller-runtime/pkg/reconcile"
)

func (r *DatabaseReconciler) SetupWithManager(mgr ctrl.Manager) error {
    return ctrl.NewControllerManagedBy(mgr).
        For(&databasev1.Database{}).
        Owns(&appsv1.StatefulSet{}).
        Owns(&corev1.Service{}).
        WithOptions(controller.Options{
            // MaxConcurrentReconciles limits parallel reconciliations
            MaxConcurrentReconciles: 2,
            // RateLimiter controls requeue rate (typed for controller-runtime v0.19+)
            RateLimiter: workqueue.NewTypedItemExponentialFailureRateLimiter[reconcile.Request](
                5*time.Millisecond,  // Base delay
                1000*time.Second,    // Max delay
            ),
        }).
        Complete(r)
}

Custom Rate Limiting for API Calls

For rate limiting external API calls within reconciliation:

import (
    "golang.org/x/time/rate"
)

type DatabaseReconciler struct {
    client.Client
    Scheme      *runtime.Scheme
    apiLimiter  *rate.Limiter  // Rate limiter for external APIs
}

func NewDatabaseReconciler(mgr ctrl.Manager) *DatabaseReconciler {
    return &DatabaseReconciler{
        Client:     mgr.GetClient(),
        Scheme:     mgr.GetScheme(),
        apiLimiter: rate.NewLimiter(rate.Limit(10), 1), // 10 requests/second
    }
}

func (r *DatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    // Wait for rate limiter before external API calls
    if err := r.apiLimiter.Wait(ctx); err != nil {
        return ctrl.Result{}, err
    }
    // ... reconciliation with external API calls ...
}

Batch Reconciliation

Batch Processing Flow

sequenceDiagram
    participant Queue
    participant Controller
    participant API as API Server
    
    Queue->>Controller: Batch of Requests
    Controller->>Controller: Group by Type
    Controller->>API: Batch Update
    API-->>Controller: Results
    Controller->>Queue: Process Next Batch
    
    Note over Controller: Process multiple<br/>resources together

Batch Reconciliation Example

func (r *DatabaseReconciler) ReconcileBatch(ctx context.Context, requests []ctrl.Request) (ctrl.Result, error) {
    // Group requests by operation
    creates := []*databasev1.Database{}
    updates := []*databasev1.Database{}
    
    for _, req := range requests {
        db := &databasev1.Database{}
        if err := r.Get(ctx, req.NamespacedName, db); err != nil {
            if errors.IsNotFound(err) {
                continue
            }
            return ctrl.Result{}, err
        }
        
        if db.Status.Phase == "" {
            creates = append(creates, db)
        } else {
            updates = append(updates, db)
        }
    }
    
    // Batch create
    for _, db := range creates {
        r.reconcileDatabase(ctx, db)
    }
    
    // Batch update
    for _, db := range updates {
        r.reconcileDatabase(ctx, db)
    }
    
    return ctrl.Result{}, nil
}

Caching Strategies

Client Caching

graph TB
    CLIENT[Controller Client]
    
    CLIENT --> CACHE[Cache Layer]
    CLIENT --> API[Kubernetes API]
    
    CACHE --> HIT[Cache Hit]
    CACHE --> MISS[Cache Miss]
    
    HIT --> FAST[Fast Response]
    MISS --> API
    
    style CACHE fill:#90EE90
    style FAST fill:#FFB6C1

Kubebuilder’s Built-in Caching

Controller-runtime (used by kubebuilder) provides automatic caching through the Manager’s client. When you use r.Get() or r.List(), it reads from the cache, not directly from the API server:

// In your controller - reads from cache by default
func (r *DatabaseReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
    db := &databasev1.Database{}
    // This reads from cache, NOT from API server
    if err := r.Get(ctx, req.NamespacedName, db); err != nil {
        return ctrl.Result{}, client.IgnoreNotFound(err)
    }
    
    // List also uses cache
    dbList := &databasev1.DatabaseList{}
    if err := r.List(ctx, dbList, client.InNamespace(req.Namespace)); err != nil {
        return ctrl.Result{}, err
    }
    
    return ctrl.Result{}, nil
}

Custom Indexers for Fast Lookups

Add custom indexes in your controller setup to enable fast filtering:

// In cmd/main.go or during controller setup
func SetupIndexes(mgr ctrl.Manager) error {
    // Index databases by their environment
    return mgr.GetFieldIndexer().IndexField(
        context.Background(),
        &databasev1.Database{},
        "spec.environment",
        func(obj client.Object) []string {
            db := obj.(*databasev1.Database)
            return []string{db.Spec.Environment}
        },
    )
}

// Then use in controller with MatchingFields
dbList := &databasev1.DatabaseList{}
err := r.List(ctx, dbList, client.MatchingFields{
    "spec.environment": "production",
})

Parallel Processing

Concurrent Reconciliation

func (r *DatabaseReconciler) ReconcileParallel(ctx context.Context, requests []ctrl.Request) error {
    var wg sync.WaitGroup
    errChan := make(chan error, len(requests))
    
    for _, req := range requests {
        wg.Add(1)
        go func(request ctrl.Request) {
            defer wg.Done()
            _, err := r.Reconcile(ctx, request)
            if err != nil {
                errChan <- err
            }
        }(req)
    }
    
    wg.Wait()
    close(errChan)
    
    // Collect errors
    var errors []error
    for err := range errChan {
        errors = append(errors, err)
    }
    
    if len(errors) > 0 {
        return fmt.Errorf("reconciliation errors: %v", errors)
    }
    
    return nil
}

Managing Large Clusters

Scaling Considerations

graph TB
    SCALE[Scaling]
    
    SCALE --> SMALL[Small: <100 Resources]
    SCALE --> MEDIUM[Medium: 100-1000]
    SCALE --> LARGE[Large: >1000]
    
    SMALL --> SIMPLE[Simple Reconciliation]
    MEDIUM --> OPTIMIZE[Optimize Queries]
    LARGE --> BATCH[Batch Processing]
    LARGE --> CACHE[Heavy Caching]
    
    style SMALL fill:#90EE90
    style LARGE fill:#FFB6C1

Optimization Techniques

Use Field Selectors

// Instead of listing all, use field selector
databases := &databasev1.DatabaseList{}
r.List(ctx, databases, client.MatchingFields{
    "spec.environment": "production",
})

Limit List Results

databases := &databasev1.DatabaseList{}
r.List(ctx, databases, &client.ListOptions{
    Limit: 100,
})

Use Indexes

// Create index for frequent queries
mgr.GetFieldIndexer().IndexField(ctx, &databasev1.Database{},
    "spec.environment", indexEnvironment)

Performance Monitoring

Key Metrics

graph TB
    METRICS[Metrics]
    
    METRICS --> DURATION[Reconcile Duration]
    METRICS --> RATE[Reconcile Rate]
    METRICS --> QUEUE[Queue Depth]
    METRICS --> ERRORS[Error Rate]
    
    style METRICS fill:#90EE90

Monitoring Performance

var (
    reconcileDuration = prometheus.NewHistogramVec(
        prometheus.HistogramOpts{
            Name: "database_reconcile_duration_seconds",
            Help: "Duration of reconciliations",
        },
        []string{"result"},
    )
    
    reconcileRate = prometheus.NewGauge(
        prometheus.GaugeOpts{
            Name: "database_reconcile_rate",
            Help: "Reconciliations per second",
        },
    )
)

Key Takeaways

Controller-runtime caching is automatic in kubebuilder
MaxConcurrentReconciles controls parallel reconciliations
RateLimiter in controller options manages requeue rates
Field indexes enable fast filtered lookups
client.MatchingFields optimizes queries
Metrics are available at :8080/metrics by default
Scale strategies depend on cluster size

Understanding for Building Operators

When optimizing kubebuilder operators:

Use controller-runtime’s built-in caching (automatic)
Configure MaxConcurrentReconciles in SetupWithManager
Set up custom field indexes for frequent lookups
Use client.MatchingFields{} for filtered queries
Monitor metrics at the default metrics endpoint
Use rate.Limiter for external API calls
Increase replicas with leader election for scale
Profile with go tool pprof if needed

Lab 7.4: Optimizing Performance - Hands-on exercises for this lesson

References

Official Documentation

Next Steps

Congratulations! You’ve completed Module 7. You now understand:

Packaging and distribution
RBAC and security
High availability
Performance optimization

In Module 8, you’ll learn about advanced topics and real-world patterns.