This guide explains how to configure persistent storage for the Weaviate vector database so your AI data persists across restarts.
Most common configuration:
apiVersion: ai.splunk.com/v1
kind: AIPlatform
metadata:
name: my-ai-platform
spec:
# ... other config ...
storage:
vectorDB:
size: "100Gi" # How much space you need
storageClassName: "gp3" # Your cloud storage class
That’s it! The operator will automatically create a persistent volume for your vector database.
Without persistent storage:
With persistent storage:
The storage.vectorDB field configures persistent storage for Weaviate. This ensures that vector data persists across pod restarts and upgrades.
apiVersion: ai.splunk.com/v1
kind: AIPlatform
metadata:
name: my-ai-platform
spec:
storage:
vectorDB:
# Option 1: Use existing PVC
pvcName: "my-existing-pvc"
# Option 2: Create dynamic PVC (via VolumeClaimTemplate)
size: "100Gi"
storageClassName: "gp3"
The operator will create a PersistentVolumeClaim automatically using StatefulSet VolumeClaimTemplates:
spec:
storage:
vectorDB:
size: "100Gi" # Volume size (default: 50Gi)
storageClassName: "gp3" # Optional StorageClass
How it works:
weaviate-data-<platform-name>-weaviate-0Example:
apiVersion: ai.splunk.com/v1
kind: AIPlatform
metadata:
name: prod-ai
namespace: ai-platform
spec:
defaultAcceleratorType: "nvidia-tesla-t4"
objectStorage:
path: "s3://my-bucket/models"
region: "us-west-2"
storage:
vectorDB:
size: "200Gi"
storageClassName: "gp3-encrypted"
If you have a pre-provisioned PVC, you can reference it:
spec:
storage:
vectorDB:
pvcName: "my-weaviate-pvc"
When to use this:
Important: When using an existing PVC:
If your StorageClass supports volume expansion (allowVolumeExpansion: true), you can increase the volume size by updating the AIPlatform spec:
# Initial configuration
spec:
storage:
vectorDB:
size: "50Gi"
storageClassName: "gp3"
To expand the volume:
# Update the size in your AIPlatform manifest
kubectl edit aiplatform my-ai-platform -n ai-platform
# Change size from "50Gi" to "100Gi"
spec:
storage:
vectorDB:
size: "100Gi" # ← Increase this value
storageClassName: "gp3"
What happens:
Check StorageClass expansion support:
kubectl get storageclass gp3 -o jsonpath='{.allowVolumeExpansion}'
# Should return: true
If automatic expansion is not working, follow these steps:
# 1. Check current PVC status
kubectl get pvc -n ai-platform | grep weaviate
# 2. Manually edit the PVC to request more storage
kubectl edit pvc weaviate-data-my-ai-platform-weaviate-0 -n ai-platform
# 3. Update spec.resources.requests.storage
spec:
resources:
requests:
storage: 100Gi # ← Increase this
# 4. Check PVC conditions for expansion status
kubectl describe pvc weaviate-data-my-ai-platform-weaviate-0 -n ai-platform | grep -A5 Conditions
# 5. Restart Weaviate pod if needed
kubectl delete pod my-ai-platform-weaviate-0 -n ai-platform
✅ Supported:
❌ Not Supported:
Volume expansion requirements:
allowVolumeExpansion: truespec:
storage:
vectorDB:
size: "100Gi"
storageClassName: "gp3" # Or "gp2", "io1", "io2"
EBS CSI Driver features:
spec:
storage:
vectorDB:
size: "100Gi"
storageClassName: "standard" # Or "ssd"
spec:
storage:
vectorDB:
size: "100Gi"
storageClassName: "managed-premium"
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: weaviate-storage
provisioner: kubernetes.io/aws-ebs
parameters:
type: gp3
encrypted: "true"
iops: "3000"
throughput: "125"
allowVolumeExpansion: true # ← Enable expansion
volumeBindingMode: WaitForFirstConsumer
If storage.vectorDB is not specified, the following defaults are used:
spec:
storage:
vectorDB:
size: "50Gi" # Default size
storageClassName: "" # Use cluster default StorageClass
pvcName: "" # No existing PVC, create new one
# List PVCs in namespace
kubectl get pvc -n ai-platform
# Should see:
# NAME STATUS VOLUME CAPACITY STORAGECLASS
# weaviate-data-my-ai-platform-weaviate-0 Bound pvc-xxx 100Gi gp3
# Describe Weaviate pod
kubectl describe pod my-ai-platform-weaviate-0 -n ai-platform | grep -A5 Volumes
# Should see:
# Volumes:
# weaviate-data:
# Type: PersistentVolumeClaim
# ClaimName: weaviate-data-my-ai-platform-weaviate-0
# Exec into Weaviate pod
kubectl exec -it my-ai-platform-weaviate-0 -n ai-platform -- df -h /var/lib/weaviate
# Output:
# Filesystem Size Used Avail Use% Mounted on
# /dev/xvdxx 100G 5G 95G 5% /var/lib/weaviate
# 1. Create some test data in Weaviate
kubectl exec -it my-ai-platform-weaviate-0 -n ai-platform -- curl localhost:8080/v1/schema
# 2. Delete the pod
kubectl delete pod my-ai-platform-weaviate-0 -n ai-platform
# 3. Wait for pod to restart
kubectl wait --for=condition=ready pod -l app=my-ai-platform-weaviate -n ai-platform
# 4. Verify data is still there
kubectl exec -it my-ai-platform-weaviate-0 -n ai-platform -- curl localhost:8080/v1/schema
# ← Should return the same schema as before
Symptom: No PVC appears after creating AIPlatform
Causes:
Debug:
# Check StatefulSet
kubectl get statefulset -n ai-platform
# Check operator logs
kubectl logs -n splunk-ai-operator-system deployment/splunk-ai-operator-controller-manager | grep -i weaviate
# Check events
kubectl get events -n ai-platform --sort-by='.lastTimestamp' | grep -i weaviate
Symptom: PVC shows Pending status
Causes:
Debug:
# Check PVC details
kubectl describe pvc weaviate-data-<platform-name>-weaviate-0 -n ai-platform
# Check available StorageClasses
kubectl get storageclass
# Check if StorageClass supports required access mode
kubectl get storageclass <class-name> -o yaml | grep -A5 parameters
Symptom: PVC shows FileSystemResizePending or expansion doesn’t complete
Causes:
Debug:
# Check PVC conditions
kubectl describe pvc weaviate-data-<platform-name>-weaviate-0 -n ai-platform | grep -A10 Conditions
# Check for expansion events
kubectl get events -n ai-platform --field-selector involvedObject.name=weaviate-data-<platform-name>-weaviate-0
# If stuck, restart the pod
kubectl delete pod <platform-name>-weaviate-0 -n ai-platform
Symptom: Weaviate data disappears after pod restart
Causes:
Verify:
# Check if PVC is mounted
kubectl describe pod <platform-name>-weaviate-0 -n ai-platform | grep -A10 "Mounts:"
# Should see:
# Mounts:
# /var/lib/weaviate from weaviate-data (rw)
# Check if using correct volume
kubectl get pod <platform-name>-weaviate-0 -n ai-platform -o yaml | grep -A5 volumes:
allowVolumeExpansion: truespec:
storage:
vectorDB:
size: "20Gi"
storageClassName: "standard"
spec:
storage:
vectorDB:
size: "100Gi"
storageClassName: "gp3"
spec:
storage:
vectorDB:
size: "500Gi"
storageClassName: "io2" # High IOPS for AWS
spec:
storage:
vectorDB:
pvcName: "weaviate-production-pvc"
If you have an existing AIPlatform without persistent storage:
kubectl exec -it <platform-name>-weaviate-0 -n ai-platform -- weaviate-backup export
kubectl edit aiplatform <platform-name> -n ai-platform
spec:
storage:
vectorDB:
size: "100Gi"
storageClassName: "gp3"
Operator will recreate StatefulSet with PVC
kubectl exec -it <platform-name>-weaviate-0 -n ai-platform -- weaviate-backup import
To change StorageClass (requires data migration):
Note: This process causes downtime. Plan accordingly.