跳到主要内容
版本:v1.0

How to Enable FUSE Auto-recovery

Installation

You can download the latest Fluid installation package from Fluid Releases.

In Fluid chart values.yaml, set csi.featureGates to FuseRecovery=true, indicating enable FUSE auto-recovery. Refer to the Installation Documentation to complete the installation. And check that the components of Fluid are running normally (here takes JuiceFSRuntime as an example):

$ kubectl -n fluid-system get po
NAME READY STATUS RESTARTS AGE
csi-nodeplugin-fluid-2gtsz 2/2 Running 0 20m
csi-nodeplugin-fluid-2h79g 2/2 Running 0 20m
csi-nodeplugin-fluid-sc459 2/2 Running 0 20m
dataset-controller-57fb4569cd-k2jb7 1/1 Running 0 20m
fluid-webhook-844dcb995f-nfmjl 1/1 Running 0 20m
juicefsruntime-controller-7d9c964b4-jnbtf 1/1 Running 0 20m

Typically, you will see a Pod named dataset-controller, a Pod named juicefsruntime-controller, a Pod named fluid-webhook and multiple pods named csi-nodeplugin are running. Among them, the number of csi-nodeplugin these Pods depends on the number of nodes in your Kubernetes cluster.

Demo

Create dataset and runtime

Create corresponding Runtime resources and Datasets with the same name for different types of runtimes. Take JuiceFSRuntime as an example here. For details, please refer to Documentation, as follows:

$ kubectl get juicefsruntime
NAME WORKER PHASE FUSE PHASE AGE
jfsdemo Ready Ready 2m58s
$ kubectl get dataset
NAME UFS TOTAL SIZE CACHED CACHE CAPACITY CACHED PERCENTAGE PHASE AGE
jfsdemo [Calculating] N/A N/A Bound 2m55s

Create Pod

$ cat<<EOF >sample.yaml
apiVersion: v1
kind: Pod
metadata:
name: demo-app
labels:
fuse.serverful.fluid.io/inject: "true"
spec:
containers:
- name: demo
image: nginx
volumeMounts:
- mountPath: /data
name: demo
volumes:
- name: demo
persistentVolumeClaim:
claimName: jfsdemo
EOF
$ kubectl create -f sample.yaml
pod/demo-app created

The FUSE mount point auto-recovery feature requires the pod's mountPropagation to be set to HostToContainer or Bidirectional to pass the mount point information between the container and the host. And Bidirectional requires the container to be a privileged container. Fluid webhook helps automatically set the pod's mountPropagation to HostToContainer. To enable this function, you need to set label fuse.serverful.fluid.io/inject=true on the corresponding Pod's metadata (See the sample mentioned above).

See if the Pod is created and check its mountPropagation

$ kubectl get po |grep demo
demo-app 1/1 Running 0 96s
jfsdemo-fuse-g9pvp 1/1 Running 0 95s
jfsdemo-worker-0 1/1 Running 0 4m25s
$ kubectl get po demo-app -oyaml |grep volumeMounts -A 3
volumeMounts:
- mountPath: /data
mountPropagation: HostToContainer
name: demo

Test FUSE mount point auto recovery

Delete FUSE pod

Delete the FUSE pod, and waiting for it to restart:

$ kubectl delete po jfsdemo-fuse-g9pvp
pod "jfsdemo-fuse-g9pvp" deleted
$ kubectl get po
NAME READY STATUS RESTARTS AGE
demo-app 1/1 Running 0 5m7s
jfsdemo-fuse-bdsdt 1/1 Running 0 6s
jfsdemo-worker-0 1/1 Running 0 7m56s

After the new FUSE pod is created, check the mount points in the demo pod:

$ kubectl exec -it demo-app bash
kubectl exec [POD] [COMMAND] is DEPRECATED and will be removed in a future version. Use kubectl exec [POD] -- [COMMAND] instead.
[root@demo-app /]# df -h
Filesystem Size Used Avail Use% Mounted on
overlay 100G 9.4G 91G 10% /
tmpfs 64M 0 64M 0% /dev
tmpfs 2.0G 0 2.0G 0% /sys/fs/cgroup
JuiceFS:minio 1.0P 64K 1.0P 1% /data
/dev/sdb1 100G 9.4G 91G 10% /etc/hosts
shm 64M 0 64M 0% /dev/shm
tmpfs 3.8G 12K 3.8G 1% /run/secrets/kubernetes.io/serviceaccount
tmpfs 2.0G 0 2.0G 0% /proc/acpi
tmpfs 2.0G 0 2.0G 0% /proc/scsi
tmpfs 2.0G 0 2.0G 0% /sys/firmware

It can be seen that there is no Transport endpoint is not connected error in the container, indicating that the mount point has been restored.

Check dataset events

$ kubectl describe dataset jfsdemo
Name: jfsdemo
Namespace: default
...
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Normal FuseRecoverSucceed 2m34s (x5 over 11m) FuseRecover Fuse recover /var/lib/kubelet/pods/6c1e0318-858b-4ead-976b-37ccce26edfe/volumes/kubernetes.io~csi/default-jfsdemo/mount succeed

You can see that there is a FuseRecover event in the Dataset event, indicating that Fluid has performed a recovery operation on the mount.

Notice

When the FUSE pod crashes, the recovery time of the mount point depends on the recovery of the FUSE pod itself and the period of the csi polling kubelet (env RECOVER_FUSE_PERIOD). Before the recovery, the mount point will still have a Transport endpoint is not connected error, which is expected. In addition, the mount point recovery is accomplished by bind mount repeatedly. For the file descriptors that have been opened by the application before the FUSE pod crash, cannot be recovered even after the mount point recovered. The application itself needs to retry when an error occurs and enhance its robustness.