Important Notice: Our web hosting provider recently started charging us for additional visits, which was unexpected. In response, we're seeking donations. Depending on the situation, we may explore different monetization options for our Community and Expert Contributors. It's crucial to provide more returns for their expertise and offer more Expert Validated Answers or AI Validated Answers. Learn more about our hosting issue here.

Why does GFS lock up/freeze when a node gets fenced?

April 26, 2017fenced freeze gfs lock Node

0

Posted

Why does GFS lock up/freeze when a node gets fenced?

1 Answer

0

Posted

When a node fails, cman detects the missing heartbeat and begins the process of fencing the node. The cman and lock manager (e.g. lock_dlm) prevent any new locks from being acquired until the failed node is successfully fenced. That has to be done to ensure the integrity of the file system, in case the failed node wants to write to the file system after the failure is detected by the other nodes (and therefore out of communication with the rest of the cluster). The fence is considered successful after the fence script completes with a good return code. After the fence completes, the lock manager coordinates the reclaiming of the locks held by the node that had failed. Then the lock manager allows new locks and the GFS file system continues on its way. If the fence is not successful or does not complete for some reason, new locks will continue to be prevented and therefore the GFS file system will freeze for the nodes that have it mounted and try to get locks. Processes that have alread