TY - JOUR
T1 - Checkpointing with mutable checkpoints
AU - Cao, Guohong
AU - Singhal, Mukesh
N1 - Copyright:
Copyright 2008 Elsevier B.V., All rights reserved.
PY - 2003/1/2
Y1 - 2003/1/2
N2 - There are two approaches to reduce the overhead associated with coordinated checkpointing: first is to minimize the number of synchronization messages and the number of checkpoints; the other is to make the checkpointing process non-blocking. In our previous work (IEEE Parallel Distributed Systems 9 (12) (1998) 1213), we proved that there does not exist a non-blocking algorithm which forces only a minimum number of processes to take their checkpoints. In this paper, we present a min-process algorithm which relaxes the non-blocking condition while tries to minimize the blocking time, and a non-blocking algorithm which relaxes the min-process condition while minimizing the number of checkpoints saved on the stable storage. The proposed non-blocking algorithm is based on the concept of "mutable checkpoint", which is neither a tentative checkpoint nor a permanent checkpoint. Based on mutable checkpoints, our non-blocking algorithm avoids the avalanche effect and forces only a minimum number of processes to take their checkpoints on the stable storage.
AB - There are two approaches to reduce the overhead associated with coordinated checkpointing: first is to minimize the number of synchronization messages and the number of checkpoints; the other is to make the checkpointing process non-blocking. In our previous work (IEEE Parallel Distributed Systems 9 (12) (1998) 1213), we proved that there does not exist a non-blocking algorithm which forces only a minimum number of processes to take their checkpoints. In this paper, we present a min-process algorithm which relaxes the non-blocking condition while tries to minimize the blocking time, and a non-blocking algorithm which relaxes the min-process condition while minimizing the number of checkpoints saved on the stable storage. The proposed non-blocking algorithm is based on the concept of "mutable checkpoint", which is neither a tentative checkpoint nor a permanent checkpoint. Based on mutable checkpoints, our non-blocking algorithm avoids the avalanche effect and forces only a minimum number of processes to take their checkpoints on the stable storage.
UR - http://www.scopus.com/inward/record.url?scp=0037413288&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=0037413288&partnerID=8YFLogxK
U2 - 10.1016/S0304-3975(02)00566-2
DO - 10.1016/S0304-3975(02)00566-2
M3 - Article
AN - SCOPUS:0037413288
VL - 290
SP - 1127
EP - 1148
JO - Theoretical Computer Science
JF - Theoretical Computer Science
SN - 0304-3975
IS - 2
ER -