⭐ 欢迎来到虫虫下载站! | 📦 资源下载 📁 资源专辑 ℹ️ 关于我们
⭐ 虫虫下载站

📄 readme

📁 The MPICH-V patch for MPICH2 - the MPI Implementation
💻
字号:
ckpt:  A process checkpoint librarywww.cs.wisc.edu/~zandy/ckptCopyright (c) 2002 Victor C. Zandy  zandy@cs.wisc.eduCOPYING contains the distribution terms for ckpt (LGPL).Contents:        1  RELEASE NOTES        2  INSTALLATION        3  QUICK START        4  OVERVIEW        5  DOES IT CHECKPOINT EVERYTHING?        6  FILES INCLUDED WITH CKPT        7  ENVIRONMENT VARIABLES        8  LINKING PROGRAMS WITH THE CHECKPOINT LIBRARY        9  TRIGGERING A CHECKPOINT       10  RESTARTING A CHECKPOINT       11  DIAGNOSTICS         12  CKPT API       13  CONTACT       14  REFERENCES1. RELEASE NOTESWe support ckpt on x86 Linux 2.4.  It may very well still work onLinux 2.2.Note that the ckpt api has changed since 1.3.2. INSTALLATIONEdit the user options in Makefile to set the compiler, compilerflags, and installation directories.  Then run 'make install'.3. QUICK STARTHere is a brief, simple session.  We checkpoint the program foo,which prints the positive integers.  The ckpt libraries(libckpt.so and librestart.so) are installed in /home/me/lib andthe restart command (restart) is installed in /home/me/bin.  Weuse csh shell syntax and prefix annotations with #.% setenv LD_PRELOAD /home/me/lib/libckpt.so% setenv CKPT_FILENAME foo.ckpt                 # checkpoint file% foo1234^Z                                              # send SIGTSTP% unsetenv LD_PRELOAD                           # these environment variables% unsetenv CKPT_FILENAME                        # are not needed to restart % /home/me/bin/restart foo.ckpt564. OVERVIEWckpt is a set of libraries and programs for user-level processcheckpointing.A process linked with the ckpt library will checkpoint itselfwhen it receives a selected signal, SIGTSTP by default.Depending on options set in the environment, it either writesthe checkpoint to a file or sends it to a checkpoint server.The process can be restarted from the checkpoint at a latertime, possibly on a different machine of the samearchitecture and OS.Programs do not need to be relinked with the ckpt library aheadof time, although that is one of the linking options:1) compile time via the linker;2) the start of process execution via LD_PRELOAD [Preload];3) any time during process execution via hijacking [Hijack].ckpt exports a small API to the program with which it is linked.The API allows the program to receive a callback when acheckpoint is triggered or restarted.  Programs do not need touse the API to get basic checkpointing service.The techniques used by ckpt are similar to those used by Condor[Condor], although the code is entirely new and, unlike Condor,does not require you to relink your program in advance.5. DOES IT CHECKPOINT EVERYTHING?No.  ckpt only checkpoints the process address space and signal state.The following resources are not checkpointed:- open files- network connections- interprocess communication- process identifiers, including process id, process group  id, user id, or group id- thread stateYou can combine ckpt with rocks to checkpoint open networkconnections [Rocks].We are developing mechanisms for checkpointing other resources.6. FILES INCLUDED WITH CKPTA ckpt installation comprises several files:libckpt.so:         The checkpoint library.  It is linked with the                    process to be checkpointed in one of the ways                    described below.librestart.so:      The restart library, a temporarily loaded library                    that assists the restarting of a checkpoint.                    It should not be linked with the process by                    the user.restart:            A command that restarts a process from a                    checkpoint.cssrv:              A network server that manages checkpoints.                    It is not necessary; checkpoints can be                    written to ordinary files instead.7. ENVIRONMENT VARIABLESThe ckpt library is controlled by the environment variables ofthe process in which it is loaded.  Since environments reside inthe user address space, the environment of a process thatrestarts a checkpoint will be replaced with the environmentpreserved in the checkpoint.CKPT_ID	     The checkpoint identifier, a string of ascii             characters.  The identifier names the checkpoint; it             is not interpreted.  If this variable is not set,             the ckpt library sets the identifer to a random             32-bit integer expressed in ascii hex.  The             identifer including the terminating null character             cannot be longer than 1024 characters.CKPT_SERVER  	     Enables the use of a checkpoint server.  The value             is the ascii server hostname or dotted IP address             optionally followed by a colon and ascii port             number.  The format of this variable may change to             accommodate future server protocols.CKPT_FILENAME             The name of the file in which the checkpoint is             written.  Forward slashes are treated as path             separators.  Occurrences of "%i" are replaced with             the checkpoint identifier.  Forward slashes in the             identifier are interpreted as path separators.	     The default is /tmp/ckpt.	     This variable has no effect if CKPT_SERVER is set.CKPT_RESTARTLIB             The pathname of the checkpoint restart library.  If	     this variable is not set, the restart library must	     be present somewhere in the LD_LIBRARY_PATH.             This variable is only significant in the process	     that restarts a checkpoint.CKPT_CONTINUE	     After emitting a checkpoint, the ckpt library forces	     the process to exit unless this variable is set (to	     any value).CKPT_SIGNAL             The signal that triggers a checkpoint.  The default             is SIGTSTP.  The value may be a Unix signal name             (e.g., SIGUSR1, SIGURG, etc.)  or a signal number             expressed as an ascii decimal integer.8. LINKING PROGRAMS WITH THE CHECKPOINT LIBRARYThe simplest way to link libckpt.so with a process is relink theprogram binary, adding the checkpoint library to its library list.For example, if the program foo is linked with this line:     cc -o foo foo.o -lm Then edit it as follows to include libckpt.so:     cc -o foo foo.o -lm -L/home/me/lib -lckptNote that the -L option directs the linker to include thespecified directory in its search for the library.Note also that the -L option does not affect the set ofdirectories searched when the program is started.  You must alsomodify the LD_LIBRARY_PATH environment variable to include thedirectory containing libckpt.so.Sometimes it is inconvenient or impossible to relink a program.You can force a program to load the checkpoint library when it isexecuted with the environment variable LD_PRELOAD.  Set thisvariable to the pathname of the library before executing theprogram.  For example (csh):    % setenv LD_PRELOAD /home/me/lib/libckpt.so    % fooFinally, you can inject the checkpoint library into an alreadyrunning process with a process hijacker.  A hijacker is availableat www.cs.wisc.edu/~zandy/p.9. TRIGGERING A CHECKPOINTYou trigger a checkpoint by sending the CKPT_SIGNAL to theprocess.  A process may checkpoint itself.After the checkpoint, the process exits unless CKPT_CONTINUE isset.  It calls _exit to avoid executing functions registered withatexit(3) or on_exit(3).  When CKPT_CONTINUE is set, the processcontinues from the point it was interrupted by the checkpointsignal and it can be checkpointed again.When CKPT_SERVER is not set, the checkpoint is written to anordinary file.  The name of the file is determined byCKPT_FILENAME.When CKPT_SERVER is set, the checkpoint is transferred over TCPto the checkpoint server.  The checkpoint is identified to thecheckpoint server with the checkpoint identifier.  Currently onlythe checkpoint server cssrv, included in the ckpt distribution,is supported.10. RESTARTING A CHECKPOINTThere are two ways to restart a checkpoint.1. The restart command restarts a checkpoint from a file.  It   replaces itself with the continuation of the checkpoint.   Restart does not require libckpt.so and ignores it if it is   loaded, except when CKPT_SERVER is set.2. When libckpt.so is loaded and CKPT_SERVER is set, it first   checks whether the checkpoint server has a checkpoint for   CKPT_ID, and if so it downloads and restarts it.A restarted process can be checkpointed again.The library librestart.so is required to restart a checkpoint.This library must be present in one of the directories listed inLD_LIBRARY_PATH or be identified with CKPT_RESTARTLIB.11. DIAGNOSTICSckpt prints warnings and errors to standard error.  It does nothesitate to abort if it senses danger.We are happy to assist with problems.12. CKPT APIPrograms linked with the checkpoint library can call thesefunctions:void ckpt_on_preckpt(void (*f)(void *), void *arg);    Register F to be called when a checkpoint is triggered.    Registered functions are called before the checkpoint begins    in the order they were registered.  F is passed the ARG    argument.void ckpt_on_postckpt(void (*f)(void *), void *arg);    Register F to be called when a checkpoint is taken and    CKPT_CONTINUE is set.  Registered functions are called    after the checkpoint completes in the order they were    registered.  F is passed the ARG argument.void ckpt_on_restart(void (*f)(void *), void *arg);    Register F to be called when a process is restarted from a    checkpoint.  Registered functions are called after the    checkpoint has been completely restored, just before control    returns to the program, in the reverse of the order they were    registered.  F is passed the ARG argument.13. CONTACTVictor Zandy wrote and maintains ckpt.  Please report bugs tozandy@cs.wisc.edu.  Feedback and experience reports are welcome.The ckpt webpage is http://www.cs.wisc.edu/~zandy/ckpt.14. REFERENCES[Condor]    http://www.cs.wisc.edu/condor[Hijack]    http://www.paradyn.org/papers/index.html#hijack            http://www.cs.wisc.edu/~zandy/p[Preload]   See ld.so(8).[Rocks]     http://www.cs.wisc.edu/~zandy/rocks

⌨️ 快捷键说明

复制代码 Ctrl + C
搜索代码 Ctrl + F
全屏模式 F11
切换主题 Ctrl + Shift + D
显示快捷键 ?
增大字号 Ctrl + =
减小字号 Ctrl + -