IIM(1) User Contributed Perl Documentation IIM(1) NNAAMMEE iim - an instant mirroring client for CPAN SSYYNNOOPPSSIISS iim [-v] [-q] [-d] [-t] [-f] [-m] [-daemon tag] [-e e] [-c conf] [config-options] DDEESSCCRRIIPPTTIIOONN Program iiiimm mirrors CPAN based on a set of _R_E_C_E_N_T ("RECENT-*.json") files provided in CPAN. On start-up, iiiimm compares the state of the local copy of CPAN with the master archive. If the _R_E_C_E_N_T files in the local copy indicate that it is incomplete or too much out-of-date, iiiimm does a full sync first. Then, iiiimm periodically reads the relevant _R_E_C_E_N_T files from the master archive. These files contain information about recent updates. Pro- gram iiiimm uses this information to fetch new files from the master, and delete obsolete files in the local copy. Program iiiimm is controlled by a small configuration file ; see section "CONFIG FILE" -> required entries. In _d_a_e_m_o_n _m_o_d_e, iiiimm is properly backgrounded and all output is written to a log file. Some effort is made to ensure that only one daemon is active at any given time. The scoreboard facility provides more information about the running program ; it is updated after every run of the main loop. The config can be _h_o_t or _n_o_t ; if _h_o_t, iiiimm will reload the config file when you change it. By default logging is terse ; iiiimm only shows errors and relevant (non-periodic) updates. With option "-v" it reports on all events and gives some state information when new events were found. With option "-d" it reports on internal actions as well. For more information, see also config entry loglevel. As an option, iiiimm can schedule periodic full rsyncs ; they are not nec- essary even when there are many and/or prolongued network failures. By default, iiiimm will periodically rotate the logfile. For more information on _R_E_C_E_N_T files and instant mirroring, see * www.cpan.org * search.cpan.org Look for "File::Rsync::Mirror::Recent" OOPPTTIIOONNSS --qq be quiet ; see also config entry loglevel --vv be verbose ; see also config entry loglevel --dd show debug info ; see also config entry loglevel --tt only test the config --ff on startup, do a full sync ; commandline option "-f" overrides con- fig entry allow_full_syncs ; so, "iim -f" will do a full sync even if _a_l_l_o_w___f_u_l_l___s_y_n_c_s is _0. --ddaaeemmoonn _t_a_g --ddaaeemmoonn _p_a_t_h_/_t_o_/_d_i_r_/_t_a_g run iiiimm in _d_a_e_m_o_n mode : A daemon-like iiiimm process is started, unless an other iiiimm daemon (with the same _t_a_g) is already running. The process is properly backgrounded. The _t_a_g must be alpha-numeric and directory "path/to/dir/" must exist. The daemon uses the current directory as it’s working directory. It creates a directory "tag" (or "path/to/dir/tag") containing : * a log-file : "iim.log" * a pid file : "iim.pid" * a lock-file : "iim.lck" All commandline arguments (except "-daemon") are passed to the dae- mon. All (error) output is written to the log _"_t_a_g_""/iim.log". The log is re-opened approximately every 5 minutes, to make log- rotation easier. The daemon is best killed with kill -9 ‘cat tag/iim.pid‘ Daemon mode uses "Proc::Daemon" ; by default, the daemon exec’s $0 ($PROGRAM_NAME) ; configure _p_r_o_g___i_i_m if that doesn’t work for you. --ee _e_p_o_c_h init with epoch _e_p_o_c_h ; _e_p_o_c_h may be given as an _i_n_t_e_r_v_a_l_-_s_p_e_c (see option sleep_main_loop). If _e_p_o_c_h is "negative" then the epoch is set to "time - _e_p_o_c_h". -e 1307687587.89889 -e -30m # set the epoch to 30 minutes ago -e -2h # set the epoch to two hours ago If "-e" is set, iiiimm does no full sync on start-up ; it just pro- cesses the update events that happened since _e_p_o_c_h. This option is for testing only. --cc _c_o_n_f_i_g_-_f_i_l_e use configuration file _c_o_n_f_i_g_-_f_i_l_e --mm compare the local archive with the master ; iiiimm exec’s an "rsync -n". _c_o_n_f_i_g _o_p_t_i_o_n_s All config entries can be set on the commandline : --entry value for example --sleep_main_loop 5m Note that the config file must still be _c_o_m_p_l_e_t_e (entries for all required keys) and _c_o_r_r_e_c_t (directory "local" must exists). CCOONNFFIIGG FFIILLEE llooccaattiioonn The default locations of the config file are : * ..//iiiimm..ccoonnff * $$HHOOMMEE//..iiiimm..ccoonnff * //eettcc//iiiimm..ccoonnff * //ddeevv//nnuullll [use default config] ssyynnttaaxx A config file looks like this : +-------------------------------------------------- │# lines that start with ’#’ are comment │# blank lines are ignored too │# tabs are replaced by a space │ │# the config entries are ’key’ and ’value’ pairs │# a ’key’ begins in column 1 │# the ’value’ is the rest of the line │somekey part1 part2 part3 ... │otherkey part1 part2 part3 ... │ │# keyword EMPTY represents the empty string ; │# in the next line some_key’s part2 is set to ’’ │somekey part1 EMPTY part3 ... │ │# indented lines are glued │# the next three lines mean ’somekey part1 part2 part3’ │somekey part1 │ part2 │ part3 │ │# lines starting with a ’+’ are concatenated │# the next three lines mean ’somekey part1part2part3’ │somekey part1 │+ part2 │+ part3 │ │# lines starting with a ’.’ are glued too │# don’t use a ’.’ on a line by itself │# ’somekey’ gets the value "part1\n part2\n part3" │somekey part1 │. part2 │. part3 +-------------------------------------------------- ccoonnffiigg ffiillee :: rreeqquuiirreedd eennttrriieess local _p_a_t_h Specify the (full, absolute) path to the local copy of CPAN. local /path/to/your/cpan-archive ccoonnffiigg ffiillee :: ooppttiioonnaall eennttrriieess temp _p_a_t_h This config entry is now oobbssoolleettee ; please remove it from config file "iim.conf". remote _s_o_m_e_._h_o_s_t_._o_r_g_:_:_m_o_d_u_l_e Optionally specify the rsync-module of the remote server. The default is : remote cpan-rsync.perl.org::CPAN If you are testing for _C_P_A_N _t_i_e_r_1, set remote cpan-rsync-master.perl.org::CPAN Also set config entries "user" and "passwd". user _l_o_g_i_n Optionally specify the login name to be used in rsync connections. The default is EMPTY ; that is, the empty string : user EMPTY passwd _p_w Optionally specify the password to be used in rsync connections. The default is EMPTY ; that is, the empty string : passwd EMPTY The password is passed to "rsync" in environment-variable "RSYNC_PASSWORD". sleep_main_loop _i_n_t_e_r_v_a_l_-_s_p_e_c Optionally specify the interval between runs of the main-loop. The default is 1 minute : sleep_main_loop 1m and five minutes in _d_a_e_m_o_n mode. An iinntteerrvvaall--ssppeecc can be given in seconds (as in 2222 or 2222ss), minutes [mm], hours [hh], days [dd] and/or weeks [ww]. The _i_n_t_e_r_v_a_l_-_s_p_e_c_s can be combined in any order : dw # a day and a week 7d+24h # same thing w-0.5h # a week minus half an hour hm6 # 3666 seconds sleep_init_epoch _i_n_t_e_r_v_a_l_-_s_p_e_c Optionally specify the interval between retries during start-up. The default is fifteen minutes : sleep_init_epoch 15m A start-up is _r_e_t_r_i_e_d if the start-up requires a full sync and that sync somehow fails. max_run_time _i_n_t_e_r_v_a_l_-_s_p_e_c By default iiiimm runs for a limited time, so memory leaks will never become a problem. Optionally specify the maximum time iiiimm may run. The default is _f_o_u_r _w_e_e_k_s _m_i_n_u_s _1_5 _m_i_n_u_t_e_s : max_run_time 4w-15m Setting "max_run_time" to _0 means _n_o _l_i_m_i_t. Make sure there is a cronjob in place to start an iiiimm daemon after iiiimm exits or the mirror host is rebooted. MIN * * * * ( cd /your/path/to/iim ; perl iim -q -daemon production ) where _M_I_N (minute) is some (randomly chosen) number between 0 and 59. scoreboard_file _p_a_t_h_/_t_o_/_f_i_l_e In each run of the main loop, iiiimm writes the _s_c_o_r_e_b_o_a_r_d___f_i_l_e ; it shows the current status of iiiimm, various timers, counters etc. The defaul is : scoreboard_file /path/to/CPAN/local/iim/iim-scb.html Actually, you can specify more than one file : scoreboard_file /path-to-some-dir/iim-scb.html /path-to-some-dir/iim-scb.json Depending on the suffix of _f_i_l_e (".html", ".php", ".json"), iiiimm writes a _h_t_m_l page, a _p_h_p fragament or a _j_s_o_n file ; plain text is the default. The _h_t_m_l pages are generated using a template _s_c_o_r_e_b_o_a_r_d___t_e_m_p_l_a_t_e (see next item). The _j_s_o_n files (also) contain the values of config entries and defaults. The _s_c_o_r_e_b_o_a_r_d___t_e_m_p_l_a_t_e (see next item) contains CSS to properly format the scoreboard. scoreboard_template _p_a_t_h_/_t_o_/_f_i_l_e Optionally specify the path to the template for a html scoreboard. The default is : scoreboard_template /path/to/CPAN/local/iim/iim-scb-tmpl.html.sample This file is re-written when iiiimm starts ; to customise the score- board, copy the default and configure the new location. If you copy to another directory, fix the iim-logo _I_M_G tag in in the template, or copy "iim-logo.png" to the other directory. hot_config 0│1 Optionally specify if the config is _h_o_t or not. The default is _n_o_t _h_o_t : hot_config 0 If/when the config is _h_o_t, iiiimm checks the config file for changes : if the (timestamp of the) config file changes, it is reloaded unless an error is detected. Use this option with care ; watch the log! loglevel quiet│terse│verbose│debug Optionally specify the level of logging ; the default is : loglevel terse If the loglevel is _t_e_r_s_e, iiiimm logs all events except updates of files that change very often like "indices/timestamp.txt", "RECENT-1h.json" etc. If the loglevel is _v_e_r_b_o_s_e, iiiimm reports on all events. If the loglevel is _d_e_b_u_g, iiiimm reports on internal actions as well. Loglevel _q_u_i_e_t does not affect event logging ; it is only used to let iiiimm quietly attempt to (re)start a daemon. Precedence : "-d", "-v", "-q", commandline option "--loglevel", config entry "loglevel". Option "-q" isn’t passed to the _d_a_e_m_o_n, so config entry "loglevel" (or "--loglevel") can be effective. rotate count [interval] Optionally specify logfile rotation ; the default is rotate 8 4w If a _c_o_u_n_t is non-zero, _c_o_u_n_t logfiles are rotated on start-up, and again after _i_n_t_e_r_v_a_l, etc. Logfile rotation only applies in _d_a_e_m_o_n _m_o_d_e. full_sync_interval _i_n_t_e_r_v_a_l_-_s_p_e_c Optionally specify the interval between full rsyncs. The default is _0, which means _d_o_n_’_t _s_c_h_e_d_u_l_e _f_u_l_l _s_y_n_c_s. full_sync_interval 0 If a full sync fails, a new full sync is scheduled to take place _s_l_e_e_p___i_n_i_t___e_p_o_c_h later. If everything works as advertized, full syncs are not necessary. allow_full_syncs 0│1 Optionally specify if full syncs are allowed or not. The default is _1, which means that full syncs _a_r_e allowed. allow_full_syncs 1 On startup, a full sync is required if the local archive is incon- sistent (_R_E_C_E_N_T _f_i_l_e_s are missing) or older than one day. After startup, iiiimm will do (scheduled) full syncs if, and only if, _f_u_l_l___s_y_n_c___i_n_t_e_r_v_a_l is set. IIiimm will exit if it can’t proceed without a full sync, and _a_l_l_o_w___f_u_l_l___s_y_n_c_s is _0. This option is for _t_e_s_t_i_n_g ; it is used to ensure that no full syncs will be done in a test environment created by "setup-test". prog_rsync _p_a_t_h_/_t_o_/_f_i_l_e Optionally specify where your "rsync" lives ; the default is : prog_rsync /usr/bin/rsync prog_iim _p_a_t_h_/_t_o_/_f_i_l_e Optionally specify where your program "iim" lives ; the default is : prog_iim $PROGRAM_NAME By default, in daemon mode, $PROGRAM_NAME ($0) is used to (re-)exec iiiimm. timeout _i_n_t_e_r_v_a_l_-_s_p_e_c Optionally specify the default for rsync’s "--timeout" ; the default is : timeout 300s The value is also used to set rsync’s "--contimeout". iim_umask _o_c_t_-_i_n_t_e_g_e_r Optionally specify the _u_m_a_s_k iiiimm should use ; in octal, as is usual. The default is : iim_umask 022 Umask 022 allows rsync to create world readable files and directo- ries. Often "cron" runs with a more restrictive umask (077). This leads to permission problems in the archive. include _p_a_t_h_/_t_o_/_f_i_l_e Include another iiiimm config file in situ. It is a fatal error to include the same file twice. IINNSSTTAALLLL rreeqquuiirreemmeennttss IIiimm requires Perl modules "JSON" and "Time::HiRes". You may want to install these modules as rroooott. · Get "cpanm" : # curl --compressed -LO http://xrl.us/cpanm # chmod +x ./cpanm · Install Perl modules "JSON" (or "JSON::PP") and "Time::HiRes" : # ./cpanm JSON # ./cpanm Time::HiRes If installing "JSON" fails, install "JSON::PP" (Pure Perl) instead. IIiimm requires that your CPAN archive is either empty or complete : the last rsync (if any) completed successfully. The archive doesn’t have to be up-to-date. If you are not sure, run rsyncs until one succeeds. rsync -av --delete cpan-rsync.perl.org::CPAN/ /path/to/CPAN/ Later, such full rsyncs aren’t necessary because iiiimm makes sure the archive is always (in some sense) _c_o_m_p_l_e_t_e. iinnssttaallllaattiioonn Installation is simple : * fetch the source (_p_r_e_f_e_r_e_d) checkout the svn repository : svn co https://svn.science.uu.nl/repos/sci.penni101.iim/trunk/ iim or get the package (same stuff) from : -- http://www.staff.science.uu.nl/~penni101/iim/iim.tar.gz -- rsync.cs.uu.nl::iim or get the bleeding edge from : -- http://ftp.cs.uu.nl/pub/PERL/iim-test/ -- http://ftp.cs.uu.nl/pub/PERL/iim-test.tar.gz -- rsync.cs.uu.nl::iim-test * make a configuration file Create a file "iim.conf" ; a sample is in "iim.conf.sample" : local /path/to/CPAN Point _l_o_c_a_l to your CPAN archive. Specify a full (not relative) pathname like "/path/to/CPAN/". If you are using "cpan-rsync-master.perl.org", add remote cpan-rsync-master.perl.org::CPAN user your-cpan-username passwd your-cpan-password * check the config perl iim -t * run You may want to do some testing, or simply run iiiimm with : perl iim -v IIiimm immediately starts tracking the changes in the CPAN master, picking up where the last sync left off. Only if your CPAN archive is more than 2 days old, a full sync is done first. * scoreboard The _s_c_o_r_e_b_o_a_r_d is in /path/to/CPAN/local/iim/iim-scb.html * daemon mode IIiimm is intended to run in the background, as a daemon process. Try _d_a_e_m_o_n _m_o_d_e with : perl iim -daemon production Watch the logfile with : tail -f production/iim.log * production Configure more options that fit your situation. See the next sec- tion for more tips on using iiiimm in production. Make sure you have a cronjob in place to start a fresh iiiimm daemon. pprroodduuccttiioonn Here are some things to keep in mind when you use iiiimm in production : · iiiimm is meant to be used in "-daemon" mode. · To prevent memory leaks from ever becoming a problem, iiiimm runs for a limited time by default. To ensure that iiiimm is always running, install a cronjob like : MIN * * * * ( cd /your/path/to/iim ; perl iim -q -daemon production ) where _M_I_N (minute) is some (randomly chosen) number between 0 and 59. The cronjob will try to start a fresh iiiimm daemon ; it will quietly exit if another daemon is already running. Adding "-f" will force a full sync on startup, even if your mirror is reasonable up to date. Use "crontab -l" to list your cronjobs. · If you make your CPAN mirror available by rsync, please add excludes = /local/ to the [CPAN] module description in your "rsyncd.conf" file. · After installation, program iiiimm can be moved anywhere. You can run iiiimm without a config file ; use a cronjob like MIN * * * * /path/to/iim -q -daemon /path/to/tag --local /path/to/CPAN tteessttiinngg Testing iiiimm doesn’t touch your CPAN archive, and doesn’t require (or make) a copy of CPAN. You set up a little test environment with : perl -w setup-test [testenv] Basicly, sseettuupp--tteesstt does : mkdir testenv mkdir testenv/CPAN # makes "testenv/iim.conf" containing : include iim.conf local testenv/CPAN sleep_main_loop 15s allow_full_syncs 0 # seed the test-archive cp -p /path_to_CPAN/RECENT-*.json testenv/CPAN You can check the test-config with : perl iim -t -c testenv/iim.conf ... and run the test with : perl iim -c testenv/iim.conf -v ... or try daemon mode with : perl iim -c testenv/iim.conf -v -daemon testenv If your local CPAN archive is a little oldish, sseettuupp--tteesstt seeds the test-archive with RECENT-1h.json RECENT-6h.json RECENT-1d.json RECENT-1W.json RECENT-1M.json RECENT-1Q.json RECENT-1Y.json RECENT-Z.json from the public CPAN archive "cpan-rsync.perl.org::CPAN". The test never does a full rsync ; it just picks up the CPAN updates and applies them to "testenv/CPAN/". If you kill (or suspend) iiiimm and restart (or resume) it later (say afer an hour), you can see that iiiimm picks up where it was when you stopped it. If/when you test iiiimm with a full CPAN archive, you can use "iim -m" to do a full compare of the local archive and the master ; "iim -m" just exec’s the proper "rsync -n". UUPPGGRRAADDEE · Before upgrading, always check the RELEASE-NOTES in svn or the bleeding edge ; see top of page under _U_P_G_R_A_D_E. · It is safe to do an svn update : svn up or download the package and copy everything to your iiiimm directory. TTOODDOO * randomize full_sync_interval, sleep_init_epoch * switch to git TTHHAANNKKSS A big thanks to Andreas J. König for patiently explaining the details of _R_E_C_E_N_T files to the author. AAUUTTHHOORR (c) 2011-2013 Henk P. Penning Faculty of Science, Utrecht University http://www.staff.science.uu.nl/~penni101/ -- penning@uu.nl iim version 0.4.10 - Fri Dec 6 10:41:48 2013 - dev revision 97 perl v5.8.8 2013-12-06 IIM(1)