User Tools

Site Tools


backup

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
backup [2007/08/07 21:39] cangianibackup [2023/08/21 07:14] (current) admin
Line 1: Line 1:
 ======= Backup Backup Backup ======= ======= Backup Backup Backup =======
 +
  
 ===== What ===== ===== What =====
Line 10: Line 11:
   * What you have to do in order to backup your data?   * What you have to do in order to backup your data?
  
-In order to answer to these questions, we prepared some systems (hardware and software) that work for us and that save/backup/move our data in an automatic fashion. This way the human intervention (and so the possibility of error) is reduced at the minimum.\\+In order to answer to these questions, we prepared some systems (hardware and software) that save/backup/move our data in an automatic fashion. This way the human intervention (and so the possibility of error) is reduced at the minimum.\\
 \\ \\
-**Remember that when we say __home directory__, we talk about the homedir of the user stored on the file server. We don't do backup of workstations or laptops**.\\ For some suggestion on how to backup your laptop see [[#laptops|here]] +**Remember that when we say __home directory__, we talk about your homedirs stored on the file server**, we don't do backup of workstations.\\ For some suggestion on how to backup your laptop see [[backup:laptops|here]].\\ 
-If a workstation stop working is not a problem, we can repair/substitute the workstation without loss of important data. In case of Personal Laptops we don't offer any kind of safeguard, regarding data stored on the laptops disk.+If a workstation stop working is not a problem, we can repair/substitute the workstation without loss of important data.
  
 ===== Why ===== ===== Why =====
  
-The most important thing that humans and computer share is just one: **the data of the users**. No matter what kind of computer, interface, network connection or whatever else mechanical/logical device users can/must use: if they can't access their data (stored inside the computer), problem arise. For a user isn'important where or how the data are saved, the only important thing is that hi/her needs access to these data, even in case of his/hers fault or in case of technical problems (yes, earthquake and flood are just technical problems).+The most important thing that humans and computer share is just one: **the data recorded**. No matter what kind of computer, interface, type of connection or whatever else mechanical/logical device users can/must use: if they can't access their data (stored inside the computer), problems arise. For a user it is not important where or how the data are saved, the only important thing is that he/she needs access to these data, even in case of his/hers fault.
  
 ===== How ===== ===== How =====
Line 23: Line 24:
 To offer the maximum support to users, our Network has different method to backup the data. Each method has pros and cons, and not all the method are accessible directly to the users. In some cases the user must ask to the System Administrators to recover his/hers data from the backup devices. This is necessary to maintain a good level of compromise between access to data and security. The backups are complete collections of all the data that **all** users collect and thus we have to consider the access to these repositories carefully. To offer the maximum support to users, our Network has different method to backup the data. Each method has pros and cons, and not all the method are accessible directly to the users. In some cases the user must ask to the System Administrators to recover his/hers data from the backup devices. This is necessary to maintain a good level of compromise between access to data and security. The backups are complete collections of all the data that **all** users collect and thus we have to consider the access to these repositories carefully.
 \\ \\
-We have different level of backup that offer access to deleted/changed files with different solutions.+We have different level of backup that offer access to deleted/changed files with different solutions.
  
   * Backup in the **[[backup:snapshot|.snapshot]]** directory   * Backup in the **[[backup:snapshot|.snapshot]]** directory
   * Backup on **[[backup:night|Hard Disk]]** in a central backup sever   * Backup on **[[backup:night|Hard Disk]]** in a central backup sever
-  * Backup on **[[backup:night|Tape]]** cartridge in central backup server 
  
  
 ==== .snapshot directory ==== ==== .snapshot directory ====
  
-Every user can find inside his/hers homedir on the central filer an hidden directory named **.snapshot**. Inside this directory exists others directories that have name +we maintain a read-only copy of all data directly in the file server, for up to 3 months. Users can request to access his/hers data back in time, providing the time range they want to recover from. 
-  * hourly.[0-3] +the possibilities are: 
-  * nightly.[0-3+  * hourly: last 24 hours 
-  * weekly.[0-1] +  * daily: last months or 90 days
-As the name suggest, these directory contain the file changed during the **lasts 4 hours, 4 days and 2 weeks** (the 0 **is** something). If the wanted file was deleted/changed during this time, the user have to search these directories, in order to find it. +
-Pay attention that inside the .snapshot directories a user can see the **state** of his/hers directory how it was, so if you look inside one of these directory you can see not only the files changed, but all the files you have currently. Amazing, isn'it? +
-If the searched file is found, the users can copy it where it want and the restore is done.+
  
 ==== Hard Disk in a central backup sever ==== ==== Hard Disk in a central backup sever ====
  
-Every day at 00.30 start backup process that copies all the files present in the users directories to another server, accessible only to System Administrators. Stored in this server we maintain the **last 30 days** of all the homedir, so in case of failure of the central file server we can restart with only 1 day of losts. +On secondary server, accessible only to Network System admins, we store a copy of all the data/files present in the primary server. We can then recover pretty fast the dataeven in case of failure of the primary server. 
-Not vital files aren't backd up. [[backup:list|here]] you can find the list of files excluded+Stored in this server we maintain the **last 180 days** of all the homedirs, so in case of failure of the central file server we can restart with only 1 day of lost
-\\+
 To retrieve a file from this server the user must ask it to System Administrators, __and please don't forget the details about the file__. To retrieve a file from this server the user must ask it to System Administrators, __and please don't forget the details about the file__.
  
-==== Tape cartridge in central backup server ====+===== Laptops ===== 
 +Each Laptop managed by the lab must be configured to backup all the work files using the Backup solution provided by EPFL (aTempo Lina, at the moment). 
 +The backup solution for laptop is under the direct control of the user, please follow strictly the directives so to no lose your data/files.
  
-Every day there's a backup procedure that copies all the files changed **during the day** to a tape cartridge stored on the central server of **Epfl**. Every Sunday night, the backup's procedure do a copy of **all** the files present on our File Server. 
 \\ \\
-To access the files stored on tapes the user must ask to the System Administrator that forward his/hers request to the backup team of the **Epfl**. The restore of files from tape require a lot of time, but normally is done during the day of request. The tapes are cheaper than the hard disk (and more slower) so we can maintain the data for more time. Currently we can restore the **last 3 months** of data, from tape backups. +\\ 
- +Please refer to [[backup:laptops|this page]] for more information about backup of laptops
-===== Laptops ===== +
-Laptops are the nightmare of any system-manager. Not only they are always a special case, therefore taking the same amount of work as 50 identical workstations, but also (and more importantly), their backup is under the only responsability of the user. +
-The files on a laptop are much more in danger than those on a workstation or on a file server: the laptop HardDisk is smaller and more fragile, the laptop can get stolen or lost, laptops get into many different uncontrolled networks.... +
-Nevertheless, we know that very few users backup their data on a regular basis.  +
-As Laptops become more and more convinient to use, the important work data also becomes more and more in danger. It is useless to have a sophisticated centralized backup system as the one described above if users keep their important data on their laptop. So **please backup your work data as often as possible**. Afterall, your work data belongs to EPFL. +
- +
-==== Syncronizing work stuff with Unison ==== +
-[[http://www.cis.upenn.edu/~bcpierce/unison/|Unison]] is a file-synchronization tool for Unix and Windows. Here we show how it can be used to keep the work stuff on your laptop in sync with your home directory on the file server. +
-Here is a short check list of the things to do: +
-  - install unison +
-  - cleanup your directories +
-  - chose and/or prioritize what to backup +
-  - decide what to exclude from the backup +
-  - setup unison configuration files and startup scripts +
- +
-=== Install Unison === +
-You can either download it directly from the [[http://www.cis.upenn.edu/~bcpierce/unison/download.html|official web site]], or use a precanned package from your favorite distribution. Unison is included in most linux distribution and on both [[http://fink.sf.net|fink]] and [[http://macports.org|mac ports]] for Apple OS X. On a Mac, for a nice feedback from the backup scripts, we also suggest to install the [[http://growl.info/|growl notifier]]. +
- +
-=== Cleanup your files === +
-Let's classify our files as //static// or //dynamic// files, and as //personal// or //professional//. Static files are created once and modified at most very rarely as, e.g. pictures, mp3 files, or pdf of downloaded papers. Dynamic files are those that you edit often as, e.g. the latex source files of your latest revolutionary scientific paper, your program sources as well as your contact list. Off-course dynamic files are those that need particular attention and a more frequent backup. Luckly, dynamic files are typically much smaller than static ones. In fact, very often most of the space on the disk is taken by personal static data (pictures, movies and music collection). +
- +
-<note> +
-Please **do not use our server for storing your personal static data**. Buy yourself an external Firewire or USB hard-disk instead. The backup of our home directories on tapes takes a long time and the tapes have limited capacity. It is impossible to backup tens of GigaBytes of non-work-related data for every user. The largest the stuff to backup the less frequent and safe are the backup. Please, respect the other users by keeping our disk usage on the servers as limited as possible. +
-</note> +
- +
-=== Synchronization policies === +
-We suggest to use two different policies (two separate configuration files) for synchronizing your data.  +
-  - a frequent (hourly) and automatic synchronization for all your dynamic data (we accept here also your personal files if they are not too large); +
-  - an on-demand or less frequent for professional static files. +
- +
-=== What to exclude? === +
-Clearly any file that is derived from another one (e.g. compiled files like object .o files, and latex output .dvi files) should be excluded from any backup. Strictly speaking synchronization is not a backup but excluding non source files will speedup your synchronization process. +
- +
-It is sane to **use only one synchronization system** for a given file. Version control systems are in fact synchronization systems. Therefore one should exclude also all the files that are alreadt under version control (cvs, svn, mercurial...). +
-I use to to append ''_svn'' to the name of any directory containing only files under version control and add the ''*_svn'' pattern to my exclude list. +
- +
-=== Configuration example === +
-Unison is easy to use, once the configuration file is set.  +
- +
-$ unison -batch myPolicy +
- +
-where ''myPolicy'' is the name of the policy to use and is described in a file with the same name under the ''$HOME/.unison'' directory. +
- +
-Here is how my main configuration file looks like: +
- +
-<code> +
-# Roots of the synchronization +
-    root = /Users/cangiani +
-    root = ssh://cangiani@algosrv5.epfl.ch/ +
- +
-# Here is what I want to keep in sync +
-    path = Archives +
-    path = Documents +
-    ignore = Path Documents/Local +
-    path = Learn +
-    path = Projects +
- +
-# Some regexps specifying names and paths to ignore +
-    ignore = Name temp.* +
-    ignore = Name *~ +
-    ignore = Name .*~ +
-    ignore = Name .o~$ +
-    ignore = Name *.{o,x} +
-    ignore = Name *.{tmp,aux,log,dvi} +
-    ignore = Name *_svn +
-    ignore = Name *.sparseimage +
-    ignore = Name .DS_Store +
- +
-# always use rsync for sending files +
-    rsync = true +
- +
-# first treat smaller files +
-    sortbysize = true +
- +
-# On Mac the default FS is case insensitive   :   +
-#    ignorecase = true +
- +
-# Log actions to the terminal +
-    log = true +
- +
-# Specific settings +
-    key = 1 +
-    label = Learn directory +
-    batch = false +
-</code> +
- +
-Please refer to the [[http://www.cis.upenn.edu/~bcpierce/unison/download/releases/beta/unison-manual.html|official documentation]] for all the details. +
- +
-=== Do it! === +
-Now that you've prepared and tested your perfect configuration file, it is time to make sure that unison is executed periodically. On unix (linux and mac) you can symply call unison from a script like the following: +
-<code> +
-#!/bin/sh +
- +
-# make sure to use always the same hostname even if the actual address of the laptop is variable +
-export UNISONLOCALHOSTNAME=giovanniMBP +
- +
-# make sure to be at EPFL connected on a wired connection +
-ifconfig | grep 128.178.70 > /dev/null 2>&+
-if [ $? eq 0 ] ; then +
-  unison -batch main +
-fi +
-</code> +
-and launch the script periodically as a cron job. Edit your cron table (''crontab -e'') and add something like +
-<code> +
-38 8-20/2 * * * /Users/cangiani/Desktop/unison.command 2>&1 > /dev/null +
-</code> +
-which runs the script every 2 hours between 8:38 am and 8:38 pm. +
- +
-Note that the line  +
-<code> +
-export UNISONLOCALHOSTNAME=giovanniMBP +
-</code> +
-of the above script is quite important because the database used by unison for keeping track of which files are changed since last sync depends on the host-name (address) of your machine. Since the laptop gets a different ip address and, therefore, a different host-name each time it is connected to the network it happens often that you have false conflicts: a file is supposed to be changed on both machines only because it was synced on a previous run with a previous host-name.+
  
backup.1186522747.txt.gz · Last modified: 2007/08/07 21:39 by cangiani