Backup-restore procedures
**
Auteur : Thomas Bellembois ([University of Rennes
1|http://http://www.univ-rennes1.fr])
Introduction
The ESUP-WebDAV server is based on the Jakarta Slide WebDAV server. In this document the terms "Slide" and "ESUP WebDAV server" refer to the same entity : The ESUP WebDAV server.
Slide does not provide a user-friendly backup/restore interface. But given that Slide content and metadata are well structured it is possible to backup and restore them easily.
The ESUP Portail consortium does not provide a friendly tool to handle backup/restore procedures, but we have tested two procedures that work properly.
Feel free to contact me for further details.
Notational conventions :
- resource = a file or directory
- content/metadata = when a file is put on the WebDAV server, the server attach it informations. This is totally transparent for users but not for the administrator. That is why we call the file "content" and its attached information "metadata".
- backend = place (storage media) where the WebDAV server stores the content and metadata.
What you have to know
To understand how to bakup and particularly restore Slide content and metadata, you must be aware of how Slide handle these datas.
In the build.properties file of the ESUP Webdav server you have defined 5 parameters :
slide.rootPath = /home/tbellemb/esup-serveur-WebDav-3.5/SlideData slide.contentRootStore = ${slide.rootPath}/content/store slide.contentWorkStore = ${slide.rootPath}/content/work slide.metadataRootStore = ${slide.rootPath}/metadata/store slide.metadataWorkStore = ${slide.rootPath}/metadata/work
Note that once the server is deployed, you can get these parameters in the webapps/Slide/Domain.xml file in the deployment directory :
... <nodestore classname="org.apache.slide.store.txfile.TxXMLFileDescriptorsStore"> <parameter name="rootpath">/home/tbellemb/esup-serveur-WebDav-3.5/SlideData/metadata/store</parameter> <parameter name="workpath">/home/tbellemb/esup-serveur-WebDav-3.5/SlideData/metadata/work</parameter> <parameter name="defer-saving">true</parameter> <parameter name="timeout">120</parameter> </nodestore> ... <contentstore classname="org.apache.slide.store.txfile.TxFileContentStore"> <parameter name="rootpath">/home/tbellemb/esup-serveur-WebDav-3.5/SlideData/content/store</parameter> <parameter name="workpath">/home/tbellemb/esup-serveur-WebDav-3.5/SlideData/content/work</parameter> <parameter name="defer-saving">true</parameter> <parameter name="timeout">120</parameter> </contentstore> ...
The work directories are used as temporary directories so we will leave them aside...
We have then two branches :
- mySlideData/metadata/store
- mySlideData/content/store
that will respectively store Slide metadata and content.
Slide uses the same hierarchy to store content and metadata as the one created by users.
Let's have a look to this hierarchy :
!dataStructure.png!In the metadata branche, each directory contains descriptor files (*.def.xml) of its files and sub-directories. These xml files contain many informations such as the creation date, the display name or the permissions on the resource. Of course files and directories descriptors are a bit differents and contain specific informations. A directory descriptor enumerates its files and sub-directories, its parent and permissions.
Descriptor of the /files directory :
<children> <child name="quotas" uuri="/files/quotas"/> <child name="homedirs" uuri="/files/homedirs"/> <child name="partages" uuri="/files/partages"/> <child name="test" uuri="/files/test"/> </children> ... <parents> <parent name="files" uuri="/"/> </parents> ... <permissions> <permission subjectUri="/roles/uPortal/ToutLeMonde/Personnels/CRI/SI" actionUri="all" inheritable="true" negative="false" /> </permissions>
The content branche contains files (in the meaning of binary content). The filenames are ended by the file revision number - always _1.0 given that the ESUP WebDAV server does not use versioning.
The content and metadata branches are strongly linked. Adding a file in the content branche would have no effect (the file would not be visible on the server) if you do not modify the descritptor of the directory you want to put the file in.
1. Descriptors (.def.xml) files MUST NOT be modified while the server is running. This may crash the server.
2. If you DO NOT modify a file descriptor correctly (not well formed xml, for example) this may crash the server.
Backup/restore process
There are two main ways to restore Slide data depending on the permissions put on the folder tree you want to restore. Any way, you have to ensure that the Slide backend (myslideData) is on a backed filesystem.
Folder tree with no or few permissions (like an homedir)
Consider the following tree (resources with an * have access restrictions (permissions)) :
homedirs* -- tbellemb* -- c -- b -- a -- a1 -- foo3.doc -- foo1.doc -- foo2.doc
The user tbellemb deletes his "a" directory and would like it to be restored.
The last saved backend is (content branche) :
content -- store -- homedirs -- tbellemb -- c -- b -- a -- a1 -- foo3.doc_1.0 -- foo1.doc_1.0 -- foo2.doc_1.0
To restore the "a" directory :
- copy the "a" directory from the backend to a temporary directory
- remove all of the _1.0 extensions. This can be done by a simple Perl script below.
- with a DAV client put the "a" directory from the temporary directory to the user homedir
Perl script to remove the _1.0 extensions :#!/usr/bin/perl print "Debut programme: ".__FILE__." Ligne: ".__LINE__."\n\n"; chdir("$ARGV[0]"); opendir REP, "."; $reponse=`find . -name "*_1.0"`; @fics= split /\n/,$reponse; closedir REP; foreach $_ (@fics) { $oldName = $_; print "Fichier a traiter: ".$oldName."\n"; s/_1.0//; `mv "$oldName" "$_"`; print "Fichier traite: ".$_."\n"; } print "\nFin normale programme: ".__FILE__." Ligne: ".__LINE__."\n"; exit;
Folder tree with many permissions (like a shared space)
Consider the following tree (resources with an * have access restrictions (permissions)) :
partages* -- SI* -- c* -- b* -- a* -- a1** -- a11* -- .. -- a12* -- .. -- foo1.doc -- foo2.doc
A user deletes the "a" directory and would like it to be restored.
Using the first restore method would be a hard task and would force the administrator to put right access to each "a" sub-directory after the restoration.
The last saved backend is (content and metadata branches) :
content -- store -- partages -- SI -- c -- b -- a -- a1 -- a11 -- .. -- a12 -- .. -- foo1.doc_1.0 -- foo2.doc_1.0 metadata -- store -- partages -- [.def.xml] -- [partages.def.xml] -- SI -- [SI.def.xml] -- b -- c -- [a.def.xml] -- [b.def.xml] -- [c.def.xml] -- a -- foo1.doc.def.xml -- foo2.doc.def.xml
To restore the "a" directory : * Stop the WebDAV server
- Copy the two "a" directories both from the content and metadata branches of the saved backend to the current backend.
- Modify the SI.def.xml file to add the restored "a" directory as shown below.
- Restart the server
SI.def.xml :
<children> <child name="b" uuri="/partages/SI/b" /> <child name="c" uuri="/partages/SI/c" /> <!-- RESTORED DIRECTORY--> <child name="a" uuri="/partages/SI/a" /> </children>