PPRuNe Forums - View Single Post - Shadow Copy Storage
View Single Post
Old 2nd August 2009 | 22:21
  #5 (permalink)  
bnt
15 Anniversary
 
Joined: Feb 2007
Posts: 755
Likes: 26
From: Dublin, Ireland. (No, I just live here.)
A bit more detail for anyone who's interested; I used to work with large disk storage systems that used similar principles to Microsoft's Shadow Copy service, and became a bit too familiar with the limitations of the technology. (No, they didn't invent it - the generic term is Snapshot - though some of the stuff "on top" is different. IIRC it was a VMS innovation.) The Wikipedia article on Shadow Copy is horrid, and doesn't properly cover the underlying Snapshot technology.

Basically, a Snapshot is an event that gets triggered, manually or automatically (by a program or a timer) etc., which appears to duplicate a disk at a point in time. You get two disks (block devices) where you originally had one: the original, which you continue to use as normal, and the "point-in-time" copy or Snapshot. The latter appears frozen at the exact time the snapshot was triggered, and it is thus safer to back it up, since you know nothing will change during that process.

When the Snapshot is triggered, only the first few blocks of the disk (the boot record, FAT etc.) are copied, so it's a quick process. It does not duplicate any data at that time. The key to Snapshots are what happens afterwards - the "copy out" of old data I described in my earlier post. The Snapshot system (e.g. Shadow Copy Service) keeps track of which disk blocks are original and which are modified. It watches out for disk writes, and if original data (from before the Snapshot only) is about to get overwritten, it is "copied out" and safeguarded. After that, read requests to the disk go to the new data as normal, but requests to those blocks on the Snapshot are redirected to the blocks that were "copied out". The "overhead" imposed by the Snapshot depends on how much data is changed on the original disk.

One point of confusion in that Wikipedia article is that it claims that Shadow Copy is a solution to the problem of backing up open files. It can be ... but only if the files are closed (or at least in a sensible state) at the instant the Snapshot is taken. On a Windows Server running SQL Server, for example, you get something like this happening:
- you tell a backup program "back up that database".
- the backup program tells Shadow Copy "make me a Snapshot of this disk" (holding the database file);
- Shadow Copy tells SQL Server "quiesce your database" (write all data from memory and close the file).
- SQL Server does that (delaying database requests), and tells Shadow Copy when it's done;
- Shadow Copy creates the Snapshot (a few seconds) and tells SQL Server when it's done;
- SQL Server re-opens the file and carries on servicing requests;
- Shadow Copy tells the backup program the disk ID of the Snapshot, so it can start the backup.

My point is that Shadow Copy (or an Snapshot system) is not magic, and doesn't solve the open file backup problem by itself, but it can work with other programs or services to co-ordinate its actions. The Snapshot itself is dumb: it will be a snapshot of whatever was happening on disk at the time, files open or closed. I don't know whether e.g. Word or Excel can work with Shadow Copy in that way, but it would be a good idea, since they can be a bit messy with temp files etc. when in use.

Last edited by bnt; 2nd August 2009 at 22:41.
bnt is offline  
Reply