Storage,
the base of a backup system
Data
repository models
Any backup strategy
starts with a concept of a data repository. The backup data needs to be
stored somehow and probably should be organized to a degree. It can be
as simple as a sheet of paper with a list of all backup tapes and the
dates they were written or a more sophisticated setup with a
computerized index, catalog, or relational database. Different
repository models have different advantages. This is closely related to
choosing a backup rotation scheme.
- Unstructured
- An unstructured
repository may simply be a stack of floppy disks or CD-R media with
minimal information about what was backed up and when. This is the
easiest to implement, but probably the least likely to achieve a
high level of recoverability.
- Full + Incrementals
- A Full + Incremental
repository aims to make storing several copies of the source data
more feasible. At first, a full backup (of all files) is
taken. After that an incremental backup (of only the files
that have changed since the previous full or incremental backup) can
be taken. Restoring whole systems to a certain point in time would
require locating the full backup taken previous to that time and all
the incremental backups taken between that full backup and the
particular point in time to which the system is supposed to be
restored. This model offers a high level of security that something
can be restored and can be used with removable media such as tapes
and optical disks. The downside is dealing with a long series of
incrementals and the high storage requirements.
- Full + Differential
- A full + differential
backup differs from a full + incremental in that after the full
backup is taken, each partial backup captures all files created or
changed since the full backup, even though some may have been
included in a previous partial backup. Its advantage is that a
restore involves recovering only the last full backup and then
overlaying it with the last differential backup.
- Mirror + Reverse
Incrementals
- A Mirror + Reverse
Incrementals repository is similar to a Full + Incrementals
repository. The difference is instead of an aging full backup
followed by a series of incrementals, this model offers a mirror
that reflects the system state as of the last backup and a history
of reverse incrementals. One benefit of this is it only requires an
initial full backup. Each incremental backup is immediately applied
to the mirror and the files they replace are moved to a reverse
incremental. This model is not suited to use removable media since
every backup must be done in comparison to the mirror.
- Continuous data
protection
- This model takes it a
step further and instead of scheduling periodic backups, the system
immediately logs every change on the host system. This is generally
done by saving byte or block-level differences rather than
file-level differences. It differs from simple disk mirroring
in that it enables a roll-back of the log and thus restore of old
image of data.
Storage
media
Regardless of the
repository model that is used, the data has to be stored on some data
storage medium somewhere.
- Magnetic tape
- Magnetic tape has long
been the most commonly used medium for bulk data storage, backup,
archiving, and interchange. Tape has typically had an order of
magnitude better capacity/price ratio when compared to hard disk,
but recently the ratios for tape and hard disk have become a lot
closer. There are myriad formats, many of which are proprietary or
specific to certain markets like mainframes or a particular brand of
personal computer. Tape is a sequential access medium, so even
though access times may be poor, the rate of continuously writing or
reading data can actually be very fast. Some new tape drives are
even faster than modern hard disks.
- Hard disk
- The capacity/price
ratio of hard disk has been rapidly improving for many years. This
is making it more competitive with magnetic tape as a bulk storage
medium. The main advantages of hard disk storage are low access
times, availability, capacity and ease of use. External disks can be
connected via local interfaces like SCSI, USB or FireWire, or via
longer distance technologies like Ethernet, iSCSI, or Fibre Channel.
Some disk-based backup systems, such as Virtual Tape Libraries,
support data de-duplication which can dramatically reduce the amount
of disk storage capacity consumed by daily and weekly backup data.
- Optical disc
- A recordable CD can be
used as a backup device. One advantage of CDs is that they can be
restored on any machine with a CD-ROM drive. As well, recordable
CD's are relatively cheap. Another common format is recordable DVD.
Many optical disk formats are WORM type, which makes them useful for
archival purposes since the data can't be changed. Other rewritable
formats can also be utilized such as CDRW or DVD-RAM. The newer HD-DVD's
and BluRay Disks dramatically increase the amount of data possible
on a single optical storage disk, though, as yet, the hardware may
be cost prohibitive for many people.
- Floppy disk
- During the 1980s and
early 1990s, many personal/home computer users associated backup
mostly with copying floppy disks. The low data capacity of a floppy
disk makes it an unpopular and obsolete choice today
- Solid state storage
- Also known as flash
memory, thumb drives, USB flash drives, CompactFlash, SmartMedia,
Memory Stick, Secure Digital cards, etc., these devices are
relatively costly for their low capacity, but offer excellent
portability and ease-of-use.
- Remote backup
service
- As broadband internet
access becomes more widespread, remote backup services are gaining
in popularity. Backing up via the internet to a remote location can
protect against some worse case scenarios, such as fire, flood or
earthquake, destroying any backups along with everything else. A
drawback to a remote backup service is that an internet connection
is usually substantially slower than the speed of local data storage
devices, so this can be a problem for people with large amounts of
data. It also has the risk associated with putting control of
personal or sensitive data in the hands of a third party.
Managing
the data repository
Regardless of the data
repository model or data storage media used for backups, a balance needs
to be struck between accessibility, security and cost.
- On-line
- On-line backup storage
is typically the most accessible type of data storage, which can
begin restore in milliseconds time. A good example would be an
internal hard disk or a disk array (maybe connected to SAN). This
type of storage is very convenient and speedy, but is relatively
expensive. On-line storage is vulnerable to being deleted or
overwritten, either by accident, or in the wake of a data-deleting
virus payload.
- Near-line
- Near-line storage is
typically less accessible and less expensive than on-line storage,
but still useful for backup data storage. A good example would be a
tape library with restore times ranging from seconds to a few
minutes. A mechanical device is usually involved in moving media
units from storage into a drive where the data can be read or
written.
- Off-line
- Off-line storage is
similar to near-line, except it requires human interaction to make
storage media available. This can be as simple as storing backup
tapes in a file cabinet. Media access time is more than an hour.
- Off-site vault
- To protect against a
disaster or other site-specific problem, many people choose to send
backup media to an off-site vault. The vault can be as simple as the
System Administrator’s home office or as sophisticated as a
disaster hardened, temperature controlled, high security bunker that
has facilities for backup media storage.
- Backup site,
Disaster Recovery Center or DR Center
- In the event of a
disaster, the data on backup media will not be sufficient to
recover. Computer systems onto which the data can be restored and
properly configured networks are necessary too. Some organizations
have their own data recovery centers that are equipped for this
scenario. Other organizations contract this out to a third-party
recovery center. Note that because DR site is itself a huge
investment, backup is very rarely considered preferred method of
moving data to DR site. More typical way would be remote disk
mirroring, which keeps the DR data as up-to-date as possible.
Selection,
extraction and manipulation of data
Deciding what to back up
at any given time is a harder process than it seems. By backing up too
much redundant data, the data repository will fill up too quickly. If we
don't back up enough data, critical information can get lost. The key
concept is to only back up files that have changed.
- Copying files
- Copy the files to be
backed up to another location using the OS specific copy utility.
- Filesystem dump
- Copy the filesystem
that holds the files in question to another location. This usually
involves unmounting the filesystem and running a program like dump.
This is also known as a raw partition backup. This type of
backup has the possibility of running faster than a backup that
simply copies files. A feature of some dump software is the ability
to restore specific files from the dump image.
- Identification of
changes
- Some filesystems have
an archive bit for each file that says it was recently changed. Some
backup software looks at the date of the file and compares it with
the last backup, to determine whether the file was changed.
- Block Level
Incremental
- A more sophisticated
method of backing up changes to files is to only back up the blocks
within the file that changed. This requires a higher level of
integration between the filesystem and the backup software.
- Versioning file
system
- A versioning
filesystem keeps track of all changes to a file and makes those
changes accessible to the user. Generally this gives access to any
previous version, all the way back to the file's creation time. An
example of this is the Wayback versioning filesystem for Linux.
Selection
and extraction of live data
If a computer system is
in use while it is being backed up, the possibility of files being open
for reading or writing is real. If a file is open, the contents on disk
may not correctly represent what the owner of the file intends. This is
especially true for database files of all kinds.
When attempting to
understand the logistics of backing up open files, one must consider
that the backup process could take several minutes to back up a large
file such as a database. In order to back up a file that is in use, it
is vital that the entire backup represent a single-moment snapshot of
the file, rather than a simple copy of a read-through. This represents a
challenge when backing up a file that is constantly changing. Either the
database file must be locked to prevent changes, or a method must be
implemented to ensure that the original snapshot is preserved long
enough to be copied, all while changes are being preserved. Backing up a
file while it is being changed, in a manner that causes the first part
of the backup to represent data before changes occur to be
combined with later parts of the backup after the change results
in a corrupted file that is unusable, as most large files contain
internal references between their various parts that must remain
consistent throughout the file.
- Snapshot backup
- A snapshot is an
instantaneous function of some storage systems that presents a copy
of the filesystem as if it was frozen in a specific point in time,
often by a copy-on-write mechanism. An effective way to back up live
data is to temporarily quiesce it (e.g. close all files), take a
snapshot, and then resume live operations. At this point the
snapshot can be backed up through normal methods. While a
snapshot is very handy for viewing a filesystem as it was at a
different point in time, it is hardly an effective backup mechanism
by itself.
- Open file backup
- Many backup software
packages feature the ability to back up open files. Some simply
check for openness and try again later. File locking is useful for
regulating access to open files.
- Cold database
backup
- During a cold backup,
the database is closed or locked and not available to users. The
datafiles do not change during the copy so the database is in sync
upon restore.
- Hot database backup
- Some database
management systems offer a means to generate a backup image of the
database while it is online and usable ("hot"). This
usually includes an inconsistent image of the data files plus a log
of changes made while the procedure is running. Upon a restore, the
changes in the log files are reapplied to bring the database in
sync.
Selection
and extraction of metadata
Not all information
stored on the computer is stored in files. Accurately recovering a
complete system from scratch requires keeping track of this non-file
data too.
- System description
- System specifications
are needed to procure an exact replacement after a disaster.
- File metadata
- Each file's
permissions, owner, group, ACLs, and any other metadata need to be
backed up for a restore to properly recreate the original
environment.
- Partition layout
- The layout of the
original disk, as well as partition tables and filesystem settings,
is needed to properly recreate the original system.
- Boot sector
- The boot sector can
sometimes be recreated more easily than saving it. Still, it usually
isn't a normal file and the system won't boot without it.
Manipulation
of data
It is frequently useful
to manipulate the data being backed up to optimize the backup process.
These manipulations can improve backup speed, restore speed, data
security, and media usage.
- Compression
- Various schemes can be
employed to shrink the size of the source data to be stored so that
uses less storage space. Compression is frequently a built-in
feature of tape drive hardware.
- De-duplication
- When multiple similar
systems are backed up to the same destination storage device, there
exists the potential for much redundancy within the backed up data.
For example, if 20 Windows workstations were backed up to the same
data repository, they might share a common set of system files. The
data repository only needs to store one copy of those files to be
able to restore any one of those workstations. This technique can be
applied at the file level or even on raw blocks of data, potentially
resulting in a massive reduction in required storage space.
Deduplication can occur on a server before any data moves to backup
media, sometimes referred to as source/client side deduplication.
This approach also reduces bandwidth required to send backup data to
its target media. The process can also occur at the target storage
device, sometimes referred to as inline or back-end deduplication;
- Duplication
- Sometimes backup jobs
are duplicated to a second set of storage media. This can be done to
rearrange the backup images to optimize restore speed, to have a
second copy at a different location or on a different storage
medium.
- Encryption
- High capacity
removable storage media such as backup tapes present a data security
risk if they are lost or stolen. Encrypting the data on these
media can mitigate this problem, but presents new problems. First,
encryption is a CPU intensive process that can slow down backup
speeds. Second, once data has been encrypted, it can not be
effectively compressed (although since redundant data makes
cryptanalytic attacks easier many encryption routines compress the
data as an integral part of the encryption process). Third, the
security of the encrypted backups is only as effective as the
security of the key management policy.
- Staging
- Sometimes backup jobs
are copied to a staging disk before being copied to tape. This
process is sometimes referred to as D2D2T, an acronym for Disk to
Disk to Tape. This can be useful if there is a problem matching the
speed of the final destination device with the source device as is
frequently faced in network-based backup systems. It can also serve
as a centralized location for applying other data manipulation
techniques.
Managing
the backup process
It is important to
understand that backup is a process. As long as new data is being
created and changes are being made, backups will need to be updated.
Individuals and organizations with anything from one computer to
thousands (or even millions) of computer systems all have requirements
for protecting data. While the scale is different, the objectives and
limitations are essentially the same. Likewise, those who perform
backups need to know to what extent they were successful, regardless of
scale.
Objectives
- Recovery Point
Objective (RPO)
- The point in time that
the restarted infrastructure will reflect. Essentially, this is the
roll-back that will be experienced as a result of the recovery. The
most desirable RPO would be the point just prior to the data loss
event. Making a more recent recovery point achievable requires
increasing the frequency of synchronization between the source data
and the backup repository.
- Recovery Time
Objective (RTO)
- The amount of time
elapsed between disaster and restoration of business functions.
- Data security
- In addition to
preserving access to data for its owners, data must be restricted
from unauthorized access. Backups must be performed in a manner that
does not compromise the original owner's undertaking. This can be
achieved with data encryption and proper media handling policies.
Limitations
- System impacts
- An effective backup
scheme will take into consideration the limitations of the
situation. All backup schemes have some impact on the system being
backed up. If this impact is significant, the backup needs to be
time-limited to a convenient backup window or alternate means of
protecting data need to be employed. These alternate means tend to
be more expensive.
- Costs of hardware,
software, labor
- All types of storage
media have a finite capacity with a real cost. Matching the correct
amount of storage capacity (over time) with the backup needs is an
important part of the design of a backup scheme. Any backup scheme
has some labor requirement, but complicated schemes have
considerably higher labor requirements. The cost of commercial
backup software can also be considerable.
- Network Bandwidth
- Distributed backup
systems can be impacted by limited network bandwidth.
Implementation
Meeting the defined
objectives in the face of the above limitations can be a difficult task.
The tools and concepts below can make that task more achievable.
- Scheduling
- Using a Job scheduler
can greatly improve the reliability and consistency of backups by
removing part of the human element. Many backup software packages
include this functionality.
- Authentication
- Over the course of
regular operations, the user accounts and/or system agents that
perform the backups need to be authenticated at some level. The
power to copy all data off of or onto a system requires unrestricted
access. Using an authentication mechanism is a good way to prevent
the backup scheme from being used for unauthorized activity.
- Chain of trust
- Removable storage
media are physical items and must only be handled by trusted
individuals. Establishing a chain of trusted individuals (and
vendors) is critical to defining the security of the data.
Measuring
the process
To ensure that the backup
scheme is working as expected, the process needs to include monitoring
key factors and maintaining historical data.
- Backup validation
- (also known as
"Backup Success Validation") The process by which owners
of data can get information regarding how their data was backed up.
This same process is also used to prove compliance to regulatory
bodies outside of the organization, for example, an insurance
company might be required under HIPAA to show "proof" that
their patient data are meeting records retention requirements.
Disaster, data complexity, data value and increasing dependence upon
ever-growing volumes of data all contribute to the anxiety around
and dependence upon successful backups to ensure business
continuity. For that reason, many organizations rely on third-party
or "independent" solutions to test, validate, and optimize
their backup operations (backup reporting).
- Reporting
- In larger
configurations, reports are useful for monitoring media usage,
device status, errors, vault coordination and other information
about the backup process.
- Logging
- In addition to the
history of computer generated reports, activity and change logs are
useful for monitoring backup system events.
- Validation
- Many backup programs
make use of checksums or hashes to validate that the data was
accurately copied. These offer several advantages. First, they allow
data integrity to be verified without reference to the original
file: if the file as stored on the backup medium has the same
checksum as the saved value, then it is very probably correct.
Second, some backup programs can use checksums to avoid making
redundant copies of files, to improve backup speed. This is
particularly useful for the de-duplication process.
- Monitored Backup
- Backup processes are
monitored by a third party monitoring center. This center alerts
users to any errors that occur during automated backups. Monitored
backup requires software capable of pinging the monitoring center's
servers in the case of errors.
Lore
Advice
- The more important the
data that is stored on the computer the greater the need is for
backing up this data.
- A backup is only as
useful as its associated restore strategy.
- Storing the copy near
the original is unwise, since many disasters such as fire, flood and
electrical surges are likely to cause damage to the backup at the
same time.
- Automated backup and
scheduling should be considered, as manual backups can be affected
by human error.
- Backups will fail for
a wide variety of reasons. A verification or monitoring strategy is
an important part of a successful backup plan.
Glossary
of backup terms
- Backup policy
- An organisation's
procedures and rules for ensuring that adequate amounts and types of
backups are made, including suitably frequent testing of the process
for restoring the original production system from the backup copies.
- Backup rotation
scheme
- A method for
effectively backing up data where multiple media are systematically
moved from storage to usage in the backup process and back to
storage. There are several different schemes. Each takes a different
approach to balance the need for a long retention period with
frequently backing up changes. Some schemes are more complicated
than others.
- Backup site
- A place where business
can continue after a data loss event. Such a site may have ready
access to the backups or possibly even a continuously updated
mirror.
- Backup software
- Computer software
applications that are used for performing the backing up of data,
i.e., the systematic generation of backup copies.
- Backup window
- The period of time
that a system is available to perform a backup procedure. Backup
procedures can have detrimental effects to system and network
performance, sometimes requiring the primary use of the system to be
suspended. These effects can be mitigated by arranging a backup
window with the users or owners of the system(s).
- Copy backup
- Term for full backup
used by Windows Server 2003.
- Cumulative
incremental backup
- Term for a
differential backup used by NetBackup.
- Daily backup
- Term for incremental
backup used by Windows Server 2003.
- Data salvage
- The process of
recovering data from storage devices when the normal operational
methods are impossible. This process is typically performed by
specialists in controlled environments with special tools. For
example, a crashed hard disk may still have data on it even though
it doesn't work properly. A data salvage specialist might be able to
recover much of the original data by opening it up in a clean room
and tinkering with the internal parts.
- Differential backup
- A cumulative backup of
all changes made since the last full backup. The advantage to this
is the quicker recovery time, requiring only a full backup and the
latest differential backup to restore the system. The disadvantage
is that for each day elapsed since the last full backup, more data
needs to be backed up, especially if a majority of the data has been
changed.
- Differential
incremental backup
- Term for an
incremental backup used by NetBackup.
- Disaster recovery
- The process of
recovering after a business disaster and restoring or recreating
data. One of the main purposes of creating backups is to facilitate
a successful disaster recovery. For maximum effectiveness, this
process should be planned in advance and audited.
- Disk image
- A method of backing up
a whole disk or filesystem in a single image. Since the underlying
data structures are what is actually backed up, this method does not
allow for file level control over what is selected for backup or
restore.
- FlashBackup
- Term for raw
partition backup used by NetBackup Advanced Client. In NBAC,
support is limited to the VxFS (Veritas), ufs (Solaris), Online JFS
(HP-UX), and NTFS (Windows) filesystem types. Similar to the UNIX
utility dump.
- Full backup
- A backup of all
(selected) files on the system. In contrast to a drive image, this
does not included the file allocation tables, partition structure
and boot sectors.
- Hot backup
- A backup of a database
that is still running, and so changes may be made to the data while
it is being backed up. Some database engines keep a record of all
entries changed, including the complete new value. This can be used
to resolve changes made during the backup.
- Incremental backup
- A backup that only
contains the files that have changed since the most recent backup
(either full or incremental). The advantage of this is quicker
backup times, as only changed files need to be saved. The
disadvantage is longer recovery times, as the latest full backup,
and all incremental backups up to the date of data loss need to be
restored.
- Media spanning
- Sometimes a backup job
is larger than a single destination storage medium. In this case,
the job must be broken up into fragments that can be distributed
across multiple storage media.
- Multiplexing
- The practice of
combining multiple backup data streams into a single stream that can
be written to a single storage device. For example, backing up 4
PC's to a single tape drive at once.
- Multistreaming
- The practice of
creating multiple backup data streams from a single system to
multiple storage devices. For example, backing up a single database
to 4 tape drives at once.
- Normal backup
- Term for full backup
used by Windows Server 2003.
- Near store
- Provisionally backing
up data to a local staging backup device, possibly for later
archival backup to a remote store device.
- Open file backup
- Term for the ability
to back up a file while it is in use by another application.
- Remote store
- Backing up data to an
offsite permanent backup facility, either directly from the live
data source or else from an intermediate near store device.
- Restore time
- The amount of time
required to bring a desired data set back from the backup media.
- Retention time
- The amount of time in
which a given set of data will remain available for restore. Some
backup products rely on daily copies of data and measure retention
in terms of days. Others retain a number of copies of data changes
regardless of the amount of time.
- Site-to-site backup
- Backup, over the
internet, to an offsite location under the user's control. Similar
to remote backup except that the owner of the data maintains control
of the storage location.
- Synthetic backup
- Term used by NetBackup
for a restorable backup image that is synthesized on the backup
server from a previous full backup and all the incremental backups
since then. It is equivalent to what a full backup would be if it
were taken at the time of the last incremental backup.
- Tape library
- A storage device which
contains tape drives, slots to hold tape cartridges, a barcode
reader to identify tape cartridges and an automated method for
physically moving tapes within the device. These devices can store
immense amounts of data.
- True image restore
- Term used by NetBackup
for the collection of file deletion and file movement records so
that an accurate restore can be performed. For instance, consider a
system that has a directory with 5 documents in it on Friday. On
Saturday, the system gets a full backup that includes those 5
documents. On Monday, the owner of those documents deletes 2 of them
and updates 1 of the 3 remaining. That updated document gets backed
up as part of The Monday night incremental backup. On Tuesday
afternoon the system crashes. If we perform a normal restore of the
full backup from Saturday and the incremental backup from Monday to
the fresh system, we will have restored the 2 documents that were
intentionally deleted. True image restore keeps track of the
deletions with each incremental backup and prevents the deleted
files from being inappropriately restored.
- Virtual Tape
Library (VTL)
- A storage device that
appears to be a tape library to backup software, but actually stores
data by some other means. A VTL can be configured as a temporary
storage location before data is actually sent to real tapes or it
can be the final storage location itself.
|