Bacula – copying a job
Don’t be fooled by Bacula. Although it is by far the most downloaded Open Source backup program, don’t be
intimidated by it. Bacula can handle backing up your single machine, or your entire enterprise. According to
SourceForge statistics, Bacula is by far the most downloaded
backup software. I’ve been using Bacula for nearly 7 years now. I was in the middle of preparing to deploy
another backup package (Amanda) when someone, I forget who, mentioned Bacula to me. I liked it immediately.
I liked it so most, I abandoned my plans and deployed Bacula instead. Soon after that, I wrote the PostgreSQL
module for Bacula. Bacula stores backup information (what was backed up, from where, and when) in a Catalog.
This Catalog is stored in a database. At present, the Catalog can be SQLite (not recommended), MySQL (also,
not recommended, but if you have to use it, it’s better than SQLite), and PostgreSQL. It is relatively
simple to create catalog support for a new database.
Back to the point; I’ve been backing up to to an 11TB disk cluster since March 2010. I’ve long wanted to
copy those jobs to tape. I bought a 10-slot tape library back in 2006.
I’ve never really used it with Bacula, although I was backing up to DLT prior to moving to a disk-based
solution. I’ve done some recent work to get my Bacula configuration to do Copy Jobs. In Bacula, you can
both Migrate (move) or Copy an existing backup job. However, the main restriction is that you cannot
Copy/Migrate Jobs between Storage Daemons. It must all be done within the same Storage Daemon. In my case,
I am copying a job from disk to tape. Therefore, the Storage Daemon must have access to both the HDD Pool
and the Tape Pool.
Now, on to the Copy Job details.
Migration & Copy
The following very brief summary is taken from the Bacula Concepts.
Bacula allows you to Copy or Migrate an existing Job. With a Copy Job, you wind up with two copies of the backup.
With a Migrate Job, you end up with only one backup job, just in a different location.
I will be doing Copy Job. You should be aware of one issue with Copy Job: just because you have another
backup, this does not mean you will be able to restore from it. Keep in mind that Bacula will see the
original Job first. Thus, it will restore from that Job. It is not until the original Job is no longer
available (e.g. Retention has expired, and then later the Job has been Purged) will Bacula even consider
restoring using the other Job. Normally, this is not a problem, but when making plans do not overlook this
point.
NOTE: Given that Copy Job and Migrate Job have so much in common, I will refer only to Copy Job, since that is what I
will be implementing.
What to copy?
To Copy a Job in Bacula, you create a Job with Type = Copy. It is
important to realize that this Job rill run, and create a new Job. Thus, your
job history will show two new entries. It is also possible for a Copy Job
to select multiple jobs for copying. Use this with caution as it does not
scale well, even with concurrency. Copies are done on a Job by Job basis.
I think the most flexible and also misunderstood feature of Copy Job is the selection of what to job.
This is the Selection Type directive, which identifies the type of
criteria you wish to use to select Jobs for copying.
While writing this, I thought I’d need to use Selection Type = SQLQuery. Now I think I’ll use
PoolUncopiedJobs, but I will need to do some configuration adjustments with that first.
For testing, I was using this Selection Pattern in conjunction with Selection Type = SQLQuery:
SELECT jobid, name, poolid, starttime, jobbytes FROM job WHERE jobid = 40164 ORDER BY jobid
The query must return a list of Job IDs, which must be in the first field of the result set. All other
fields are ignored. I returned the other fields only for my own debugging purposes.
Before I wrote this, I thought I’d have to do a query similar to this (do not use, untested!):
SELECT jobid, name, poolid, starttime, jobbytes FROM job WHERE endtime > '2010-11-28 00:00:00' ORDER BY jobid
I’m quite sure this won’t work as it will pick up both the jobs to be copied, and the copied jobs.
However, if I use PoolUncopiedJobs, I’m concerned that ALL jobs in my Pool will be copied to
tape. I don’t want that. I only want to duplicate all future jobs over to tape. Thus, I think my solution
is to create a new Pool, specifically for copying to tape.
A better SQL-based solution would be the sql_jobids_of_pool_uncopied_jobs value found in src/dird/migrate.c:
SELECT DISTINCT Job.JobId,Job.StartTime FROM Job,Pool WHERE Pool.Name = '%s' AND Pool.PoolId = Job.PoolId AND Job.Type = 'B' AND Job.JobStatus IN ('T','W') AND Job.jobBytes > 0 AND Job.JobId NOT IN (SELECT PriorJobId FROM Job WHERE Type IN ('B','C') AND Job.JobStatus IN ('T','W') AND PriorJobId != 0) ORDER BY Job.StartTime
However, should the underlying SQL structures ever change (as happens), your query will break.
Another feasible solution would involve creating a new Pool, and put all my disk based backups there.
The existing Pool looks like this:
Pool { Name = MegaFilePool Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 3 years Next Pool = Default Maximum Volume Bytes = 5G LabelFormat = "FileAuto-" }
I could just duplicate the above Pool, give it a different name, and alter my JobDefs to refer to this new pool.
Instead I created three new pools and decided to use the Selection Type = PoolUncopiedJobs
option. This later proved to have a few issues when run in conjunction with
Allow Duplicate Jobs = no.
The new pools
The new pools are:
Pool { Name = FullFile Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 3 years Storage = MegaFile Next Pool = Fulls Maximum Volume Bytes = 5G LabelFormat = "FullAuto-" } Pool { Name = DiffFile Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 6 weeks Storage = MegaFile Next Pool = Differentials Maximum Volume Bytes = 5G LabelFormat = "DiffAuto-" } Pool { Name = IncrFile Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 3 weeks Storage = MegaFile Next Pool = Incrementals Maximum Volume Bytes = 5G LabelFormat = "IncrAuto-" }
The above are the destinations for disk-based backups. My main JobDefs for disk-based backups contains
the following:
Pool = FullFile # required parameter for all Jobs Full Backup Pool = FullFile Differential Backup Pool = DiffFile Incremental Backup Pool = IncrFile
And the JobDefs for the Copy To Tape jobs contains this:
Pool = FullFile # required parameter for all Jobs # # since this JobDef is meant to be used with a Copy Job # these Pools are the source for the Copy... not the destination. # The Destination is determined by the Next Pool directive in # the respective Pools. # Full Backup Pool = FullFile Differential Backup Pool = DiffFile Incremental Backup Pool = IncrFile
You might wonder why they are identical. The comments explain why. The pools in a
Type = Copy job identify the source Pools. The pool definition will contain
a Next Pool which is the destination for the Copy.
The full JobDef and Job for copying from disk to tape appears here:
JobDefs { Name = "DefaultJobTape" Type = Backup Level = Incremental Client = polo-fd FileSet = "Full Set" Schedule = "WeeklyCycleForCopyingToTape" Storage = DigitalTapeLibrary Messages = Standard Pool = FullFile # required parameter for all Jobs # # since this JobDef is meant to be used with a Copy Job # these Pools are the source for the Copy... not the destination. # The Destination is determined by the Next Pool directive in # the respective Pools. # Full Backup Pool = FullFile Differential Backup Pool = DiffFile Incremental Backup Pool = IncrFile Priority = 400 # don't spool date when backing up to tape from local disk Spool Data = no Spool Attributes = yes RunAfterJob = "/home/dan/bin/dlt-stats-kraken" # no sense spooling local data Spool Data = no Spool Attributes = yes Maximum Concurrent Jobs = 6 } Job { Name = "CopyToTape" Type = Copy JobDefs = "DefaultJobTape" Selection Type = PoolUncopiedJobs }
Start with an empty pool?
My initial concern with PoolUncopiedJobs was the existing source pool was not empty.
It contained thousands of jobs. I did not want to copy them all to tape. I only want to copy
new jobs to tape. I had two choices:
- modify the database to make it look like the jobs had been copied over (i.e. SQL query)
- move the volumes already in the pool to a new pool
Ultimately, I started with a new empty pool (as demonstrated above by the pool definitions).
Be aware that if you do not start with an empty pool, your initial Copy job will have a lot of
work to do. Consider whether or not you want to copy everything over, etc.
My actions were:
- create a new pool
- leave existing backups where they were
- alter jobs to backup to the new pool
- create a copy job to take jobs from that new pool
Updating the pool parameters
When you update the pool definition in bacula-dir.conf, you should be aware that this has no
effect on existing volumes. For example, if you wish to alter the retention period for the volumes
in a pool, you need to issue the update command, like this:
*update Update choice: 1: Volume parameters 2: Pool from resource 3: Slots from autochanger 4: Long term statistics Choose catalog item to update (1-4): 1 Parameters to modify: 1: Volume Status 2: Volume Retention Period 3: Volume Use Duration 4: Maximum Volume Jobs 5: Maximum Volume Files 6: Maximum Volume Bytes 7: Recycle Flag 8: Slot 9: InChanger Flag 10: Volume Files 11: Pool 12: Volume from Pool 13: All Volumes from Pool 14: All Volumes from all Pools 15: Enabled 16: RecyclePool 17: Action On Purge 18: Done Select parameter to modify (1-18): 14 All Volume defaults updated from "Default" Pool record. All Volume defaults updated from "FullsFile" Pool record. All Volume defaults updated from "FullFile" Pool record. Error updating Volume records: ERR=sql_update.c:443 Update failed: affected_rows=0 for UPDATE Media SET ActionOnPurge=0, Recycle=1,VolRetention=3628800,VolUseDuration=0, MaxVolJobs=0,MaxVolFiles=0,MaxVolBytes=5368709120,RecyclePoolId=0 WHERE PoolId=23 All Volume defaults updated from "IncrFile" Pool record. All Volume defaults updated from "Fulls" Pool record. All Volume defaults updated from "Differentials" Pool record. All Volume defaults updated from "Incrementals" Pool record. All Volume defaults updated from "FullBackupsFile" Pool record. All Volume defaults updated from "FilePool" Pool record. All Volume defaults updated from "MegaFilePool" Pool record. *
I did the above after deciding upon new retention periods.
Retention periods
One idea I’ve decided to keep in mind is Job Retention and File Retention. These are define in the client resource.
I have decided to keep these values very high.
File Retention = 3 years Job Retention = 3 years
I am then able to have different retention periods on my pools and not worry about losing the Job or File records
from the Catalog. So long as the Volume Retention is greater than File|Job Retention, this plan will work
Do not try to save disk space by putting File|Job Retention less than Volume Retention. Once File Retention has
expired and the File records have been pruned from the catalog, you will not be able to restore an individual file
or set of files. You will only be able to restore a whole job. Similarly, for Job retention; once that data has been
pruned from the catalog, you will not even be able to restore those jobs. You will have to resort to bextract or
bscan. That is not something you want to do. Save time, stress, and energy. Use that disk space. Keep all of your
retention values the same.
When you *do* need to restore, you don’t want to be working with bextract/bscan. You just want the data restored.
Is the saving in disk space worth that? I think not.
Duplicate Job control
I have duplicate job control. This avoids queuing the same job twice. The directives I use are:
Allow Higher Duplicates = no Allow Duplicate Jobs = no Cancel Queued Duplicates = yes
These directives occur on the original backup jobs, not on the copy job that I run. However, when the copying is done,
those directives from the original jobs are used. This means I cannot copy more than one job at a time. If there is
more than one job waiting to be copied from the pool, you’re going to see something like this:
16-Dec 14:30 bacula-dir JobId 42282: The following 4 JobIds were chosen to be copied: 41741,41786,42183,42234 16-Dec 14:30 bacula-dir JobId 42283: The following 1 JobId was chosen to be copied: 41741 16-Dec 14:30 bacula-dir JobId 42283: Copying using JobId=41741 Job=nyi_maildir_tarball.2010-12-13_05.55.01_48 16-Dec 14:30 bacula-dir JobId 42283: Bootstrap records written to /home/bacula/working/bacula-dir.restore.660.bsr 16-Dec 14:30 bacula-dir JobId 42282: Job queued. JobId=42283 16-Dec 14:30 bacula-dir JobId 42282: Copying JobId 42283 started. 16-Dec 14:30 bacula-dir JobId 42285: The following 1 JobId was chosen to be copied: 41786 16-Dec 14:30 bacula-dir JobId 42285: Copying using JobId=41786 Job=BackupCatalog.2010-12-13_08.15.00_33 16-Dec 14:30 bacula-dir JobId 42285: Bootstrap records written to /home/bacula/working/bacula-dir.restore.661.bsr 16-Dec 14:30 bacula-dir JobId 42282: Job queued. JobId=42285 16-Dec 14:30 bacula-dir JobId 42282: Copying JobId 42285 started. 16-Dec 14:30 bacula-dir JobId 42287: The following 1 JobId was chosen to be copied: 42183 16-Dec 14:30 bacula-dir JobId 42287: Copying using JobId=42183 Job=nyi_maildir_tarball.2010-12-16_05.55.01_27 16-Dec 14:30 bacula-dir JobId 42287: Bootstrap records written to /home/bacula/working/bacula-dir.restore.662.bsr 16-Dec 14:30 bacula-dir JobId 42288: Cancelling duplicate JobId=42284. 16-Dec 14:30 bacula-dir JobId 42288: JobId 42284, Job nyi_maildir_tarball.2010-12-16_14.30.46_10 marked to be canceled. 16-Dec 14:30 bacula-dir JobId 42282: Job queued. JobId=42287 16-Dec 14:30 bacula-dir JobId 42282: Copying JobId 42287 started. 16-Dec 14:30 bacula-dir JobId 42282: Copying using JobId=42234 Job=BackupCatalog.2010-12-16_08.15.00_18 16-Dec 14:30 bacula-dir JobId 42282: Bootstrap records written to /home/bacula/working/bacula-dir.restore.663.bsr 16-Dec 14:30 bacula-dir JobId 42289: Cancelling duplicate JobId=42286. 16-Dec 14:30 bacula-dir JobId 42289: JobId 42286, Job BackupCatalog.2010-12-16_14.30.46_12 marked to be canceled.
I have put the cancel job lines in bold. These jobs will not get copied. This is something which you can handle
by running the Copy Job repeatedly, but I find that to be less than ideal. It is good enough for now, but perhaps later
I will look into a better solution for Copy Jobs. The ideas I have now include:
- not using duplicate job directives and using directives such as Max Start Delay to control duplicate jobs
- modifying Bacula to not observe duplicate job directives on Copy Jobs
The configuration items
I hope the above snippets will be helpful, however, there is no substitute for practical examples. This last section is a straight copy from my
existing configuration files. Hopefully it will help you get started with Copy jobs. Some of this will also appear above, but is included below
as a complete example.
FileSet { Name = "basic backup" Include { Options { signature=MD5 } File = /boot File = /usr/src/sys/i386/conf File = /etc File = /usr/local/etc File = /usr/local/info File = /usr/local/libexec/nagios/ File = /usr/local/var File = /root File = /var/db/ports File = /var/log File = /var/cron } } # # Disk based pools # Pool { Name = FullFile Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 3 years Storage = MegaFile Next Pool = Fulls Maximum Volume Bytes = 5G LabelFormat = "FullAuto-" } Pool { Name = DiffFile Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 6 weeks Storage = MegaFile Next Pool = Differentials Maximum Volume Bytes = 5G LabelFormat = "DiffAuto-" } Pool { Name = IncrFile Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 3 weeks Storage = MegaFile Next Pool = Incrementals Maximum Volume Bytes = 5G LabelFormat = "IncrAuto-" } # # Tape pools # Pool { Name = Fulls Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 3 years Storage = DigitalTapeLibrary } Pool { Name = Differentials Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 2 months Storage = DigitalTapeLibrary } Pool { Name = Incrementals Pool Type = Backup Recycle = yes AutoPrune = yes Volume Retention = 4 weeks Storage = DigitalTapeLibrary } Schedule { Name = "WeeklyCycle" Run = Level=Full 1st sun at 5:55 Run = Level=Differential 2nd-5th sun at 5:55 Run = Level=Incremental mon-sat at 5:55 } Schedule { Name = "WeeklyCycleForCopyingToTape" Run = Level=Full 1st sun at 6:15 8:15 10:15 Run = Level=Differential 2nd-5th sun at 6:15 8:15 10:15 Run = Level=Incremental mon-sat at 6:15 8:15 10:15 } JobDefs { Name = "DefaultJob" Type = Backup Level = Incremental Client = polo-fd FileSet = "Full Set" Schedule = "WeeklyCycle" Storage = MegaFile Messages = Standard Pool = FullFile # required parameter for all Jobs Full Backup Pool = FullFile Differential Backup Pool = DiffFile Incremental Backup Pool = IncrFile Priority = 10 # don't spool date when backing up to disk Spool Data = no Spool Attributes = yes Allow Higher Duplicates = no Allow Duplicate Jobs = no Cancel Queued Duplicates = yes } Client { Name = polo-fd Address = polo.unixathome.org Catalog = MyCatalog Password = "Gu9VX10CN7qZFbx26mZMd0T6FlEk8Z3g7Sle+SCBBmfj" File Retention = 3 years Job Retention = 3 years } Job { Name = "polo basic" JobDefs = "DefaultJob" Client = polo-fd FileSet = "basic backup" Write Bootstrap = "/home/bacula/working/polo-fd-basic.bsr" } JobDefs { Name = "DefaultJobTape" Type = Backup Level = Incremental Client = polo-fd FileSet = "Full Set" Schedule = "WeeklyCycleForCopyingToTape" Storage = DigitalTapeLibrary Messages = Standard Pool = FullFile # required parameter for all Jobs # # since this JobDef is meant to be used with a Copy Job # these Pools are the source for the Copy... not the destination. # The Destination is determined by the Next Pool directive in # the respective Pools. # Full Backup Pool = FullFile Differential Backup Pool = DiffFile Incremental Backup Pool = IncrFile Priority = 400 # don't spool date when backing up to tape from local disk Spool Data = no Spool Attributes = yes RunAfterJob = "/home/dan/bin/dlt-stats-kraken" # no sense spooling local data Spool Data = no Spool Attributes = yes Maximum Concurrent Jobs = 6 } Job { Name = "CopyToTape" Type = Copy JobDefs = "DefaultJobTape" Selection Type = PoolUncopiedJobs }
Enjoy.