Bacula – copying a job

Bacula – copying a job

Don’t be fooled by Bacula. Although it is by far the most downloaded Open Source backup program, don’t be
intimidated by it. Bacula can handle backing up your single machine, or your entire enterprise. According to
SourceForge statistics, Bacula is by far the most downloaded
backup software. I’ve been using Bacula for nearly 7 years now. I was in the middle of preparing to deploy
another backup package (Amanda) when someone, I forget who, mentioned Bacula to me. I liked it immediately.
I liked it so most, I abandoned my plans and deployed Bacula instead. Soon after that, I wrote the PostgreSQL
module for Bacula. Bacula stores backup information (what was backed up, from where, and when) in a Catalog.
This Catalog is stored in a database. At present, the Catalog can be SQLite (not recommended), MySQL (also,
not recommended, but if you have to use it, it’s better than SQLite), and PostgreSQL. It is relatively
simple to create catalog support for a new database.

Back to the point; I’ve been backing up to to an 11TB disk cluster since March 2010. I’ve long wanted to
copy those jobs to tape. I bought a 10-slot tape library back in 2006.
I’ve never really used it with Bacula, although I was backing up to DLT prior to moving to a disk-based
solution. I’ve done some recent work to get my Bacula configuration to do Copy Jobs. In Bacula, you can
both Migrate (move) or Copy an existing backup job. However, the main restriction is that you cannot
Copy/Migrate Jobs between Storage Daemons. It must all be done within the same Storage Daemon. In my case,
I am copying a job from disk to tape. Therefore, the Storage Daemon must have access to both the HDD Pool
and the Tape Pool.

Now, on to the Copy Job details.

Migration & Copy

The following very brief summary is taken from the Bacula Concepts.
Bacula allows you to Copy or Migrate an existing Job. With a Copy Job, you wind up with two copies of the backup.
With a Migrate Job, you end up with only one backup job, just in a different location.
I will be doing Copy Job. You should be aware of one issue with Copy Job: just because you have another
backup, this does not mean you will be able to restore from it. Keep in mind that Bacula will see the
original Job first. Thus, it will restore from that Job. It is not until the original Job is no longer
available (e.g. Retention has expired, and then later the Job has been Purged) will Bacula even consider
restoring using the other Job. Normally, this is not a problem, but when making plans do not overlook this
point.

NOTE: Given that Copy Job and Migrate Job have so much in common, I will refer only to Copy Job, since that is what I
will be implementing.

What to copy?

To Copy a Job in Bacula, you create a Job with Type = Copy. It is
important to realize that this Job rill run, and create a new Job. Thus, your
job history will show two new entries. It is also possible for a Copy Job
to select multiple jobs for copying. Use this with caution as it does not
scale well, even with concurrency. Copies are done on a Job by Job basis.

I think the most flexible and also misunderstood feature of Copy Job is the selection of what to job.
This is the Selection Type directive, which identifies the type of
criteria you wish to use to select Jobs for copying.

While writing this, I thought I’d need to use Selection Type = SQLQuery. Now I think I’ll use
PoolUncopiedJobs, but I will need to do some configuration adjustments with that first.

For testing, I was using this Selection Pattern in conjunction with Selection Type = SQLQuery:

  SELECT jobid, name, poolid, starttime, jobbytes
    FROM job
   WHERE jobid = 40164
ORDER BY jobid

The query must return a list of Job IDs, which must be in the first field of the result set. All other
fields are ignored. I returned the other fields only for my own debugging purposes.

Before I wrote this, I thought I’d have to do a query similar to this (do not use, untested!):

  SELECT jobid, name, poolid, starttime, jobbytes
    FROM job
   WHERE endtime > '2010-11-28 00:00:00' 
ORDER BY jobid

I’m quite sure this won’t work as it will pick up both the jobs to be copied, and the copied jobs.
However, if I use PoolUncopiedJobs, I’m concerned that ALL jobs in my Pool will be copied to
tape. I don’t want that. I only want to duplicate all future jobs over to tape. Thus, I think my solution
is to create a new Pool, specifically for copying to tape.

A better SQL-based solution would be the sql_jobids_of_pool_uncopied_jobs value found in src/dird/migrate.c:

  SELECT DISTINCT Job.JobId,Job.StartTime FROM Job,Pool
   WHERE Pool.Name = '%s' AND Pool.PoolId = Job.PoolId
     AND Job.Type = 'B' AND Job.JobStatus IN ('T','W')
     AND Job.jobBytes > 0
     AND Job.JobId NOT IN
         (SELECT PriorJobId 
            FROM Job
           WHERE Type IN ('B','C')
             AND Job.JobStatus IN ('T','W')
             AND PriorJobId != 0)
ORDER BY Job.StartTime

However, should the underlying SQL structures ever change (as happens), your query will break.
Another feasible solution would involve creating a new Pool, and put all my disk based backups there.

The existing Pool looks like this:

Pool {
  Name                 = MegaFilePool
  Pool Type            = Backup
  Recycle              = yes
  AutoPrune            = yes
  Volume Retention     = 3 years
  Next Pool            = Default
  Maximum Volume Bytes = 5G
  LabelFormat          = "FileAuto-"
}

I could just duplicate the above Pool, give it a different name, and alter my JobDefs to refer to this new pool.

Instead I created three new pools and decided to use the Selection Type = PoolUncopiedJobs
option. This later proved to have a few issues when run in conjunction with
Allow Duplicate Jobs = no.

The new pools

The new pools are:

Pool {
  Name             = FullFile
  Pool Type        = Backup
  Recycle          = yes
  AutoPrune        = yes
  Volume Retention = 3 years
  Storage          = MegaFile
  Next Pool        = Fulls

  Maximum Volume Bytes = 5G

  LabelFormat = "FullAuto-"
}

Pool {
  Name             = DiffFile
  Pool Type        = Backup
  Recycle          = yes
  AutoPrune        = yes
  Volume Retention = 6 weeks
  Storage          = MegaFile
  Next Pool        = Differentials

  Maximum Volume Bytes = 5G

  LabelFormat = "DiffAuto-"
}

Pool {
  Name             = IncrFile
  Pool Type        = Backup
  Recycle          = yes
  AutoPrune        = yes
  Volume Retention = 3 weeks
  Storage          = MegaFile
  Next Pool        = Incrementals

  Maximum Volume Bytes = 5G

  LabelFormat = "IncrAuto-"
}

The above are the destinations for disk-based backups. My main JobDefs for disk-based backups contains
the following:

Pool = FullFile  # required parameter for all Jobs

Full         Backup Pool = FullFile
Differential Backup Pool = DiffFile
Incremental  Backup Pool = IncrFile

And the JobDefs for the Copy To Tape jobs contains this:

Pool = FullFile # required parameter for all Jobs

#
# since this JobDef is meant to be used with a Copy Job
# these Pools are the source for the Copy... not the destination.
# The Destination is determined by the Next Pool directive in
# the respective Pools.
#
Full         Backup Pool = FullFile
Differential Backup Pool = DiffFile
Incremental  Backup Pool = IncrFile

You might wonder why they are identical. The comments explain why. The pools in a
Type = Copy job identify the source Pools. The pool definition will contain
a Next Pool which is the destination for the Copy.

The full JobDef and Job for copying from disk to tape appears here:

JobDefs {
  Name        = "DefaultJobTape"
  Type        = Backup
  Level       = Incremental
  Client      = polo-fd 
  FileSet     = "Full Set"
  Schedule    = "WeeklyCycleForCopyingToTape"
  Storage     = DigitalTapeLibrary
  Messages    = Standard

  Pool        = FullFile # required parameter for all Jobs

  #
  # since this JobDef is meant to be used with a Copy Job
  # these Pools are the source for the Copy... not the destination.
  # The Destination is determined by the Next Pool directive in
  # the respective Pools.
  #
  Full         Backup Pool = FullFile
  Differential Backup Pool = DiffFile
  Incremental  Backup Pool = IncrFile

  Priority    = 400

  # don't spool date when backing up to tape from local disk
  Spool Data  = no
  Spool Attributes = yes

  RunAfterJob  = "/home/dan/bin/dlt-stats-kraken"

  # no sense spooling local data
  Spool Data       = no
  Spool Attributes = yes
  Maximum Concurrent Jobs = 6
}

Job {
  Name     = "CopyToTape"
  Type     = Copy
  JobDefs  = "DefaultJobTape"

  Selection Type = PoolUncopiedJobs
}

Start with an empty pool?

My initial concern with PoolUncopiedJobs was the existing source pool was not empty.
It contained thousands of jobs. I did not want to copy them all to tape. I only want to copy
new jobs to tape. I had two choices:

  • modify the database to make it look like the jobs had been copied over (i.e. SQL query)
  • move the volumes already in the pool to a new pool

Ultimately, I started with a new empty pool (as demonstrated above by the pool definitions).
Be aware that if you do not start with an empty pool, your initial Copy job will have a lot of
work to do. Consider whether or not you want to copy everything over, etc.

My actions were:

  • create a new pool
  • leave existing backups where they were
  • alter jobs to backup to the new pool
  • create a copy job to take jobs from that new pool

Updating the pool parameters

When you update the pool definition in bacula-dir.conf, you should be aware that this has no
effect on existing volumes. For example, if you wish to alter the retention period for the volumes
in a pool, you need to issue the update command, like this:

*update
Update choice:
     1: Volume parameters
     2: Pool from resource
     3: Slots from autochanger
     4: Long term statistics
Choose catalog item to update (1-4): 1
Parameters to modify:
     1: Volume Status
     2: Volume Retention Period
     3: Volume Use Duration
     4: Maximum Volume Jobs
     5: Maximum Volume Files
     6: Maximum Volume Bytes
     7: Recycle Flag
     8: Slot
     9: InChanger Flag
    10: Volume Files
    11: Pool
    12: Volume from Pool
    13: All Volumes from Pool
    14: All Volumes from all Pools
    15: Enabled
    16: RecyclePool
    17: Action On Purge
    18: Done
Select parameter to modify (1-18): 14
All Volume defaults updated from "Default" Pool record.
All Volume defaults updated from "FullsFile" Pool record.
All Volume defaults updated from "FullFile" Pool record.
  Error updating Volume records: ERR=sql_update.c:443 Update failed: affected_rows=0 for 
  UPDATE Media SET ActionOnPurge=0, Recycle=1,VolRetention=3628800,VolUseDuration=0,
  MaxVolJobs=0,MaxVolFiles=0,MaxVolBytes=5368709120,RecyclePoolId=0 WHERE PoolId=23
All Volume defaults updated from "IncrFile" Pool record.
All Volume defaults updated from "Fulls" Pool record.
All Volume defaults updated from "Differentials" Pool record.
All Volume defaults updated from "Incrementals" Pool record.
All Volume defaults updated from "FullBackupsFile" Pool record.
All Volume defaults updated from "FilePool" Pool record.
All Volume defaults updated from "MegaFilePool" Pool record.
*

I did the above after deciding upon new retention periods.

Retention periods

One idea I’ve decided to keep in mind is Job Retention and File Retention. These are define in the client resource.
I have decided to keep these values very high.

File   Retention = 3 years
Job    Retention = 3 years

I am then able to have different retention periods on my pools and not worry about losing the Job or File records
from the Catalog. So long as the Volume Retention is greater than File|Job Retention, this plan will work

Do not try to save disk space by putting File|Job Retention less than Volume Retention. Once File Retention has
expired and the File records have been pruned from the catalog, you will not be able to restore an individual file
or set of files. You will only be able to restore a whole job. Similarly, for Job retention; once that data has been
pruned from the catalog, you will not even be able to restore those jobs. You will have to resort to bextract or
bscan. That is not something you want to do. Save time, stress, and energy. Use that disk space. Keep all of your
retention values the same.

When you *do* need to restore, you don’t want to be working with bextract/bscan. You just want the data restored.
Is the saving in disk space worth that? I think not.

Duplicate Job control

I have duplicate job control. This avoids queuing the same job twice. The directives I use are:

Allow Higher Duplicates = no
Allow Duplicate Jobs = no
Cancel Queued Duplicates = yes

These directives occur on the original backup jobs, not on the copy job that I run. However, when the copying is done,
those directives from the original jobs are used. This means I cannot copy more than one job at a time. If there is
more than one job waiting to be copied from the pool, you’re going to see something like this:

16-Dec 14:30 bacula-dir JobId 42282: The following 4 JobIds were chosen to be copied: 41741,41786,42183,42234
16-Dec 14:30 bacula-dir JobId 42283: The following 1 JobId was chosen to be copied: 41741
16-Dec 14:30 bacula-dir JobId 42283: Copying using JobId=41741 Job=nyi_maildir_tarball.2010-12-13_05.55.01_48
16-Dec 14:30 bacula-dir JobId 42283: Bootstrap records written to /home/bacula/working/bacula-dir.restore.660.bsr
16-Dec 14:30 bacula-dir JobId 42282: Job queued. JobId=42283
16-Dec 14:30 bacula-dir JobId 42282: Copying JobId 42283 started.
16-Dec 14:30 bacula-dir JobId 42285: The following 1 JobId was chosen to be copied: 41786
16-Dec 14:30 bacula-dir JobId 42285: Copying using JobId=41786 Job=BackupCatalog.2010-12-13_08.15.00_33
16-Dec 14:30 bacula-dir JobId 42285: Bootstrap records written to /home/bacula/working/bacula-dir.restore.661.bsr
16-Dec 14:30 bacula-dir JobId 42282: Job queued. JobId=42285
16-Dec 14:30 bacula-dir JobId 42282: Copying JobId 42285 started.
16-Dec 14:30 bacula-dir JobId 42287: The following 1 JobId was chosen to be copied: 42183
16-Dec 14:30 bacula-dir JobId 42287: Copying using JobId=42183 Job=nyi_maildir_tarball.2010-12-16_05.55.01_27
16-Dec 14:30 bacula-dir JobId 42287: Bootstrap records written to /home/bacula/working/bacula-dir.restore.662.bsr
16-Dec 14:30 bacula-dir JobId 42288: Cancelling duplicate JobId=42284.
16-Dec 14:30 bacula-dir JobId 42288: JobId 42284, Job nyi_maildir_tarball.2010-12-16_14.30.46_10 marked to be canceled.
16-Dec 14:30 bacula-dir JobId 42282: Job queued. JobId=42287
16-Dec 14:30 bacula-dir JobId 42282: Copying JobId 42287 started.
16-Dec 14:30 bacula-dir JobId 42282: Copying using JobId=42234 Job=BackupCatalog.2010-12-16_08.15.00_18
16-Dec 14:30 bacula-dir JobId 42282: Bootstrap records written to /home/bacula/working/bacula-dir.restore.663.bsr
16-Dec 14:30 bacula-dir JobId 42289: Cancelling duplicate JobId=42286.
16-Dec 14:30 bacula-dir JobId 42289: JobId 42286, Job BackupCatalog.2010-12-16_14.30.46_12 marked to be canceled.

I have put the cancel job lines in bold. These jobs will not get copied. This is something which you can handle
by running the Copy Job repeatedly, but I find that to be less than ideal. It is good enough for now, but perhaps later
I will look into a better solution for Copy Jobs. The ideas I have now include:

  • not using duplicate job directives and using directives such as Max Start Delay to control duplicate jobs
  • modifying Bacula to not observe duplicate job directives on Copy Jobs

The configuration items

I hope the above snippets will be helpful, however, there is no substitute for practical examples. This last section is a straight copy from my
existing configuration files. Hopefully it will help you get started with Copy jobs. Some of this will also appear above, but is included below
as a complete example.

FileSet {
  Name    = "basic backup"
  Include {
     Options {
        signature=MD5
      } 
    File = /boot
    File = /usr/src/sys/i386/conf
    File = /etc
    File = /usr/local/etc
    File = /usr/local/info
    File = /usr/local/libexec/nagios/
    File = /usr/local/var
    File = /root
    File = /var/db/ports
    File = /var/log
    File = /var/cron
  }
}

#
# Disk based pools
#

Pool {
  Name             = FullFile
  Pool Type        = Backup
  Recycle          = yes
  AutoPrune        = yes
  Volume Retention = 3 years
  Storage          = MegaFile
  Next Pool        = Fulls

  Maximum Volume Bytes = 5G

  LabelFormat = "FullAuto-"
}

Pool {
  Name             = DiffFile
  Pool Type        = Backup
  Recycle          = yes
  AutoPrune        = yes
  Volume Retention = 6 weeks
  Storage          = MegaFile
  Next Pool        = Differentials

  Maximum Volume Bytes = 5G

  LabelFormat = "DiffAuto-"
}

Pool {
  Name             = IncrFile
  Pool Type        = Backup
  Recycle          = yes
  AutoPrune        = yes
  Volume Retention = 3 weeks
  Storage          = MegaFile
  Next Pool        = Incrementals

  Maximum Volume Bytes = 5G

  LabelFormat = "IncrAuto-"
}

#
# Tape pools
#

Pool {
  Name             = Fulls
  Pool Type        = Backup
  Recycle          = yes
  AutoPrune        = yes
  Volume Retention = 3 years
  Storage          = DigitalTapeLibrary
}

Pool {
  Name             = Differentials
  Pool Type        = Backup
  Recycle          = yes
  AutoPrune        = yes
  Volume Retention = 2 months
  Storage          = DigitalTapeLibrary

}

Pool {
  Name             = Incrementals
  Pool Type        = Backup
  Recycle          = yes
  AutoPrune        = yes
  Volume Retention = 4 weeks
  Storage          = DigitalTapeLibrary

}

Schedule {
  Name = "WeeklyCycle"
  Run = Level=Full         1st     sun at 5:55 
  Run = Level=Differential 2nd-5th sun at 5:55
  Run = Level=Incremental  mon-sat     at 5:55
}

Schedule {
  Name = "WeeklyCycleForCopyingToTape"
  Run = Level=Full         1st     sun at 6:15 8:15 10:15
  Run = Level=Differential 2nd-5th sun at 6:15 8:15 10:15
  Run = Level=Incremental  mon-sat     at 6:15 8:15 10:15
}

JobDefs {
  Name        = "DefaultJob"
  Type        = Backup
  Level       = Incremental
  Client      = polo-fd 
  FileSet     = "Full Set"
  Schedule    = "WeeklyCycle"
  Storage     = MegaFile
  Messages    = Standard

  Pool        = FullFile  # required parameter for all Jobs

  Full         Backup Pool = FullFile
  Differential Backup Pool = DiffFile
  Incremental  Backup Pool = IncrFile

  Priority    = 10

  # don't spool date when backing up to disk
  Spool Data  = no
  Spool Attributes = yes

  Allow Higher Duplicates = no
  Allow Duplicate Jobs = no
  Cancel Queued Duplicates = yes
}


Client {
  Name           = polo-fd
  Address        = polo.unixathome.org
  Catalog        = MyCatalog
  Password       = "Gu9VX10CN7qZFbx26mZMd0T6FlEk8Z3g7Sle+SCBBmfj"

  File   Retention = 3 years
  Job    Retention = 3 years 
}


Job {
  Name            = "polo basic"
  JobDefs         = "DefaultJob"
  Client          = polo-fd 
  FileSet         = "basic backup"
  Write Bootstrap = "/home/bacula/working/polo-fd-basic.bsr"
}


JobDefs {
  Name        = "DefaultJobTape"
  Type        = Backup
  Level       = Incremental
  Client      = polo-fd 
  FileSet     = "Full Set"
  Schedule    = "WeeklyCycleForCopyingToTape"
  Storage     = DigitalTapeLibrary
  Messages    = Standard

  Pool        = FullFile # required parameter for all Jobs

  #
  # since this JobDef is meant to be used with a Copy Job
  # these Pools are the source for the Copy... not the destination.
  # The Destination is determined by the Next Pool directive in
  # the respective Pools.
  #
  Full         Backup Pool = FullFile
  Differential Backup Pool = DiffFile
  Incremental  Backup Pool = IncrFile

  Priority    = 400

  # don't spool date when backing up to tape from local disk
  Spool Data  = no
  Spool Attributes = yes

  RunAfterJob  = "/home/dan/bin/dlt-stats-kraken"

  # no sense spooling local data
  Spool Data       = no
  Spool Attributes = yes
  Maximum Concurrent Jobs = 6
}

Job {
  Name     = "CopyToTape"
  Type     = Copy
  JobDefs  = "DefaultJobTape"

  Selection Type = PoolUncopiedJobs
}

Enjoy.

Leave a Comment

Scroll to Top