Skip to content

Core Concepts

Before diving into Copybara configurations, it’s important to understand the key concepts that drive how Copybara works.

A workflow is the fundamental unit of Copybara. It defines:

  • Where to read code from (origin)
  • Where to write code to (destination)
  • What transformations to apply
  • How to handle authorship and commit history
core.workflow(
name = "my-workflow", # Unique identifier
origin = ..., # Source of code
destination = ..., # Target for code
transformations = [...], # Changes to apply
authoring = ..., # Author handling
mode = "SQUASH", # How to create commits
)

You can define multiple workflows in a single config file:

core.workflow(name = "export", ...) # Internal → External
core.workflow(name = "import", ...) # External → Internal

An origin is where Copybara reads code from. The most common origins are:

OriginUse Case
git.originGeneric Git repository
git.github_originGitHub with PR/issue integration
git.github_pr_originRead from GitHub PRs
git.gerrit_originGerrit with CL integration
folder.originLocal filesystem
origin = git.github_origin(
url = "https://github.com/org/repo",
ref = "main",
)

A destination is where Copybara writes code to:

DestinationBehavior
git.destinationPush directly to a branch
git.github_destinationPush to GitHub
git.github_pr_destinationCreate GitHub PRs
git.gerrit_destinationCreate Gerrit CLs
folder.destinationWrite to local filesystem
# Direct push
destination = git.destination(
url = "https://github.com/org/repo",
push = "main",
)
# Create PRs instead
destination = git.github_pr_destination(
url = "https://github.com/org/repo",
destination_ref = "main",
)

Transformations modify code during the sync process. They’re applied in order:

transformations = [
core.move("src/", ""), # First: move files
core.replace("foo", "bar"), # Then: replace text
core.verify_match("SECRET", ...), # Finally: verify
]

Common transformation types:

TransformationPurpose
core.moveMove/rename files or directories
core.copyCopy files (keep original)
core.removeDelete files matching a pattern
core.replaceText replacement (literal or regex)
core.verify_matchFail if pattern exists (or doesn’t)

Globs define which files to include or exclude:

# Include specific directories
origin_files = glob(["src/**", "docs/**"])
# Exclude internal files
origin_files = glob(
include = ["**"],
exclude = ["**/internal/**", "**/*.secret"],
)

Pattern syntax:

  • * matches any characters except /
  • ** matches any characters including /
  • ? matches a single character

Authoring controls how commit authors are handled:

# Keep original authors
authoring = authoring.pass_thru("Default <default@example.com>")
# Replace all authors
authoring = authoring.overwrite("Bot <bot@example.com>")
# Allow only specific authors
authoring = authoring.allowed(
default = "Bot <bot@example.com>",
allowlist = ["alice@example.com", "bob@example.com"],
)

The mode determines how commits are created in the destination:

ModeBehavior
SQUASHAll origin changes → one destination commit
ITERATIVEPreserve individual commits
CHANGE_REQUESTCreate PR/CL for review
CHANGE_REQUEST_FROM_SOTPR from source of truth
mode = "SQUASH", # Recommended for most workflows

Copybara tracks state to know what’s already been synced:

  1. After syncing, Copybara adds a marker to commit messages:

    GitOrigin-RevId: abc123def456
  2. On the next run, Copybara reads this marker to find where to resume

  3. Only new commits since the last sync are processed

Terminal window
# Start from a specific commit
java -jar copybara.jar migrate copy.bara.sky --last-rev abc123
# Process all history (initial sync)
java -jar copybara.jar migrate copy.bara.sky --init-history

The Source of Truth is the authoritative repository:

  • For export workflows: Internal repo is SOT
  • For import workflows: External repo is SOT

Copybara always reads from the SOT and writes to the mirror.

Diagram

Labels are metadata extracted from commit messages:

# Expose labels for use in transformations
metadata.expose_label("TESTED_BY")
metadata.expose_label("BUG")
# Add labels to commit messages
metadata.add_header("Synced-From: internal")

Labels can be used to:

  • Track relationships between commits
  • Add metadata to synced commits
  • Filter or route based on commit properties