Skip to content

Real-World Examples

Learn from production Copybara configurations used by major open-source projects. These examples demonstrate battle-tested patterns for common use cases.

MongoDB uses Copybara to sync their internal repository (10gen/mongo) to the public GitHub repository (mongodb/mongo). This is a classic internal → public open-sourcing pattern.

Source of truth: Internal repo (10gen/mongo)

Flow: Internal → Public (one-way)

Development happens in the private 10gen/mongo repository. Copybara then syncs approved changes to the public mongodb/mongo repository, filtering out internal-only content and scrubbing commit messages.

Source: mongodb/mongo/copy.bara.sky

sourceUrl = "https://github.com/10gen/mongo.git"
prodUrl = "https://github.com/mongodb/mongo.git"
testUrl = "https://github.com/10gen/mongo-copybara.git"
def make_workflow(workflow_name, destination_url, branch):
core.workflow(
name = workflow_name,
origin = git.origin(url = sourceUrl, ref = branch),
destination = git.destination(
url = destination_url,
fetch = branch,
push = branch,
),
origin_files = glob(["**"], exclude = [
"src/mongo/db/modules/**",
"buildscripts/modules/**",
".github/workflows/**",
"src/third_party/private/**",
".augment/**",
".cursor/**",
"AGENTS.md",
".github/CODEOWNERS",
]),
authoring = authoring.pass_thru("MongoDB <mongodb@mongodb.com>"),
mode = "ITERATIVE",
transformations = [
# Commit message scrubbing (see below)
# URL replacement (see below)
],
)
make_workflow("prod", prodUrl, "master")
make_workflow("test", testUrl, testBranch)

MongoDB defines a make_workflow() function to create both production and test workflows with the same configuration. This ensures consistency and makes testing easier.

def make_workflow(workflow_name, destination_url, branch):
core.workflow(
name = workflow_name,
# ... configuration
)
# Production workflow
make_workflow("prod", prodUrl, "master")
# Test workflow (validates config before prod)
make_workflow("test", testUrl, testBranch)

MongoDB excludes several categories of internal content:

origin_files = glob(["**"], exclude = [
# Proprietary modules
"src/mongo/db/modules/**",
"buildscripts/modules/**",
# Internal CI/CD
".github/workflows/**",
".github/CODEOWNERS",
# Private dependencies
"src/third_party/private/**",
# AI assistant configs (internal tooling)
".augment/**",
".cursor/**",
"AGENTS.md",
])

MongoDB uses a multi-step scrubbing process to clean commit messages while preserving important trailers:

transformations = [
# Step 1: Keep trailers, blank everything else after first line
metadata.scrubber(
"\n((?:Co-authored-by|Signed-off-by|Reviewed-by): .*)?.*",
replacement = "\n$1",
),
# Step 2: Collapse multiple newlines
metadata.scrubber("\n+", replacement = "\n"),
# Step 3: Remove trailing newline
metadata.scrubber("\n$", replacement = ""),
# Step 4: Add blank line between subject and trailers
metadata.scrubber("(^.*?\n)((?:\n|.)*)", replacement = "$1\n$2"),
]

Before:

Fix authentication bug
This fixes an internal issue reported in JIRA-1234.
See internal doc at go/auth-fix for details.
Co-authored-by: Alice <alice@mongodb.com>
Reviewed-by: Bob <bob@mongodb.com>

After:

Fix authentication bug
Co-authored-by: Alice <alice@mongodb.com>
Reviewed-by: Bob <bob@mongodb.com>

Internal links are rewritten to point to the public repository:

core.replace(
before = "https://github.com/10gen/mongo",
after = "https://github.com/mongodb/mongo",
paths = glob(["**/*.md"]),
)
  1. Test before production - Use a separate test workflow
  2. Preserve attribution - Keep Co-authored-by and similar trailers
  3. Strip internal details - Remove JIRA references, internal docs links
  4. ITERATIVE mode - Preserve individual commits for better history
  5. Exclude AI configs - .cursor/, .augment/ are internal tooling

CUE: Gerrit ↔ GitHub Bidirectional Sync Archived

Section titled “CUE: Gerrit ↔ GitHub Bidirectional Sync ”

The CUE project uses Copybara to bridge between Gerrit (primary code review) and GitHub (community contributions). This demonstrates the bidirectional sync pattern.

Source of truth: Gerrit (cue-review.googlesource.com)

Flow: Gerrit → GitHub (mirror) + GitHub PRs → Gerrit (import)

Core development and code review happens on Gerrit. Changes are mirrored to GitHub for visibility. External contributors submit PRs on GitHub, which are imported to Gerrit for review.

Source: cuelang/cue/_scripts/copy.bara.sky

github_url = "https://github.com/cuelang/cue.git"
gerrit_url = "https://cue-review.googlesource.com/cue"
# Mirror Gerrit → GitHub (all refs)
git.mirror(
name = "github",
origin = gerrit_url,
destination = github_url,
refspecs = ["refs/*"],
prune = False,
)
# Import GitHub PRs → Gerrit
core.workflow(
name = "github-pr",
origin = git.github_pr_origin(
url = github_url,
baseline_from_branch = True,
use_merge = True,
),
destination = git.gerrit_destination(
url = gerrit_url,
fetch = "master",
push_to_refs_for = "master%hashtag=github-pr",
),
authoring = authoring.pass_thru("CUE team <no-reply@google.com>"),
mode = "CHANGE_REQUEST",
transformations = [
metadata.replace_message(
"${GITHUB_PR_TITLE}\n\n" +
"${GITHUB_PR_BODY}\n\n" +
"Closes #${GITHUB_PR_NUMBER}\n" +
"https://github.com/cuelang/cue/pull/${GITHUB_PR_NUMBER}",
),
],
)

For Gerrit → GitHub sync, CUE uses git.mirror instead of core.workflow. This is simpler and faster for pure mirroring:

git.mirror(
name = "github",
origin = gerrit_url,
destination = github_url,
refspecs = ["refs/*"], # Mirror all refs
prune = False, # Don't delete refs not in origin
)

External contributions come via GitHub PRs but need to go through Gerrit review:

origin = git.github_pr_origin(
url = github_url,
baseline_from_branch = True, # Use branch as baseline
use_merge = True, # Include merge commits
)
destination = git.gerrit_destination(
url = gerrit_url,
fetch = "master",
push_to_refs_for = "master%hashtag=github-pr", # Tag with hashtag
)

The %hashtag=github-pr suffix adds a hashtag to Gerrit changes for tracking.

PR metadata is formatted into a proper commit message:

metadata.replace_message(
"${GITHUB_PR_TITLE}\n\n" +
"${GITHUB_PR_BODY}\n\n" +
"Closes #${GITHUB_PR_NUMBER}\n" +
"https://github.com/cuelang/cue/pull/${GITHUB_PR_NUMBER}",
)

Variables available from github_pr_origin:

  • ${GITHUB_PR_TITLE} - PR title
  • ${GITHUB_PR_BODY} - PR description
  • ${GITHUB_PR_NUMBER} - PR number
  • ${GITHUB_PR_URL} - Full PR URL
  • ${GITHUB_PR_USER} - PR author username

For PR imports, CHANGE_REQUEST mode creates a single review request regardless of how many commits are in the PR:

mode = "CHANGE_REQUEST",
  1. Use git.mirror for simple replication - No workflow needed
  2. Hashtags for tracking - Gerrit’s %hashtag= feature tracks imported PRs
  3. Preserve PR context - Link back to original GitHub PR
  4. CHANGE_REQUEST for imports - One review per PR, not per commit

WireQuery: Enterprise ↔ Community Bidirectional

Section titled “WireQuery: Enterprise ↔ Community Bidirectional”

WireQuery demonstrates a true bidirectional sync between an enterprise edition and community edition, with path transformations and PR imports.

Source of truth: Enterprise repo (wirequery-ee)

Flow: Enterprise → Community (push) + Community PRs → Enterprise (import)

The enterprise edition is the source of truth. Community contributions via PR are imported back to enterprise with proper author attribution.

Source: wirequery/wirequery/copy.bara.sky

destinationUrl = "ssh://git@github.com/wirequery/wirequery.git"
sotUrl = "ssh://git@github.com/wirequery/wirequery-ee.git"
files = glob(["**"], exclude = ["**/ee/**", ".github/workflows/push-to-public-repo.yml"])
code_transformations = [
core.replace(
before = "/ee/",
after = "/ce/",
paths = glob(["**"], exclude = ["**/ce/**", "**/ee/**", "copy.bara.sky"]),
),
]
# Push: Enterprise → Community
core.workflow(
name = "push",
origin = git.origin(url = sotUrl, ref = "main"),
destination = git.github_destination(url = destinationUrl, push = "main"),
origin_files = files,
authoring = authoring.pass_thru("Wouter Nederhof <wouter.nederhof@gmail.com>"),
mode = "ITERATIVE",
transformations = [
metadata.restore_author("ORIGINAL_AUTHOR", search_all_changes = True),
metadata.expose_label("COPYBARA_INTEGRATE_REVIEW"),
] + code_transformations,
)
# Pull: Community PRs → Enterprise
core.workflow(
name = "pr",
origin = git.github_pr_origin(url = destinationUrl, branch = "main"),
destination = git.github_pr_destination(url = sotUrl, destination_ref = "main"),
destination_files = files,
origin_files = files,
authoring = authoring.pass_thru("Wouter Nederhof <wouter.nederhof@gmail.com>"),
mode = "CHANGE_REQUEST",
set_rev_id = False,
transformations = [
metadata.save_author("ORIGINAL_AUTHOR"),
metadata.expose_label("GITHUB_PR_NUMBER", new_name = "Closes", separator = "...#"),
] + core.reverse(code_transformations),
)

WireQuery converts /ee/ paths to /ce/ when pushing to community. The clever part: they use core.reverse() to automatically reverse this when importing PRs:

code_transformations = [
core.replace(before = "/ee/", after = "/ce/", ...),
]
# Push workflow uses transformations as-is
transformations = [...] + code_transformations
# PR workflow reverses them (/ce/ → /ee/)
transformations = [...] + core.reverse(code_transformations)

When pushing to community, authors are restored from a label. When importing PRs, authors are saved to that label:

# Push: restore original author from label
metadata.restore_author("ORIGINAL_AUTHOR", search_all_changes = True)
# PR import: save author to label for future reference
metadata.save_author("ORIGINAL_AUTHOR")

Unlike CUE (PR → commit), WireQuery creates PRs on both sides:

origin = git.github_pr_origin(url = destinationUrl)
destination = git.github_pr_destination(url = sotUrl, destination_ref = "main")

Excludes enterprise-only content and the CI workflow that triggers Copybara itself:

files = glob(["**"], exclude = [
"**/ee/**", # Enterprise-only code
".github/workflows/push-to-public-repo.yml", # Avoid recursion
])
  1. core.reverse() for bidirectional transforms - Write once, reverse automatically
  2. Author labels for attribution - save_author/restore_author preserve contributors
  3. PR-to-PR for review - Both sides get review workflows
  4. Exclude your Copybara workflow - Prevent infinite loops

Many Google open-source projects use an internal Copybara service that syncs from Google’s internal Piper monorepo to GitHub. You’ll recognize these by commits from copybara-service[bot].

Projects using copybara-service:

These projects don’t have public copy.bara.sky files because the configuration lives in Google’s internal infrastructure. However, you can observe the patterns:

  • Commit format: Clean, single-line subjects
  • Labels: GitOrigin-RevId or PiperOrigin-RevId pointing to internal commits
  • Authoring: Original authors preserved when possible

AspectMongoDBCUE (archived)WireQuery
Source of truthInternal (10gen/mongo)GerritEnterprise (wirequery-ee)
DirectionInternal → PublicGerrit ↔ GitHubEnterprise ↔ Community
ModeITERATIVECHANGE_REQUEST + mirrorITERATIVE + CHANGE_REQUEST
Primary useOpen-sourcingCommunity contributionsDual-license distribution
Message handlingScrub body, keep trailersReplace with PR metadataAuthor labels
File handlingHeavy exclusionsNo exclusionsPath transformations
ReversibilityOne-wayTwo workflowscore.reverse() transforms
core.workflow(
origin_files = glob(["**"], exclude = [
"internal/**",
".github/workflows/**",
"**/INTERNAL_*.md",
]),
mode = "ITERATIVE",
transformations = [
metadata.scrubber("(?m)^INTERNAL:.*$"),
metadata.scrubber("JIRA-[0-9]+", replacement = "[internal]"),
core.replace(
before = "internal.company.com",
after = "github.com/company",
paths = glob(["**/*.md"]),
),
],
)

For GitHub ↔ Code Review System (CUE pattern)

Section titled “For GitHub ↔ Code Review System (CUE pattern)”
# Mirror: Internal → GitHub
git.mirror(
name = "mirror-to-github",
origin = "internal-url",
destination = "github-url",
refspecs = ["refs/heads/*", "refs/tags/*"],
)
# Import: GitHub PRs → Internal
core.workflow(
name = "import-prs",
origin = git.github_pr_origin(url = "github-url"),
destination = git.destination(url = "internal-url"),
mode = "CHANGE_REQUEST",
transformations = [
metadata.replace_message(
"${GITHUB_PR_TITLE}\n\nFrom: ${GITHUB_PR_URL}",
),
],
)

Monorepo → Multi-repo Publishing:

  • Airbnb Viaduct — Dynamic workflow generation for demo apps copy.bara.sky Uses list comprehension to generate workflows for multiple apps

  • Lightspark JS-SDK — Bidirectional monorepo sync with README transformation copy.bara.sky Syncs js/ directory, transforms README-PUBLIC.md → README.md

  • GirlsOfSteelRobotics — Bidirectional sync between robotics teams copy.bara.sky Reusable bidirectional_workflow() function pattern

Enterprise/Internal → Public:

  • MongoDB — Internal → Public with commit scrubbing copy.bara.sky 4-step metadata scrubber, preserves trailers

  • WireQuery — Enterprise ↔ Community with core.reverse() copy.bara.sky Converts /ee//ce/ paths, bidirectional

  • Confluent Kafka-Bazel — Bazel overlay for Apache Kafka copy.bara.sky Complex BUILD file sync with proprietary exclusions

Vendoring Dependencies:

Archived (still valuable): Archived

  • CUE — Gerrit ↔ GitHub bidirectional copy.bara.sky Moved to cue-lang/cue

  • Bazel rules_webtesting — Folder-based release workflow copy.bara.sky folder.origin() + folder.destination() pattern

These sync from Google’s internal Piper monorepo. No public configs, but you can observe patterns via commit history:

ProjectTypeNotes
tensorflow/tensorflowML frameworkGoogle3 to GitHub
angular/angularWeb frameworkReferences angular2/copy.bara.sky
bazelbuild/bazelBuild systemdevtools/blaze/bazel/admin/copybara/
grpc/grpcRPC frameworkMulti-language sync
protocolbuffers/protobufSerializationPiperOrigin-RevId labels
chromium/chromiumBrowserCronet copy.bara.sky
google/gvisorContainer sandboxPiperOrigin-RevId
google/perfettoTracingChromium integration
google-deepmind/natural-planResearchPiper → Gerrit
youtube/cobaltHTML5 containerComponent sync
OrganizationUse CaseLink
Pixie (CNCF)K8s observability syncpixie-io/pixie
ScreenshotbotOSS publishingscreenshotbot-oss
RecidivizPublic mirror syncpulse-data
Eclipse ScoreFork managementeclipse-score/score
Twitter/XInternal forktwitter-forks/copybara
HazelcastGitHub Actionshazelcast/copybara-action
Groupe RenaultGitLab syncInternal → public
ToolDescriptionLink
Olivr/copybara-actionZero-config GitHub ActionGitHub
Olivr/nx-copybaraNx monorepo exampleGitHub
rules_copybaraBazel rules for CopybaraGitHub
anipos/copybara-docker-imageDocker imageGitHub