Skip to content

Introduction to Copybara

Copybara is Google’s open-source tool for transforming and moving code between repositories. It was developed internally at Google and open-sourced to help teams manage code synchronization across different repositories.

Many organizations face challenges when code needs to exist in multiple places:

  • Open sourcing: Internal code needs to be published externally, but with sensitive information removed
  • Vendor imports: External libraries need to be imported with custom patches applied
  • Monorepo extraction: Parts of a monorepo need to be released as standalone repositories
  • Contribution sync: External contributions need to flow back to internal systems

Manually managing these workflows is error-prone and time-consuming. Copybara automates the entire process.

Diagram
  1. Read code from the origin repository 2. Transform files (move, rename, replace content, filter) 3. Write to the destination repository (commit or PR) 4. Track state so only new changes are synced next time

Copybara uses Starlark (a Python-like language) for configuration. You declare what you want, and Copybara figures out how to do it.

core.workflow(
name = "my-workflow",
origin = git.origin(url = "..."),
destination = git.destination(url = "..."),
transformations = [...],
)
  • Move/copy files: Restructure directory layouts
  • Text replacement: Change URLs, names, or any text patterns
  • File filtering: Include only what should be public
  • Verification: Ensure no secrets or internal content is leaked
ModeUse Case
SQUASHCombine all changes into a single commit
ITERATIVEPreserve individual commit history
CHANGE_REQUESTCreate PRs/CLs for review

Copybara supports both:

  • Export: Internal → External (open sourcing)
  • Import: External → Internal (accepting contributions)

Copybara is written in Java and uses:

  • starlark-java: The same Starlark implementation used by Bazel
  • JGit: For Git operations
  • GitHub/Gerrit APIs: For creating PRs (Pull Requests) and CLs (Change Lists)

Use Copybara when:

  • You need to sync code between repositories automatically
  • Transformations are required (not a simple mirror)
  • You want PR-based workflows instead of direct pushes
  • State tracking is important (don’t re-sync already-synced changes)

Consider alternatives when:

  • Simple mirroring with no transformations (use git push --mirror)
  • One-time migration (use git filter-branch or similar)
  • Real-time sync (Copybara is batch-oriented)