Skip to content

Sync Documentation via Pull Requests

This guide shows you how to automatically sync documentation from an internal repository to a public docs site, creating Pull Requests for review instead of pushing directly.

This is ideal when:

  • You want changes reviewed before they go live
  • You don’t have (or don’t want) direct push access
  • You need an audit trail of all sync operations
  • Multiple teams need to approve documentation changes

A workflow that:

  1. Watches for changes in your internal docs/ folder
  2. Filters out internal-only content
  3. Transforms internal references to public ones
  4. Creates a Pull Request on your public docs repository
  5. Runs automatically via GitHub Actions
Diagram

You have this structure in your internal repository:

  • Directoryinternal-repo/
    • Directorysrc/ Application code
    • Directorydocs/
      • Directorypublic/ Customer-facing docs
        • getting-started.md
        • api-reference.md
        • Directoryimages/
          • diagram.png
      • Directoryinternal/ Internal-only docs (won’t be synced)
        • architecture.md
        • runbooks.md
    • copy.bara.sky Copybara config

And a separate public documentation repository:

  • Directorypublic-docs/
    • getting-started.md Synced from internal
    • api-reference.md Synced from internal
    • Directoryimages/
      • diagram.png
    • README.md Not managed by Copybara
    • CONTRIBUTING.md Not managed by Copybara
  • Two GitHub repositories: source (internal) and destination (public docs)
  • GitHub Personal Access Token with repo access to the destination
  • Java 11+ and Git installed
  • Copybara JAR downloaded

Step 1: Create a GitHub Personal Access Token

Section titled “Step 1: Create a GitHub Personal Access Token”

You need a token that can create PRs on your destination repository.

  1. Go to GitHub Settings → Developer settings → Personal access tokens → Fine-grained tokens
  2. Click Generate new token
  3. Configure:
    • Token name: copybara-docs-sync
    • Expiration: Choose based on your needs
    • Repository access: Select “Only select repositories” → choose your public docs repo
    • Permissions:
      • Contents: Read and write
      • Pull requests: Read and write
  4. Click Generate token and copy it

In your internal repository, organize your docs with clear separation:

Create docs/public/getting-started.md:

docs/public/getting-started.md
---
title: Getting Started
---
# Getting Started with Our Product
Welcome to the official documentation!
## Installation
Download from https://internal.company.com/downloads
<!-- INTERNAL: Requires VPN access for internal users -->
## Quick Start
Run the following command:
\`\`\`bash
our-cli init
\`\`\`
<!-- BEGIN INTERNAL -->
### Debug Mode (Internal Only)
For internal testing, use:
\`\`\`bash
our-cli init --debug --internal-api
\`\`\`
<!-- END INTERNAL -->
## Next Steps
- Read the [API Reference](./api-reference.md)
- Contact support@internal.company.com for help

Notice the markers:

MarkerPurpose
<!-- INTERNAL: ... -->Single-line internal notes (will be removed)
<!-- BEGIN INTERNAL --><!-- END INTERNAL -->Multi-line internal sections (will be removed)

These let you keep internal notes in your source while automatically stripping them during sync.

Create copy.bara.sky in your internal repository root:

copy.bara.sky
# Documentation Sync Configuration
# Syncs docs/public/ to the public documentation repository via PR
INTERNAL_REPO = "https://github.com/YOUR_ORG/internal-repo"
PUBLIC_DOCS_REPO = "https://github.com/YOUR_ORG/public-docs"
core.workflow(
name = "sync-docs",
# Read from the internal repository
origin = git.github_origin(
url = INTERNAL_REPO,
ref = "main",
),
# Create a PR on the public docs repository
destination = git.github_pr_destination(
url = PUBLIC_DOCS_REPO,
destination_ref = "main",
# Branch name for the PR (includes commit SHA for uniqueness)
pr_branch = "copybara/docs-sync-${CONTEXT_REFERENCE}",
# PR title and description
title = "docs: sync from internal repository",
body = """\
## Automated Documentation Sync
This PR was automatically created by Copybara to sync documentation changes.
### What's included
- All changes from `docs/public/` in the internal repository
- Internal-only content has been automatically removed
- Internal URLs have been replaced with public URLs
### Source
Commit: `${COPYBARA_CONTEXT_REFERENCE}`
---
*Please review the changes and merge when ready.*
""",
# Update PR description if we push new changes
update_description = True,
),
# Only sync the public documentation folder
origin_files = glob(
include = ["docs/public/**"],
exclude = [
"**/*.draft.md",
"**/*.draft.mdx",
"**/INTERNAL_*.md",
],
),
# Don't overwrite these files in the destination
# (they're manually maintained in the public repo)
destination_files = glob(
include = ["**"],
exclude = [
"README.md",
"CONTRIBUTING.md",
"LICENSE",
".github/**",
"CNAME",
],
),
# Preserve original authors, with fallback for automated commits
authoring = authoring.pass_thru(
default = "Documentation Bot <docs-bot@your-company.com>",
),
# Transformations applied in order
transformations = [
# 1. Flatten the directory structure
# docs/public/getting-started.md → getting-started.md
core.move("docs/public/", ""),
# 2. Replace internal URLs with public ones
core.replace(
before = "https://internal.company.com",
after = "https://docs.your-company.com",
paths = glob(["**/*.md", "**/*.mdx"]),
),
# 3. Replace internal email domains
core.replace(
before = "@internal.company.com",
after = "@your-company.com",
paths = glob(["**/*.md", "**/*.mdx"]),
),
# 4. Remove single-line internal comments
# <!-- INTERNAL: any text here -->
core.replace(
before = "<!-- INTERNAL: ${content} -->",
after = "",
regex_groups = {"content": "[^>]*"},
paths = glob(["**/*.md", "**/*.mdx"]),
),
# 5. Remove multi-line internal sections
# <!-- BEGIN INTERNAL -->
# ... anything here ...
# <!-- END INTERNAL -->
core.replace(
before = "<!-- BEGIN INTERNAL -->${content}<!-- END INTERNAL -->",
after = "",
regex_groups = {"content": "[\\s\\S]*?"},
multiline = True,
paths = glob(["**/*.md", "**/*.mdx"]),
),
# 6. Clean up extra blank lines left by removals
core.replace(
before = "\n\n\n",
after = "\n\n",
paths = glob(["**/*.md", "**/*.mdx"]),
),
# 7. SAFETY: Verify no internal content leaked through
core.verify_match(
regex = "INTERNAL|CONFIDENTIAL|internal\\.company\\.com|@internal\\.",
verify_no_match = True,
paths = glob(["**/*.md", "**/*.mdx"]),
),
# 8. Add sync metadata to commit message
metadata.squash_notes(
prefix = "Documentation sync:\n\n",
show_description = True,
show_author = True,
oldest_first = True,
),
],
# Combine all changes into one commit
mode = "SQUASH",
)

This is what makes Copybara create a PR instead of pushing directly:

destination = git.github_pr_destination(
url = PUBLIC_DOCS_REPO,
destination_ref = "main", # Target branch for the PR
pr_branch = "copybara/docs-sync-${CONTEXT_REFERENCE}",
title = "docs: sync from internal",
body = "...",
update_description = True, # Update PR description on re-runs
)

Protects files that exist only in the public repo:

destination_files = glob(
include = ["**"],
exclude = ["README.md", "CONTRIBUTING.md", ".github/**"],
)

Files in exclude won’t be deleted even if they don’t exist in the source.

The safety net - fails the sync if internal content would leak:

core.verify_match(
regex = "INTERNAL|CONFIDENTIAL|internal\\.company\\.com",
verify_no_match = True,
)

Before automating, test the sync locally:

Terminal window
# Clone your internal repo
git clone https://github.com/YOUR_ORG/internal-repo
cd internal-repo
# Set up Git credentials (use your PAT)
git config --global credential.helper store
echo "https://YOUR_USERNAME:YOUR_PAT@github.com" >> ~/.git-credentials
# Run Copybara (first run needs --force)
java -jar copybara.jar migrate copy.bara.sky sync-docs --force

Check GitHub - you should see a new PR on your public docs repository!

Automate the sync to run whenever docs change.

Create .github/workflows/sync-docs.yml in your internal repository:

.github/workflows/sync-docs.yml
name: Sync Documentation
on:
push:
branches: [main]
paths:
- "docs/public/**"
- "copy.bara.sky"
workflow_dispatch: # Allow manual trigger
jobs:
sync:
name: Sync to Public Docs
runs-on: ubuntu-latest
steps:
- name: Checkout repository
uses: actions/checkout@v4
with:
fetch-depth: 0 # Full history needed for Copybara
- name: Install D2 (for diagrams)
run: curl -fsSL https://d2lang.com/install.sh | sh -s --
- name: Set up Java
uses: actions/setup-java@v4
with:
distribution: temurin
java-version: "21"
- name: Download Copybara
run: |
curl -fsSL -o copybara.jar \
https://github.com/google/copybara/releases/latest/download/copybara_deploy.jar
- name: Configure Git credentials
run: |
git config --global user.name "github-actions[bot]"
git config --global user.email "github-actions[bot]@users.noreply.github.com"
git config --global credential.helper store
echo "https://x-access-token:${{ secrets.DOCS_SYNC_TOKEN }}@github.com" >> ~/.git-credentials
- name: Run Copybara
run: |
java -jar copybara.jar migrate copy.bara.sky sync-docs --ignore-noop
  1. Go to your internal repository → Settings → Secrets and variables → Actions
  2. Click New repository secret
  3. Name: DOCS_SYNC_TOKEN
  4. Value: Paste your GitHub PAT from Step 1

Here’s what happens when documentation changes:

Diagram
ScenarioWhat Happens
First syncNew PR is created
More changes (PR still open)PR branch is updated (force push)
PR was mergedNew PR is created for new changes
PR was closed without mergingNew PR is created

Use these patterns in your documentation:

This is public content.
<!-- INTERNAL: Remember to update the staging server first -->
More public content.

After sync:

This is public content.
More public content.
## Public Feature
Public description here.
<!-- BEGIN INTERNAL -->
### Internal Implementation Notes
This entire section is stripped during sync.
- Internal detail 1
- Internal detail 2
<!-- END INTERNAL -->
## Another Public Section

After sync:

## Public Feature
Public description here.
## Another Public Section

Any file matching these patterns is completely excluded:

  • *.draft.md - Draft documents
  • INTERNAL_*.md - Files prefixed with INTERNAL_
  • Anything in exclude patterns

Images in docs/public/images/ are synced along with the markdown:

origin_files = glob(
include = [
"docs/public/**/*.md",
"docs/public/**/*.mdx",
"docs/public/**/*.png",
"docs/public/**/*.jpg",
"docs/public/**/*.gif",
"docs/public/**/*.svg",
],
exclude = [...],
)

If your images use absolute paths, add a transformation:

# Update image paths after flattening
core.replace(
before = "](/docs/public/images/",
after = "](/images/",
paths = glob(["**/*.md"]),
)

Cause: Token doesn’t have PR permissions.

Fix: Ensure your token has pull_requests: write permission for the destination repository.

Cause: Internal content would leak to public.

Fix: Check which file triggered it:

Terminal window
java -jar copybara.jar migrate copy.bara.sky sync-docs --force 2>&1 | grep -A5 "verify_match"

Then either:

  1. Remove the internal content from the source
  2. Wrap it in <!-- BEGIN INTERNAL --> markers
  3. Add the file to exclude patterns

Cause: No new changes since last sync.

Fix: This is normal! Use --ignore-noop in CI to not fail:

Terminal window
java -jar copybara.jar migrate copy.bara.sky sync-docs --ignore-noop

Cause: PR branch exists from a previous sync.

Fix: This is normal - Copybara will update the existing branch. If you want a fresh start:

Terminal window
git push origin --delete copybara/docs-sync-abc1234

Cause: destination_files might not be excluding manually-maintained files.

Fix: Add those files to the exclude list:

destination_files = glob(
include = ["**"],
exclude = [
"README.md",
"CONTRIBUTING.md",
"YOUR_MANUAL_FILE.md", # Add this
],
)

Here’s the full configuration with all pieces:

copy.bara.sky
INTERNAL = "https://github.com/acme/internal-monorepo"
PUBLIC_DOCS = "https://github.com/acme/developer-docs"
core.workflow(
name = "sync-docs",
origin = git.github_origin(
url = INTERNAL,
ref = "main",
),
destination = git.github_pr_destination(
url = PUBLIC_DOCS,
destination_ref = "main",
pr_branch = "copybara/docs-${CONTEXT_REFERENCE}",
title = "docs: automated sync from internal",
body = """\
## Documentation Sync
Automated sync from internal repository.
**Commit:** `${COPYBARA_CONTEXT_REFERENCE}`
Please review and merge when ready.
""",
update_description = True,
),
origin_files = glob(
include = ["docs/public/**"],
exclude = ["**/*.draft.md", "**/INTERNAL_*"],
),
destination_files = glob(
include = ["**"],
exclude = ["README.md", "CONTRIBUTING.md", ".github/**", "CNAME"],
),
authoring = authoring.pass_thru(
default = "Docs Bot <docs@acme.com>"
),
transformations = [
core.move("docs/public/", ""),
core.replace("internal.acme.com", "docs.acme.com"),
core.replace("@internal.acme.com", "@acme.com"),
core.replace(
before = "<!-- INTERNAL: ${note} -->",
after = "",
regex_groups = {"note": "[^>]*"},
),
core.replace(
before = "<!-- BEGIN INTERNAL -->${content}<!-- END INTERNAL -->",
after = "",
regex_groups = {"content": "[\\s\\S]*?"},
multiline = True,
),
core.verify_match(
regex = "INTERNAL|internal\\.acme\\.com",
verify_no_match = True,
),
],
mode = "SQUASH",
)