Skip to content

Remote Files Module

The remotefiles module allows Copybara to use remote archives (ZIP, TAR, etc.) as origins. This is useful for syncing from package registries, release archives, or any HTTP-accessible file.

Define a remote archive as an origin:

origin = remotefiles.origin(
archive_source = "https://example.com/releases/v${VERSION}.tar.gz",
unpack_method = "TAR_GZ",
version_list = my_version_list,
origin_version_selector = core.latest_version(),
)
ParameterTypeDefaultDescription
archive_sourcestring""URL template to download archive. Use ${VERSION} as placeholder
unpack_methodstring"AS_IS"How to unpack: ZIP, TAR, TAR_GZ, TAR_XZ, TAR_BZ2, AS_IS
authorstring"Copybara <noreply@copybara.io>"Author for the generated change
messagestring"Placeholder message"Commit message for the change
version_listVersionListNoneSource of available versions
origin_version_selectorVersionSelectorNoneHow to select version from list
version_resolverVersionResolverNoneResolve refs to specific versions
authAuthInterceptorNoneAuthentication for HTTP requests
MethodDescription
AS_ISDon’t unpack, use file as-is
ZIPExtract ZIP archive
TARExtract TAR archive
TAR_GZExtract gzipped TAR (.tar.gz, .tgz)
TAR_XZExtract XZ-compressed TAR (.tar.xz)
TAR_BZ2Extract bzip2-compressed TAR (.tar.bz2)

Download and extract a release tarball:

core.workflow(
name = "sync_release",
origin = remotefiles.origin(
archive_source = "https://github.com/org/repo/archive/refs/tags/v1.0.0.tar.gz",
unpack_method = "TAR_GZ",
author = "Release Bot <bot@example.com>",
message = "Import release v1.0.0",
),
destination = git.destination(
url = "https://github.com/org/internal-repo",
push = "main",
),
authoring = authoring.overwrite("Bot <bot@example.com>"),
)

Use version selectors to automatically pick the latest version:

core.workflow(
name = "sync_latest",
origin = remotefiles.origin(
archive_source = "https://registry.example.com/package/${VERSION}.zip",
unpack_method = "ZIP",
version_list = http.version_list(
url = "https://registry.example.com/package/versions.json",
# Parse JSON response to get version list
),
origin_version_selector = core.latest_version(
format = "^[0-9]+\\.[0-9]+\\.[0-9]+$", # Semver only
),
),
destination = git.destination(...),
authoring = authoring.overwrite("Sync Bot <bot@example.com>"),
)

Sync from Go module proxy:

core.workflow(
name = "import_go_module",
origin = remotefiles.origin(
archive_source = "https://proxy.golang.org/github.com/org/module/@v/${VERSION}.zip",
unpack_method = "ZIP",
version_list = go.go_proxy_version_list(
module = "github.com/org/module",
),
origin_version_selector = core.latest_version(),
),
destination = git.destination(
url = "https://internal.example.com/go-modules/module",
push = "main",
),
authoring = authoring.overwrite("Go Sync <go-sync@example.com>"),
)

Sync an npm package:

core.workflow(
name = "import_npm_package",
origin = remotefiles.origin(
archive_source = "https://registry.npmjs.org/package-name/-/package-name-${VERSION}.tgz",
unpack_method = "TAR_GZ",
version_list = npm.npm_version_list(
package_name = "package-name",
),
origin_version_selector = core.latest_version(),
),
destination = git.destination(...),
authoring = authoring.overwrite("NPM Sync <npm@example.com>"),
)

Download from authenticated endpoint:

core.workflow(
name = "sync_private_archive",
origin = remotefiles.origin(
archive_source = "https://private.example.com/releases/${VERSION}.tar.gz",
unpack_method = "TAR_GZ",
auth = http.bearer_auth(
credentials.static_secret("api_token", "API_TOKEN"),
),
),
destination = git.destination(...),
authoring = authoring.overwrite("Bot <bot@example.com>"),
)

When using version_list and origin_version_selector, Copybara automatically selects which version to sync.

SelectorDescription
core.latest_version()Select the latest/newest version
core.latest_version(format = "regex")Latest version matching regex
Custom selectorImplement custom selection logic
  1. Copybara fetches the version list from version_list
  2. The origin_version_selector picks a version
  3. The ${VERSION} placeholder in archive_source is replaced
  4. The archive is downloaded and unpacked
# Example flow:
# 1. version_list returns: ["1.0.0", "1.1.0", "2.0.0"]
# 2. origin_version_selector picks: "2.0.0"
# 3. archive_source becomes: "https://example.com/releases/v2.0.0.tar.gz"
# 4. Archive is downloaded and extracted

Import a library that doesn’t use Git:

origin = remotefiles.origin(
archive_source = "https://example.com/lib-${VERSION}.tar.gz",
unpack_method = "TAR_GZ",
version_list = ...,
origin_version_selector = core.latest_version(),
)

Vendor external dependencies into your repository:

core.workflow(
name = "vendor_dep",
origin = remotefiles.origin(...),
destination = git.destination(
url = "https://github.com/org/repo",
push = "main",
),
destination_files = glob(["vendor/dep/**"]),
transformations = [
core.move("", "vendor/dep/"),
],
...
)

Mirror releases from one location to another:

core.workflow(
name = "mirror_releases",
origin = remotefiles.origin(
archive_source = "https://external.example.com/releases/${VERSION}.zip",
unpack_method = "ZIP",
...
),
destination = git.destination(
url = "https://internal.example.com/mirror",
push = "releases",
),
...
)

  • No history: Remote archives don’t have Git history; each sync creates a single commit
  • No incremental sync: Each run downloads the entire archive
  • State tracking: Uses ${VERSION} for state tracking; versionless origins always re-sync