Skip to content

Glob Reference

Globs define file patterns for filtering.

glob(include, exclude)
PatternMatches
*Any characters except /
**Any characters including /
?Single character
[abc]Character class
[!abc]Negated character class
glob(["src/**", "docs/**"])
glob(
include = ["**"],
exclude = ["**/internal/**"],
)
glob(["**/*.py", "**/*.pyi"])
glob(["src/**"])
glob(["src/*"]) # One level only
glob(
include = ["**"],
exclude = [
"**/test/**",
"**/*_test.go",
"**/testdata/**",
],
)
origin_files = glob(["src/**"])
destination_files = glob(
include = ["**"],
exclude = ["README.md"],
)
core.replace(
before = "old",
after = "new",
paths = glob(["**/*.md"]),
)
core.remove(glob(["**/*.bak"]))
glob(["*.md"]) # Only root-level .md files
glob(["**/*.md"]) # All .md files recursively
glob(["src/*/*.py"]) # Python files exactly 1 level under src/
glob(["src/**/*.py"]) # Python files at any depth under src/
glob(["**/*.{js,ts,jsx,tsx}"]) # NOT supported - use list instead
glob(["**/*.js", "**/*.ts", "**/*.jsx", "**/*.tsx"]) # Correct
glob(["**/.*"]) # All hidden files (starting with .)
glob(["**/.github/**"]) # .github directory contents
# WRONG: Only matches src/file.py, not src/sub/file.py
glob(["src/*.py"])
# CORRECT: Matches all .py files under src/
glob(["src/**/*.py"])

Excludes are processed after includes. An excluded path cannot be re-included.

glob(
include = ["**"],
exclude = ["internal/**"],
)
# internal/important.txt is EXCLUDED even if you wanted it

Glob patterns should NOT have trailing slashes:

# WRONG
glob(["src/"])
# CORRECT
glob(["src/**"])

Copybara calculates “roots” from your glob patterns to determine which directories to query from the repository. More specific patterns = fewer files traversed.

# SLOWER: Scans entire repo, then filters
glob(
include = ["**"],
exclude = ["vendor/**", "node_modules/**", "dist/**"],
)
# FASTER: Only queries src/ and docs/ directories
glob(["src/**", "docs/**"])

Copybara extracts the static prefix before any wildcard to determine query roots:

PatternComputed Root
src/**/*.pysrc
foo/bar.txtfoo
**/*.java“ (root)
{foo,bar}/**“ (root)

Patterns with wildcards at the start (like **/*.java) force a full repo scan.

Use + and - operators to build complex patterns efficiently:

base = glob(["src/**"])
tests = glob(["src/**/*_test.py", "src/**/test_*.py"])
# All source files except tests
origin_files = base - tests

Each level of glob nesting adds processing overhead:

# Simple and fast
glob(include = ["src/**"], exclude = ["src/internal/**"])
# More complex - avoid if not needed
(glob(["src/**"]) - glob(["src/internal/**"])) + glob(["docs/**"])