Search inside files recursively with grep (and faster alternatives)
Thu Aug 21st, 2025 โ€” 63 days ago

Grep: Search in Files: Use Grep to Find Text in Files

Problem

  • You need to find which files contain a given text.
  • You want recursive search, smart excludes, and readable results on Linux or macOS.

Solutions

  • Use grep recursively with includes/excludes (works on GNU and most BSD/macOS builds).
grep -RIn --exclude-dir={.git,node_modules,dist,build} -- "needle" .
  • Limit by file pattern with --include:
grep -RIn --include="*.{ts,js,py,md}" -- "needle" .
  • Basic find example that passes grep to -exec (scope to . instead of / so you donโ€™t accidentally crawl your entire filesystem):
find . -type f -exec grep -H 'needle' {} \;
  • Another find grep example that also uses -prune to exclude certain files or directories:
find . -type d \( -name .git -o -name node_modules -o -name dist -o -name build \) \
  -prune -o -type f -print0 \
| xargs -0 grep -nI -- "needle"

Breakdown

  • find . โ†’ start searching in the current directory.
  • -type d โ†’ match directories.
  • \( -name .git -o -name node_modules -o ... \) โ†’ true if the directory name matches any of these.
  • -prune โ†’ when a match is found, do not descend into that directory tree.
  • -o โ†’ logical โ€œORโ€: if the left side is false (i.e., the dir is not one of those names), then evaluate the right side.
  • -type f -print0 โ†’ for everything else (files), print them in NUL-terminated format.

So, the pruning effect comes from -prune, not from -o -name alone. The -o is just the way to say โ€œif not pruned, then continue with the next action.โ€

  • Alternative with clearer grouping and multiple lines inside \(...\):
find . -type d \( \
    -name .git -o \
    -name node_modules -o \
    -name dist -o \
    -name build \
  \) -prune -o -type f -print0 \
| xargs -0 grep -nI -- "needle"
  • Use ripgrep (faster, respects .gitignore by default). Recommended.
rg -n "needle"
  • Search only certain extensions with ripgrep.
rg -n --glob "**/*.{ts,js,py,md}" "needle"
  • Use git grep inside repos (fast, honors .gitignore).
git grep -n -- "needle"
  • macOS without GNU-style excludes? Use find+xargs (above) or install ripgrep via Homebrew.
brew install ripgrep
rg -n "needle"

Things to Consider

  • Quote the pattern to avoid shell globbing or regex surprises.
  • Use -F (fixed string) when you search literals, not regex.
  • Use -I or equivalent to skip binary files.
  • ripgrep and git grep ignore files by default; grep does not.
  • On some macOS/BSD greps, -P (PCRE) is unavailable; prefer -E or ripgrep.

Hereโ€™s a reusable function to search for files containing a particular bit of text passed as a CLI argument:

sif() {
  local q="$1"; shift
  if command -v rg >/dev/null 2>&1; then
    rg -n --hidden --glob '!.git' --glob '!node_modules' --glob '!dist' --glob '!build' -- "$q" "${@:-.}"
  elif command -v git >/dev/null 2>&1 && git rev-parse --is-inside-work-tree >/dev/null 2>&1; then
    git grep -n -- "$q" ${*:+-- "$@"}
  else
    find "${1:-.}" -type d \( -name .git -o -name node_modules -o -name dist -o -name build \) -prune -o -type f -print0 \
      | xargs -0 grep -nI -- "$q"
  fi
}

Hereโ€™s another one that allow you to pass a directory path for the first argument:

find_files() {
  local dir="$1"
  local ext="$2"

  local exclude_dirs=(
    "node_modules"
    "dist"
    "build"
    ".git"
    ".next"
    ".cache"
    "vendor"
    "venv"
    "env"
    "target"
    "out"
    "coverage"
  )

  local find_expr=(-type f)
  if [[ -n "$ext" ]]; then
    find_expr+=(-iname "*.$ext")
  fi

  find "$dir" $(for d in "${exclude_dirs[@]}"; do echo -n "-path */$d -prune -o "; done) \
    "${find_expr[@]}" -print \
    | xargs du -h 2>/dev/null | sort -h
}

NOTE: If you want these functions available globally:

  • Save them to a standalone script file (e.g., sif.sh) and run:

    chmod +x sif.sh
    ./sif.sh "needle"
  • Or, for persistent usage, add them to your shell profile:

    # In ~/.bashrc or ~/.zshrc
    source ~/scripts/sif.sh
    source ~/scripts/find_files.sh
  • Alternatively, paste the function definitions directly into your shell profile. Reload with:

    source ~/.zshrc
    # or
    source ~/.bashrc

Then you can run sif and find_files like any normal shell command.

Example usage for sif

Search inside files for a string:

# Search for "TODO" in the current directory
sif "TODO"

# Search for "import express" inside src/
sif "import express" src/

# Search across multiple directories
sif "needle" src tests scripts

Example usage for find_files

List files with a given extension, excluding noisy directories, sorted by size:

# List all .js files in ./src sorted by size
find_files ./src js

# List all Python files in current dir
find_files . py

# List all files regardless of extension under project root
find_files . ""

Gotchas

  • Forgetting quotes causes the shell to expand patterns.
  • Searching binary or huge vendor folders slows results.
  • Following symlinks can cause loops; prefer -r (no symlinks) over -R if available.
  • Mixing regex and literal text unintentionally; use -F for literals.
  • Permissions can hide matches; run with adequate rights or prune protected paths.

Sources


Further Investigation

  • Compare performance with hyperfine benchmarks.
  • Learn ignore rules: .gitignore, .ignore, .rgignore.
  • Explore searching archives: zgrep, ripgrepโ€™s --search-zip.
  • Handle encodings and locales when matches look โ€œmissingโ€.
  • ripgrep user guide

TL;DR

  • Prefer ripgrep; it is fast and respects ignores. Fallback to grep with excludes.
# Search text in all files (Linux/macOS)
rg -n "needle"

# Grep find text in files (portable fallback)
grep -RIn --exclude-dir={.git,node_modules,dist,build} -- "needle" .