By ChatGPT & Benji Asperheim | 2025-07-21

Running Custom Pre‑Build Scripts in Astro with Docker

Astro is a modern static‑site builder and web‑app framework that ships zero JavaScript by default and only hydrates components on demand. It lets you author content in Markdown, Astro, or any framework (React, Vue, Svelte) and produces ultra‑fast, SEO‑friendly sites. In this guide, you'll learn how to run custom pre‑build logic—written in Python, Go, or any language—inside a Docker container so that Astro can consume the output at build time.

We'll cover:

1. Why you'd want custom pre‑build scripts

2. How to structure your project directory

3. A Dockerfile example with Corepack explained

4. Three ways to inject generated HTML or data into your Astro app

5. Tips for using Python and Go generators

6. A summary comparison table

This is written for beginners, so if you're new to POSIX commands, Docker, or Astro, you'll still be able to follow along.

Check out our other article explaining how to create an Astro/Docker setup!

Why Run Pre‑Build Scripts?

Sometimes you need to:

  • Fetch external data (APIs, CMS, RSS) and convert it to JSON or Markdown
  • Generate content from CSV, database, or templates
  • Compile assets (SVGs, charts, custom HTML snippets) before bundling
  • Automate tasks like sitemap generation, localization files, or RSS feeds

By running your scripts before astro build, you can feed static files or modules directly into Astro's pipeline—no runtime server code required.

Project Layout Example

/ (project root)
├─ generate/               # Custom scripts, e.g. Python or Go
│  ├─ main.py              # Outputs HTML, JSON, .md, etc.
│  └─ main.go              # (optional) Go version
├─ public/                 # Files copied verbatim into final dist/
│  └─ compiled-snippet.html
├─ src/
│  ├─ components/
│  │  └─ InjectedHtml.astro
│  └─ pages/
│     └─ index.astro
├─ package.json
├─ yarn.lock
├─ .yarnrc.yml
├─ Dockerfile
└─ astro.config.mjs

Dockerfile with Pre‑Build Step

FROM node:24-alpine3.22
ENV NODE_ENV=production
WORKDIR /app

# Enable Corepack to manage Yarn versions consistently
# Corepack comes bundled with recent Node.js and lets you pin Yarn 4
RUN corepack enable && \
    corepack prepare yarn@4.0.0 --activate

# Install Python and Go runtimes for custom scripts
RUN apk add --no-cache python3 py3-pip go

# Cache Node dependencies
COPY package.json yarn.lock .yarnrc.yml ./
COPY .yarn/ .yarn/
RUN yarn install --immutable

# Copy the rest of your source code
COPY . .

# === PRE‑BUILD STEP ===
# Run your Python or Go script to generate files
RUN python3 generate/main.py
# Or, if you prefer Go:
# RUN go run generate/main.go

# === ASTRO BUILD ===
RUN yarn build

# Optionally link sitemap if needed
RUN ln -sf /app/dist/sitemap-index.xml /app/dist/sitemap.xml

EXPOSE 8080
CMD ["npx", "serve", "dist", "--listen", "8080"]

Why Corepack?

Corepack ensures that everyone on your team (and in CI/CD) uses the same Yarn version. Without it, you might get mismatched lockfiles or failing installs.

Injecting Generated HTML or Data

Once you've generated content, you have a few options to include it in your Astro pages:

Option 1: Build‑Time Injection via fs.readFileSync

Have your script write raw HTML into public/compiled-snippet.html, then:

---
// src/components/InjectedHtml.astro
import fs from 'node:fs';
const html = fs.readFileSync(new URL('../../public/compiled-snippet.html', import.meta.url), 'utf-8');
---

Use <InjectedHtml /> anywhere in your pages. This runs at build time and requires no client JavaScript.

⚠️ Make sure the file path matches exactly and trust the source of your HTML.

Option 2: Client‑Side Fetch (Runtime Injection)

If you need dynamic loading or the HTML is user‑specific:

---
// src/pages/index.astro
---

  
    
Loading...

This approach bypasses SSR but uses client fetch to inject content. It's less "Astro‑native" and not ideal for SEO.

Option 3: Importing a Generated Module

Have your script output a JS or MJS file:

// generate/generated-content.mjs
export const html = `<section class="foo">Prebuilt content</section>`;

Then in Astro:

---
// src/components/InjectedModule.astro
import { html } from '../../generate/generated-content.mjs';
---

This feels component‑like, allows tree‑shaking, and works seamlessly with Astro's build.

Using Python & Go Generators

  • Python: great for Markdown parsing, RSS feed generation, CSV or Excel processing. Use libraries like markdown, pandas, or feedgen.
  • Go: ideal if you want a single compiled binary. Perfect for parsing large datasets or running microservices-style code before build.

For example, you can generate:

  • A posts/*.md collection from a headless CMS API
  • A JSON file of site metrics pulled from Google Analytics
  • A multilingual dictionary JSON for i18n

Just write to src/content/ or public/ and Astro will pick it up.

Summary of Approaches

GoalMethodPros & Cons
Embed static HTML at buildfs.readFileSync + set:html✅ Simple, build-time only
⚠️ No dynamic logic
Embed HTML at runtimefetch() and manual injection✅ Dynamic
⚠️ SEO-unfriendly, extra JS
Component‑style inclusionExport HTML string, import it✅ Clean, tree-shakeable
⚠️ Requires module
Serve as standalone page/assetPlace in /public/✅ Auto-copied
⚠️ Not importable

Conclusion

By integrating custom pre‑build scripts into your Dockerized Astro workflow, you unlock powerful automation: from API data fetching and RSS generation to dynamic HTML or JSON compilation. Corepack ensures a consistent package manager in your container, while Python or Go scripts give you flexibility to pre‑render any content before Astro's static build. Follow these patterns to keep your site fast, maintainable, and rich with programmatically generated pages.

Happy building!

Discover expert insights and tutorials on adaptive software development, Python, DevOps, creating website builders, and more at Learn Programming. Elevate your coding skills today!