Shifting Responsibilities Since LLMs: AI and the Future of Jobs
ChatGPT & Benji Asperheim— Thu Aug 28th, 2025

Shifting Responsibilities Since LLMs: How Will AI Affect Jobs?

It’s interesting how responsibilities for developers have shifted since the advent of AI code completion and LLMs. I spend less time doing low-level “grunt work”, and more time doing high-level project management. I spend less time staring at code, working out bugs, and going through functions and algorithms, and more time planning what tech stack to use, as well as reading LLM responses and source documentation.

It’s not that I have fewer responsibilities, or easier tasks, but that my responsibilities have shifted to that of a senior developer, or project manager (even for solo projects), since I’ve offloaded a lot of the tedious coding work to LLMs.

How AI Will Affect Jobs (for Tech/IT)

If you’ve been feeling the same way—you’re not imagining it. The job has shifted from “typing code” to orchestrating systems and verifying outcomes. The developers who thrive treat LLMs like a squad of fallible junior devs: high output, uneven judgment. The leverage is real, but only if you own specification, architecture, and verification. If you delegate those, quality drifts and you become a babysitter for flaky code.

What Actually Changed Since LLMs

  1. From implementation to orchestration. The work moved up-stack: requirements framing → architecture → interface design → review/merge policies. Generation is cheap; deciding what to build and ensuring it’s correct is the scarce skill.

  2. From deterministic pipelines to probabilistic ones. Compilers are deterministic; LLMs aren’t. That pushes more energy into test design, property checks, and guardrails. “It compiles” is now table stakes, not evidence of correctness.

  3. Documentation is now a first-class dependency. Your prompt/RAG context is only as good as the internal docs, ADRs, and examples you give it. Maintaining that corpus is part of the job, not an afterthought.

  4. Glue beats grind. The value is in stitching services, contracts, and data flows—less in hand-writing boilerplate. This feels like senior/project-lead work even on solo projects.

What is an LLM?

LLM stands for Large Language Model.

In practical terms, the LLM meaning is a neural network—typically a Transformer—trained to predict the next token in text. That simple objective, scaled over huge datasets, yields models that can summarize, translate, write code, and answer questions. Modern large language models (LLM) are post-trained (instruction tuning, RLHF) so they follow directions better; and when you ground them with retrieval (RAG), they cite sources and reduce guesswork. Think of them as fast, fallible pattern engines—you supply the constraints, data, and checks.

New Skill Stack That Matters

What to Delegate vs. What to Keep

Delegate confidently

Keep for yourself

Practical, High-Level Tech Stack Example (Node/Express + Angular/Vue/React)

  1. Write the contract first. OpenAPI schema + TypeScript interfaces + acceptance criteria. Include invariants and failure modes.

  2. Generate the scaffolding, not the system. Use the model to draft controllers/services, Angular standalone components, and test skeletons. Keep side effects thin and injectable.

  3. Pin patterns. Provide a minimal “golden repo” of examples: one service, one component, one test. Reuse that in prompts or RAG so style stays consistent.

  4. Verification gates.

    • Unit + property tests for invariants
    • Contract tests against mock servers
    • Mutation testing to ensure tests bite
    • Lint/type/format must pass; block merges otherwise
  5. Review like a senior. Read diffs for invariant violations, unnecessary complexity, and hidden coupling, not just syntax.

  6. Refactor with intent. After generation, normalize to your architectural patterns (ports/adapters, feature modules, etc.). Don’t leave the “first draft” shape in place.

  7. Record decisions. ADR per notable choice (DB shape, cache policy, auth flow). LLMs reuse this context surprisingly well.

Guardrails and Things to Consider

Risks to Watch (and how to counter)

Career Implications Because of AI (how to stay sharp)


Key Performance Indicators (KPIs) for LLM Workflows

  1. Escaped Defects per 1,000 Lines of Code (LOC) and Mean Time to Repair (MTTR):

    • Escaped Defects: The number of bugs that were not caught before the software was released, measured for every 1,000 lines of code.
    • Mean Time to Repair: The average time it takes to fix a bug after it has been reported.
  2. Mutation Score and Property-Covered Surface Area:

    • Mutation Score: A measure of how well your tests can catch changes or “mutations” in the code. A higher score means better testing.
    • Property-Covered Surface Area: The extent to which your tests cover important properties or features of the code.
  3. Test Runtime vs. Flake Rate:

    • Test Runtime: The amount of time it takes to run all your tests.
    • Flake Rate: The frequency of tests that fail intermittently without any changes to the code, indicating instability.
  4. Code Churn per Feature vs. Cycle Time:

    • Code Churn: The amount of code that is added, modified, or deleted for each feature.
    • Cycle Time: The total time taken to develop a feature from start to finish.
  5. Bundle Size and APIs Changed per Release:

    • Bundle Size: The size of the software package that is released.
    • APIs Changed: The number of Application Programming Interfaces (APIs) that have been modified in each release.
  6. Dependency Count and Supply-Chain Alerts Resolved:

    • Dependency Count: The number of external libraries or tools that your software relies on.
    • Supply-Chain Alerts Resolved: The number of security or compatibility issues related to these dependencies that have been addressed.
  7. Performance Budgets (P95 Latency, Memory):

    • Performance Budgets: Targets for how well the software should perform.
    • P95 Latency: The maximum time it takes for 95% of requests to be processed.
    • Memory: The amount of memory used by the software during operation.

These KPIs help teams monitor the quality, efficiency, and performance of their software development processes, especially when integrating LLMs.

Prompt Patterns That Work

Treat the LLM as a junior team you manage: you set architecture, spell out contracts, provide examples, and build a safety net that catches when it’s confidently wrong. That’s senior work. If you invest in specs, tests, and documentation, you get the leverage without the entropy. If you don’t, the short-term velocity turns into long-term maintenance drag.


Vibe Coding Meaning

“Vibe coding” is great for exploration and scaffolding, terrible as a default delivery mode. Use it like a spike from a junior pair: time-boxed, disposably creative, then rewritten against your architecture and tests. If you let vibe code merge unchallenged, you buy long-term entropy.

Take a look at our other vibe coding article for more details on what vibe coding is and its current 2025 definition.

What is Vibe Coding - Vibe Coding Meme

Here’s the full picture.

What “Vibe Coding” Actually Is

Working by feel with an LLM as pattern-completion: sketch a prompt, accept plausible code, iterate until it runs. It’s fast because it borrows judgment from prior patterns—but those patterns aren’t your domain, your constraints, or your stack conventions.

Where Vibe Coding Helps (use deliberately)

Where Vibe Coding Hurts (do not use)

Vibe Coding Gotchas (things to be aware of)

Here are some of the big gotchas (and fixes):

  1. Hidden coupling & pattern drift
  1. Dependency creep & supply-chain risk
  1. Type safety erosion
  1. Illusory tests
  1. Context debt (prompt-only decisions)
  1. Security foot-guns
  1. Performance cliffs
  1. Non-determinism & flake
  1. Outdated patterns copied confidently

Quick “Anti-Vibe” Checklist (before merging)

Organizational Implications (even for solo work)

Vibe coding is a tool, not a methodology. Use it to explore and scaffold; never to define the system. Anchor everything to contracts, ADRs, and verification. That way you keep the speed and dodge the entropy.


How AI Will Affect Jobs (in general)?

AI won’t “replace jobs” wholesale so much as re-price tasks. Exposure is largest where work is cognitive and repeatable; complementarity (humans + AI) is strongest where judgment, trust, and accountability matter. In rich economies, a majority of roles have meaningful AI exposure, with uneven gains and risks across workers and regions. Early field evidence shows real productivity lifts, especially for less-experienced workers, but economy-wide gains follow a J-curve: benefits arrive after firms invest in process redesign, data, training, and guardrails. (IMF, NBER, Science, American Economic Association)

What Changes Concretely (non-tech jobs included)?

Mechanisms

Magnitude (credible estimates & evidence)

Sector map (outside core tech)


Who Gains from AI and Who Loses?

Practical Moves for Workers, Managers, and Policymakers

For workers (any field)

For managers

For policymakers


Conclusion

Across the economy, AI reallocates time from routine cognition to coordination, care, and judgment. The near-term wins come from augmentation, with solid evidence of productivity gains in real workplaces—especially for less-experienced workers. The long-term payoff depends on whether firms and governments invest in the complements (data, training, processes) that turn exposure into higher wages and better services, instead of wider inequality. (NBER, Science, IMF, OECD)