Workplace Image Kickscale

Leverage LLMs in Development

Description

Fabian Riedlsperger von Kickscale beleuchtet in seinem devjobs.at TechTalk welche Auswirkungen aktuelle LLMs auf die Arbeit im Software Development haben, auf was Acht zu geben ist und welche Vorteile LLMs mit sich bringen können.

By playing the video, you agree to data transfer to YouTube and acknowledge the privacy policy.

Video Summary

In "Leverage LLMs in Development," Fabian Riedlsperger (Kickscale) outlines how to apply LLMs across the software lifecycle, covering model basics (probabilistic, autoregressive), key limitations like hallucinations and limited context, and practical guardrails such as using models as a sparring partner, providing rich context, precise instructions, iterative interaction, and rigorous fact-checking. He shares hands-on use cases—including IDE code completion/generation (also locally), refactoring to coding standards, debugging/testing and TDD support, brainstorming, meeting-recording analysis, and custom GPTs that encode internal processes for onboarding and troubleshooting. He cautions against mental laziness, over-reliance, and privacy/copyright pitfalls, concluding that LLMs boost efficiency and reduce boilerplate but need human oversight; viewers can immediately apply his prompting guidelines and custom-GPT patterns in their teams.

Pragmatic LLMs for Engineers: Prompting, Refactoring, Debugging, and Custom GPTs from “Leverage LLMs in Development” by Fabian Riedlsperger (Kickscale)

What the session delivered — our engineering recap

In “Leverage LLMs in Development,” Fabian Riedlsperger (Kickscale) laid out a clear, hands-on guide to using large language models in the software development lifecycle. From our DevJobs.at vantage point, this wasn’t a hype talk. It was a practical field report: where LLMs help right now, where they routinely fail, and which simple habits make the difference between time saved and time wasted.

The throughline: treat LLMs as sparring partners. The model accelerates, suggests perspectives, and drafts scaffolding; humans own the problem-solving, constraints, and quality bar. With that mindset—plus precise context and iterative prompts—teams can refactor cleaner, debug faster, brainstorm better, and turn runbooks into living assistance through custom GPTs.

Technical background: how LLMs actually behave

Fabian’s background section set the right expectations:

  • Probabilistic, not rule-based: LLMs select the most probable next token. There is no built-in, guaranteed truth mechanism—only likelihoods over tokens.
  • Autoregressive generation: every newly generated token becomes part of the input for the next one.
  • Trained on vast natural language data: often from the public web, hence strong fluency in natural and code-like text.
  • Length increases risk: “the longer the response, the bigger the solution space,” and the less likely the exact target output will be produced. Tight, specific outputs tend to be more reliable.
  • No implicit project context: the model doesn’t know your team, domain, or codebase unless you provide that information.

This framing explains why prompting and iteration matter: you’re steering a probability engine, not invoking a rules oracle.

Limitations and failure modes to respect

The talk was explicit about pitfalls—crucial for teams aiming at repeatable outcomes:

  • Hallucinations: outputs may look plausible but be wrong due to missing context or knowledge. In code, that can mean subtle, costly bugs.
  • No guaranteed factual correctness: neither statements nor code are inherently “true” or “verified.”
  • Chaotic sensitivity: small input changes can produce very different outputs.
  • Programmatic integration is hard: embedding LLM outputs into reliable software systems remains challenging.
  • Trial-and-error in use case discovery: expect iterative prompt and format tuning when you explore new tasks.

The implication is clear: never skip human review and don’t expect “one-shot” magic.

Working principles: how to think with the model

Fabian shared lightweight rules of thumb that—when consistently applied—pay off across tasks:

  • Use LLMs as sparring partners: ideas still come from you; the model helps validate them and widen perspectives.
  • Before prompting, ask: “What information would I need to solve this?” Provide that context upfront.
  • Request suggestions first: don’t jump straight to “solve it.” Start with options and relevant aspects you might have missed.
  • Converse, don’t one-shot: guide the output, refine, and iterate toward the target.
  • Split large tasks: decompose into small, well-bounded subtasks for more stable outputs.
  • Keep problem-solving human: humans make the architectural calls and trade-offs.
  • Always fact-check: especially code. A quick paste can plant hard-to-detect issues.
  • Treat outputs as starting points: drafts that you adapt, not final artifacts.

These habits turn LLMs from novelty into real engineering leverage.

Prompt engineering that actually works

Without posturing, Fabian focused on the essentials:

  • Provide enough context: everything you yourself would need to solve the task should be in the prompt.
  • Set a persona: “act as an experienced TypeScript/React developer…” can significantly lift quality.
  • Word choice matters: be concrete and concise. Specify the steps and the expected output format.
  • Advanced techniques exist: in-context learning, chain-of-thought prompting, retrieval-augmented generation; if you’re curious, plenty of material is available. In the scenarios Fabian shared, strong fundamentals were usually sufficient.

Use cases that deliver at Kickscale

Beyond principles, the session stood out for concrete, repeatable workflows—plus time savings and learning effects.

1) Coding: completion, generation, local vs. cloud

  • IDE completion: GitHub Copilot has been so effective for Fabian over two years that he “couldn’t imagine coding without it.” Autocomplete dramatically speeds up routine work.
  • Code generation: sketch new React components or functions and adapt them to your codebase.
  • Alternatives: services like Tab9 exist; choose what fits your environment.
  • Local models on Mac: privacy-sensitive teams can run open-source models locally. With a MacBook Pro (at least M1, sufficient RAM), models like “StarCode” or “Codelama” can run without sending data to the cloud.

A simple but telling example: Fabian asked for a dialog with a title, text, two buttons (“Close” and “Analyze”), and dummy functions. The generated component was immediately usable and easy to tweak—saving roughly ten minutes compared to writing it from scratch (1–2 minutes of prompting versus 10–15 minutes of coding). That pattern—generate the scaffold, then refine—is where LLMs shine for UI and boilerplate tasks.

2) Refactoring: suggestions, standards, readability

Fabian’s refactoring workflow is pragmatic and repeatable:

  1. Ask for improvement suggestions first: what to change and why, mapped to principles (e.g., SOLID, single responsibility).
  2. Provide your standards: naming conventions, coding rules, readability/maintainability requirements.
  3. Get action items: concrete steps to achieve the stated goals.
  4. Optionally have the model implement them: then review and refine.

This helps tame oversized React components and unstructured modules. Beyond time savings (Fabian cited up to ~30 minutes), engineers see alternative designs and arguments they can fold back into team conventions.

3) Debugging and testing: error sources, edge cases, TDD

  • Analyze code for possible error sources not yet covered.
  • Interpret obscure error messages: especially useful in unfamiliar environments.
  • Ask for alternative solutions: simplify complex logic by exploring different approaches.
  • Generate unit tests: system/integration tests are more complex (require more context), but unit tests are a solid fit.
  • Create test data: dummy objects and edge cases for broader coverage.
  • TDD kickstart: begin with a natural-language description, generate tests first, then implement. TDD becomes faster to bootstrap.

A concrete case: Kickscale maintains a script to encrypt/decrypt secrets. After switching from Linux to macOS, a shell command broke. By stating “help adapt a Linux shell command to a Mac command” and including the error, Fabian landed a fix in one to two minutes. Without the model, the search alone could have taken around 15 minutes. Not a hard problem—but a frequent one, and these add up.

4) Brainstorming: features, UX, DevOps, standards, tech stack

This is Fabian’s favorite use case. LLMs are strong partners for structured thinking and decision-making:

  • New features and user experience (UI is trickier, given limited visual understanding)
  • DevOps topics and coding standards (“what should a professional coding standard cover?”)
  • Task splitting: break a large epic into smaller, well-scoped tasks to turn into tickets
  • Holistic decisions: enumerate options with pros and cons
  • Tech stack exploration: survey candidates and structure criteria
  • Meta-prompting: the model can improve your prompts—an iterative loop that compounds

A relevant example: when expanding coding standards, Fabian asked the model to list aspects to consider. About 20% wasn’t relevant—but crucial items he hadn’t considered surfaced. That’s a quality gain more than a time savings—still highly valuable.

5) Recording analysis: summaries, next steps, accessible knowledge

Close to Kickscale’s product domain, Fabian outlined a clean workflow:

  1. Record conversations.
  2. Transcribe them.
  3. Use the transcription to extract summaries, organize information, and identify next steps/action items.

For a sprint planning sync—many topics, lots of implicit detail—this prevents knowledge loss. If decisions and next steps don’t make it into a ticketing system immediately, the team can revisit the summary and action items later. More people access the same knowledge, faster.

6) Custom GPTs: embed standards and processes

Custom GPTs are specialized model instances that come preloaded with your knowledge. The win: you don’t restate the same standards and context in each prompt.

Examples from the talk:

  • Refactoring GPT: coding standards are built in. Paste code and get suggestions or refactors in your style.
  • Feature requirements engineering: assist in shaping requirements.
  • Dev tooling GPT: document manual processes as a conversational helper. New hires can ask, “This error occurred—how do I fix it?” and receive step-by-step instructions.
  • System architecture Q&A: load your tech stack and ask targeted questions.
  • Beyond engineering: can be applied to domains like sales and marketing with the right knowledge base.

A concrete Kickscale example: Fabian created a dev tooling GPT documenting common fixes—e.g., “too many participants recognized in a transcript.” New teammates can ask the GPT and get a precise procedure: find the meeting ID, review participation counts, use the internal script to correct it. That turns long documentation pages into practical, guided workflows.

Risks and considerations with heavy LLM use

Fabian’s cautionary list mirrors what we see across teams:

  • Mental laziness: jumping to the model before thinking yourself. Better: form a hypothesis, then challenge it with the model.
  • Over-reliance: for tiny tasks, prompting can take longer than doing.
  • Not a magic wand: an LLM cannot invent missing knowledge or solve impossible problems.
  • Language bias: coverage is strong for JavaScript/Python; niche languages (e.g., Fortran) can be harder.
  • Limited context: always supply what’s needed to solve the task.
  • Bad, vague prompts waste time: be specific or you’ll loop.
  • Never copy-paste blindly: review everything, especially code.
  • Security, privacy, copyright: protect sensitive data and consider legal aspects of generated material.

Our takeaway from “Leverage LLMs in Development”: grounded leverage, real gains

Fabian’s conclusions feel well-supported by the examples:

  • LLMs increase developer efficiency when used as sparring partners.
  • Any software-building company should leverage them to some extent or risk falling behind.
  • Small teams benefit disproportionately: leaner teams can build more with fewer people.
  • Less boilerplate, more value: routine coding shrinks; problem-solving and creativity expand.
  • Transformative pace: the last two years were a leap; the next years will bring another.
  • Programming will be more accessible: natural-language programming may grow as models evolve.
  • Agents on the horizon: “describe the feature and it gets done” is getting closer.
  • Humans remain essential: roles will change, but guidance, judgment, and accountability won’t disappear.

Practical takeaways for engineering teams

From the talk, five practices stand out that require no new tools—just better habits:

  1. Precision in, precision out: pack the relevant facts into the prompt and specify the desired output.
  2. Iterate in small steps: review partial results and steer.
  3. Standards up front: include coding standards, naming, and principles (SOLID, separation of concerns) in prompts—or encode them in custom GPTs.
  4. Keep humans in the loop: review refactors, tests, and debug suggestions before merging.
  5. Treat knowledge as a product: make meeting outcomes and runbooks queryable via transcripts and custom GPTs—especially for onboarding and support.

Demos and examples we found instructive

  • The “dialog with two buttons” prompt captured the essence of LLM-enabled scaffolding: clear specs yield ready-to-adapt UI code and save minutes that add up.
  • The refactor-first-ask-for-suggestions pattern prevents black-box changes and teaches the team better structure.
  • Error translation from Linux to macOS highlighted how LLMs trim research time for common, small snags.
  • The dev tooling GPT spotlighted a form of documentation that gets used: guided, conversational, and actionable for junior teammates.

Closing thoughts

“Leverage LLMs in Development” by Fabian Riedlsperger (Kickscale) delivered a grounded playbook for making LLMs work in real engineering environments. Keep the model’s probabilistic nature in mind, supply context, iterate, and maintain human review. Do that, and you’ll code faster, refactor cleaner, debug smarter, and preserve team knowledge in ways that actually help people do their job. From our DevJobs.at editorial perspective, this is the kind of practical, durable guidance teams can implement today—and keep benefiting from as the tooling evolves.

More Tech Lead Stories

More Dev Stories