Workplace Image eurofunk Kappacher GmbH

Application Load Testing with k6

Description

Daniel Knittl-Frank von eurofunk Kappacher spricht in seinem devjobs.at TechTalk darüber, wie die hohen Performance Anforderungen der im Unternehmen entwickelten Software erfüllt werden.

By playing the video, you agree to data transfer to YouTube and acknowledge the privacy policy.

Video Summary

In Application Load Testing with k6, Daniel Knittl-Frank (eurofunk Kappacher GmbH) shows how his team load-tests the mission-critical Eurofunk operation center suite for emergency and command centers using the open-source k6 tool. He covers the test structure (setup/teardown/default), checks, custom metrics (counters, gauges, rates, trends), thresholds, groups/tags, open vs. closed workload models with scenarios and ramps, and protocol support for HTTP, gRPC, and WebSockets (with a cookie workaround). He also demonstrates GitLab CI integration and live analysis via InfluxDB and Grafana, enabling teams to catch performance regressions quickly and ensure low latency and consistent response times when seconds matter.

Application Load Testing with k6: Metrics, Thresholds, Workload Models, and CI – A deep-dive recap of “Application Load Testing with k6” by Daniel Knittl-Frank (eurofunk Kappacher GmbH)

Mission-critical context: Why every second matters

From our DevJobs.at editorial seat, “Application Load Testing with k6” was a succinct, practical walkthrough of performance testing in an environment where latency isn’t a nice-to-have—it’s existential. Daniel Knittl-Frank, a GNU/Linux enthusiast, part-time lecturer, and expert developer at eurofunk Kappacher GmbH, works on software for emergency and command-and-control centers used by police, fire brigades, ambulances, industry, and larger airports.

Eurofunk delivers the complete command center solution: hardware, network, video walls, desktops, and, crucially, the software operators rely on during emergencies. At the center of the stack is the Eurofunk Operation Center Suite (EOGS), a large-scale web application that handles high data volumes with a streamlined UI for incident management. Dispatchers can initiate and track units, view locations and statuses on a live map, and communicate directly in the browser via WebRTC—both for taking calls and for radio communication to units in the field. EOGS also integrates external event sources: alarm systems, fire and smoke detectors, CCTV/video surveillance, and external web services that enrich incident information.

With that scope, EOGS is mission-critical: high availability, low latency, consistent response times, and up-to-date information are non-negotiable. Daniel distilled the stakes in one line:

“In an emergency situation every single second counts.”

That urgency frames why eurofunk relies on k6 for continuous performance testing.

k6 at a glance: Open source, JavaScript scripting, built for load

k6 is an AGPL-licensed open-source load testing tool. It’s built with Go, and tests are scripted in JavaScript. According to Daniel, support for newer JavaScript features is limited at the moment (ES 5.1), but there is module support, which makes structuring tests convenient. k6 was acquired by Grafana Labs in the summer. The key appeal for engineering teams: it’s CLI-first, scriptable, and integrates cleanly into automation.

Test anatomy: Setup, default loop, teardown, summary

A typical k6 test follows a clear structure:

  • Setup: Prepare data or perform login.
  • Default function: The core scenario. Virtual users (VUs) repeatedly execute this function in a loop and perform interactions—HTTP or gRPC requests, WebSocket actions, and more.
  • Teardown: Clean up and logout.
  • Summary: A final, customizable test summary aggregating checks, metrics, and thresholds.

k6 runs on the command line. When you start a test, you get an interactive view while it’s running—logs, failed iterations, remaining duration, and more. This immediate feedback helps iterate quickly on test scripts and hypotheses.

Checks: Validating correctness under load

Load testing is not just about throughput. With k6, you can apply arbitrary checks to objects in your test—most commonly HTTP responses. Examples Daniel highlighted:

  • Is the HTTP status 200?
  • Is the Content-Type JSON?
  • Does the response contain an ID matching a specific pattern?

Checks are automatically summarized at the end with clear markers: green for successful checks and red for failures. k6 also reports the percentage and absolute count of failed versus successful checks. This keeps functional integrity in focus alongside performance.

Custom metrics: Counter, Gauge, Rate, Trend

Many performance questions are domain-specific. k6 supports four custom metric types to capture what’s unique in your system:

  • Counter: Total counts, for example “how many frames/commands received.”
  • Gauge: The last known value, suitable for tracking a current state.
  • Rate: Success-to-failure ratios for a particular event that k6 doesn’t capture by default.
  • Trend: Time-series metrics with min/avg/median/percentiles/max—ideal for response time distributions.

In the summary, k6 renders these appropriately:

  • Counter: total plus the per-second rate observed during the test.
  • Trend: mean, min, percentiles, median, max.
  • Gauge: last observed value.

This extends analysis beyond generic HTTP timings to business- and protocol-level signals—vital in integrated systems like EOGS.

Thresholds: Enforcing performance contracts

Thresholds operate at the global test level and are the mechanism for encoding expectations. Daniel gave examples like:

  • Fail if the average response time exceeds 100 ms.
  • Fail if a rate drops below 5% for something you track.

Threshold outcomes are shown in the summary—with red markers for exceeded limits and green checks when within bounds. Thresholds are the backbone of automated regression detection: they turn performance requirements into pass/fail criteria.

Groups and tags: Slice metrics and target subsets

k6 lets you categorize requests, checks, and metrics using groups and tags. This is useful when you need to split metrics and define thresholds for subsets—say, the response times of a specific endpoint. Tags can be applied per check or globally. In practice, this enables focused SLIs/SLOs for critical paths inside broader end-to-end flows.

Reading the summary: Built-ins plus domain metrics

Daniel walked through a test summary containing metrics and thresholds. Much of it is captured automatically by k6:

  • Number of checks executed.
  • Data received from and sent to the application.
  • Average duration per group.
  • Response time distribution (count, average, median, etc.).

In addition, custom metrics appeared—such as “SOC frames” and “STUMP commands received.” The summary shows raw counts and the derived per-second rates. This blend of generic and domain-specific signals makes the output actionable for both engineers and operations.

Workload models: Open vs. closed, scenarios, and ramps

Daniel differentiated two workload models that k6 supports:

  • Open workload: You control the arrival rate of new users. Example: “100 requests per second,” and k6 spins up as many VUs as needed to maintain that rate.
  • Closed workload: You control the number of concurrent users. Example: “16 users execute the default function in a loop.”

You define these in k6 using scenarios. Crucially, load isn’t just a flat line—ramping is essential to observe system behavior during change:

  • Start at zero users, ramp to 50 over one minute, hold for two minutes, then ramp to 100 over one minute.
  • Mirror the same idea for arrival rates: start at zero RPS, reach 50 RPS at the one-minute mark, hold, then ramp further—or ramp down.

In visualizations Daniel referenced:

  • For VU ramping, the number of users matches the defined ramp profile.
  • For arrival-rate ramping, k6 injects VUs to hit the target RPS.
  • In the request duration view, a “green background area” corresponds to the number of requests and reflects the defined rate.

This distinction—concurrency vs. arrival rate—shapes how bottlenecks emerge and how capacity should be interpreted.

Beyond HTTP: WebSockets and gRPC

EOGS uses WebSockets, and k6 supports testing beyond HTTP. You can connect to secure and plain WebSocket endpoints, send and receive messages, and measure durations—time to send, time to receive. Daniel also noted that you can do gRPC requests with k6. One caveat from their implementation: cookie support for WebSockets was “kind of broken” at the time, with ongoing work and a workaround available. For teams relying on cookies in WebSocket flows, this heads-up is important when designing tests.

Continuous testing: GitLab CI, Docker, and hourly runs

Mission-critical software deserves frequent validation. Eurofunk runs k6 in a GitLab CI pipeline using a Docker image to execute predefined tests, for example on an hourly schedule. Thresholds gate the pipeline: regressions cause the job to fail, and you get email notifications upon failure. The loop is tight—detect issues early, act before they affect operations.

Observability: InfluxDB output and live Grafana dashboards

k6 can emit metrics to external outputs. Daniel highlighted InfluxDB as a time-series database to which the test writes metrics while it’s running. With Grafana on top, you create panels to visualize results live:

  • How many users are active?
  • What are the response times?
  • Are there unexpected errors?
  • Is there a slowdown over the duration of the test?
  • Is there a slowdown correlated with user count?

This live lens is especially helpful for ramp scenarios and for diagnosing issues under production-like workload patterns.

A practical blueprint for engineering teams

Based on Daniel’s talk, here’s a practical sequence for applying k6 in your environment using the specific building blocks he covered:

  1. Clarify goals and critical paths
  • Define latency targets and what “consistency” means for your system (average, median, percentiles, max).
  • Identify critical user journeys and protocols (HTTP, WebSockets, gRPC if applicable).
  • Decide on correctness checks: status codes, content types, and field patterns that must hold.
  1. Structure tests intentionally
  • Use setup for authentication and data prep; teardown for cleanup and logout.
  • Model the default function as a realistic loop VUs will execute.
  1. Implement checks for correctness
  • Validate HTTP 200, JSON content type, and expected IDs/fields.
  • Monitor success/failure ratios in the final summary.
  1. Add custom metrics where it matters
  • Counters/Rates for domain events not captured by default.
  • Trends for time-based distributions—especially response times per endpoint.
  • Gauges for last-known states when you need that signal.
  1. Encode expectations with thresholds
  • Global thresholds that fail the test when violated.
  • Combine with groups/tags to apply thresholds to subsets (e.g., a specific endpoint).
  1. Choose the right workload model
  • Closed model to reason about concurrency and resource contention.
  • Open model to drive specific arrival rates and emulate external traffic patterns.
  • Use ramp-up/down to validate stability during transitions.
  1. Cover the right protocols
  • Test WebSockets: connect, send/receive, and measure durations.
  • Be mindful of cookie handling with WebSockets and use the available workaround.
  • Leverage gRPC testing if your system exposes it.
  1. Make it continuous in CI
  • Run dockerized k6 jobs via GitLab CI on a frequent schedule (hourly/daily).
  • Let thresholds fail the pipeline on regressions; rely on email notifications for fast feedback.
  1. Instrument for live visibility
  • Route k6 output to InfluxDB.
  • Build Grafana panels for user counts, response times, errors, and trends.
  • Watch the correlation between load and latency in real time.
  1. Iterate based on evidence
  • Turn anomalies from the summary and Grafana into hypotheses.
  • Refine scripts, checks, metrics, and thresholds incrementally.

Key lessons from the session

  • Performance is part of functionality in command-and-control settings. The targets are strict: high availability, low latency, and consistent response times.
  • k6 provides a disciplined structure with setup/default/teardown and an actionable summary. Checks and custom metrics move load testing beyond raw throughput to include business correctness.
  • Thresholds turn performance expectations into automated gates. They are foundational for catching regressions early in CI.
  • Understanding open vs. closed workload models is critical; they reveal different bottlenecks and shape how capacity is interpreted.
  • WebSockets are first-class citizens for testing. One practical caveat: cookie support needed a workaround—plan for that if your flows depend on cookies.
  • With InfluxDB and Grafana, you gain a live window into running tests—ideal for ramp phases and diagnosing slowdowns.

Conclusion: Structure, metrics, and automation—k6 in practice

In “Application Load Testing with k6,” Daniel Knittl-Frank lays out how to operationalize performance testing for a mission-critical system: a clean test structure, rigorous checks, domain-aware metrics, group/tag scoping, thresholds as contracts, and realistic load profiles via open/closed models and ramps. Add GitLab CI with Docker for frequent execution and InfluxDB/Grafana for live analysis, and you have a tight feedback loop.

With this setup, the application runs smoothly with low latency and consistent response times.

For engineering teams, this talk offers a concrete blueprint: reproducible scenarios, domain-specific metrics, explicit thresholds, continuous runs, and meaningful visualization. It’s exactly the toolkit you need when, as Daniel reminded us, every single second counts.

More Tech Talks

More Tech Lead Stories

More Dev Stories