Tractive GmbH
Logging in a Polyglot IoT Environment
Description
Dominik Hurnaus von Tractive zeigt in seinem devjobs.at TechTalk „Logging in a Polyglot IoT Environment“ wie Tractive die Probleme, welche bei GPS Geräten, Server Software, Mobile Apps, usw. auftreten, gelöst haben.
By playing the video, you agree to data transfer to YouTube and acknowledge the privacy policy.
Video Summary
In "Logging in a Polyglot IoT Environment," Dominik Hurnaus explains how Tractive centralizes logs from a heterogeneous IoT stack—mobile apps (Swift/Kotlin/React Native/JS), backends (Kotlin/Ruby/JRuby/Java), firmware (C/C++), and web services—using the Elastic Stack: Docker JSON logs via Filebeat into Logstash, on to Elasticsearch, and queried in Kibana. He outlines actionable practices: rich metadata and IDs, environment-appropriate log levels, no sensitive data, exhaustive third-party call logging, daily log reviews and post-deploy checks, alerting on log-derived thresholds, and lifecycle-based retention. Viewers can apply this approach to turn many gigabytes of logs into reliable operational signals for 24/7 systems.
Logging in a Polyglot IoT Environment: How Tractive GmbH Unifies Container Logs with JSON and the Elastic Stack
Why this session matters
At DevJobs.at we tuned into “Logging in a Polyglot IoT Environment” by Dominik Hurnaus (CTO) from Tractive GmbH. The talk offers a grounded blueprint for turning a sprawling IoT landscape—devices in the field, mobile apps, and a multi‑service backend—into a single, searchable source of truth for logs. It’s a candid look at how logging becomes an operational cornerstone for stability, incident response, and everyday engineering.
Tractive GmbH builds GPS trackers for pets. The environment is classic IoT: devices outside in the wild stream data, mobile apps provide real‑time visibility, and a diverse backend orchestrates ingestion, notifications, and geolocation logic. Alongside secure communication, updatability, zero‑downtime deployments, and database scaling, logging takes center stage. Without well‑structured logs, teams lose the leverage they need to detect, reproduce, and resolve issues fast.
The problem space: Many sources, many formats, 24/7 expectations
Polyglot isn’t a buzzword here—it’s the daily reality:
- Mobile apps: Swift and Kotlin, with React Native and JavaScript embedded parts.
- Backend services: Kotlin, Ruby, JRuby, and Java.
- Firmware: C and C++.
- Web apps and a webshop built with various web frameworks.
All of these produce logs. Add in the GPS devices in the field and the mobile apps pushing events, and the challenge is obvious. Dominik also highlights operational constraints:
- Secure communication between apps, servers, and devices.
- Updatability for firmware (over Bluetooth, Wi‑Fi, GSM/LTE), apps, and servers.
- High availability and zero‑downtime deployments.
- Scaling, including the core database (MongoDB is mentioned).
The punchline: “many devices in the field, many requests, big server farms... create lots of logs out of various software systems.” Centralizing and making sense of this is the job to be done.
Architecture backdrop: A Docker Swarm cluster with many moving parts
Tractive GmbH runs its ecosystem on a larger Docker Swarm cluster across several servers. Within that cluster, multiple applications operate side by side:
- REST APIs.
- Endpoints to ingest data from GPS trackers (device or IoT endpoints).
- Supplementary services such as a notification service for push notifications and geolocation‑related applications.
- Infrastructure components like load balancers, caches, and MongoDB as the core database.
Every one of these systems generates logs. And there are external sources as well: GPS devices and mobile apps. A workable logging solution must capture container logs consistently while also making these external events navigable, correlated, and searchable.
The chosen approach: The Elastic Stack (formerly ELK)
Dominik’s team adopted the Elastic Stack—Beats, Logstash, Elasticsearch, and Kibana—as the backbone of their logging pipeline.
Beats (Filebeat) as the collector
- Beats is the family of lightweight shippers for different data types.
- Tractive GmbH uses Filebeat to read logs from files created by the Docker environment.
- Filebeat collects those files and forwards entries downstream.
Logstash as the aggregator and enricher
- Logstash aggregates inbound logs, filters where appropriate, and “adds a few fields” depending on the service’s needs.
- This enrichment is crucial: it bakes context (IDs, environment, service) into each event before storage.
Elasticsearch as the log database
- Elasticsearch stores logs and powers fast search.
- Free‑form text search and structured field queries both become first‑class citizens.
Kibana as the query and visualization layer
- Kibana enables Google‑like queries over logs—type a term, get results.
- It also supports saved searches and dashboards for daily operations.
The pivotal decision: One canonical format (JSON)
Dominik calls this “the key to solving the puzzle”: a single, central logging format—JSON. In a polyglot stack, format drift is the enemy of reliable parsing. The implementation leans on existing Docker and Beats capabilities:
- Docker provides a JSON logger that writes container logs to JSON files.
- Filebeat reads those JSON files produced by Docker.
- When any service starts, “Filebeat automatically picks up the logs” and forwards them.
The payoff: there’s no more SSHing into containers or tailing scattered files. All logs land in one cluster, ready for enrichment, indexing, and search.
Lessons learned: What really matters in production
Dominik distilled several practical principles that resonated throughout the session.
1) Add rich metadata: Context turns logs into answers
A message like “user logged out” is nearly useless without “which user?”. The remedy:
- Always include IDs for affected entities (user IDs, tracker IDs, payment IDs, subscription IDs).
- For domain events (e.g., creating a pet), add the fields that will matter in incident analysis.
- For deletions, at least log the ID of the deleted object.
With these basics, logs stop being vague statements and start serving as precise, correlatable events.
2) Use log levels deliberately—and differently per environment
- It’s valid to log more details in staging or test than in production.
- Production logging should avoid noise while preserving the detail needed for diagnosis.
Teams need to actively decide what counts as DEBUG/INFO/WARN/ERROR and revisit those choices as systems evolve.
3) Never log sensitive information
Dominik explicitly calls out personally identifiable information, secrets, keys, API keys, and anything else you don’t want others to read. Because Kibana makes searching trivial, compliance starts at the point of emission—by not putting sensitive data into logs in the first place.
4) Log third‑party calls comprehensively
External integrations often fail in unpredictable ways. To debug effectively, Dominik advises logging:
- What data you sent.
- What data you received.
- The headers—since they can contain valuable clues when systems fail.
This completeness shaves off hours or days when chasing down integration issues.
5) Manage volume consciously
Dominik cites “around 20 gigabytes of logs per day” across systems. That’s a storage and cost reality, but also a signal‑to‑noise and query‑performance constraint. Elastic’s lifecycle features are essential to keep growth and retention in check (more below).
What to do with all those logs: From archive to operational muscle
Centralizing logs is necessary; making them part of daily routines is what extracts value. Dominik outlines a pragmatic operating model.
Daily routine: Morning log review and ticket creation
- Developers are “trained to understand how to use the logs.”
- Each morning, a rotating person reviews errors: any new or unknown error patterns?
- Findings lead to tickets in the team’s ticketing system.
This practice plants prevention into everyday work—small issues surface before they turn into incidents.
After deployments: Error log reviews
- After a rollout, the team intentionally checks if error profiles changed.
- This helps catch regressions early and confidently roll back or patch.
Log‑driven observability: Monitoring and alerting
Alongside application performance monitoring, the team sets up log‑based alerts:
- Logs tell business stories: purchases, user creation, and more.
- Thresholds per time window provide simple but effective guardrails. One example given: “no payments within one hour” triggers an alert derived from logs.
The big win: there is no second telemetry system to build. If the system logs meaningful events, you can alert on them directly.
Lifecycle management: Retain, then delete automatically
Elastic’s Index Lifecycle Management provides controls for retention:
- Configure how long to keep data (e.g., 30 days).
- After that period, logs are automatically deleted.
- You constrain growth and keep budgets predictable.
Picking the right retention is a balance: enough to investigate issues and trends, but not so much that storage and searches suffer.
What a log looks like: Core shape with domain‑specific fields
Dominik describes a sample application log:
- A timestamp.
- A log level.
- A message.
In addition, domain‑specific IDs appear where they make sense—payment IDs, tracker IDs, subscription IDs, and so forth. Other logs carry different fields aligned with their purpose. This flexible, JSON‑based structure makes searching powerful: you can filter by IDs, correlate events, and reconstruct failure paths.
A practical path for teams seeking similar outcomes
Staying close to the talk’s content, here’s a concrete path—using the components Dominik covered.
1) Standardize on JSON
- Make JSON the shared contract for logs across services.
- Agree on common field names (e.g.,
timestamp,level,message,service,request_id,user_id,tracker_id). - Document domain‑level fields per service so developers know what to log.
2) Leverage the container layer: Docker’s JSON logger
- Configure Docker to write logs as JSON to files.
- This provides a single, well‑formed source for Filebeat to harvest.
3) Collect with Filebeat
- Deploy Filebeat on your nodes to read the Docker‑generated files.
- New services are picked up automatically when they start emitting logs.
4) Enrich and route with Logstash
- Parse the JSON structure.
- Add environment, service name, host, container ID, and domain IDs as needed.
- Optionally filter to cut noise without losing diagnostic value.
5) Store and search in Elasticsearch
- Plan an index strategy that fits retention and query patterns.
- Map frequently queried fields appropriately.
6) Operationalize with Kibana
- Use ad‑hoc search for daily debugging.
- Build saved searches and dashboards for recurring questions (e.g., errors by service or version).
7) Define team processes around logs
- Morning error reviews with a rotating on‑duty person.
- Post‑deployment error reviews.
- Log‑based alerts on meaningful thresholds (e.g., “no purchases in 60 minutes”).
8) Keep hygiene front and center
- Calibrate log levels by environment.
- Don’t log sensitive data.
- For third‑party payloads, log enough detail for troubleshooting while avoiding sensitive content.
9) Apply lifecycle management
- Set explicit retention (e.g., 30 days as mentioned in the talk’s example).
- Let ILM automatically delete aged data to prevent unchecked growth.
What stood out to us: Why this approach works
- A single format reduces friction: JSON tames parsing across languages and runtimes.
- Enrichment is essential, not optional: Without IDs, troubleshooting is guesswork.
- Process is as important as tooling: Morning checks and post‑deploy reviews turn logs into a daily habit, not a passive archive.
- Log‑driven alerts track real business behavior: Events like purchases or user creation are excellent, robust signals to watch.
- Retention protects both operability and budget: Lifecycle management keeps the dataset right‑sized and compliant.
Common pitfalls—and how the session addresses them
- Unstructured text: Free‑form logs are hard to correlate. JSON fixes that.
- Missing context: “An error occurred” is not actionable. IDs, headers, and request/response snapshots are.
- Sensitive data leakage: Establish rules and reviews early; better yet, never log sensitive fields.
- Alert fatigue: Align thresholds with meaningful business signals (like “no payments in the last hour”), not just raw technical counters.
- “Set and forget”: Log levels, fields, and processes must evolve alongside the system.
Conclusion: A resilient, everyday logging practice for a polyglot IoT stack
“Logging in a Polyglot IoT Environment” by Dominik Hurnaus lays out a practical path:
- Polyglot stack? Normalize on JSON.
- Containerized platform? Use Docker’s JSON logger with Filebeat for collection.
- Centralize and search? Logstash → Elasticsearch → Kibana.
- Make logs useful? Enrich with IDs, set log levels deliberately, and avoid sensitive data.
- Embed in operations? Daily reviews, post‑deploy checks, and log‑based alerting.
- Control growth? Lifecycle management with explicit retention (e.g., 30 days).
The outcome is more than a tooling diagram—it’s an operating model where logs become an early warning system, a diagnostic bedrock, and a quality lever. In an environment spanning field devices, mobile clients, and a scaling backend, the emphasis on structure, context, and routine is exactly what keeps complexity in check.
Dominik closes by noting that Tractive GmbH is hiring across multiple roles—senior backend engineers (Kotlin or Ruby), cloud engineers, software testers, hardware engineers, and firmware engineers. If you’re drawn to IoT products, polyglot stacks, and disciplined engineering practice, it’s a compelling setup where logging isn’t an afterthought, but a catalyst for reliable delivery.