Tractive GmbH
The purpose of your data
Description
Clemens Kaar von Tractive zeigt in seinem devjobs.at TechTalk den Unterschied zwischen functional Data und analytical Data – und die wichtige Rolle, die diese Unterteilung in einem Devteam spielen kann.
By playing the video, you agree to data transfer to YouTube and acknowledge the privacy policy.
Video Summary
In “The purpose of your data,” Clemens Kaar (Tractive GmbH) argues that every dataset serves two distinct purposes—functional (product-facing) and analytical (insight-focused)—and that mixing them harms focus and quality. Using GPS/activity, marketing, and app data, he demonstrates how to separate concerns via ETL/data warehousing with Airflow, Apache Spark, Amazon S3/Redshift, and BI tools, and cautions against pushing analytical needs (e.g., weight history for a BMI calculator) into product databases. He closes with practical steps—build company-wide awareness, mirror the split in org structure, and embed it in delivery processes (joint kickoffs, separate tickets)—so teams can scale analytics without compromising product performance.
Separate functional and analytical data: Engineering takeaways from “The purpose of your data” by Clemens Kaar (Tractive GmbH)
Setting the stage: Tractive’s context and the session’s focus
In “The purpose of your data,” Clemens Kaar, Head of Big Pet Data at Tractive GmbH, makes a deceptively simple point with far-reaching implications: every piece of data serves a purpose—usually two. There is a functional purpose (what the product or application must deliver to customers right now) and an analytical purpose (what teams will want to learn from the same data later). Kaar’s core message is unambiguous: if you mix those purposes, you lose focus, sacrifice performance, and undermine your ability to analyze data reliably.
For context, Tractive is the market leader in GPS tracking for cats and dogs. Their device is also an activity tracker that reports activity and sleep behavior to the owner’s smartphone. The company is headquartered near Linz (Pasching), has around 140 employees from more than 30 nations (English is the company language), recently opened a US office, and counts roughly 500,000 cats and dogs wearing Tractive devices in more than 150 countries.
Against this backdrop, Kaar’s thesis becomes tangible. Tractive’s data landscape is broad—GPS, activity, user, connectivity, app, web, marketing, logistics, sales, finance, webshop data, and information from external tools (he mentioned MailChimp explicitly). In that variety, teams tend to begin with functional goals. Over time, however, analytical needs inevitably emerge. The shift from “ship the feature” to “understand the data” is exactly where many organizations stumble.
The core idea: Every dataset has two purposes
Kaar consistently differentiates between functional and analytical data purposes, grounding the distinction in concrete examples:
- GPS data
- Functional purpose: tell customers where their cat or dog is—now. The device reports the path and current location to the phone.
- Analytical purpose: the product team might want to know when hunting season starts in certain regions, to coordinate with partners or improve product quality.
- Activity data
- Functional purpose: show owners how active their dog is and what the sleeping behavior looks like.
- Analytical purpose: internally, compare, for example, the sleeping behavior of beagles with that of Siberian huskies.
- Marketing data
- Functional purpose: run campaigns efficiently (e.g., Google campaigns).
- Analytical purpose: compare campaigns, understand behavior, and derive strategies toward efficiency.
These examples show how the same data stream demands different modeling, query patterns, performance characteristics, and usability in the two worlds.
Data sprawl: Internal and external sources
Kaar underlines that data resides across many systems:
- Internal databases—for webshop data, user data, GPS data, and more.
- External systems—he cited MailChimp as a tool used for newsletter campaigns.
In practice, initial implementation efforts focus rightly on the functional purpose; that’s why systems are built in the first place. But as products evolve, stakeholders request analyses—and too often, they target the functional systems first. That is the pivot point where the trouble starts.
The pitfall: Cramming analytics into functional systems
The initial situation is familiar: “We need to deliver for the customer.” Once that’s done, product, research, or marketing asks for extra fields, history, or cross-source joins. The seemingly pragmatic response is to add columns, tables, or logging in the transactional database and call it a day.
Kaar’s verdict: that’s not the way to go.
- Focus erosion: Functional systems are built to power the product. Analytical extensions blur scope and accountability.
- Performance conflict: Functional access is about low-latency reads of a single object (e.g., a tracker’s current location). Analytical access is about bulk processing across large sets—exactly the opposite.
- Divergent usability: Customer-facing representations differ drastically from what analysts need to work with.
- Schema drift and friction: Analytics drives format harmonization and cross-source joins that do not belong in production databases.
In short, mingling the two purposes means degrading the product and the analytics at the same time.
The architectural answer: A separate analytics stack
The structural remedy is to split the worlds and keep them separate. Kaar outlines several architectural routes organizations take to implement this separation:
- Use a SQL tool for direct analysis over extracted data.
- Adopt a tool like Tableau that can also bring its own data warehouse setup.
- Build your own data warehouse to extract, transform, and load data from the various sources—then visualize it.
- Combine your own data warehouse with a warehouse offered by a tool vendor.
- Route via a data lake: extract data, land it in the lake, process it, load it into a data warehouse, then visualize it.
In every case, the essence is the same: functional systems serve product functionality; analytical systems serve insight—and they are decoupled technically, organizationally, and procedurally.
Why the separation matters—Kaar’s rationale
Kaar lists concrete reasons for the split, which serve as clear design drivers:
- Join information across sources: analytics often spans datasets that have no direct functional relationship.
- Unify data formats: harmonization belongs in the analytics path.
- Optimize performance for each world:
- Functional: very fast access to individual entities (e.g., current GPS position).
- Analytical: bulk analysis over large volumes.
- Respect different usability needs: customer-facing and analyst-facing views are distinct and deserve their own models and access paths.
A data warehouse is not the goal—it’s the means to realize these differing requirements cleanly and predictably.
How Tractive runs its pipeline
Kaar describes Tractive’s analytics flow with specific components:
- Orchestration: Airflow triggers the extraction jobs.
- Processing: Apache Spark executes the extractions.
- Staging storage: Data is loaded into Amazon S3.
- Data warehouse: Data is pushed into Amazon Redshift.
- Visualization: “Any kind of data visualization tools”—the specific tool is not the point; the dedicated access path is.
This gives Tractive a dedicated analytics landscape, freeing functional systems from analytical load while providing analysts with integrated, performant data.
Why that still isn’t enough: The BMI/weight history example
Kaar’s hypothetical feature example is particularly instructive: a BMI calculator for dogs. Functionally, all you need is a weight input and an immediate result in the app—akin to the human BMI. That satisfies the customer-facing goal.
Later, a research team wants to analyze “how weight develops over time.” That question requires a historical record of weight changes. A common, but problematic, reaction is to modify the functional database to store every weight change as a new record, keeping the full history—even though the history is not intended for the customer, but purely for analysis.
This is precisely where the boundary lies: implementing a purely analytical requirement in the functional database mixes the two worlds again. Kaar argues it would be better to implement the weight history in the analytics pipeline structure. In reality, however, stakeholders often approach developers first (“we need to capture this data”), not the data team—recreating the root cause of the problem.
The lesson is clear: technology alone won’t keep you out of trouble. You need the right mindset, organizational structure, and process.
Three recommendations: Awareness, organization, process
Kaar ends with three practical recommendations for sustaining the separation:
1) Build awareness
- Everyone—from product and engineering to data and marketing—must recognize the two sides of a dataset: functional and analytical.
- With this shared mental model, ownership and responsibilities become easier to define: who delivers the feature, who enables the analysis?
2) Separate organizationally
- Tractive has data engineers, data people, and developers on the functional side—and data experts and developers on the analytical side.
- This organizational distinction clarifies priorities and keeps the two worlds from bleeding into each other.
3) Embed separation in the process
- New features at Tractive kick off with both sides present: the people implementing the functional feature and those responsible for the analytical implementation.
- Tickets are kept separate: a ticket for the functional part and a dedicated ticket for the analytical part.
- This prevents analytical requirements from being silently absorbed into transactional schemas.
Engineering implications you can apply
Within the scope of Kaar’s talk, engineers can translate the ideas into concrete actions:
- Explicitly define the purpose of each datum. For every new feature, ask: what is the functional purpose? What is the analytical purpose?
- Implement in the right place. Anything that serves analytics only—history, aggregations, harmonization—belongs in the analytics stack, not in the functional schema.
- Establish a robust pipeline. A disciplined flow—extraction, processing, staging, warehouse, visualization—creates a reliable path for analytics.
- Optimize separately for divergent performance needs. Functional: low latency, point queries; analytical: throughput, batch processing, unified schemas.
- Keep ownership clean. Who is responsible for the feature? Who is responsible for making it analyzable? Reflect this in kickoffs and tickets.
Details that matter in day-to-day work
Kaar highlighted several nuances that often get overlooked:
- “Functional first” is normal—but not final. Systems are understandably built for immediate customer value. Analytical requirements will emerge; direct them into a dedicated analytics architecture.
- Distinct usability requirements. What customers should see and how analysts need to work are different problems, legitimately inviting different schemas, transformations, and access models.
- Bulk over single-object access. Analytics rarely cares about “one tracker”; it cares about patterns over many trackers. That is a different access pattern than an app showing a current location.
- Joining and harmonizing as first-class analytics needs. Kaar’s examples show that insight often spans app, web, marketing, and GPS data. That integration belongs in the analytics world.
The recurring anti-pattern—and how to break it
The cycle Kaar describes is familiar:
1) A team needs new insight.
2) They ask the nearest accessible team—often functional developers.
3) Developers add a few fields or tables.
4) The functional database becomes a quasi-analytics system.
5) Both product performance and analysis quality suffer.
The antidote is to make this cycle visible and replace it with an established alternative: the analytics path. With the organizational and process separation Kaar describes, teams can route requests correctly from the start.
Memorable examples to anchor the mindset
- GPS: customer location now vs. hunting-season patterns later. Same stream, opposing requirements.
- Activity and sleep: customers see behavior; internally, breed-level comparisons are made. The customer view is not the analytics view.
- BMI/weight history: functionally, a single value is enough; analytically, you need a history. Where you build that history determines the health of your data ecosystem.
These examples are intentionally concrete. They keep the discussion grounded and prevent the abstract from hiding the practical trade-offs.
Tools serve the principle—architecture is the point
Airflow, Apache Spark, Amazon S3, and Redshift are Kaar’s concrete components. But the real lesson is the principle: orchestrate, process, stage, warehouse, visualize—inside a separate analytics path. Unless this separation is protected organizationally and procedurally, even modern tooling won’t prevent old mistakes.
Conclusion: A mindset that reduces complexity
Kaar’s closing recommendation is straightforward: ensure everyone shares the mindset that there is a functional purpose and an analytical purpose. With that in place, teams can assign ownership cleanly, address trade-offs explicitly, and design architectures that let both worlds thrive.
For us, “The purpose of your data” was a timely reminder that technical choices without organizational and process support fall short. A clean separation of data purposes—and of the systems that serve them—is not a luxury; it is a prerequisite for both product operation and meaningful analysis. Customers get a reliable, real-time view of where their cat or dog is, and internal teams can derive robust insights—without one goal cannibalizing the other.
Kaar closed by inviting further conversation—via email or on LinkedIn. The session provides a clear common language for that dialogue: functional is what the app delivers; analytical is what the company learns from it. Each deserves its own space.