Workplace Image smaXtec animal care GmbH

Johannes Pesenhofer, Data Scientist bei smaXtec

Description

Johannes Pesenhofer von smaXtec spricht im Interview darüber, wie er zu seiner aktuellen Arbeit als Data Scientist gekommen ist und was seiner Ansicht nach wichtig für Beginner ist.

By playing the video, you agree to data transfer to YouTube and acknowledge the privacy policy.

Video Summary

In "Johannes Pesenhofer, Data Scientist bei smaXtec," speaker Johannes Pesenhofer recounts a non-linear path from early MySQL tinkering and an electrical engineering HTL (C/C++) to a mixed CS/EE degree, leading to an internship at smaXtec after his second semester. He outlines work spanning static analysis in SQL/Python and real-time stream processing of bolus data via Kafka, with the primary challenge being productionizing, optimizing performance, and scaling from single-animal analysis to hundreds of thousands. His advice: there’s no single path—passion for problem solving and persistence matter; a degree can provide a useful toolbox but isn’t required; always start from the problem, not the “best” programming language.

From MySQL and Microcontrollers to Real-Time Streams: Lessons from “Johannes Pesenhofer, Data Scientist bei smaXtec” for pragmatic, production-grade Data Science

How we experienced the session

“Johannes Pesenhofer, Data Scientist bei smaXtec” from smaXtec animal care GmbH offers a clear, grounded look at what Data Science work really entails and how one ends up doing it. Instead of buzzwords, Pesenhofer shares practical waypoints: first steps with a MySQL database, a high school path through electrical engineering and microcontrollers, and then a shift into a role that deals with real-time processing over Kafka. The core message is disarmingly honest: the analysis itself is often quick; the real work starts when you turn insights into reliable, scalable systems.

Detours as a driver: Early contact with data and systems

“I got into Data Science a bit via detours,” Pesenhofer notes. Those detours start early:

  • Middle school: first exposure to databases, an ECDL computer license, and—crucially—a privately hosted server set up with friends, which led to his first MySQL experience.
  • HTL (higher technical school): electrical engineering, microcontroller programming, lots of C and C++, and a distinctly hardware-near perspective.

The common thread is curiosity and hands-on problem solving. The contrast—from a self-hosted MySQL playground to embedded C/C++—sets up an approach that later defines his Data Science practice: let real problems guide the path.

From hardware to the math question: starting university and finding a Data Scientist role

After HTL, Pesenhofer begins university with the desire to “do something with mathematics,” yet without a precise job picture. He looks around, discovers smaXtec, sees the role “Data Scientist,” and decides on a mixed program of computer science and electrical engineering. The aim: expand programming skills and broaden the base. After his second semester, he starts an internship at smaXtec.

This stage underscores two simple points:

  • Concrete roles create orientation: “Data Scientist” becomes a tangible option rather than an abstract label.
  • Education can be modular: electrical engineering for system proximity, computer science for software competence, mathematics as a way of thinking.

Data work in two speeds: static analysis and streaming

Pesenhofer describes a work spectrum we encounter often in data teams, though it’s rarely summarized this crisply. It spans from “static analysis” to “real-time streaming”:

  • Static analysis: data sits in a database; you write queries and analyze to “figure things out.” The emphasis is on methodical work over hype.
  • Streaming: “When the bolus measures data, those messages are sent via Kafka to our system and we have to analyze them in real time.” Here, the logic shifts: the stream doesn’t pause, and latency and reliability become first-order requirements.

Moving between batch-style thinking and streaming is part of the daily rhythm. It’s not two separate universes; it’s the same discipline operating at two different speeds and risk profiles.

One shot in the stream: robustness and performance are non-negotiable

Pesenhofer’s phrasing about streaming is memorable: “When the data has been there once and something goes wrong, then it’s gone.” He adds: “You more or less have only one chance; you also have several chances, but that involves warm-up.”

Three takeaways stand out:

  1. Error handling in stream processing is a first-class concern. There is no easy “try again later” fallback.
  2. Performance is a core requirement. Real time means “on time, reliably, and stably,” not just “fast enough.”
  3. Operations over proofs-of-concept. Stream processing makes production quality the primary design target.

Pesenhofer puts it plainly: this is “its own challenge, also in terms of performance,” and it’s where “most of the work” lives.

From one animal to hundreds of thousands: where the real work begins

One of the session’s clearest lines: “The actual analysis is often done quickly, but then bringing it into production … that is a challenge of its own.” The leap he describes is straightforward and profound:

  • Starting point: an analysis “on, for example, a single animal”—contained, demonstrable, and easy to reason about.
  • Target state: a “stream-processing system” that “runs across several hundred thousand animals”—durable, scalable, and dependable.

What lies between these two states are design decisions, failure tolerance, backpressure and liveness concerns, and the unglamorous work of making systems just work. That shift—from prototype to infrastructure—is the main effort in Pesenhofer’s description.

Tools that carry weight: Python and SQL, side by side

Pesenhofer is precise about his tool choices: “We mostly work with Python for stream processing, and for other analyses we work with both Python and SQL.”

Two practical directions follow from this:

  • Python as a bridge between analysis and operations—especially in streaming contexts.
  • SQL as the language of database-near analysis—where data lives first and queries carry the logic.

The message isn’t “use exactly these tools.” It’s: choose tools that map cleanly to the problem class and use them consistently. For Pesenhofer’s context, that means Python/SQL across batch and streaming.

No single path: passion, persistence, and a clear-eyed mindset

“I don’t think there is a single path,” Pesenhofer says. What matters most are two attitudes:

  • A passion for solving problems
  • “Sitzfleisch”—the persistence to stick with problems and accept that the first solution rarely fits

He names it plainly: don’t expect the first solution to work, “because it usually doesn’t.” That realism threads through his talk: Data Science isn’t a linear recipe; it’s an iterative practice where perseverance matters.

Education as a toolbox—helpful, not mandatory

Pesenhofer strikes a balanced view of education:

  • Backgrounds vary: “You can come out of an HTL, and you can also do a degree.”
  • What a degree provides: “With a degree you build a sort of toolbox you can draw on.” You “won’t understand everything right away,” but you’ll know “what techniques exist” and “can read up later.”
  • A prerequisite? No: “I don’t think a degree is a prerequisite. I think anyone can do it if they bring enough motivation and apply themselves.”

This perspective blends pragmatism (no degree required) with a clear benefit (orientation, a map of techniques, and easier future deep-dives).

Start with the problem, not the solution

Pesenhofer’s stance on languages and tools is explicit: looking for “the best programming language” doesn’t make sense. “It doesn’t exist,” as he frames it. Each problem has “very different requirements,” so: “Start with the problem, not with the solution.”

As decision guidance goes, this is as concrete as it gets. Whether you’re writing SQL for static analysis, building real-time pipelines with Python and Kafka, or bridging a one-animal analysis to a system running across hundreds of thousands, the requirements should drive the stack.

From early curiosity to professional responsibility

Without adding anything beyond what he said, Pesenhofer’s path lines up as follows:

  1. Early curiosity and hands-on practice (server with friends, MySQL, ECDL)
  2. System closeness at HTL (electrical engineering, microcontrollers, C/C++)
  3. University with a desire for mathematics; discovering smaXtec and the “Data Scientist” role
  4. Mixed studies (computer science + electrical engineering) and an internship at smaXtec after the second semester
  5. Current focus: static analyses and—above all—real-time streaming, with the main effort in turning analyses into production-grade, scalable systems

The constant elements are problem orientation and persistence, independent of a specific educational route.

Practical takeaways developers can act on

Staying within what Pesenhofer states, here’s an organized summary of his core points:

  • Begin with problem definition:
  • What data do you have (database, stream)?
  • What requirements are non-negotiable (latency, reliability, scale)?
  • What analysis translates to operational value?
  • Think in two speeds:
  • Static analysis (SQL, Python) for insight
  • Streaming (Kafka, Python) for real-time decisions
  • Plan for production from the outset:
  • The analysis can be fast—the hard work is the move to production and scale
  • Trace the path from “one animal” to “hundreds of thousands of animals”—what assumptions break?
  • Choose tools to fit problems:
  • Don’t hunt for “the best language”—decide based on requirements
  • Python and SQL form a strong base, especially together
  • Build persistence:
  • Expect that the first solution rarely sticks
  • Iteration is the norm, not the exception
  • Use education as a map:
  • A degree can supply a toolbox; motivation is the real engine

These aren’t generic platitudes—they’re the distilled points Pesenhofer emphasized, kept as close as possible to his words.

Quote highlights worth remembering

“I got into Data Science a bit via detours.”

“When the bolus measures data, those messages are sent via Kafka to our system and we have to analyze them in real time.”

“When the data has been there once and something goes wrong, then it’s gone.”

“The actual analysis is often done quickly, but then bringing it into production … that is a challenge of its own.”

“With a degree you build a sort of toolbox … You won’t understand everything right away, but you’ll at least know what techniques exist.”

“It doesn’t make much sense to look for the best programming language … You should start with the problem, not the solution.”

What streaming demands of our thinking

Even without extra detail, his framing of streaming tells a lot. Systems must reliably receive and process messages within sharp time windows; there’s “more or less only one chance.” That isn’t alarmist—it’s a compass: resilience and performance are first-class concerns in streaming contexts.

Static analysis remains essential as the place where hypotheses form and models are checked against data. Only then do you turn those findings into a pipeline that “runs across several hundred thousand animals.” The mental model shifts from producing insight to delivering it—reliably, repeatably, and at scale.

Why this path is encouraging—without a standard recipe

Pesenhofer’s story confirms what many suspect: you don’t need to have “Data Scientist” in mind from day one to arrive there. What matters is:

  • early contact with data and systems (MySQL, a self-hosted server, ECDL),
  • a technical foundation (HTL, microcontrollers, C/C++),
  • the willingness to connect interests (mathematics) with a concrete role (Data Scientist),
  • and sticking with it in an environment where production quality matters (streaming, real time, scale).

This relieves the pressure to have a perfect roadmap. It’s enough to take the next sensible step—and to let the problem shape the solution.

Conclusion: Data Science as craft—and as mindset

“Johannes Pesenhofer, Data Scientist bei smaXtec” from smaXtec animal care GmbH doesn’t portray Data Science as magic. It’s a craft with a mindset: problem-first, robust, iterative. The tools (Python, SQL, Kafka) are means to an end. The real achievement lies in turning quick analyses into reliable, performant systems—from a prototype on a single animal to pipelines running across hundreds of thousands.

If you want to follow a similar path, Pesenhofer’s points offer a clear line of travel:

  • Start with the problem, not the stack.
  • Expect iteration—and keep at it.
  • Treat education as a toolbox—motivation remains the key driver.

That’s how Data Science becomes impactful: close to real needs, resilient in operation, and always guided by the problem at hand.

More Tech Lead Stories

More Dev Stories