Österreichische Lotterien

Established Company

Show all Videos

Recommendation Engine

Patrick von den Österreichischen Lotterien gibt in seinem TechTalk Einblicke in die grundlegenden Hintergedanken bei der Entwicklung der hauseigenen Recommendation Engine.

By playing the video, you agree to data transfer to YouTube and acknowledge the privacy policy.

Video Summary

In Recommendation Engine, Speaker Patrick from Österreichische Lotterien explains how they built a recommendation system using collaborative filtering (ALS) with implicit feedback from game transactions. The pipeline moves data from Oracle to AWS S3 via batch and CDC (AWS DMS, Debezium), with Databricks/PySpark and dbt applying a Bronze/Silver/Gold medallion model; evaluation uses a leave‑one‑out split with hit rate, baselined against top and random picks, daily retraining, and per‑user ranked lists with confidence—where ALS performs best. He details GDPR/legal and infrastructure setup (permissions, streaming, infrastructure as code), the progression from offline to A/B online testing, and a near real‑time roadmap—patterns viewers can apply to ship production recommenders in the cloud.

Inside the Recommendation Engine at Österreichische Lotterien: Cloud Data Flow, ALS Modeling, and Daily Training from Patrick’s Talk

Context: “Recommendation Engine” with Patrick (Österreichische Lotterien)

In the session titled “Recommendation Engine,” Patrick from Österreichische Lotterien walked us through a full-stack journey from raw data to per-user recommendations. From our DevJobs.at vantage point, the talk distilled a pragmatic path: align legal and infrastructure early, build a clean cloud data foundation, validate offline with solid baselines, and only then wire recommendations into production systems and A/B tests.

The organizational context matters. The IT organization counts roughly 200 people and serves as the internal provider for the business lines. Development and Infrastructure are the two largest departments. Development is large “because we develop everything in-house,” including game systems and the data warehouse. This in-house philosophy accelerated the move from model prototype to on-site delivery: the same organization that designed the ALS pipeline could also extend the game systems to display per-user recommendations.

What a Recommendation Engine Does

The goal is familiar: propose relevant items to each user, in spirit similar to Amazon or Netflix. Patrick grounded the concept in the two feedback types:

Explicit feedback: user ratings such as stars or likes, which unambiguously reveal preferences per item.
Implicit feedback: derived from behavior—e.g., how many times a user played a game or which pages they clicked.

The crux with implicit data is that non-interaction does not imply dislike. A user might simply not know an item. The team designed their modeling and evaluation with that caveat in mind.

Two Families: Content-Based and Collaborative Filtering

Patrick separated recommendation strategies into two primary approaches:

Content-based: recommend items similar to those the user has consumed.
Collaborative filtering: find users similar to me and propose items those similar users consumed.

The first production step focused on collaborative filtering. Instead of relying on item attributes, the team learned from user–item interactions.

In-House Engineering as a Force Multiplier

The development organization (about 100 people) spans three areas:

Lotto system: fully self-developed, including terminals (C++), backend game processing (Java), and frontends (e.g., for Euromillions, Lotto, and online platforms) in TypeScript/React.
Online systems: TIP3 and Win2day, both developed in-house on a shared platform implemented in Java.
Data Competence Center: where Patrick and the Data Science team sit. Responsible for the DWH, corporate reporting, and spearheading the move to the cloud. Core languages/tools include SQL, dbt (Data Build Tool), and Python.

This setup explains how the project moved swiftly from offline validation to online delivery: once the ALS approach proved itself, the application teams could plug in the recommendation output and allow per-user targeting.

Upfront Work: Legal, Infrastructure, and Permissions

Before any cloud deployment, the team coordinated closely with the legal department. They aligned on how data would be managed, stored, and which data could be used—ensuring compliance with GDPR.

In parallel, the infrastructure team enabled streaming to the cloud, a robust permission model, and the use of infrastructure as code. Those foundational steps unlocked the actual data and ML work.

Data Flow to the Cloud: Oracle, S3, and CDC Streaming

Patrick laid out the end-to-end pipeline clearly:

Source: Oracle database
Batch loading to AWS S3 for initial transfer
Ongoing change data capture (CDC) streaming via AWS Database Migration Service (DMS) and Debezium

The platform is AWS, with Databricks used for analytics and processing. Data engineers prepare the data with dbt using a layered architecture.

Medallion Architecture for Data Modeling

The dbt transformations follow the Medallion architecture:

Bronze: raw data as-is
Silver: refined/aggregated data
Gold: business-ready aggregations for business lines and reporting

This structure provides a stable base for training and serving the recommendation model on Databricks using PySpark.

The First Model: ALS (Alternating Least Squares)

The team started with a classic matrix factorization approach, ALS. Patrick illustrated it with an intuitive matrix: users on the vertical axis, games on the horizontal axis, and known entries indicating ratings; blank cells represent unknowns. While the example used explicit ratings “only for illustration,” the production scenario uses implicit signals.

ALS decomposes the large user–item matrix into a user matrix and an item (game) matrix with latent features. Using known entries, the model learns these latent factors. Multiplying the factorized matrices reconstructs the full space and fills in previously unknown entries—providing a basis for recommendations.

Crucially, the real signal is implicit. The team uses the number of game transactions—how often a user played—as the core input to ALS.

Train/Test Split: Leave-One-Out Instead of 80/20

Rather than a conventional 80/20 split, the team adopted a leave-one-out split by user:

For each user, one played game is withheld.
That single game composes the test set for that user.
The model is trained without the withheld interaction.

They then evaluate using hit rate: during offline (dry-run) testing, they check whether the withheld game appears among the model’s recommendations for that specific user. It’s a clear, realistic gauge of whether the engine can surface the “next right game.”

Baselines: Top Games and Random

Before shipping to users, the team compared ALS against two baselines:

Random: recommend arbitrary games to each user.
Top: recommend the globally best games to everyone.

As expected, random performed poorly. The simple “top games for all” baseline did better, but ALS “fortunately” came out on top. Those comparisons anchored the ALS gains in tangible terms.

From Offline to Online: Individualized Delivery

Bridging offline success into production required application work. The game systems had to be extended so that each user could receive their own recommendation list. Thanks to the in-house development setup, the team could deliver the necessary changes through their existing development lines.

The engine returns, per user, a list of recommended games ordered by priority. With implicit feedback, the priority is driven by a confidence measure—how sure the engine is that the suggested game fits the user’s observed behavior.

Daily Training and Day-to-Day Stability

We asked the obvious question: “Will users see a different list every day?” Patrick’s answer: “The truth lies somewhere in between.”

Some users will see more daily changes, depending on their play patterns.
On aggregate, the lists drift slowly: a large share of games remains the same from one day to the next.

Even so, the team retrains daily. The reason is straightforward: new users arrive daily, and new games are launched regularly. Items and users not present at training time cannot be recommended. Daily training keeps the engine aligned with the latest user base and catalog.

Online Experimentation: A/B Testing and Activity Metrics

With online delivery in place, the team planned classic A/B testing. Users are split into groups, and activity is measured to assess the effectiveness of the recommendations. Patrick kept the focus on the essentials: split users, measure activity, iterate.

Iteration Roadmap: From Homogeneous Scope to Near-Real-Time

Patrick summarized the project’s evolution in four steps:

Start with ALS on a homogeneous group of similar games—avoid the full catalog at first to validate end-to-end behavior in a controlled slice.
Add more models: additional filtering approaches and content models; explore whether a hybrid model can outperform single approaches globally or per user.
Expand to all game areas, including those with different mechanics—requiring conceptual adjustments.
Move toward a near-real-time approach: react swiftly within a session, ideally suggesting better options based on the most recent session behavior.

This incremental plan balances risk, value, and complexity: prove it small, broaden carefully, and then speed it up.

Technical Pillars: Tools, Teams, and Responsibilities

Reiterating the building blocks mentioned in the talk:

Platform: AWS
Source: Oracle database
Storage: Amazon S3 (batch)
Streaming: CDC via AWS DMS and Debezium
Processing/analytics: Databricks (PySpark for modeling)
Data transformations: dbt with the Medallion architecture (Bronze/Silver/Gold)
Organization: development areas (Lotto systems with C++/Java/TypeScript/React; online systems TIP3/Win2day on a Java platform), Data Competence Center (SQL, dbt, Python)
Compliance/infrastructure: GDPR alignment with Legal, permissioning and IaC with the infrastructure team

The lesson is that a recommendation engine is not a single model artifact. It’s an organizational and technical weave—legal alignment, data plumbing, modeling, and application integration working together.

Practical Lessons for Engineering Teams

Several pragmatic takeaways stood out:

Implicit feedback requires careful interpretation. “Didn’t see it” is not “didn’t like it.” Using transaction counts as the core signal and evaluating with user-specific withheld interactions grounds the engine in reality.
Leave-one-out is a clear and stringent evaluation setup. It tests if the engine can retrieve the actual next item a user played.
Baselines are non-negotiable. Comparing against random and global top items quantifies the lift.
Offline validation must lead to system integration. “So that the user can see the recommendation,” the game systems were extended for per-user targeting.
Daily training is necessary in dynamic catalogs. New users and newly launched games won’t be recommended if they weren’t present at training time.
In-house ownership shortens the path from idea to impact. When the same organization controls data, models, and application surfaces, iteration loops get tighter.

Why ALS Was a Strong First Choice

Patrick’s rationale was implicit and solid: ALS is a proven approach for collaborative scenarios with implicit signals. It learns latent factors from the user–item matrix and can leverage interaction counts without requiring explicit ratings. By starting within a homogeneous game group, the team also contained variability and sped up validation.

From there, the plan to layer in content models and consider hybridization addresses known blind spots (like cold start) and aims for broader coverage across diverse game mechanics.

Stability vs. Responsiveness

The “does it change daily?” question surfaces a classic trade-off:

Users benefit from stable lists; radical daily churn can feel erratic.
But responsiveness to new items and fresh interactions is essential.

The current approach—daily training with observed slow drift on average—strikes a workable balance, while the near-real-time goal introduces session-level adaptivity where it matters most.

Outlook: Hybrid Models and Session-Aware Recommendations

The four-step path Patrick outlined suggests a mature trajectory: combine collaborative and content-based signals, broaden to all game areas, and bring latency down so that session context influences suggestions on the fly. This is not a replacement for daily training; it complements it with finer-grained recency.

Conclusion: A Reproducible Path to Production-Grade Personalization

Patrick showed how Österreichische Lotterien built a recommendation engine end to end: Oracle to S3, CDC streaming via DMS and Debezium, dbt Medallion layers, Databricks with PySpark for ALS, offline evaluation using leave-one-out and hit rate, baseline comparisons, then per-user online delivery and A/B tests.

From a DevJobs.at perspective, “Recommendation Engine” is a blueprint for applied ML in regulated environments: align with legal on GDPR, coordinate with infrastructure for streaming and permissions, keep data transformations clean and layered, validate with strong baselines, and integrate into product surfaces early. The forward path—more models, broader scopes, and near-real-time—is clear and incremental.

Session: Recommendation Engine — Speaker: Patrick — Company: Österreichische Lotterien

Patrick closed by pointing to open positions at “Livejobs.” It’s a fitting coda: the work is ongoing, and the organization is investing in the very mix of skills—cloud data engineering, ML modeling, and application integration—that made this engine real.

More Tech Lead Stories

Österreichische Lotterien
Martin, Abteilungsleiter Development bei Österreichische Lotterien
Der Abteilungsleiter Development bei den Österreichischen Lotterien Martin gibt im Interview einen Überblick über die wesentlichen Eckpunkte der selbst organisierten Devteams, wie dort das Recruiting abläuft und mit welchen Technologien gearbeitet wird.
Watch now

More Dev Stories