niceshops
Vorhersage der Verpackungsgröße via Machine Learning
Description
Lilly Adlgasser und Michael Heininger von niceshops unterhalten sich in ihrem devjobs.at TechTalk "Die Vorhersage der Verpackungsgröße via Machine Learning" über die wichtigsten Eckpunkte des Projektes.
By playing the video, you agree to data transfer to YouTube and acknowledge the privacy policy.
Video Summary
In “Vorhersage der Verpackungsgröße via Machine Learning,” Lilly Adlgasser & Michael Heininger explain how niceshops built an ML system to predict parcel size from historical shipping data—covering data engineering, feature design (weight, item count, pickup/cold-shipping flags, approximated volume), and model trials (k-NN, Naive Bayes, Decision Trees, Random Forest). The approximated volume proved pivotal; Random Forest performed best and top-3 size suggestions exceed 95% accuracy, cutting the choice from 40+ sizes down to three for packers. Deployed on Google AI Platform with a custom prediction routine and integrated into a PHP warehouse system, predictions are precomputed to avoid latency and enable transport capacity planning per carrier.
Predicting Packaging Size with Machine Learning at niceshops: From Raw Data to Production
The context: Packaging decisions at e-commerce scale
In “Vorhersage der Verpackungsgröße via Machine Learning” by Lilly Adlgasser & Michael Heininger (niceshops), the challenge is concrete and high scale: more than 20,000 parcels leave the warehouse every day, destined for over 1 million customers. The decision about which of more than 40 available packaging sizes to use happens at the packing station—under time pressure.
On the operator’s UI, the cart items appear on the right and the list of packaging sizes on the left. The project’s goal: have an ML model propose a small, highly probable subset—ideally three sizes—to speed up packing decisions. A further benefit of early predictions is better downstream planning: if you can foresee which sizes today’s shipments will likely use, you can pre-estimate transport volume by carrier.
“With a prediction of the packaging size you can support the warehouse worker. You can provide a subset from over 40 different sizes.”
We watched the session from the DevJobs.at editorial seat. Below is a technical narrative of what we learned: the problem framing, the data engineering, the choice of features and algorithms, deployment on Google AI Platform, and the pragmatic PHP integration that makes it useful on the shop floor.
Why machine learning? Data exists; product dimensions often don’t
niceshops had a key enabler in place: historical shipment data including the packaging sizes actually used has been recorded for some time. What they often didn’t have were reliable product dimensions—so a purely rules-based approach on item length/width/height wasn’t viable. That made a data-driven approach attractive.
“Machine Learning is very good at recognizing relationships in large amounts of data… And in the company the dimensions of products are not often recorded.”
Michael Heininger, a backend developer and expert on niceshops’ in-house inventory and warehouse system, used his master’s thesis to build this predictor—aiming for a simple model and a strong, low-latency integration.
A three-phase project: data, modeling, deployment and integration
The work was split into three phases:
- Data engineering: collect data from multiple shop databases via REST; analyze and prepare it in Google Colab.
- Modeling and evaluation: define features; train and compare classifiers using scikit-learn.
- Deployment and integration: deploy to Google AI Platform (Custom Prediction Routine); integrate with the existing PHP warehouse software.
This staged approach reduced risk: only after data and features proved solid did the team push into tuning and production.
Data engineering: Distributed sources, REST pipes, Colab workflows
niceshops operates multiple shops, each with its own database. The team consolidated data by calling REST endpoints per shop and then loaded the historical shipments into Google Colab for exploration, cleaning, and feature preparation.
Colab fit the bill for two reasons: fast, browser-based iteration and straightforward handover to Google Cloud for deployment. For the master’s thesis and the internal project, it enabled tight prototyping loops.
Data engineering lessons
- Pulling data across multiple REST endpoints is cumbersome and error-prone.
- The team wants to move to a direct synchronization via Google BigQuery to centralize data, train models consistently, and write back results for monitoring and automated (re-)training.
“It would be cool if we synchronized directly via Google BigQuery… and continuously checked prediction accuracy.”
Feature engineering: Simple signals plus one breakthrough feature
The initial feature set focused on stable, universally available shipment-level signals:
- Shipment weight
- Number of products in the shipment
- Flag indicating self-pickup
- Flag indicating chilled shipping
- An approximated volume for the shipment
The first four are obvious. The breakthrough came from the approximated volume. Because exact product measurements were often missing, the team derived an approximation using values known from the shipping process (e.g., package volume and weight metrics) to inject a spatial signal into the model.
“Before we had the approximated volume, we were at 50–60% prediction accuracy. That was too low for the warehouse.”
With the added volume feature, accuracy improved significantly—enough to support real-world usage.
Why approximated volume matters
- Packaging size correlates strongly with volumetric requirements.
- Weight alone is a weak proxy (think light-but-bulky versus heavy-but-compact items).
- An approximate volume brings the crucial spatial dimension into the predictor—even without exact product dimensions.
Modeling with scikit-learn: Start simple, compare rigorously
The team chose scikit-learn for practical reasons: established algorithms, clean APIs, and straightforward comparison. They trained multiple classifiers:
- K-Nearest Neighbors (KNN)
- Naive Bayes
- Decision Trees
- Random Forest
Random Forest delivered the best performance on their data.
“We achieved the highest prediction accuracy with the Random Forest algorithm.”
Metrics: Top-1 and Top-3 aligned with the workflow
Two metrics anchored the evaluation:
- Top-1 accuracy: whether the predicted size exactly matches the actually used size.
- Top-3 accuracy: whether the true size is among the three most probable sizes predicted.
Since the goal is to present a small set of strong candidates to the packer, Top-3 accuracy is the operational metric. Averaged across shops, it exceeded 95%, making it highly usable on the warehouse floor.
Deployment: Custom Prediction Routine on Google AI Platform
The model was deployed to Google AI Platform using the Custom Prediction Routine, which allows running custom Python code around the model and defining two classes:
- Preprocessor: accepts the prediction request and prepares the data (e.g., encoding categorical variables as numerical features).
- Predictor: orchestrates the flow—receives the request, calls the Preprocessor, invokes the model, and shapes the response.
“The advantage is that you can execute custom Python code in the cloud… You can define a predictor and a preprocessor.”
Technically, Preprocessor and Predictor were bundled into a source distribution package along with the trained model and uploaded from within Google Colab. Naming and versioning were handled in a few notebook cells.
Why custom prediction matters
- It guarantees training-serving parity for preprocessing.
- It lets teams shape the response to application needs (e.g., return the Top-N candidates consistently).
- It centralizes encodings, data checks, and fallbacks.
PHP integration: Simple binding, smart latency strategy
The warehouse system at niceshops runs primarily in PHP. Integration used the “Google Machine Learning Engine for PHP.” Two implementation details stood out:
- Authentication is done via a JSON service account key, producing an authorized service.
- Inference calls use that authorized service to invoke the AI Platform’s Predict API with a model name and a request array of instances. The response is parsed and fed back into the application.
“With the service account key you create an authorized service… With the request array you can very easily connect to the model.”
Decoupling latency: Predict at delivery creation, cache for packing
Rather than triggering a prediction when an operator starts packing, niceshops predicts at delivery creation time and stores the results in the database. That way, no one waits for a model call at the packing station; results are retrieved instantly when packing starts.
“We decided on the latter… We store the prediction results in the database in advance.”
UI: Three suggestions, optional preselection
In the live test for one shop, the UI shows three recommended packaging sizes. Via configuration, the most probable size can be preselected. This lines up perfectly with the Top-3 metric and reduces cognitive friction for operators.
Operations: Roll out gradually, verify quality continuously
At the time of the talk, the system was live in one shop. The plan is to integrate it into additional shops while continuously monitoring prediction accuracy. Data quality remains the focus: the more representative the features per shop, the more stable the recommendations.
An especially compelling next step is transport planning: early knowledge of packaging sizes enables better estimation of carrier-specific transport volumes—even vehicle counts per carrier.
“A fascinating project is determining transport volume in advance based on the model we created.”
Takeaways we noted
From our DevJobs.at editorial vantage point, these are the key lessons:
- Solve data access first: Robust ML starts with repeatable data flows. The team intends to replace the REST-heavy setup with BigQuery synchronization.
- Feature engineering wins: The approximated volume changed the game, pushing accuracy into production territory.
- Choose metrics tied to the workflow: Top-3 accuracy reflects the real decision context at the packing station better than Top-1 alone.
- Deploy with preprocessing: Custom Prediction Routine ensures serving-time preprocessing mirrors training-time logic.
- Decouple latency from human work: Predict early, cache results, and make the UI instant. That’s how ML becomes an invisible productivity boost.
Technical snapshot—precise and pragmatic
- Training stack: Python, scikit-learn, Google Colab.
- Models evaluated: KNN, Naive Bayes, Decision Trees, Random Forest (top performer).
- Features: shipment weight, item count, flags (self-pickup, chilled shipping), approximated volume.
- Metrics: Top-1 accuracy (varies by shop), Top-3 accuracy (>95% averaged across shops).
- Deployment: Google AI Platform, Custom Prediction Routine with Preprocessor and Predictor as Python classes; packaged as a source distribution alongside the model.
- Integration: Google Machine Learning Engine for PHP, JSON service account key, Predict endpoint with model name and request array of instances.
- Runtime mode: predictions generated at delivery creation, stored in DB, shown as three sizes in the UI with optional preselection.
What’s next: More shops, better data, automated retraining
The roadmap is straightforward:
- Roll out to additional shops, gated by observed accuracy.
- Improve data quality per shop and search for more representative features.
- Centralize data via BigQuery to simplify training and monitoring.
- Write back outcomes and trigger automated training jobs when prediction accuracy degrades.
Conclusion: ML that lands on the shop floor
“Vorhersage der Verpackungsgröße via Machine Learning” is a solid case of taking data science from a notebook into real operations. The steps are clear and repeatable: consolidate data, engineer features that matter (approximated volume), compare simple, robust models, deploy with serving-time preprocessing, integrate cleanly into the existing PHP stack, and design the UI to present three strong choices.
In the speakers’ words, machine learning recognizes patterns in the data. The real win is embedding those predictions where they help immediately—on the packing line. That’s exactly what niceshops achieved.
—
Session title: Vorhersage der Verpackungsgröße via Machine Learning
Speaker: Lilly Adlgasser & Michael Heininger
Company: niceshops