coilDNA

Startup

Show all Videos

Industrial High Speed Cloud OCR

Jan Wagner von coilDNA zeigt in seinem devjobs.at TechTalk den Lösungsansatz zur Optical Character Recognition, mit welchem es möglich ist Metallbleche mit zusätzlichen Informationen zu versehen.

By playing the video, you agree to data transfer to YouTube and acknowledge the privacy policy.

Video Summary

In Industrial High Speed Cloud OCR, Speaker Jan Wagner from coilDNA explains how distributed dot‑matrix codes printed across aluminum coils enable traceability and anti‑counterfeiting. He outlines a cloud‑based OCR workflow—image capture via fixed cameras or a browser Web App, preprocessing (grayscale, orientation, edge detection), character extraction with Azure Cognitive Services, and format-aware correction of common misreads—followed by a data lookup returning position, quality, and production details. Live examples contrast fixed cameras using a defined 5×7 dot‑matrix font with mobile scanning on reflective surfaces, and he notes a move toward an ML model to further reduce errors.

Industrial High Speed Cloud OCR in Metal Manufacturing: From Dot‑Matrix Codes on Aluminum Coils to Reliable Data Retrieval

Context: Why Industrial High Speed Cloud OCR?

In the session “Industrial High Speed Cloud OCR,” Jan Wagner (Software Engineer, coilDNA) walked through a concrete factory‑floor challenge: robustly and quickly reading dot‑matrix codes printed on aluminum coils—both on fixed production lines and via a mobile browser. The solution is a disciplined pipeline: image capture, targeted preprocessing, cloud‑based character recognition, and strict code validation before data lookup.

The use case is straightforward yet demanding. coilDNA prints a distributed code along the entire length of an aluminum coil. Each partial sequence encodes coil information—such as producer and material—and any single sequence can be reassembled as part of a larger fragment. That means every coil segment is individually identifiable, and the data link includes the exact position on the coil as well as quality attributes. Counterfeit protection is a central motivation.

“Our idea was to print a code on an aluminum coil … each sequence of this code can be reassembled into a whole fragment—and each sequence contains information about this coil.”

For engineers, the crux is the medium: dot‑matrix codes on reflective metal surfaces. Glare, varying illumination, perspective shifts, and the peculiarities of dot‑matrix glyphs make OCR brittle. The talk laid out how coilDNA tackles these hurdles with a cloud‑enabled pipeline.

The DNA Analogy: Distributed Codes Along the Coil

Wagner used a DNA analogy to describe the concept: the coil carries sequentially printed code fragments along its length. Each sub‑sequence contains sufficient information to identify the overall fragment and thus the coil. In practice, this enables:

Traceability: data is anchored to precise positions on the coil.
Quality linkage: quality data is retrieved per segment.
Anti‑counterfeiting: the origin of the material can be verified.

“I get the data exactly for the position on the overall coil, I get the quality data for this part … the main reason is, of course, counterfeit protection.”

From Image to Data: The End‑to‑End Pipeline

The path from a captured frame to structured data has several stages:

Image acquisition

Fixed cameras on the production machine with stable parameters.
Mobile capture via a browser‑based web app using the Media Devices interface.

Preprocessing

Convert to grayscale.
Orientation correction (rotate if the code is upside down).
Edge detection to emphasize glyph structure.

Cloud OCR

Cloud‑based character recognition on Azure using Microsoft Vision from Azure Cognitive Services.

Post‑processing and format validation

Normalize the recognized string to the expected format (12 characters plus 2 letters).
Repair common dot‑matrix confusions (e.g., “B” misread as “A8”).

Data retrieval

Query coilDNA’s APIs/web services to fetch production, quality, and positional data from the database.

This is not an academic luxury. Raw camera frames often defeat OCR due to glare, surface artifacts, and the dot‑matrix representation. Without preprocessing, even a strong cloud OCR engine can return nothing.

“If I send the original image to the cloud … I get nothing out … with image processing, a code was recognized.”

Acquisition Modes: Fixed Cameras vs. Mobile Browser

Wagner presented two acquisition modes with different engineering trade‑offs.

1) Fixed cameras on the line

Advantage: controlled, stable conditions (lighting, distance, angle, aperture, focus, trigger timing).
Font specifics: A 5×7 dot‑matrix font can be defined in a text file (or programmatically in an array). Each letter/digit is represented as a point grid the camera “knows”.
Outcome: With fixed parameters and a known raster, recognition is simpler and more reliable.

“I only have this text file … I define every letter and digit … and the camera then knows it.”

2) Mobile capture via web app (browser)

Access: The web app uses the Media Devices interface to access the camera—after explicit user permission.
Usage: Workers (e.g., at a plant or at an automotive customer) perform quick scans.
Challenge: Highly variable parameters (hand movement, perspective, distance, lighting), resulting in uneven image quality.
Upside: Zero‑install scanning and immediate data lookup directly in the browser.

The mobile mode demands stronger preprocessing to counter variability. Otherwise, glare and low contrast will sink OCR performance.

Preprocessing: Grayscale, Orientation, Edges

Dot‑matrix scripts amplify artifacts: points merge under glare, noise, or softness; edges fray; line structure tilts. The talk emphasized three core steps:

Grayscale: simplify channels and stabilize contrast for segmentation.
Orientation: rotate/normalize when the code is captured upside down.
Edge detection: highlight point boundaries to boost segmentation.

“I have to process the image accordingly … convert to grayscale, adjust orientation, and perform edge detection.”

These minimal steps prepare a cleaner input for Microsoft Vision on Azure. Without them, reflective metal often defeats OCR outright.

Cloud OCR with Azure Cognitive Services (Microsoft Vision)

coiIDNA’s recognition runs in the cloud on Azure, specifically Microsoft Vision from Azure Cognitive Services. This separates acquisition (local) from recognition (centralized). It fits environments where:

reliable networking is available on the line,
scaling across peaks or multiple lines/sites matters,
mobile browser clients should remain thin.

Even so, recognition quality hinges on preprocessing. The talk contrasts an unprocessed, reflective image yielding nothing with a processed version that produces a recognizable code.

Post‑Processing: Format Knowledge Beats OCR Ambiguity

A key practical lever is format validation. The coilDNA code follows a fixed structure:

“Our code has 12 characters plus 2 letters …”

This expectation acts as a corrective lens for typical misreads. Wagner gave a classic dot‑matrix confusion:

“B is sometimes recognized as A8 … if it should be A B, then it can’t be A8; it must be A B.”

In other words, domain knowledge about permissible letter combinations and lengths is a powerful filter. Combining cloud OCR, format constraints, and repair rules transforms guesswork into dependable recognition.

Why these errors occur

Dot‑matrix points only approximate curves and strokes.
Shine/lighting creates “pseudo‑pixels” that mimic digit cores.
Distance and perspective alter point size relative to the 5×7 grid.

The takeaway: in industrial OCR, semantic constraints (length, allowed characters, valid prefixes/suffixes) are essential.

From Code to Context: API‑Based Data Lookup

Once the code is extracted and normalized, coilDNA performs an API lookup against its web services/databases. The result includes, among others:

When the code was captured/issued.
When it was printed on the coil.
The code’s position on the coil (e.g., which meter).
Total coil length.
The operator’s name.

“Lookup code and you get the data … when the code was captured … when it was printed on the coil … which meter … how long the coil is … and the operator’s name.”

This closes the loop: an edge image becomes a structured, production‑relevant record.

Demo Highlights: Why It Sometimes Fails

Wagner highlighted scenarios where the cloud OCR returns nothing—especially with strong glare or visually noisy surfaces. After preprocessing (grayscale, orientation, edges), recognition improves but may still be imperfect. That’s where correction rules (format knowledge) and—looking forward—trained ML models come in.

“I have to process the image so that I can extract something … still not perfect … these are the sources of error that must be eliminated.”

The contrast between fixed cameras and mobile scanning is stark: fixed setups benefit from defined parameters and known font rasters (5×7), whereas mobile scans must cope with variability in distance and lighting and therefore require stronger post‑processing.

The “Desired Solution”: A Learned ML Model for Error Profiles

Wagner outlined a direction of travel:

“Desired solution … build a machine learning model and train it … certain digits have this error value … then I output that, and I don’t do it myself programmatically.”

Rather than hand‑coding every confusion, a trained model would learn which glyphs tend to be misread under which conditions and apply that knowledge in correction. Dot‑matrix is a good candidate because point spacing, softness, and shine interact to produce recurring error patterns.

The talk remains focused on today’s pipeline while pointing to a learned correction layer as the next step.

Engineering Playbook: Lessons We Took Away

The session distilled a practical recipe for industrial OCR:

Stabilize acquisition first
Prefer fixed cameras when possible: constant parameters drive reliability.
For mobile scans, design the flow to encourage steady, front‑facing captures.
Establish “minimum viable” preprocessing
Always apply grayscale, orientation correction, and edge detection before cloud OCR.
Actively mitigate glare (capture angle, lighting) where feasible.
Use cloud OCR deliberately
Microsoft Vision (Azure Cognitive Services) performs recognition—but only as good as the input.
Balance latency/bandwidth with gains in recognition; avoid bloating image payloads.
Don’t skimp on post‑processing
Hard‑code code format (length, allowed character sets, prefixes/suffixes).
Apply rules for typical confusions (B↔A8, etc.).
Make data retrieval a first‑class step
After OCR, immediately perform a structured lookup for production/quality data.
Present results so position and history are obvious at a glance.
Prepare for a learned correction layer
Collect and label misreads over time.
Keep the pipeline modular so an ML model can be inserted later without re‑plumbing.

Dot‑Matrix on Metal: Specific Pitfalls

The talk clarified why dot‑matrix OCR on metal is notoriously tricky:

Reflectivity: shiny surfaces create false edges/highlights.
Low stroke continuity: dot‑based glyphs tolerate little blur.
Perspective: small tilt skewers point spacing.
Glyph confusion: similar 5×7 patterns cause B vs. 8‑like misreads, among others.

Conclusion: without robust preprocessing and format validation, recognition is unreliable. Fixed cameras with known parameters and font definitions mitigate the risk; mobile scanning demands stronger processing and careful UX.

Organizational Context: coilDNA in Linz

Wagner located coilDNA clearly:

“We are in the Inter‑Trading building in Linz, a 100% subsidiary of AMAG—hence aluminum coil.”

That aluminum focus explains the deep dive into coil‑specific identification and why the solution sits at the intersection of production engineering, computer vision, and cloud architecture.

Conclusion: Industrial OCR That Fits the Factory Floor

Jan Wagner’s “Industrial High Speed Cloud OCR” outlines a pragmatic blueprint for OCR in harsh industrial settings. The core messages:

Without preprocessing, even capable cloud OCR struggles against glare and dot‑matrix artifacts.
Domain‑specific format knowledge is a decisive accuracy lever—especially for recurring confusions like “B” vs. “A8”.
Fixed cameras with stable parameters and known font definitions are inherently robust; mobile browser scans offer flexibility but must be backed by strong preprocessing and validation.
The next boost will come from a trained ML correction layer that learns misread patterns from real‑world scans.

For engineers tackling similar problems, the approach is clear: pipeline first, heuristics now, cloud OCR where it helps, and a clean API‑based data retrieval at the end. That’s how a dot‑matrix pattern on reflective metal turns into trustworthy, production‑grade data.