A few days ago I was at the checkout in a grocery store, when the cashier asked me: “Can I scan your App?” then I got back home, opened the App Store and the itch started!

Image by Caio on pexels.com

In this article, I would like to explain why retailers are interested in promoting their mobile app to their customer base, what advantages it brings for them, what kind of data they collect, and how it can be used under a data science point of view.


Besides my little adventure at my local grocery store, as a Data Scientist, I couldn’t stop myself from noticing that in recent times more and more retailers of every kind (even ice cream retailers) are inviting their customers to download their apps and scan some kind of barcode at the moment of the transaction.

Sometimes they promise discounts, sometimes they deliver products to your door, sometimes you can use it as a note-taking app to prepare your shopping list and sometimes a mix of all.

The Apps:

Those apps are available and free to downloads from different app stores depending on the platform, however, they are structured more or less all in the same way:

Besides the text being in polish, it is quite easy to get the main features of this app.

The app’s home screen has a recap of how many points I collected, a carousel with different call to action, and an image suggesting the main way to use this app: scan your barcode at the checkout. Scrolling a bit more to the bottom I noticed an interesting section with some offers that are available for a limited time.

Google trends search for the term “grocery store app”

According to an interesting report by Nielsen, the main reasons retailers are interested in an app for their customers are:

  • Everything is social: many apps push social media integration since it helps to collect different kinds of data. Also, social integration makes it easier for the user to login (think about the “Login with …” button).
  • It helps decision making: since it is very easier (once you have a user base) to get user location and taste, grocery companies can have help in deciding where to set up future brick and mortar stores.
  • Customer care improves: based on the information your customers shared with you, along with purchases, transactions, and demographics relations with customers can be improved.
  • It creates a new field where suppliers can compete: having a section like “recommended products” might give retailers the chance to put out bids for a spot there, this opens the doors to competition among suppliers and/or increase in bargaining power.
  • The magic of data: collecting user transactions, contents of their baskets, and preferences open the doors for retailers to the world of predictive analysis.
  • Retaining existing customers leads to higher profits: it is known that customer acquisition costs are generally higher than retention costs, moreover, reports suggest that existing customers, compared to new customers, are 50% more likely to try new products and spend up to 31% more.

What kind of data do they collect?

To answer this question I would first focus on which data we input in those apps, from the moment we download the app the first time, to the moment when we scan our barcode at the checkout.

First, when we register, we are usually prompted for our email address, date of birth (because of regulations there might be some age barriers to register), name, surname. If we login with an external provider (think of “Login With Facebook”) we delegate the external provider to give that information on our behalf.

Next, when we do our regular shopping, we are prompted by the cashier to scan our barcode. This barcode contains an ID associated with our profile, which is used to create an association between the transaction and us.

When doing a transaction the following data (at least) are recorded:

  • The transaction timestamp.
  • The store where the transaction happened.
  • The content of our basket (it might be a list of product IDs).
  • The transaction total.
  • If we used any coupon/discount and which.
  • If we used any promo code.

It is also possible, depending on the way the app was designed, to acquire a much broader set of data, containing information such as if the basket content was previously prepared inside the app, what technology was used for the payment(Google pay, cash, credit cards…), which mobile OS was used and so on.

What retailers (can) do with those data?

We can see that even without an App some of those data listed in the previous section can still be recorded, devices such as the POS located in-store already records transaction data. However, having an association between a transaction and a user opens different possibilities for retailers, such as tailored marketing, user clustering, personal recommendations, and offers.

An example of a database that keeps track of transactions. Every box represents a table.

A similar way to create a relation between a transaction and a user is the loyalty cards from a convenient store, they are around for as long as I remember. However, this strategy is still considered an analogic way of keep tracking track of the customer base, since it does not take into account, and it is not able to records events such as the preparation of a shopping list.

Since there are many ways, retailers can exploit such a “live” data collection of transactions and users I would like to group them into two groups:

The descriptive analysis is used to get acquainted with our data, to generate a set of business reports, to cluster our customers base on different categories, and to find/prepare correlations and aggregations for further modeling the data, with, as a final goal the predictive analysis.

The predictive analysis has a goal to deliver predictions about the future using data from the past. It can be possible, for example, to forecast future sales, for different categories of products, in this case, the data must be processed correctly (we might want to aggregate and sum all the sales)

An example of how different aggregations can give different insights into the data.

Another common application of transactional data is price optimization: prices of articles are tweaked by calculating the price elasticity of demand.

Have you ever saw those e-ink price tags in a store? The information displayed on those price tags is controlled by a remote system, that can be updated anytime.

It seems that we’ve already seen something very similar somewhere else…

Another (more related to retail) example of the techniques described until now is the Alibaba Hema store. Hema is a Chinese grocery store which basically is the definition of the future or grocery shops: customers use the Hema’s app to scan and pay for groceries, the app shows additional information for every product, such as recipes (think about having n items in your basket and the app tell you which recipes you can cook and what is missing).

A glimpse of Hema’s mobile app.

The stores also serve as a distribution center, where employees gather online orders and through a system of conveyors belt hanging from the ceiling order bags are moved throughout the store. Normally a customer within the range of 3KM can get its groceries in less than 30 minutes.

Customers use Hema’s mobile app, using it to scan barcodes throughout the store to find out things such as product information and recipe ideas. Alibaba knows everything a customer has purchased, so it offers users the option in the future to quickly order the same goods to be delivered to their home.

A small live demo:

An interesting presentation never misses the live demo, this is not a presentation, but anyway I wanted to give some tastes of how it looks like a very simple basket analysis model, based on a very common and widely used algorithm: apriori.

A sample of a Market Basket Transactions

For this demo, I will use a very cool and open-source data mining software called Orange, it provides a series of tools for data mining, exploration, and even some machine learning! After you downloaded it make sure to install the Associate add-on (head over Options, then Add-ons). Next, you can get the dataset for this demo here.

Once fitted the Frequent Itemset algorithm it is possible to have a look at the output tables:

This table shows a summary for each item (with nested itemsets), for example mineral waterappears in 23.83% of the transactions; the itemset {mineral water, spaghetti}in 5.97%.

The second (definitely more interesting) table is the one generated by the Association Rules algorithm, it introduces two very important concepts: support and confidence.

Support measures how frequent an itemset appears in the transactions: from our data, we can see how the rule milk -> chocolatehas more than twice the support of the rule milk -> soup .

While with confidence we can identify how likely the consequent item is purchased when the antecedent is in the basket.

Using a combination of the two metrics can help retailers understand where to place, what to promote, and how to promote different items (or itemsets!).


While doing research for writing this article I came across another very interesting trend: shop-agnostic grocery apps.

The way those apps work is slightly different than apps coming from a single retailer: some have a built-in recommendation system to help you with meal planning and grocery lists, then they offer partnerships with different retailers. Jaw, a French startup that recently got 7M$ in funding offers partnerships with different supermarkets, among them giants such as Carrefour, Auchan, and Leclerc.

Another interesting move is the acquisition of szopi by supermcato24 which besides the copy-paste of their Italian website to the Polish one (suggesting the willingness for a rebranding of szopi) shows that this trend is growing.

A question spikes me: at this moment, it seems to me that this market is pretty segmented (especially in Europe), what will happen in the next months?

Will we install on our smartphones tons of different apps (think of UberEats, Glovo, Deliveroo) meaning different players fighting for their market share or the Zip’s law will also come into play here?

Why Grocery Stores are Asking You to Download Their Apps was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.