DEEP DIVE: The Product Data Model

One of the major differences between being a traditional E-com (where products are purchased or manufactured by the E-com owner) and a marketplace is that in a marketplace model, it is the seller herself who is responsible for providing the marketplace with product data.

This means that the seller’s product data model must be aligned with or mapped to the product data model existing in the marketplace platform. It is unusual for a seller’s product data model to align perfectly with the marketplace’s product data model, but it is impossible for all sellers’ product data models to align with the marketplace model. It would therefore be simpler if a marketplace had only one or a few sellers, but as soon as a marketplace scales up, there must be a general and universal product data model that all sellers need to adapt to. Only after the seller has mapped their products to the common product data model can the products become sellable online.

It is a delicate balancing act between having a good time to market and abundant product data because an overly complicated data model often affects the time it takes for a product to reach the market.

Below we will outline some ground rules when it comes to creating a sustainable and useful product data model.

A product data model should be MECE

A good product data model should be mutually exclusive and collectively exhaustive, also known as MECE. This means that if you as a seller are going to map a knitted turtleneck against one out of several selectable categories, you don’t want to have to choose from the following categories.

  • Tops
  • Knitted
  • Long sleeved
  • Polos

Since your product fits in all the above categories, it will be very difficult to map. In this case, certain categories in the product data model overlap, and are therefore not mutually exclusive.

Another example is where a seller should map a pair of jeans to the common product data model and the categories available to choose from are:

  • Chinos
  • Sweatpants
  • Suit pants

In this case, although the categories are mutually exclusive (i.e. they do not overlap), they are not collectively exhaustive. So, there are no categories for all the world’s products to have exactly one place to be mapped against.

A product data model must be durable over time

When developing your product data model as a marketplace operator, it is important to have a time horizon that is not too short. Changing the product data model means not only changes in the marketplace’s own systems, but also in all sellers’ systems. If hundreds of sellers have mapped their products to a product data model and this is then changed (some categories are removed or moved and some are added), this means that the seller will also need to re-map their products to match the new product data model.

Generally, we usually recommend letting the work on the product data model take the time it needs to be as sure as possible that it will last over a long time. One important thing to remember is that it is always easier to add new categories than removing existing ones (because this means that the seller will need to re-map all products that are currently in the deleted category).

Categories vs attributes

One often very tricky question is “should I work with very specific categories and fewer attributes, or should I have shallow categories and enrich the products with attributes”? Let’s take an example:

If we imagine that we want to be able to differentiate short-sleeved and long-sleeved shirts in our product data model and that we also want to know if the shirts have button-down collar or cutaway collar. One way to do that would be to have categories that look like the following:

Shirts > Long-sleeved shirts > Button-down collar
Shirts > Long-sleeved shirts > Cutaway collar
Shirts > Short-sleeved shirts > Button-down collar
Shirts > Short-sleeved shirts > Cutaway collar

This would allow the seller to map each shirt into exactly one category and this would have worked fine. But what happens if we want to add a third collar type? Then we would have had to create two more new categories, one for short-sleeved and one for long-sleeved shirts. Furthermore, if we imagine that there is another interesting feature of the shirt;  Fit. By then, the category tree would have grown quickly because the number of categories become exponentially larger the more properties we add.

The alternative to this is to use the attributes, in this case the attributes could be called

  • Collar type
  • Sleeve length
  • Fit

This would mean that we can simplify our product data model to just be:


And given that a product is mapped to the Shirts category, the seller also needs to enter a value in each of the three attributes above. It will give us a much simpler product data model that is also infinitely more flexible for future changes. For each attribute, we then work with value lists that give us control over which values a salesperson can enter for each attribute.


Setting a good and sustainable product data model early and then sticking to this is one of the most important tasks when a new marketplace takes shape. A little extra work at the beginning will repay itself many times in the future.

Written by Max Guidotti and Klas Wilhelmsson, PARAGAIA Consulting Group

Share this article on social media

Recommended articles