Data Warehouse Design For an E-Commerce Company

akhil anand
4 min readMar 19, 2023
Source

Introduction

Data Warehouse works as a centralized source for storing structured and unstructured data in a transformed form which further helps an organization to build better decisions out of those data. As it stores the cleaned and transformed data the storage cost(for current and historical data) is not that high. Google big query and Amazon Dynamo DB are some common examples of data warehouses.

Points to be considered before constructing data warehose.

Before constructing a data Warehouse you should keep these four points in your mind.
1. Understand the business process.
2. Define the grain (An individual row should explain the whole purpose of constructing a warehouse system).
3. Define the Dimensions (the static table which will help in deriving the insights out of data).
4. Define the Facts (the table which stores transactional data).

Understanding the business process

  1. Buyers will place the order & the order would be placed from a particular seller.
  2. Buyers can place multiple orders from multiple sellers.
  3. The base price of the product.
  4. Coupon amount if applied
  5. Transaction type online or Cash on Delivery.
  6. Logistics partner delivering the order.
  7. Number of quantities ordered
  8. Final status like delivered, RTO, DTO, or canceled.

When all the above pointers will be explained in a row we will be successfully able to define the grain.

Defining the Dimensions

The Most important portion of warehouse modeling design, the Construction of the dimensional table will directly or indirectly help in making the better decision out of those data. Let’s understand each and every dimensional table in detail.

1. Buyer Info

Buyer info will store all the relevant information of buyers. i.e: Buyer Name, Buyer Phone, etc. for better optimization of data it’s better to construct another sub dimensional table that will store the buyer’s location info. Please See the below table for visual understanding.

Figure 1

2. Seller Info

In this table, we will store all the info about the seller like seller id, seller name, Seller Phone, products available, and location level info.

Figure 2

3. Status

The status table stores relevant information about the order Journey. Suppose a buyer has ordered a product, the transaction can be either online payment or offline payment(COD), and you will construct a column that gives info about the payment type. once the order is placed it can either be delivered to the buyer or RTO (Return to Owner).
If the order will be delivered then it might be possible for the order to be returned from the buyer's end. so these different order statuses can be stored in the status column. This table will also store the order date info.

Figure 3

4. Logistics Partner

Suppose a buyer has ordered a product and the seller has prepared the same order. Now it’s time for the logistics partner to deliver the order to the end user. So we need a separate dimensional table that will store info related to logistics partners. Each order will be associated with an individual Agent so we need to store the individual agent information in the sub dimensional table.

Figure 4

Defining The Facts

Facts store all the transactional data. The main aim is to construct the fact table to store the numerical data which is further combined with the dimensional table to give better data-driven info. Let’s see the pictorial representation of the facts Table.

Figure 5

Now you are able to define all the components needed to design a data warehouse model. Let’s Combine all these facts and dimensional table to construct the final data warehouse System.

Figure 6

--

--