Amazon Fashion Analytics Dashboard

Business Intelligence for Sales Directors

Principal Component Analysis (PCA)

Introduction to PCA

Principal Component Analysis (PCA) is a dimensionality reduction technique that transforms a set of correlated variables into a smaller set of uncorrelated variables called principal components. These components capture the maximum amount of variance in the data, allowing us to identify the main dimensions of variation and simplify complex datasets while preserving most of the information.

PCA on Product Dimensions

We applied PCA to product dimensions (weight, length, height, width) to identify the main factors that explain variation in product sizes.

Explained Variance by Principal Components

Figure 1: Explained variance by principal components for product dimensions

Principal Component Explained Variance (%) Cumulative Explained Variance (%)
PC1 58.29% 58.29%
PC2 21.74% 80.03%
PC3 11.42% 91.45%
PC4 8.55% 100.00%
Component Loadings

Figure 2: Component loadings for product dimensions

Feature PC1 Loading PC2 Loading
Weight (g) 0.58 -0.12
Length (cm) 0.52 -0.38
Height (cm) 0.42 0.87
Width (cm) 0.47 -0.29
PC1 vs PC2 Scatter Plot

Figure 3: Scatter plot of products in PC1 vs PC2 space

Business Insight: The first two principal components explain about 80% of the variance in product dimensions. PC1 (58.29% of variance) has positive loadings for all dimensions, suggesting it represents the overall size of products. PC2 (21.74% of variance) has a strong positive loading for height but negative loadings for other dimensions, suggesting it contrasts tall-but-narrow products with wide-but-short ones. This dimensional reduction allows us to categorize products based on their size characteristics, which can inform packaging strategies, warehouse organization, and shipping cost optimization.

PCA on Order Items

We applied PCA to order item features (price, freight value, product dimensions) to identify patterns in customer purchases.

Explained Variance by Principal Components for Orders

Figure 4: Explained variance by principal components for order items

Principal Component Explained Variance (%) Cumulative Explained Variance (%)
PC1 48.16% 48.16%
PC2 17.41% 65.57%
PC3 14.23% 79.80%
PC4 10.87% 90.67%
PC5 5.78% 96.45%
PC6 3.55% 100.00%
Component Loadings for Orders

Figure 5: Component loadings for order items

Feature PC1 Loading PC2 Loading
Price 0.32 0.78
Freight Value 0.38 0.42
Weight (g) 0.52 -0.21
Length (cm) 0.47 -0.24
Height (cm) 0.36 -0.28
Width (cm) 0.35 -0.18
PC1 vs PC2 Scatter Plot for Orders

Figure 6: Scatter plot of order items in PC1 vs PC2 space

Business Insight: The first two principal components explain about 65% of the variance in order items. PC1 (48.16% of variance) has positive loadings for all features, suggesting it represents the overall size and cost of orders. PC2 (17.41% of variance) has strong positive loadings for price and freight value but negative loadings for physical dimensions, suggesting it contrasts expensive-but-small items with cheap-but-large ones. This analysis reveals that price and physical size are somewhat independent factors in customer purchasing patterns, which can inform pricing strategies, product bundling, and targeted marketing campaigns.

Applications of PCA Results

The PCA results provide several practical applications for e-commerce business operations:

1. Product Categorization

Using the principal components, we can categorize products based on their size characteristics rather than just their product category. This can help in:

2. Pricing Strategy

The separation of price and physical dimensions in PC2 for order items suggests:

3. Shipping Optimization

Understanding the main dimensions of variation in product sizes can help:

Summary of PCA Analysis

Our Principal Component Analysis has revealed the main dimensions of variation in our e-commerce data:

These insights provide a foundation for more efficient inventory management, pricing strategies, and shipping optimization, ultimately leading to cost savings and improved customer satisfaction.