Building the Analytical Dashboard with Power BI and R

An overview of the process of building an analytical dashboard using Power BI and R, focusing on data preparation, time series forecasting, and causal analysis.
BI
Statistics
ML
Visualization
R
Author

Aleksei Prishchepo

Published

October 26, 2025

In this article I’m going to share the process I followed in developing this analytical dashboard. My goal was to take the Consumer Financial Complaints Dataset and distill it into a set of visualizations, integrating statistical modeling and time series analysis.

For those who enjoy the technical deep dives, the repository containing Exploratory Data Analysis, Time Series Forecasting, and Causal Analysis documents is linked at the end of this post. Here is an overview of how the dashboard came together.

Data Preparation and Exploratory Data Analysis (EDA)

The entire project started with ensuring the data was robust and ready for analysis. I performed Exploratory Data Analysis (EDA) to check the data structure, look at distributions, and establish correlations.

  • Loading and Cleaning: I loaded two main sheets from the Excel file: the complaints data, which contained 62,516 rows and 19 columns, and the companies data, which contained 1,081 rows and 9 columns.

  • Joining Tables: to create a single comprehensive analytical dataset, I joined the complaints data and the companies data. This resulted in a dataset comprising 62,516 rows and 27 variables.

Visualizing Distributions and Relationships

During EDA, I needed to check the univariate distributions of the numeric variables (Figure 1) and examine correlations between them to identify relationships and potential multicollinearity (Figure 2).

Distributions of Numeric Variables
Figure 1: Distributions of Numeric Variables
Distributions of Numeric Variables
Figure 2: Correlation Matrix of Numeric Variables
Note

See the document Exploratory Data Analysis for the detailed description of the EDA process.

Factor Analysis (Causal Modeling)

I performed causal analysis to estimate the effect of different company characteristics, such as enforcement history and size, on critical performance metrics like the complaint rate.

Full Linear Model

I built a linear regression model that included multiple covariates: enforcement status, company size tier, reputation score, and timely response rate, targeting the log of complaints_per_1pct_share.

  • Key Finding: After running the full model, I determined that company size was the only variable that was statistically significant.

Reduced Model

I then created a simplified, reduced model using only company size as a predictor. This model exhibited strong quality metrics: adjusted R² of 0.548, and a highly significant p-value of 1.143e-188. This suggested that company size alone explains a significant portion of the variance in the complaint rate, generally indicating that larger companies tend to experience lower complaint rates compared to smaller ones.

The model metrics confirmed the quality of the reduced regression model:

Reduced Model Quality Metrics
r_squared adj_r_squared f_statistic p_value
0.5487796 0.5483614 1312.292 1.142991e-188
Note

See the full Causal Analysis document for details.

Dashboard Visuals: Model Comparison

The dashboard’s “Factor Analysis” tab transparently presents the modeling results, comparing the full and reduced models and highlighting the significant variable.

Models Comparison Table
Variable Full Reduced
F statistic 3.286E+002 1.312E+003
p value 9.721E-185 1.143E-188
5.498E-001 5.488E-001
R² adj. 5.482E-001 5.484E-001

Advanced Techniques (Anomaly Detection and Clustering)

I also explored advanced techniques, though I ultimately chose to exclude their visuals from the final report to maintain focus and brevity.

Anomaly Detection

I used STL decomposition and z-score methods to identify anomalies in the monthly complaint volumes. This approach helped to detect only a couple of anomalies (one spike and one drop). I decided to skip including this visual in the final Power BI report because the low number of detected anomalies didn’t add substantial new information to the report at this stage.

Time Series with Anomalies Highlighted
Figure 4: Time Series with Anomalies Highlighted
Note

See Anomaly Detection for full details.

Clustering Analysis

I performed K-means clustering on various company features (like enforcement history, size, and complaint rate). I found that the optimal number of clusters was 3, based on the silhouette scores. These clusters effectively separated companies:

  • Cluster 1 (Large/Medium, low complaints, near-zero enforcement),

  • Cluster 2 (Small, high complaints, zero enforcement), and

  • Cluster 3 (Medium/Small, high complaints, some enforcement history).

However, since visualizing these clusters simply reinforced the patterns already established by the causal analysis (the importance of size), I chose to skip the clustering visualizations in the final dashboard.

Figure 5: The plot of clusters reveals the same patterns observed in the causal analysis
Note

See Clustering Analysis for full details.

Key Dashboard Visuals

The final dashboard provides users with an interactive, clear overview of the financial complaints landscape, structured across tabs for Overview, Geography, Products, Companies, and Factor Analysis.

Overview

The dashboard immediately shows key metrics.

Figure 6: Overview Metrics and Complaints by Product

Geography and Product Breakdowns

The geographical tab highlights totals by Census Region, for example, the South region accounted for 4K complaints, the West for 3K, and the Midwest for 1K.

Figure 7: Complaints by Region

In terms of products, corresponding report section provides breakdown of complaints by product, including YoY changes.

Figure 8: Complaints by Product Details

Companies

This report section provides descriptive statistics about companies.

Figure 9: Company Statistics

Factor Analysis

This tab presents the results of the causal modeling, comparing the full and reduced models.

Figure 10: Factor Analysis

Conclusion

I hope this overview of the dashboard construction process, from initial data cleaning to final model integration, gives the reader a clear picture of the analytical work involved.

References