Correlation Matrix
ggplot2: geom_tile() + geom_text() · Package: ggplot2 + reshape2 · Variables: 3+ numerical variables
WHAT IS A CORRELATION MATRIX?
A correlation matrix visualizes the pairwise correlation coefficients between all numeric variables in a dataset as a color-coded grid. Each cell shows the strength and direction of the linear relationship between two variables, from -1 (perfect negative) to +1 (perfect positive). It is a critical tool in exploratory data analysis, feature selection for machine learning, and detecting multicollinearity in regression. The visual pattern immediately reveals which variables are strongly related. In ggplot2, reshape the correlation matrix with reshape2::melt() and plot with geom_tile() plus geom_text() for values.
BEST FOR
- · Exploring variable relationships
- · Feature selection
- · Multicollinearity detection
AVOID WHEN
- · Non-numeric data
- · Very few variables
- · When causation matters more than correlation
R + GGPLOT2 CODE EXAMPLE
cor_mat <- reshape2::melt(round(cor(mtcars[, 1:7]), 2)) ggplot(cor_mat, aes(x = Var1, y = Var2, fill = value)) + geom_tile() + geom_text(aes(label = value), size = 3) + scale_fill_gradient2(low = "blue", mid = "white", high = "#ff6a00") + labs(title = "Correlation Matrix", x = NULL, y = NULL)
Run this code now
Paste the code above into RChat and see the correlation matrix rendered instantly in your browser.