## Base plot function: “pairs”

You may be aware of the `pairs`

plotting function in R. Here is an example with the `Boston`

data set. We will just use a few variables since there isn’t much room here to show the plots for all of them.

miniBoston <- Boston[, c("crim", "lstat", "rm", "medv")]

Let us inspect this new data frame:

> head(miniBoston) crim lstat rm medv 1 0.00632 4.98 6.575 24.0 2 0.02731 9.14 6.421 21.6 3 0.02729 4.03 7.185 34.7 4 0.03237 2.94 6.998 33.4 5 0.06905 5.33 7.147 36.2 6 0.02985 5.21 6.430 28.7

Let’s plot this with the `pairs`

function now:

pairs(miniBoston)

This produces the following plot:

This is great for seeing associations between all the variables. While this looks OK, it takes too long to get generated and wastes a lot of space (each plot is there twice) when you have a lot of variables and too much data and all you are interested is in seeing how a specific variable is related to all the other variables.

## Sensible Alternative with reshape2 and ggplot2

If you do not have reshape2 and ggplot2 installed already, install them first:

install.packages("reshape2") install.packages("ggplot2")

Let us say we are interested in how the variable `medv`

is related to the other variables. We proceed with:

library(reshape2) meltBoston <- melt(miniBoston, "medv")

If you inspect `meltBoston`

, you see

> str(meltBoston) 'data.frame': 1518 obs. of 3 variables: $ medv : num 24 21.6 34.7 33.4 36.2 28.7 22.9 27.1 16.5 18.9 ... $ variable: Factor w/ 3 levels "crim","lstat",..: 1 1 1 1 1 1 1 1 1 1 ... $ value : num 0.00632 0.02731 0.02729 0.03237 0.06905 ... > head(meltBoston) medv variable value 1 24.0 crim 0.00632 2 21.6 crim 0.02731 3 34.7 crim 0.02729 4 33.4 crim 0.03237 5 36.2 crim 0.06905 6 28.7 crim 0.02985

Now we can plot like this:

library(ggplot2) ggplot(data = meltBoston, aes(x = value, y = medv)) + geom_point(size = 0.3, pch = 1) + facet_wrap(~ variable, ncol = 3, scales = "free_x")

You get a plot like this which uses the space more effectively: