Summarising data in R

Summarising data

One of the most frequent tasks I do is summarising data using either proc sql or proc means with code like this:

proc means data=inputdata nway missing noprint;
    class var1 var2;
    var var3 var4;
    output out=outputdata (drop = _type_ _freq_) sum=;
run;

Given that I use it in SAS a lot I’m going to assume that I’ll use it in R a lot so it seems like the next sensible thing to learn.
Read More

First attempt at simple analysis in R - Part 1

It’s time to start some analysis, albeit very basis analysis. I want to look at the interaction between the ONS Rural score and the average Broadband speed. This will be done using the postcode file created in my previous post. I’m assuming that the more rural a place is the slower its broadband will be. Is this actually the case?

The aim of this exercise is to learn so R skills not do some rigorous analysis. This means that some rather broad and potentially foolish assumptions will be made with the data to make some things easier to code given my novice R skills.
Read More

Creating a postcode file in R from different public data sources

Following on from my previous post on creating a postcode file with the 2015 general election results I wanted to create a larger file with more variables. Some from the ONS lookups, others from different public datasets. The one I have added so far is the Office of Communications Broadband Coverage dataset from 2013.

The final dataset will contain for each postcode in the UK:

  • The 2015 general election result
  • The Westminster Election Constituency
  • The Easting and Northing coordinates
  • Census lookup areas
  • Rural Indicator
  • Broadband coverage data
  • Which (if any) national park the postcode is in

Read More