Terry Pratchett and the Clacks Overhead

"You know they'll never really die while the Trunk is alive[...]
It lives while the code is shifted,
and they live with it, always Going Home."

- Moist von Lipwig, Going Postal, Chapter 13

It is a year and a day since Sir Terry Pratchett died, author of the Discworld series amongst other things, died. But it is nice to see that a year on that he still lives on, not just through his books but through the Clacks Overhead. This was a little movement by web developers to keep his memory alive.

Read More

The Dog and the Duck

Time is of the essence with this post so please excuse the strange mix of notations later on. One of my friends sent me this puzzle from fivethirtyeight.com and here is my solution to how the dog needs to catch the duck.
Lots of peoples first instinct is times faster but this is really just a lower bound, i.e. if the dog travels less than π times faster than the duck then the duck can just read radially to the antipoint of the dogs starting position.
There are a few key assumptions here that we can make from the beginning:

Read More

Looping over a string in SAS

In SAS it is easy to loop a macro between two numbers

%DO I=1 %TO 10;

But if you have a list of non-sequential  numbers or text you want to run your macro over, e.g. a list of towns, it can be a bit trickier. This SUGI paper gives a macro which lets you do just that. A few years ago I looked at that macro and didn't really understand how it worked and not wanting to use code I didn't understand I wrote my own version.

Read More

Further explorations into choropleth maps in Excel

I’ve been trying to improve my Excel choropleth map spreadsheet from my first post.

The first thing I tried was to try to update it using some maps created on ClearlyandSimply.com. So I’ve created two new versions, one for Europe and one of the World. Both using the maps from ClearlyandSimply.com, with a few small alterations to the code so that on hovering over the map it tells you the country not the abbreviation. I did try to make my own UK maps of Constituency boundaries and Counties but using this technique but no matter how hard I tried I couldn’t get it to work. If you do get it to work, please let me know. Note that in both the Europe and World map the data may not be accurate and is only there for illustrative purposes.

Read More

Pivot Chart Tracker

Sales mix by sale type
One of the dashboard charts

I was setting up some trackers at work the other day using some OLAP cubes in Excel across a number of different variables (about 20) to track monthly sales which I could refresh each month. Once I’d set the sales tracker up I realised that I wanted to look at average price across the same variables so I made a copy of the spreadsheet and went through each of the pivot tables changing  them to track average price. When I then wanted to look at sales mix (%sales that month) I thought there must be a better way so decided to write some VBA to do all of this for me.
Read More

Summarising data in R

Summarising data

One of the most frequent tasks I do is summarising data using either proc sql or proc means with code like this:

proc means data=inputdata nway missing noprint;
    class var1 var2;
    var var3 var4;
    output out=outputdata (drop = _type_ _freq_) sum=;

Given that I use it in SAS a lot I’m going to assume that I’ll use it in R a lot so it seems like the next sensible thing to learn.
Read More

UK Train Data

One thing I keep on wanting to get around to looking at is what publicly available data is available on the trains in the UK and the answer to a few questions?

  • What proportions of the trains in the UK are run by foreign governments?
    • At a rough guess it is in the region of 20% based on not very much
    • How would this be measured? Passenger miles?
    • Why can the French, Dutch and German governments run UK rail franchises but not the UK government? It makes no sense!
  • What subsidies are given to the different franchises?
  • Are the original tenders released publicly?
    • Do the train companies live up to these documents?

Read More

First attempt at simple analysis in R - Part 1

It’s time to start some analysis, albeit very basis analysis. I want to look at the interaction between the ONS Rural score and the average Broadband speed. This will be done using the postcode file created in my previous post. I’m assuming that the more rural a place is the slower its broadband will be. Is this actually the case?

The aim of this exercise is to learn so R skills not do some rigorous analysis. This means that some rather broad and potentially foolish assumptions will be made with the data to make some things easier to code given my novice R skills.
Read More