Friday, July 31, 2015

Machine learning visual experiment


R2D3 is an experiment in expressing statistical thinking or machine learning with interactive visual design. In machine learning, computers apply statistical learning techniques to automatically identify patterns in data. These techniques can be used to make highly accurate predictions.
In this case the authors Stephanie and Tony design  a machine learning process in order to know if a house with unknown location is from San Francisco or New York based on other parameters such us altitude, price, sq feet,…



In most cases they seem to apply a logistic regression where high or low values of one variable such as high altitude are more likely to be from one city rather than the other. By sequentially applying this process with all variables and using recorded data the authors can get a very accurate estimate of location for any given house, provided the explanatory variables are available.

Saturday, July 4, 2015

Coloured time series

Here comes an idea (I saw it in Flowingdata.com) to graph many time series together in a well-designed and simple way.

The thing is that when you try to put many graphs together, the Y-axis becomes too short and changes are really hard to appreciate.


These kind of graphs can be converted into colour graphs which are much easier to understand. The Y-axis is fixed to a given level for all graphs. If one of the time series goes above that level, a darker area start at that point from the Y=0. Negative numbers can be drawn as reddish.

It is much clearer when a time serie is up or down by looking at colours.