Sunday, January 1, 2012

Zipf

An interesting thing about geography and population is, how citizens distribute themselves in a country?. Most of them live in cities obviously but how big is the biggest comparing it to the rest. Are population more concentrated than before?

Zipf is “an empirical law formulated using mathematical statistics, refers to the fact that many types of data studied in the physical and social sciences can be approximated with a Zipfian distribution” (Wikipedia). It also says “The appearance of the distribution in rankings of cities by population was first noticed by Felix Auerbach in 1913. Empirically a data set can be tested to see if Zipf's law applies by running the regression log R = a - b log n where R is the rank of the datum, n is its value and a and b are constants. For Zipf's law applies when b = 1. When this regression is applied to cities a better fit has been found with b = 1.07”

I did an analysis of the 8 biggest cities of 122 countries to see if 1.07 is true. I found that the average b coefficient of those 122 countries is 0.36 far enough from 1 to even care to check the statistical significance of the 1 value.

For those who want to recalculate it data comes from Maxmind.com.

I assumed that the Zipf coefficient depends basically on how poor and big is the country. Theoretically, the poorer the country, the less Zipf coef is (less concentration and more rural population). Moreover, the bigger the country the more disperse the population is. So, Russia, for example should see a more disperse population.

Results of the regression are: (the Zipf coef is negative so remember to change explanatory var. coefficients.)

No comments: