Signed chi-squares

While I was a member of the Census Research Unit of the University of Durham, working on People in Britain – a census atlas, I found that the ratio maps were totally misleading. Moreover, it was only possible to use the colour cue to map patterns in Great Britain at the 1 km grid square resolution (see below). I therefore tried to accommodate all three of Sally’s requirements in a single mapping measure. The signed chi-square measure was designed to achieve this. However, as this metric measures the magnitude of deviation of observed values (O) from the expected values (E), there is a need to state the value for expectations up front.

People in Britain was only aiming to provide descriptive maps of circumstances as they were in 1971; i.e. it was only intended to map the outcome of past processes. For this, statistical averages (such as the national average) were sufficient. However, for prescriptive and diagnostic purposes the values for expectation need to reflect national aspirations.Here is a map from People in Britain, showing the distribution of male unemployment. Red and blue represent low and high unemployment respectively. The experimental 1 km population census data facilitated the mapping of population trends at a resolution never seen before.

Signed chi-sqaure has also been used (by researchers who could program) with other enumerated data, for example in epidemiological studies. Unfortunately, as Bob Barr noted in Dale and Marsh (1993), it is not available within proprietary GIS systems.

Returning to unemployment in Humberside, we find that the signed chi-square map brings male unemployment into crisper focus.The worst 5% of areas are much more concentrated in the inner areas of Hull and Grimsby and some (not all) outer council estates; the seasonal unemployment in Bridlington is also picked up.The two cases with extreme ratios have now become too prominent to be true and made me visit the map library.

The prime areas with good employment prospects in 1971 are indicated by red arrows. They include Scunthorpe, Beverley, West Hull and Cleethorpes.
The distribution pattern is closer to local perception of good and bad areas; perceptions which were being reflected by the residential preferences of those who enjoyed better employment prospects.

Some implications of changing the metrics can be studied using Lorenz curves. To construct these curves,

We sort the data units into ascending order on each metric. We then normalise the ranks to make the graphs comparable and show the ranks along the x-axis.

For each metric, we construct a line graph which shows the cumulative percentage of unemployed males, starting with 0% and ending with 100% males.

These graphs show the dramatic change in the ranking of areas on a good to bad scale with a change of metrics. Look initially at the blue line which represents areas sorted on the number of unemployed males. This is extremely bowed, which shows that the distribution of male unemployment is very skewed. Some 60% of unemployed males live in the worst 5% of areas (demarcated by the red vertical line).Now look at the red graph, for the same areas sorted on ratio values. With ratios, only 10% of the unemployed would be found in the worst areas. This implies that governments using ratio measures can redress poverty in the worst areas with a much smaller budget. However, the mass of the unemployed males would continue to live in non-target areas. Indeed, many researchers have been pointing out that the bulk of the deprived live outside the designated action areas.If we now looked at the plot for signed chi-squares, the worst 5% of areas would include 47% of the unemployed since we have attempted to balance numbers and ratios.I must say that I was perplexed by this graph when I first saw it. If we look at the best areas, we see something strange. The red arrow in the middle shows that 45% of km squares have full employment. Yet, the best 10% of areas on chi-score contain some 13% of the unemployed.

To understand this, we need to look also at the Lorenz curves for employed males– i.e. the complement of the unemployed.

Note now that the 45% of areas with full employment are not only geographically dispersed as we saw on the maps, they also contain only 6% of working males. In contrast, the best 10% of areas on signed chi-squares contain 30% of the employed males. So, if you are in direct marketing, signed chi-squares is clearly a better indicator than ratios for you.These curves mean very little without reference to the geographic maps of the distribution of competing ‘worst areas’ on various metrics, which I have already shown. Isolated pictures are easily misread!

My research with signed chi-squares showed me that there was scope for simultaneous consideration of all 3 requirements outlined by Sally Holtermann. This metric seems to give a better indication of the location of worst cases of male unemployment.However:

It is an arbitrary formulation which requires further research.
It was designed for ranking – not scaling population units

Most importantly, the ranking depends on the values for expectation.

Statistical averages may be adequate for indicating endemic patterns resulting from past processes. In social planning, we need to measure deviation from target levels for unemployment. This clearly is dependent on the state of the economy and other socio-political influences.The worst cases on the signed chi-square maps for Hull also show something else. For example, the council estate (to the north of Hull) had many symptoms of deprivation. But, in 1971, this was indicative of people poverty rather than place poverty. Many of the people on this estate had been moved out of more run down, small houses with no inside toilets or bathrooms into what was regarded as better accommodation. So, even the signed chi-square maps could be misread without local knowledge.

The Institution of Analysts and Programmers

Signed chi-squares