The Zung Jung Average (中庸平均值)

An average which balances the characteristics of the mean and the median

(English version)

Alan Kwan

This article is not about mahjong. But rather, this is a mathematical paper.

Introduction

Two commonly used "averages" are the arithmetic mean and the median. The arithmetic mean (hereafter simply "mean") has the advantage that all data values are taken into account, but has the weakness that it is much affected by extreme data values. The median, on the other hand, has the advantage that it is not affected by extreme data values, but the weakness that it largely "ignores" them.

This paper proposes the Zung Jung average which tries to balance the two. It has the advantage of taking into account all data values, while alleviating the impact of extreme data values.

Observations

We observe that, the mean satisfies the following equation:

sum ( xi - mean(X) ) = 0

While the median satisfies the following equation:

sum ( sgn( xi - median(X) ) ) = 0

The idea of the Zung Jung average is to use a function which alleviates the impact of large differences ( xi - y ) without going so far as to render them all to unity as per the sign function. Hence we use the square root.

Definition

The Zung Jung average ZJ(X) is defined as the value y which satisfies the following equation:

sum ( sgn( xi - y ) * sqroot(abs( xi - y )) ) = 0

In other words, instead of using the raw value of the differences (as for the mean) or applying the sign function (as for the median), we take their square roots (while preserving the signs).

For a couple of simple examples, if the data values are [0, 0, 1], the Zung Jung average is 1 / (22+1) = 0.2 . If the data values are [0, 0, 0, 1], the Zung Jung average is 1 / (32+1) = 0.1 .

Computation

The above examples are by no means representative of typical cases. For typical data which contains more than two different values, there is no known closed-form formula computation for the Zung Jung average. The best method for computing the Zung Jung average seems to be using iterated linear interpolation, by evaluating the above formula with approaching values of y until a value which gives a result close enough to zero is found. Since the above formula is a continuous and absolutely decreasing function, this method should be safe and efficient. For most data, the mean and the median should be good starting values for y. However, in some cases the mean and the median can be identical or very close while the Zung Jung average is not between the two; in such case, some other starting values should be used.

The computation is impossibly tedious if carried out manually (even for a small number of data points); this is the main drawback of the Zung Jung average. However, the computation can be easily performed using the computer.

Applications

The Zung Jung average balances the characteristics of the arithmetic mean and the median, and has the strengths of both. Thus, it is a very good and representative average for many practical purposes.

Higher Orders

A variation is to use higher-order roots instead of the square root. This strengthens the characteristic of the median over the mean:

sum ( sgn( xi - y ) * (abs( xi - y ))(1/r) ) = 0

Of particular interest is the cubic root (r=3), because with odd-numbered r, the sign of the root is preserved automatically. Whether this would allow for a more convenient method of calculating the average value is an open topic for further research.

sum ( ( xi - y )(1/3) ) = 0

Note that when r=1 , we simply have the arithmetic mean. As r approaches infinity, we approach the median.


back to home

Alan KWAN Shiu Ho / tarot@netvigator.com / created 24 Jan 2009

© 2009 Alan KWAN Shiu Ho