Welford’s Algorithm

Most of the time, general statistics calculations are easy. Take all data points you have, find the average, the standard deviation, and you have 90% of stuff you need to present a result in a reasonable and familiar manner. However, what if you have streaming data?

Well, then you have a Welford's method. This magical algorithm enables you to calculate both average and standard deviation as data arrives without wasting a lot of memory accumulating it all.

So, of course, I wrote a C# code for it. To use it, just add something like this:

var stats = new WelfordVariance();
while(true) {
stats.Add(something.getValue());
output.SetMean(stats.Mean);
output.SetStandardDeviation(stats.StandardDeviation);
}

Algorithm is not perfect and will sometime slightly differ from classically calculated standard deviation but it's generally within a spitting distance and uses minimum of memory. It's hard to find better bang-for-buck when it comes to large datasets.

Leave a Reply

Your email address will not be published. Required fields are marked *