Wednesday, November 21, 2012

Processing a Moving Average using a Window Function with a Perl script

I typically collect lots of data from a wide variety of devices using snmp and store them in a flat file where I'll parse through it with a cron job and plot the data using gnuplot.  This allows me to visualize the data and spot trends or anomalies.  Sometimes you may want to see a moving average for a specific window size.  I'll use the following Perl script to parse the data and generate the desired data operation.  This essentially consolidates a number of data points inside the "window" into a single value which is the average of the data points within the "window."

The following command would read inputfile, compute the average for a window of size 3, then write the result to outputfile.

aveWinCalc.pl -i inputfile -o outputfile -w 3

example input datafile (seperated by \t)
2012-11-01:08:00:00    1200
2012-11-01:09:00:00    1225
2012-11-01-10:00:00    1312
2012-11-01-11:00:00    1355
...

------------------------------------------------------------------------


#!/usr/bin/perl

# file: aveWinCalc.pl
# Written by: Stephen B. Johnson
# 2012-11-20
# Step through every element and compute the average for
# the specified window size.
#
# my input datafile format: (separated by \t)
# YYYY-MM-DD:hh:mm:ss value
#



use Getopt::Std;

getopts("i:o:w:");

if (!$opt_i) {
   print "no input file...\n";
   exit;
}

if (!$opt_o) {
   print "no output file...\n";
   exit;
}

if (!$opt_w) {
   print "must specify a window size\n-> ";
   $opt_w = <>;
}


open (INFILE, "< $opt_i") or die $!;
open (OUTFILE, "> $opt_o") or die $!;

@data = <INFILE>;

my $i = 0;
while ($i < scalar(@data) - ($opt_w - 1)) {
   $wintotal = 0;
   for ($j = $i; $j < ($i + $opt_w); $j++) {
      ($date, $val) = split(/\t/, $data[$j]);
      $wintotal += $val;
      $ave = $wintotal / $opt_w;
   }
#   print "$date \t $ave\n";
   print OUTFILE "$date\t$ave\n";

   $i+=$opt_w;
}

close INFILE;
close OUTFILE;