2012-08-24

mongodb-mapreduce and highcharts

During company's hackday, we analyze some bigdata in mongodb. We want to find out whether XXXXX price's short term and long term Moving averages can predicting the future price changing trends. BACKGROUND: http://www.hardrightedge.com/tactics/indicators/movingaverages.htm http://winninginvestor.quickanddirtytips.com/The-Top-5-Chart.aspx
THE SCRIPT TO MAP-REDUCE DATA IS:
m = function () {
    if (this.landsize > 0) {
        seq_key = NumberInt(this.seq / 100);
       if (this.valuation_range_high > 0 && this.valuation_range_low > 0) {
            emit(seq_key, {seq:seq_key, avg_price:0, price:(this.valuation_range_high + this.valuation_range_low) / this.landsize, count:1});
        }
    }
}



r = function (key, values) {
    r = {seq:key, avg_price:0, price:0, count:0};
    values.forEach(function (v) {r.price += v.price;r.count += v.count;});
    return r;
}
function finalizef(key, value) {
    if (value.count > 0) {
        value.avg_price = value.price / value.count / 2;
    }
    return value;
}

s = db.rpdata.mapReduce(m, r, {finalize:finalizef, out : "price"})

result::
{
 "result" : "price",
 "timeMillis" : 730604,
 "counts" : {
  "input" : 19477443,
  "emit" : 7345947,
  "reduce" : 26799,
  "output" : 19468
 },
 "ok" : 1,
}
THE PYTHON SCRIPT TO LOAD DATA FROM MONGODB:
from pymongo import Connection
import pymongo

...

    def get_price_from_db(self):
        connection = Connection()
        db = connection.bigdata
        price_records = db.price

        avg_prices = []
        for record in price_records.find():
            avg_prices.append(record['value']['avg_price'])
 
        mean_value = mean(avg_prices)
        std_value = std(avg_prices)

        normalized_prices = []
        for price in avg_prices:
            //using z-score to strip strange values
            if abs((price - mean_value) / std_value) < 1.0 :
                normalized_prices.append(price)
Code on github
REFERENCES:
moving_averages
http://en.wikipedia.org/wiki/Moving_average
http://en.wikipedia.org/wiki/Z-score
http://www.mongodb.org/display/DOCS/MapReduce
http://highcharts.com/demo/
http://docs.python.org/library/basehttpserver.html
http://www.tanzilli.com/python_httpserver