MongoDBでMapReduce

MongoDBをMacにインストールして動かす

1. MongoDBをインストール
$ sudo port install mongodb
2. DBファイルの置き場所を作成
$ mkdir /foo/bar/mongodb_data

MongoDBは大きめのディスクスペースを必要とします。その理由は以下のリンク先に。
http://www.mongodb.org/pages/viewpage.action?pageId=17596968

3. DBを起動
$ mongod --dbpath=/foo/bar/mongodb_data
Sun Oct 17 23:26:51 MongoDB starting : pid=1601 port=27017 dbpath=/foo/bar/mongodb_data 64-bit 
Sun Oct 17 23:26:51 db version v1.6.3, pdfile version 4.5
 :
4. クライアントからアクセスしてみる
$ mongo
MongoDB shell version: 1.6.3
connecting to: test
> 
> show dbs;
admin
local
test

最初は'test'という名前のDBにつながる。'show dbs;'で現存しているDBの一覧が見られます。

MapReduceで遊んでみる

1. テストデータを登録
> isono = {
...   'members' : [
...     { 'name' : 'namihey', 'age' : 58, 'gender' : 'male', 'favorite_fruits' : ['apple', 'orange'] },
...     { 'name' : 'fune', 'age' : 55, 'gender' : 'female', 'favorite_fruits' : ['peach', 'pear'] },
...     { 'name' : 'katsuwo', 'age' : 11, 'gender' : 'male', 'favorite_fruits' : ['banana'] },
...     { 'name' : 'wakame', 'age' : 9, 'gender' : 'female', 'favorite_fruits' : ['kiwi', 'apple'] }
...   ]
... }
 :
> fuguta = {
...   'members' : [
...     { 'name' : 'masuwo', 'age' : 29, 'gender' : 'male', 'favorite_fruits' : ['orange'] },
...     { 'name' : 'sazae', 'age' : 27, 'gender' : 'female', 'favorite_fruits' : ['banana', 'apple'] },
...     { 'name' : 'tarawo', 'age' : 4, 'gender' : 'male', 'favorite_fruits' : [] }
...   ]
... }
 :
> namino = {
...   'members' : [
...     { 'name' : 'norisuke', 'age' : 26, 'gender' : 'male', 'favorite_fruits' : ['strawberry', 'orange', 'banana'] },
...     { 'name' : 'taiko', 'age' : 24, 'gender' : 'female', 'favorite_fruits' : ['strawberry', 'pineapple'] },
...     { 'name' : 'ikura', 'age' : 2, 'gender' : 'male', 'favorite_fruits' : ['peach', 'orange'] }
...   ]
... }
 :
> db.SundayNightSyndrome.insert(isono);
> db.SundayNightSyndrome.insert(fuguta);
> db.SundayNightSyndrome.insert(namino);
2. map時のコールバック関数を定義(性別ごとの人数を知りたい)
> m = function(){
...     this.members.forEach(
...         function(z){
...             emit( z.gender , { count : 1 } );
...         }
...     );
... };
function () {
    this.members.forEach(function (z) {emit(z.gender, {count:1});});
}
3. reduce時のコールバック関数を定義
> r = function( key , values ){
...     var total = 0;
...     for ( var i=0; i<values.length; i++ )
...         total += values[i].count;
...     return { count : total };
... };
function (key, values) {
    var total = 0;
    for (var i = 0; i < values.length; i++) {
        total += values[i].count;
    }
    return {count:total};
}
4. mapreduce処理をおこなう
> res = db.SundayNightSyndrome.mapReduce(m,r);
{
  "result" : "tmp.mr.mapreduce_1287071655_3",
  "timeMillis" : 19,
  "counts" : {
    "input" : 3,
    "emit" : 10,
    "output" : 2
  },
  "ok" : 1,
}
5. 結果見てみる
> db[res.result].find();
{ "_id" : "female", "value" : { "count" : 4 } }
{ "_id" : "male", "value" : { "count" : 6 } }
6. map時のコールバック関数の内容を変更(好きな果物ごとの人数を知りたい)
> m = function(){
...     this.members.forEach(
...         function(z){
...             z.favorite_fruits.forEach(
...                 function (y) {
...                     emit( y , { count : 1 } );
...                 }
...             )
...         }
...     );
... };
function () {
    this.members.forEach(function (z) {z.favorite_fruits.forEach(function (y) {emit(y, {count:1});});});
}
7. mapreduce処理をおこなう
> res = db.SundayNightSyndrome.mapReduce(m,r);
{
  "result" : "tmp.mr.mapreduce_1287071940_4",
  "timeMillis" : 21,
  "counts" : {
    "input" : 3,
    "emit" : 17,
    "output" : 8
  },
  "ok" : 1,
}
8. 結果見てみる
> db[res.result].find();
{ "_id" : "apple", "value" : { "count" : 3 } }
{ "_id" : "banana", "value" : { "count" : 3 } }
{ "_id" : "kiwi", "value" : { "count" : 1 } }
{ "_id" : "orange", "value" : { "count" : 4 } }
{ "_id" : "peach", "value" : { "count" : 2 } }
{ "_id" : "pear", "value" : { "count" : 1 } }
{ "_id" : "pineapple", "value" : { "count" : 1 } }

備考

MongoDB uses SpiderMonkey for server-side Javascript execution. The mongod project requires a file js.lib when linking. This page details how to build js.lib.

Note: V8 Javascript support is under development.

http://www.mongodb.org/display/DOCS/Building+Spider+Monkey