Google Paper on Parallel EM Algorithm using MapReduce
I hadn’t seen much discussion of this on the web, so I thought I would post the link to this…
Amazon EC2 Considered Harmful
“The TruckNumber is the size of the smallest set of people in a project such that, if all of them…
MPI Cluster with Python and Amazon EC2 (part 2 of 3)
Today I posted a public AMI which can be used to run a small beowulf cluster on Amazon EC2 and…
PyCon 2008 ElasticWulf Slides
Here are the ElasticWulf slides from my talk. The video will eventually be posted to the PyCon site. Elasticwulf Pycon…
Some Datasets Available on the Web
The Datawrangling blog was put on the back burner last May while I focused on my startup. Now that I…
Amazon Web Services Public Datasets
Amazon announced their Hosted Public Data Sets service today, and I expect it to be a game changer. Finding and…
Conversation with Eric Siegel on Predictive Analytics World
The Predictive Analytics World Conference is taking place Feb 18-19, 2009 in San Francisco, CA, and seems to have an…
Adding all the numbers
Once again I was just adding the numbers in the stimulus package of the US Senate.The economics pendulum has swung…
R Successor Language ‘Tea’ announced
The article was first announced in 2008/4/1. Here is a reprint. The successor to the popular open source language R…
David Meerman Scott speaks to Tom H. C. Anderson about PR
David M. ScottAnderson Analytics Marketing Guru Round Table Discussion, with David M. ScottMarket researchers are often in a great position…