PyCon 2008 ElasticWulf Slides
Here are the ElasticWulf slides from my talk. The video will eventually be posted to the PyCon site. Elasticwulf Pycon…
Some Datasets Available on the Web
The Datawrangling blog was put on the back burner last May while I focused on my startup. Now that I…
Amazon Web Services Public Datasets
Amazon announced their Hosted Public Data Sets service today, and I expect it to be a game changer. Finding and…
The Colbert Bump in Amazon Data
Last month, I took a position as Director of Advanced Analytics at Juice. I’m primarily a machine learning guy, so…
Hidden Video Courses in Math, Science, and Engineering
Over the last few years, a large number of open courseware directories and video lecture aggregators have popped up on…
Python Montage Code for Displaying Arrays
This post will show how to replicate the Matlab montage function using Python. The Data Wrangling blog seems to be…
Google Paper on Parallel EM Algorithm using MapReduce
I hadn’t seen much discussion of this on the web, so I thought I would post the link to this…
Amazon EC2 Considered Harmful
“The TruckNumber is the size of the smallest set of people in a project such that, if all of them…
MPI Cluster with Python and Amazon EC2 (part 2 of 3)
Today I posted a public AMI which can be used to run a small beowulf cluster on Amazon EC2 and…
Conversation with Eric Siegel on Predictive Analytics World
The Predictive Analytics World Conference is taking place Feb 18-19, 2009 in San Francisco, CA, and seems to have an…