Information Management has recently published an interesting article about automating analytics. Written by Bill Franks and Scott VanValkenburgh, it explains when automating analytics work and when it doesn’t.
Information Management has recently published an interesting article about automating analytics. Written by Bill Franks and Scott VanValkenburgh, it explains when automating analytics work and when it doesn’t.
I think automating data mining is in the top five open issues in the field. It’s not an easy task, but it can be very time (and thus cost) saving when one need to generate hundreds or thausands of models each day. According to the authors (and I agree with them):
“The reality is that there is no magic button. […] The initial processes still need to be built and automated. Maintenance and modifications also need to be performed on the processes over time as business needs, data structures or other factors change.”
I think some tasks such as ETL, scoring and quality checks can be automated while ressources will be free up for data miners to build models. If you’re interested in this subject, you may want to read an older (2006) post with a lot of interesting comments on Data Mining Research: Garbage in, Garbage out. I think that my mind has changed from four years ago. At least, I’m convinced that some parts of the data mining process can be automated. However, the limits of automating data mining are still not clear to me. I guess it’s an open issue for the data mining field.
One good point highlighted by the authors is the life duration of a model. They advised not to “try to use an existing model for a brand new question, and don’t let a process outlast its reasonable life span.”
Access to the full article at Information Management. Also, feel free to share your opinion about automating analytics: do you think data mining can by automated? If no, why? If yes, to what extent?