The first in an occasional series in which a review of recent posts on SmartData Collective reveals the following nuggets:
Hadoop for the big guys
Hadoop is not for everyone. It is a very powerful open source software focused on highly scalable distributed computing. It implements the MapReduce distributing computing metaphor in use at some very large computer powerhouses. In general, I don’t believe it will be of immediate use to the average enterprise; it is for the big guys with high-end problems. My recommendation is that all CTOs at least download it at home and try it out just for familiarity. (I’m running Hadoop on my home systems now so I can kick the tires.)
Capturing markets with BI
There have been many articles that have spoken about good BI being a great defense in times of economic stress. I would go beyond this and state that the real BI pioneers will take advantage of these capabilities to capture markets from their less well-informed competitors and to steer a course away from areas of business that may bring other less-foresighted organizations down. I look forward to seeing case studies bearing this out appear over the next few y…
The first in an occasional series in which a review of recent posts on SmartData Collective reveals the following nuggets:
Hadoop for the big guys
Hadoop is not for everyone. It is a very powerful open source software focused on highly scalable distributed computing. It implements the MapReduce distributing computing metaphor in use at some very large computer powerhouses. In general, I don’t believe it will be of immediate use to the average enterprise; it is for the big guys with high-end problems. My recommendation is that all CTOs at least download it at home and try it out just for familiarity. (I’m running Hadoop on my home systems now so I can kick the tires.)
Capturing markets with BI
There have been many articles that have spoken about good BI being a great defense in times of economic stress. I would go beyond this and state that the real BI pioneers will take advantage of these capabilities to capture markets from their less well-informed competitors and to steer a course away from areas of business that may bring other less-foresighted organizations down. I look forward to seeing case studies bearing this out appear over the next few years.
Newtonian versus Darwinian management styles
Despite my Newtonian view (probably shared by a vast majority in business and IT) that an organization is a big machine that simply needs more gears, pulleys and dials to better operate it, many of my blogs and articles describe the importance of behavioral change management. (A good example is my blog, “Why Do You Have to Be a Sociologist to Implement Performance Management?”) I realize there is also a Darwinian view that an organization is like an organism, and we must acknowledge its sense-and-respond behavior. A balance of the Newtonian and Darwinian management styles is needed for a healthy organization.
Know the business
A data quality analyst who isn’t business oriented would be someone who really reads a report and says, “Ah! Bad data again! We’d better fix it.” An analytical data quality expert would be intimate with the data. He would look at the trends, outliers, and more—and have an understanding of what the data is saying about the business. So when there’s a quality threshold spike on a data object, he immediately knows it’s related to what the business is doing. This will save investigation time and money.
“Computational knowledge engine” for the Web?
If Wolfram has built a breakthrough tool to support information seeking, then he should let it prove itself by unveiling it and letting other people test it. We aren’t talking about some kind of esoteric science where only a few intellectuals can hope to understand it. Rather, his product purports to be some kind of search / answer / knowledge engine. It’s 2009, and we’re all used to the general vision. What we’re holding our breath for is execution.
Selling your vision with the elevator pitch
One of the success traits of a good data champion is that they have vision and they can sell it. Working with others within your organization to develop a vision is important, but the data champion is the primary marketer of the vision. Successful data champions understand the power of the elevator pitch and are willing to use it to promote the data governance vision to all who will listen. The term “elevator pitch” describes a sales message that can be delivered in the time span of an elevator ride. The pitch should have a clear, consistent message and reflects your goals to make the company more efficient through data governance. The more effective the speech, the more interested your colleagues will become.
So what’s with the custom code?
First-generation data warehouses were originally built as reporting systems. But people quickly recognized the need for data provisioning (e.g., moving data between systems), and data warehouses morphed into storehouses for analytic data. This was out of necessity: developers didn’t have the knowledge or skills to retrieve data from operational systems. The data warehouse was rendered a data provisioning platform not because of architectural elegance, but due to resource and skills limitations. If everyone’s needs are homogenous and well-defined, using the data warehouse for data provisioning is just fine. The flaw of hub-and-spoke is that it doesn’t address issues of timeliness and latency. After all, if it could, why are programmers still writing custom code for data provisioning?