Hadoop
Data Variety: What It's All About
data variety / shutterstock
Data variety stands out from the three Vs of big data from the report of the big data survey conducted by NewVantage Partners in 2012. One of the survey results shows companies focusing more on data variety instead of data volume both now and in the next three years.[read more]
Preserving Big Data to Live Forever
Preserving knowledge for generations is no easy task. Key components of this massive undertaking include decisions in technology, architecture, data storage, and data accessibility. What are steps to architect a solution to keep your own data safeguarded and accessible long-term?[read more]
Healthcare, Risk Aversion, and Big Data Case Studies
With so much waste and opportunity in the healthcare industry, it should be no surprise that quite a few software vendors are focusing on Big Data–and not just behemoths like IBM. Start-ups like Explorys, Humedica, Apixio, and scores of others have entered the space.[read more]
Hadoop Toolbox: When to Use What
Hadoop and Big Data have almost become synonymous. But Hadoop is not just Hadoop now. Over time it has evolved into a big herd of various tools, each meant to serve a different purpose. But glued together they give you a powerpacked combo. Here's my short intro to some very useful tools.[read more]
Hadoop + Ubuntu: The Big Fat Wedding
Now, here is a treat for all you Hadoop and Ubuntu lovers. Last month, Canonical, the organization behind the Ubuntu operating system, partnered with MapR, one of the Hadoop heavyweights, in an effort to make Hadoop available as an integrated part of Ubuntu through its repositories.[read more]
Zynga: A Big Data Company Masquerading as a Gaming Company
Online game developer Zynga operates at such a large scale that on a regular day, Zynga delivers one petabyte of content. They have built a flexible cloud server center that can easily add up to 1,000 servers in just 24 hours. So, big data is truly big at Zynga, but how do they cope with it?[read more]
Will Big Data Finally Turn CRM Into Something Valuable?
Customer Relationship Management has always involved data, but most of it used to be structured data; new techniques make it possible to process, store and analyse massive amounts of unstructured data as well, which means CRM can finally become a true revenue driver.[read more]
What Are Big Data, Hadoop and HDFS? 3 Must-Watch YouTube Videos
Here are three great videos on Big Data, Hadoop and HDFS: "Explaining Big Data," "Introducing Apache Hadoop: The Modern Data Operating System," and "Hadoop Tutorial: Intro to HDFS." Hope you enjoy them and do let me know if they are useful in the comments.[read more]
Walmart Makes Big Data Part of Its DNA
All the big data efforts of Walmart are a good example of the massive possibilities of what can be done if big data is truly incorporated in the DNA of the company. Already, Walmart is able to optimize the local assortment of Walmart stores based on customer commentary on social media.[read more]
Data Visualization and BI Tools Selection
Data Visualization plays a very significant role in the world of Business Intelligence (BI). By efficiently identifying trends and patterns, data visualization tools help the user quickly understand and relate to the data, without having to painstakingly sift through it. Here are some criteria to consider.[read more]
Informatica: Establishing Order from Information Chaos
I recently attended the annual Informatica analyst summit to get the latest on that company’s strategy and plans. The data integration provider offers a portfolio of information management software that supports today’s big data and information optimization needs.[read more]
Big Data Success Stories: Take Them with a Grain of Salt
For every successful case study listed in Harvard Business Review, Fortune or the like on “Big Data” success, there are likely thousands of failures. Hidden behind all these “success stories,” there is a figurative graveyard of big data failures that we never see.[read more]
Is Hadoop Knowledge a Must-Have for Today's Big Data Scientist?
Individuals skilled at data manipulation, modeling and programming, often titled “data scientists,” are in great demand due to the growing need to analyze and get value from big data. But how important is it that they understand emerging technologies such as Hadoop?[read more]
Analytics, Graph Search, APIs: Is Facebook Struggling with Big Data?
Facebook has admitted to a major bug in Insights that resulted in inaccurate data. But the social analytics problem is the second bit of evidence that Facebook is struggling with making its own massive data store. Facebook engineers also admitted struggling with Graph Search.[read more]
Resampling Data in Hadoop with RHadoop
Uri Laserson has created an excellent guide to resampling from a large data set in Hadoop. Resampling is an important step in fitting ensemble models (including random forests and other bagging techniques), and Uri provides a step-by-step guide to resampling with RHadoop.[read more]
The moderated business community for business intelligence, predictive analytics, and data professionals.
Recommended to follow
The Predictive Analytics in the Cloud Study is complete!
Register here to access the full results of this exclsuive study on Predictive Analytics and Cloud Technology including a whitepaper, 2 webinars, multiple podcasts and more!
SmartData Collective

About Social Media Today


















“Data variety is indeed both a challenge and an opportunity. I work for Gnip and we provide social data from a variety of sources and are constantly talking about what we call The Social Cocktail. We normalize the streams to help businesses overcome some of the challenges presented in this articles The Curse and Challenge of Data Variety section. Our customers are using multiple data sources ...”
“Interesting post... I've been an admirer of Rick's for years and had no idea that this is what he is interested in right now. Good for SAP for having him at SAPPHIRE.”