MapReduce
HStreaming for Hadoop and MapReduce
HStreaming is an analytics platform build on top of Hadoop and MapReduce. It allows analyses on unstructured and structured data in real-time, adding a significant improvement to Hadoop and MapReduce, as these tools are built for batch-processing.[read more]
When Data Flows Faster Than It Can Be Processed
With big data come a few challenges. What can we do when data flows faster than it can be processed? There is a solution that benefits everyone (users, companies such as Google, Amazon, Netflix, Facebook or Twitter, and clients): better use of data science.[read more]
Hadoop Toolbox: When to Use What
Hadoop and Big Data have almost become synonymous. But Hadoop is not just Hadoop now. Over time it has evolved into a big herd of various tools, each meant to serve a different purpose. But glued together they give you a powerpacked combo. Here's my short intro to some very useful tools.[read more]
Hadoop + Ubuntu: The Big Fat Wedding
Now, here is a treat for all you Hadoop and Ubuntu lovers. Last month, Canonical, the organization behind the Ubuntu operating system, partnered with MapR, one of the Hadoop heavyweights, in an effort to make Hadoop available as an integrated part of Ubuntu through its repositories.[read more]
A Unified Environment For Big Data Analytics
We are reaching a crossroads in the world of analytics. Over the years, we have seen analytic environments and data environments begin to merge through the advent of in-database processing within relational database engines. But is this causing a case of "2 steps forward, 4 steps back"?[read more]
Will Big Data Finally Turn CRM Into Something Valuable?
Customer Relationship Management has always involved data, but most of it used to be structured data; new techniques make it possible to process, store and analyse massive amounts of unstructured data as well, which means CRM can finally become a true revenue driver.[read more]
Walmart Makes Big Data Part of Its DNA
All the big data efforts of Walmart are a good example of the massive possibilities of what can be done if big data is truly incorporated in the DNA of the company. Already, Walmart is able to optimize the local assortment of Walmart stores based on customer commentary on social media.[read more]
Big Data Success Stories: Take Them with a Grain of Salt
For every successful case study listed in Harvard Business Review, Fortune or the like on “Big Data” success, there are likely thousands of failures. Hidden behind all these “success stories,” there is a figurative graveyard of big data failures that we never see.[read more]
Is Hadoop Knowledge a Must-Have for Today's Big Data Scientist?
Individuals skilled at data manipulation, modeling and programming, often titled “data scientists,” are in great demand due to the growing need to analyze and get value from big data. But how important is it that they understand emerging technologies such as Hadoop?[read more]
Analytics, Graph Search, APIs: Is Facebook Struggling with Big Data?
Facebook has admitted to a major bug in Insights that resulted in inaccurate data. But the social analytics problem is the second bit of evidence that Facebook is struggling with making its own massive data store. Facebook engineers also admitted struggling with Graph Search.[read more]
5 Must-Watch YouTube Videos on Big Data
Here’s a compilation of my five favorite YouTube videos on the topic of Big Data, including a talk about Hadoop and MapReduce given by Robert Scoble, the CEO of Cloudera.[read more]
Data Analytics Evolution at LinkedIn - Key Takeaways
At Teradata Partners Conference 2012 held this week near Washington D.C., Simon Zhang’s talk on “Data Sciences and Analytics Evolution @LinkedIn,” provided many useful insights for oraganizations wanting to expand into the space of decision making using Data Analytics built on Big Data ecosystems.[read more]
A Social Media Listening Post - Closing the Feedback Loop
Some emerging “Big Data” platforms offer the perfect tool for monitoring public sentiment toward a company or brand, even in the face of the rapid explosion in data volumes from social media, which could easily overwhelm traditional BI analytics tools.[read more]
100 Petabytes of Data in Poop?
University of California computer scientist Dr. Larry Smarr is a man on a mission—to measure everything his body consumes, performs, and yes, discharges. For Dr. Smarr, this data collection has a goal –to fine tune his ecosystem in order to beat a potentially incurable disease. Is this kind of rigorous information collection and analysis the future of healthcare?[read more]
Can Big Data Analytics Solve “Too Big to Fail” Banking Complexity?
Can “Big Data” technologies such as MapReduce/Hadoop, or even more mature technologies like BI/Data Warehousing help banks make better sense of their own complex internal systems and processes, much less tangled and interdependent global financial markets?[read more]
The moderated business community for business intelligence, predictive analytics, and data professionals.
The Predictive Analytics in the Cloud Study is complete!
Register here to access the full results of this exclsuive study on Predictive Analytics and Cloud Technology including a whitepaper, 2 webinars, multiple podcasts and more!
SmartData Collective

About Social Media Today



















“Mike, we are seeing an increase in businesses seeking specialized skills to help address challenges that arose with the era of big data. The HPCC Systems platform from LexisNexis helps to fill this gap by allowing data analysts themselves to own the complete data lifecycle. Designed by data scientists, ECL is a declarative programming language used to express data algorithms across the entire ...”
“Data variety is indeed both a challenge and an opportunity. I work for Gnip and we provide social data from a variety of sources and are constantly talking about what we call The Social Cocktail. We normalize the streams to help businesses overcome some of the challenges presented in this articles The Curse and Challenge of Data Variety section. Our customers are using multiple data sources ...”