Sign up | Login with →

Hadoop

exclusive

A Big Data Cheat Sheet: What Executives Want to Know

May 21, 2015 by Tamara Dull

The Big Data MOPS Series. 

What can Hadoop do that my data warehouse can’t? The short answer is: (1) Store any and all kinds of data more cheaply and (2) process all this data more quickly (and cheaply). The longer answer is: They say that 20% of the data we deal with today is structured data. I also call this traditional, relational data. The other 80% is semi-structured or unstructured data, and this is what I call “big” data.[read more]

exclusive

The Data Lake Debate: Conclusion (With Apologies to the Rolling Stones)

April 30, 2015 by Jill Dyché

The Data Lake Debate.

In an homage to the Rolling Stones, I blithely suggested that if you try sometimes, you get what you need—be it more funding, access to third-party data, a more effective executive sponsor, or a Hadoop distribution provider. It’s not an easy decision, but I’d call the data lake debate a draw. After all, when it comes to the verdict on whether or not a data lake is a worthwhile investment, the success stories will start to emerge. In the meantime, I’m happy to watch those stories unfold. Time, after all, is on my side. Yes it is.[read more]

exclusive

The Data Lake Debate: Pro Delivers Final Rebuttal and Summary

April 27, 2015 by Tamara Dull

Okay, this is where the rubber meets the road. I have three minutes (or ~450 words) to respond to Anne’s final statement and summarize why I still believe a data lake is essential for any organization to take full advantage of its data. Let’s get started![read more]

exclusive

The Data Lake Debate: The Final Word from Negative

April 22, 2015 by Anne Buff
1

The Data Lake Debate.

Well, it seems you took the gloves off this time, Tamara. I appreciate the valiant effort and your passionate belief in the Hadoop ecosystem. However, given your revisit to the definition of the data lake and clarifications about Hadoop, I find it important to repeat the resolution we are debating: “a data lake is essential for any organization to take full advantage of its data”. We are not debating whether a data ecosystem is essential – just the data lake.[read more]

exclusive

The Data Lake Debate: Pro Delivers First Rebuttal

April 10, 2015 by Tamara Dull

Data Lake Debate.

In my opening argument, I defined the data lake as a storage repository that holds a vast amount of raw data in its native format, including structured, semi-structured, and unstructured data. I also mentioned that a data lake can take on different shapes and sizes, and provided these examples: A single data lake; or a data lake with multiple data ponds—similar in concept to a data warehouse/data mart model; or multiple, decentralized data lakes; or a virtual data lake to reduce data movement.[read more]

Not All Hadoop Users Drop ACID

April 8, 2015 by Paige Roberts

ACID.

Hadoop users have all had to give up ACID and settle for the new standard, BASE, as a general rule, but like so many things in the data wrangling industry, that’s changing fast. This may come as a shock to a lot of current Hadoop users and database users considering making the switch to Hadoop, but using Hadoop doesn’t mean you have to give up your ACID habit.[read more]

exclusive

The Data Lake Debate: Pro Cross-Examines Con

April 6, 2015 by Tamara Dull

The Data Lake Debate.

As to be expected, Anne, your arguments against building a data lake are both persuasive and passionate. You’ve made some great points, my friend, but you’re making this way too easy for me. Before I jump into my rebuttal [my next post], I’d like to clarify a few things that you brought up. I’ve boiled it down to three questions. What say you?[read more]

exclusive

The Data Lake Debate: Negative Puts a Stake in the Ground

April 1, 2015 by Anne Buff
4

The Data Lake Debate.

While the idea of a data lake sounds like fun, don’t go jumping in just yet. There are critical factors to consider before taking the plunge and saying that A data lake is essential for any organization to take full advantage of its data. Not only is a data lake not essential for any organization, a data lake may in fact be detrimental for those who do so prematurely.[read more]

exclusive

Data Lakes and Network Optimization: What’s Next for Telecommunications and Big Data

March 31, 2015 by Sameer Nori

Telecommunication. 

Relational data warehouses served communications service providers well in the past, but it’s time to start thinking beyond columns and rows. Unstructured data will be the fuel that powers risk management and decision-making in the near future. And to use all sorts of data to its fullest potential, we need new ways of storing, accessing and analyzing that data.[read more]

exclusive

The Data Lake Debate: Questioning the Pro

March 27, 2015 by Anne Buff

The Data Lake Debate.

Technology is not the answer for every big data issue (well any data for that matter). I get it - Hadoop and the concept of data lakes are hot topics. However, just because they are trending in the world of technology does not mean that they will solve critical business issues such as taking full advantage of an organization’s data. I stand firm that data storage, data lake or any other type, is not the essential element for an organization to take full advantage of its data.[read more]

exclusive

The Data Lake Debate: Pro is Up First

March 20, 2015 by Tamara Dull

The Data Lake Debate column.

To data lake or not to data lake? That is the question du jour, precipitated by the big data tsunami that hit our enterprise shores a few years ago. Unfortunately, the answer to this question is not so cut-and-dried, as we can see by this small sampling of headlines: Gartner says beware of the data lake fallacy. Gartner gets the ‘data lake’ concept all wrong.[read more]

exclusive

The Data Lake Debate: The Introduction

March 6, 2015 by Jill Dyché
2

Data Lake Debate column.

Will filling up your data lake will help or hurt the cause? On the one hand, a data lake full of raw, multi-structured, and heterogeneous data from across systems and business processes, could be the proverbial “single version of truth” that up until now had just been the unconsummated hope of many an executive.[read more]

exclusive

Hygienic Hadoop Data Lakes Not Just Happenstance

February 23, 2015 by Paul Barsch

Risky Business column.

If you work with Hadoop on a daily basis you already know that data cannot simply be dumped into Hadoop’s file system and be of high value to rank and file business users. To be sure, data management is crucial if you’re planning on Hadoop serving as a true lake or “hub” for all your organization’s data.[read more]

exclusive

10 Amazing Data Analytics Platforms Everyone Should Know About

February 18, 2015 by Bernard Marr
2

Big Data Guru column.

The past few years has seen an explosion in the number of platforms available for big data analytical tasks. The open source Hadoop framework is free to use, but very technical to set up and not specialized towards any particular job or industry. To use it in your business, you need a “platform” to operate it from.[read more]

Aligning Big Data

January 18, 2015 by Martyn Jones

Aligning big data.

This is an overview of the realignment and placement of Big Data into a more generalized architectural framework, an architecture that integrates data warehousing (DW 2.0), business intelligence and statistical analysis.[read more]