Hadoop: his open source software platform managed by the Apache Software Foundation has proven to be very helpful in storing and managing vast amounts of data cheaply and efficiently.
But what exactly is Hadoop, and what makes it so special? Basically, it's a way of storing enormous data sets across distributed clusters of servers and then running "distributed" analysis applications in each cluster.
It's designed to be robust, in that your Big Data applications will continue to run even when individual servers — or clusters — fail. And it's also designed to be efficient, because it doesn't require your applications to shuttle huge volumes of data across your network.
Hadoop is almost completely modular, which means that you can swap out almost any of its components for a different software tool. That makes the architecture incredibly flexible, as well as robust and efficient.