Hadoop, for many years, was the leading open source Big Data framework but recently the newer and more advanced Spark has taken over. Spark is reported to be 100 times faster although it lacks its own distributed storage system. For this reason many projects involve installing Spark on top of Hadoop, where Spark’s advanced analytics can make use of data stored using the Hadoop Distributed File System (HDFS).
 What really gives Spark the edge is speed. Spark handles most of its operations ‘in memory’- copying them from the distributed physical storage into far faster logical RAM memory. Spark’s speed of handling advanced data processing tasks such as real time stream processing and machine learning is much more than what could be achieved by Hadoop. Faster dynamic data handling gives Spark the upper hand over Hadoop.
 However it must be concluded that these two frameworks are not necessarily mutually exclusive and do not perform exactly the same tasks. In fact using both of them together can actually provide better results than using either one separately.
 
 For more information visit:
 http://www.forbes.com/sites/bernardmarr/2015/06/22/spark-or-hadoop-which-is-the-best-big-data-framework/

