Writing for 10/21

Writing for 10/21

by Charlie Benck -
Number of replies: 0

I found that, according to Cade Metz, "Hadoop" is the next big thing in database technology because it allows for the harnessing of massive quantities of data, a service in high demand by companies such as Facebook who have userbases in the millions or billions. Hadoop began as an open source software used by Yahoo for database management and became widespread after it was adopted by Facebook. Furthermore, recently "Greenplum", a tech company based out of California, has produced a new version of Hadoop which offers not only the ability to catalog tremendous amounts of data but also to ask questions of it at a significantly faster pace than was previously possible (over 100 time faster than Impala, a similar product). However, PivotalHD, the new version of Hadoop, will not complete a query if even a single server crashes while previously Hadoop would be able to run off of a large amount of servers and bounce between them in the case of a crash. In short, the article states that the "traditional database" will always have a place in the market, but that Hadoop has reinvented the way that businesses small and large analyze their data.

I was confused as to why the article was claiminng that Hadoop, or alternatively PivotalHD, was the future of databases if an individual machine crashing causes everything to stop. I was under the impression that companies like Facebook were using thousands of servers and as such would not be inclined to use a service that was reliant on every machine being operational at all times. I also was left with questions about how it is that one database could be better at asking questions of itself than another, or what exactly separated the different versions of the software.