Day 1 Write Up

Day 1 Write Up

by Killian Perrigon -
Number of replies: 0

https://www.wired.com/2013/02/pivotal-hd-greenplum-emc/

Hadoop's popularity picked up when facebook was still a small Silicon Valley project. After facebook began blowing up, other companies such as Greenplum and Oracle started looking into and then implementing Hadoop into their own databases. However, it was still looked at as a "non-important" alternative to the traditional database. However, that is changing. Greenplum has been working on changing it for the last two years into what will be known as Pivotal HD. Pivotal HD will be much more efficient at answering queries than the existing version. It was being revamped to act more like a relational database, allowing you to ask a lot of questions in rapid succession using SQL. A downfall of Pivotal HD, however, is that if the server crashes, the query is forced to stop, which is unlike what most have been used to when using Hadoop which runs over multiple small servers and can keep running queries even if one of the servers slams.

Overall, Pivotal HD is a very unique advanced version of Hadoop. It can run large queries and spit out answers efficiently, utilizing techniques from both Relational Database and SQL, with a slight downfall when or if the server crashes, you’ll have to restart your query. It is the future of the database.

A few other notes from today's class session, we learned about Database Management System (DBMS), which handles massive amounts of data (terabytes on terabytes), and how it efficiently responds to thousands of queries per second and also effectively keeps all user's information safe, while also still maintaining a guaranteed 99.9% uptime. The key concepts of the DBMS include the data model (the set of records, how it uses XML and graphs as the format), schema vs data (types vs variables), the Data definition language (DDL, setting up the schema), and lastly Data Manipulation or Query Language (DML, which is used for querying and modifying the database).