CSC230-3 2019: Day 2 Reflection

Relational algebra is a formal language that is used to form languages like SQL. The simplest query in relational algebra is just the name of the relation, and the result would be the whole relation. A sigma is used for the command "select", so it picks out certain rows from the original relation and returns only a subset of the relation. The "and" operator (caret) can be used if you want to base this new subset of of two different conditions. To choose only a portion of the columns you can use the "project" operator, which is a pi symbol. If you want only a subset of the rows and the columns, you can use the project operator on the select operator to get only a portion of both.

One key difference between relational algebra and SQL is that in relational algebra there will never be two rows with the exact same values in every column because duplicate values are always removed. Since SQL is based on "multi-sets" duplicate values are not removed. To combine two relations you can use the cross-product (cartesian product) operator, just like we practiced in class on Monday. When used with other operators like sigma, this can be useful if you want to select information from multiple tables that meet certain conditions. This can also be done using Natural Join (bow tie) and you do not have to add a condition for equality like you do with cross-product. Finally, the Theta Join does essentially the same thing with a "theta condition" that must be passed. This type of join is more commonly used in database management systems.

The union operator can be used to combine information vertically (add rows to a table). The difference operator can be used to remove values that appear in another relation. The intersection operator returns any rows that are the same in two different relations. The rename operator reassigns the schema and may be necessary when trying to join two relations.