# DDIA - chp 2
## Introduction
- Each abstraction layer hides the complexity of the layers below it by providing a clean data model.
- Focus on data storage and querying
## Relation model vs Document model
- Relation model
- Still dominant nowadays
- The goal is to hide that implementation detail behind a cleaner interface
- NoSQL
- Scalability
- Free and open sourced
- Specialized operations
- Dynamic and expressive data model
### The object-relational database mismatch
- Object programming issue (ORM)
- Example

```json
{
"user_id": 251,
"first_name": "Bill",
"last_name": "Gates",
"summary": "Co-chair of the Bill & Melinda Gates... Active blogger.",
"region_id": "us:91",
"industry_id": 131,
"photo_url": "/p/7/000/253/05b/308dd6e.jpg",
"positions": [
{"job_title": "Co-chair", "organization": "Bill & Melinda Gates Foundation"},
{"job_title": "Co-founder, Chairman", "organization": "Microsoft"}
],
"education": [
{"school_name": "Harvard University", "start": 1973, "end": 1975},
{"school_name": "Lakeside School, Seattle", "start": null, "end": null}
],
"contact_info": {
"blog": "http://thegatesnotes.com",
"twitter": "http://twitter.com/BillGates"
}
}
```
### Many-to-one and Many-to-many relationships
- Why Id instead of fixed readable text
- Consistent style and spelling across profiles
- Avoiding ambiguity
- Localization support
- Better search
- Example

### Are Document database repeating history
- History
- Hierarchical model → relation model → document model
- Hierarchical model
- document based tree structure
- Network model
- The only way of accessing a record was to follow a path from a root record along these chains of
links. This was called an access path
- Traversal of a linked list
- Relation model
- Query optimizer
- Comparison to document databases
- Different to hierarchical model: storing nested records within their parent record rather than in a separate
table.
## Relational Versus Document Databases Today
### Document database pros
- Good for document-like structured application
- Schema flexibility
- Example
```json
if (user && user.name && !user.first_name) {
// Documents written before Dec 8, 2013 don't have first_name
user.first_name = user.name.split(" ")[0];
}
```
```sql
ALTER TABLE users ADD COLUMN first_name text;
UPDATE users SET first_name = split_part(name, ' ', 1); -- PostgreSQL
UPDATE users SET first_name = substring_index(name, ' ', 1); -- MySQL
```
### Document database cons
- Cannot refer directly to a nested
item within a document
- many-to-many relationships: The poor support for joins
- Data locality for queries
### Convergence of document and relational databases
- Relational database support document data type
- Document database side, RethinkDB supports relational-like joins
## Query Languages for Data
### Imperative query
- Example
```sql
function getSharks() {
var sharks = [];
for (var i = 0; i < animals.length; i++) {
if (animals[i].family === "Sharks") {
sharks.push(animals[i]);
}
}
return sharks;
}
```
### Declarative query
- Parallel execution
- Better optimization
- Example
```sql
SELECT * FROM animals WHERE family = 'Sharks';
```
### Declarative Queries on the Web
- Declarative


- Imperative query

## MapReduce Querying
- SQL version

- MongoDB’s MapReduce

- Result

- MongoDB aggregation pipeline

## Graph-Like Data Models
### Property Graphs
- Representing a property graph using a relational schema

### The Cypher Query Language
- Cypher query

- Cypher query to find people who emigrated from the US to Europe

- Declarative query optimization
### Graph Queries in SQL
- Example


### Triple-Stores and SPARQL
- Subset of the data in Figure 2-5, represented as Turtle triples

- Another form

- The semantic web
- The RDF data model

- The SPARQL query language

### The Foundation: Datalog
- A subset of the data in Figure 2-5, represented as Datalog facts

- The same query as Example 2-4, expressed in Datalog


## Summary
- Hierarchical data model
- Relation data model
- NoSQL database
- Document databases
- Graph databases
- Other domain specified database