Quick Primer to Grad databases
===
**Problems on Transactions and Concurrency**:
- table bytes = tuple size (from query) * number of tuples, number of tuples per page = byte per page / bytes for a single tuple. Number of pages needed for the entire table = table size / byte of a single page. Say there are `X` unique/different values, from a certain condition, do (number of tuples / unique values) = number of records we need to consider.
**Query Optimization**
- Cost is in terms of I/O operations
- Nested Loops -
- total block transfer or COST = br+mr*bs
- r outer relation, s inner relation
- $P_n$ will be tuples per page in `n`
- BNLJ -
- Needs an additional parameter b (buffer pages)
- cost = M + N*ceil(M/(b-2)) – {ignores memory for hash tables}
- b-2 of those pages are for outer relation R.
- Sort merge join -
- Sort both relations
- Use external merge sort if needed
- Hash Joins -
-
**Problems on ARIES**
- Analysis -
- Check which log records aren't ended or committed, add them to dirty page table
- Remove ended transactions from transaction table (not committed)
- Add unfinished xacts to the TT
- *How to answer*: Draw tables in the same shape
- Redo -
- Start with smallest LSN
- For any pageID being accessed, check in database, if the corresponding pageLSN of the data in the database is less than the currentPageLSN, then redo, and update pageLSN to currentPageLSN
- Write like this: LSN XY -> Action1, Action2
- Append an END record at last with last LSN and END operation
- Undo -
- Find last LSN with unfinished TXN, and write toUndo = {thatLSN}
- Write "We undo XYZ operation, and append a CLR to log" for all unfinished txns
- ToUndo={60}
- Next, we need to undo LSN 60. Again we undo it and write the following CLR record:
- eg: 150 3 D 140 CLR 40
- Write a new log table at the end with the title "Extra log records" - it starts and ends with END
**MongoDB**
```mongodb
show dbs // all db
db // current db
use this_db // switch
db.dropDatabase()
show collections
db.createCollection('posts')
db.posts.insert( [
{title: "blog post 1 title", body: "blog post 1 content"},
{title: "blog post 2 title", body: "blog post 2 content"},
])
db.posts.find()
/* returns a cursor - show 20 results - "it" to display more */
db.posts.find({'title' : 'blog 1 title'})
/* update only specific fields */
db.posts.update({"category": "technology"}, {$set: {"category": 'computer science'}})
//upsert complete row
db.posts.update({ '_id' : 1 },
{
title: 'Post one',
body: 'New body for post 1',
},
{
upsert: true
})
// incremenent field value
db.posts.update({ "_id": 1 },
{
$inc: {
views: 5
}
})
// delete
db.posts.remove({ title: 'Post 1' })
# ascending order
db.posts.find().sort({ title: 1 }).pretty()
# descending order
db.posts.find().sort({ title: -1 }).pretty()
/* Create Index on single field */
db.posts.createIndex({"title": 1})
/* Create compound Index */
db.posts.createIndex({"title": 1, "date": 1})
Question:
{
_id: '01039',
city: 'HAYDENVILLE',
loc: [ -72.703178, 42.381799 ],
pop: 1387,
state: 'MA'
}
Write the following queries in MongoDB:
1. Find all the zipcodes where the population is either less than 5000 or greater than
50000. Return only the zipcode ("_id") and the city (“city”).
Ans: db.zips.find( { $or: [ {pop : { $lt: 5000}}, {pop : { $gt: 50000}}] }, {city:1})
```