Quick Primer to Grad databases === **Problems on Transactions and Concurrency**: - table bytes = tuple size (from query) * number of tuples, number of tuples per page = byte per page / bytes for a single tuple. Number of pages needed for the entire table = table size / byte of a single page. Say there are `X` unique/different values, from a certain condition, do (number of tuples / unique values) = number of records we need to consider. **Query Optimization** - Cost is in terms of I/O operations - Nested Loops - - total block transfer or COST = br+mr*bs - r outer relation, s inner relation - $P_n$ will be tuples per page in `n` - BNLJ - - Needs an additional parameter b (buffer pages) - cost = M + N*ceil(M/(b-2)) – {ignores memory for hash tables} - b-2 of those pages are for outer relation R. - Sort merge join - - Sort both relations - Use external merge sort if needed - Hash Joins - - **Problems on ARIES** - Analysis - - Check which log records aren't ended or committed, add them to dirty page table - Remove ended transactions from transaction table (not committed) - Add unfinished xacts to the TT - *How to answer*: Draw tables in the same shape - Redo - - Start with smallest LSN - For any pageID being accessed, check in database, if the corresponding pageLSN of the data in the database is less than the currentPageLSN, then redo, and update pageLSN to currentPageLSN - Write like this: LSN XY -> Action1, Action2 - Append an END record at last with last LSN and END operation - Undo - - Find last LSN with unfinished TXN, and write toUndo = {thatLSN} - Write "We undo XYZ operation, and append a CLR to log" for all unfinished txns - ToUndo={60} - Next, we need to undo LSN 60. Again we undo it and write the following CLR record: - eg: 150 3 D 140 CLR 40 - Write a new log table at the end with the title "Extra log records" - it starts and ends with END **MongoDB** ```mongodb show dbs // all db db // current db use this_db // switch db.dropDatabase() show collections db.createCollection('posts') db.posts.insert( [ {title: "blog post 1 title", body: "blog post 1 content"}, {title: "blog post 2 title", body: "blog post 2 content"}, ]) db.posts.find() /* returns a cursor - show 20 results - "it" to display more */ db.posts.find({'title' : 'blog 1 title'}) /* update only specific fields */ db.posts.update({"category": "technology"}, {$set: {"category": 'computer science'}}) //upsert complete row db.posts.update({ '_id' : 1 }, { title: 'Post one', body: 'New body for post 1', }, { upsert: true }) // incremenent field value db.posts.update({ "_id": 1 }, { $inc: { views: 5 } }) // delete db.posts.remove({ title: 'Post 1' }) # ascending order db.posts.find().sort({ title: 1 }).pretty() # descending order db.posts.find().sort({ title: -1 }).pretty() /* Create Index on single field */ db.posts.createIndex({"title": 1}) /* Create compound Index */ db.posts.createIndex({"title": 1, "date": 1}) Question: { _id: '01039', city: 'HAYDENVILLE', loc: [ -72.703178, 42.381799 ], pop: 1387, state: 'MA' } Write the following queries in MongoDB: 1. Find all the zipcodes where the population is either less than 5000 or greater than 50000. Return only the zipcode ("_id") and the city (“city”). Ans: db.zips.find( { $or: [ {pop : { $lt: 5000}}, {pop : { $gt: 50000}}] }, {city:1}) ```