# <center><i class="fa fa-edit"></i> Navigating MongoDB: Indexing and Aggregation Pipeline </center>
###### tags: `Internship`
:::info
**Goal:**
To gain a basic understanding of MongoDB fundamentals. Focus on indexing and aggregation pipeline.
- [x] Aggregation Framework
- [x] `sort()` and `limit()`
- [x] Introduction to Indexes
- [x] Introduction to Data Modeling
- [x] Upsert - Update or Insert?
**Resources:**
[MongoDB Basics Online Course](https://university.mongodb.com/courses/M001/about)
[MongoDB 常用Query指令](/b2Y7lgR1RaeQyMHp9RGHdQ)
[Database Commands](https://www.mongodb.com/docs/manual/reference/command/)
:::
### Aggregation Framework
- Exceeds filtering capabilities of MQL
- compute
- reshape
- reoganize
- Non-filtering stages do not modify the data. Work with the data in the cursor
- Syntax is in form of a pipleine where stages are excuted in the order they are listed in
**Operators**
- `$group`: takes the incoming stream of data, and siphons it into multiple distinct reservoirs
**Procedures**
- Connect to Atlas Cluster
```
mongo "mongodb+srv://<username>:<password>@<cluster>.mongodb.net/admin"
```
- Navigate to database
```
use sample_airbnb
```
- Find all documents that have `Wifi` as one of the amenities. Only include `price` and `address` in the resulting cursor:
```
db.listingsAndReviews.find({ "amenities": "Wifi" },
{ "price": 1, "address": 1, "_id": 0 }).pretty()
```
- **SYNTAX**: Using the aggregation framework find all documents that have `Wifi` as one of the amenities. *Only include* `price` and `address` in the resulting cursor:
```
db.listingsAndReviews.aggregate([
{ "$match": { "amenities": "Wifi" } },
{ "$project": { "price": 1,
"address": 1,
"_id": 0 }}]).pretty()
```
- Find one document in the collection and only include the `address` field in the resulting cursor
```
db.listingsAndReviews.findOne({ },{ "address": 1, "_id": 0 })
```
- Project only the address field value for each document, then group all documents into one document per `address.country` value
```
db.listingsAndReviews.aggregate([ { "$project": { "address": 1, "_id": 0 }},
{ "$group": { "_id": "$address.country" }}])
```
- Project only the `address` field value for each document, then group all documents into one document per `address.country` value, and count one for each document in each group
```
db.listingsAndReviews.aggregate([
{ "$project": { "address": 1, "_id": 0 }},
{ "$group": { "_id": "$address.country",
"count": { "$sum": 1 } } }
])
```
### `sort()` and `limit()`
- Both cursor methods (applied to the results set that lives in the cursor)
- MongoDB assumes that you first sort then limit regardless of the order of these methods typed
**Procedures**
- Connect to Atlas Cluster
```
mongo "mongodb+srv://<username>:<password>@<cluster>.mongodb.net/admin"
```
- Commands
```
use sample_training
db.zips.find().sort({ "pop": 1 }).limit(1)
db.zips.find({ "pop": 0 }).count()
db.zips.find().sort({ "pop": -1 }).limit(1)
db.zips.find().sort({ "pop": -1 }).limit(10)
db.zips.find().sort({ "pop": 1, "city": -1 })
```
### Introduction to Indexes
- Analogous to the index of a book
- Special data structure that stores a small portion of the collection's data set in an easy to traverse form
- Optimizes queries -> avoids sorting
**Procedures**
- Connect to Atlas Cluster
```
mongo "mongodb+srv://<username>:<password>@<cluster>.mongodb.net/admin"
```
- Commands
```
use sample_training
db.trips.find({ "birth year": 1989 })
db.trips.find({ "start station id": 476 }).sort( { "birth year": 1 } )
db.trips.createIndex({ "birth year": 1 })
db.trips.createIndex({ "start station id": 1, "birth year": 1 })
```
### Introduction to Data Modeling
- A way to organize fields in a document to support your application performance and querying capabilities
- Evolving application implies an evolving data model
::: warning
Rule of Thumb: Data is stored in the way that it is used
:::
Example: two patients. Patient 1 has medical records and prescriptions, prefer to be contacted by email. Patient 2 does not. Want to optimize collection for docotors to use.
What doctors need:
- current prescriptions
- diagnoses
- patient contact info
- ability to cross reference medication info
- side effects
- allergies
- additional info
Solution:
- Patient Records collection: structure data that is most helpful for our application
- Medications collections: for various medications
- Everything that is reguarly queried together is stores together for faster retieval
### Upsert - Update or Insert?
- Hybrid of `update` and `insert` commands
- Syntax: `db.<collection>.updateOne({<query>}, {<update>}, {"upsert": true})`
- By default, upsert is false
- When set to true it will either `update` or `insert`
- `update`: Documents that match the criteria for the update operation
- `insert`: No documents that match the same criteria
**Procedures**
- Common Commands
```
db.iot.updateOne({ "sensor": r.sensor, "date": r.date,
"valcount": { "$lt": 48 } },
{ "$push": { "readings": { "v": r.value, "t": r.time } },
"$inc": { "valcount": 1, "total": r.value } },
{ "upsert": true })
```