--- tags: ironhack, lecture, --- <style> .markdown-body img[src$=".png"] {background-color:transparent;} .alert-info.lecture, .alert-success.lecture, .alert-warning.lecture, .alert-danger.lecture { box-shadow:0 0 0 .5em rgba(64, 96, 85, 0.4); margin-top:20px;margin-bottom:20px; position:relative; ddisplay:none; } .alert-info.lecture:before, .alert-success.lecture:before, .alert-warning.lecture:before, .alert-danger.lecture:before { content:"đŸ‘šâ€đŸ«\A"; white-space:pre-line; display:block;margin-bottom:.5em; /*position:absolute; right:0; top:0; margin:3px;margin-right:7px;*/ } b { --color:yellow; font-weight:500; background:var(--color); box-shadow:0 0 0 .35em var(--color),0 0 0 .35em; } .skip { opacity:.4; } </style> ![Ironhack Logo](https://i.imgur.com/1QgrNNw.png) # MongoDB | Data Models ## Learning Goals After this lesson, you will be able to: - Structure your documents and collections by using data models ## Introduction :::info lecture MongoDB permet une trĂšs grande flexibilitĂ© dans l'organisation des donnĂ©es et contrairement aux bases de donnĂ©es relationnelles. --- **Nous allons pouvoir imbriquer des objects/tableaux dans un mĂȘme document.** ![](https://i.imgur.com/zBIjkpH.png) lĂ  oĂč en relationnel tout est sur un seul niveau : ![](https://i.imgur.com/DBDjRpo.png) ::: One of the advantages of Mongo is its flexible schema; this allows us to create documents with different structures. Even though we could have some with various fields in the same collection, the most common scenario is that we group the ones with similar properties in the same collection. When designing the data model of a Database we have to consider the types of data that we will use in the application. The key is to comprehend the structure of the documents and how are represented the relationships between them. ## Document Structure :::info lecture En plus d'imbriquer les choses entre elles au sein d'un meme document (**SUBDOCUMENT ou EMBED**) / nous allons Ă©galement pouvoir faire des liens vers d'autres documents (**REFERENCE**) ![](https://i.imgur.com/GRDKDpb.png) ::: The critical decision in designing data models for MongoDB applications revolves around the structure of documents and how it represents relationships between data. Two tools allow us to show these links: - Referencing documents - Embedding documents Let's look at the two types in depth but first let's understand what **relations** means. ## Referencing documents - Relations References store the relationships between data by including links or references from one document to another. Applications can resolve these references to access the related data. Broadly, these are normalized data models. The relations are associations between documents of different collections through their `_id`. Let's assume we have a `contact` and `access` documents. If we wanted to relate a `contact` and an `access` to a particular `user`, we could simply add a property that relates them through the `user_id`: :::info lecture `user_id` fait ici rĂ©fĂ©rence Ă  un document d'une autre collection. C'est son `ObjectId` que l'on rĂ©fĂ©rence. ::: ![](https://i.imgur.com/GmBjx9W.png) If we want to get the contact information of a particular user first, we'll get the `_id` of the user, and then we'll make a second query to obtain the contact information based on the `_id` obtained previously. :::info lecture Dans ce exemple, `Willy` est un document, son adresse `123 Fake street` en est un autre distinct. Les 2 sont liĂ©s par la propriĂ©tĂ© `user_id` dans l'adresse. ::: **User** Collection ```javascript { _id: ObjectId("593e7a1a4299c306d656200f"), name: "Willy", lastName: "Doe", email: "willydow@gmail.com", birthday: 1990-12-14 01:00:00.000, phone: "123412399" } ``` **Address** Collection ```javascript { _id: ObjectId("59f30dd86f0b06a96e31bbb9"), user_id: ObjectId("593e7a1a4299c306d656200f"), street: "123 Fake Street", city: "Faketon", state: "MA", zip: "12345" } ``` ## Embedding documents Another way of relating documents is by **embedding them**, saving the related document inside the main one. As you can see in the following picture: :::info lecture Mais grĂące Ă  la flexibilitĂ© de MongoDB, on peut tout aussi bien "stocker" **des sous-objects dans un meme document** : c'est ce qu'on appelle les sous-documents ou embed-documents. ::: ![](https://i.imgur.com/yrliwPP.png) The same example of using embedded documents: :::info lecture Le mĂȘme exemple, cette fois-ci : `address` est maintenant un sous-document de `Willy`. ::: ```javascript { _id: ObjectId("593e7a1a4299c306d656200f"), name: "Willy", lastName: "Doe", email: "willydow@gmail.com", birthday: 1990-12-14 01:00:00.000, phone: "123412399", address: { street: "123 Fake Street", city: "Faketon", state: "MA", zip: "12345" } } ``` ### Multiple Sub-documents When we have multiple documents (subdocuments) that can embed in the same component, in this case, `addresses` will be an array of objects. :::info lecture PIRE, ça marche Ă©galement avec les tableaux : `Willy` contient maintenant 2 sous-documents adresse. ::: ```javascript { _id: ObjectId("593e7a1a4299c306d656200f"), name: "Willy", lastName: "Doe", email: "willydow@gmail.com", birthday: 1990-12-14 01:00:00.000, phone: "123412399", addresses: [ { street: "123 Fake Street", city: "Faketon", state: "MA", zip: "12345" }, { street: "1 Some Other Street", city: "Boston", state: "MA", zip: "12345" } ] } ``` :::danger lecture Il est Ă  noter que les documents (principaux) ont une limite de taille 16Mb. ::: :::warning :bomb: Be aware of how many subdocuments you embed into a single one. **Keep in mind that a document in Mongo cannot be more than 16Mb in size.** ::: ## Defining Your Document Schema :::info lecture Tout l'art va donc d'ĂȘtre d'architecturer ses donnĂ©es en choisissant entre embed et reference. Chacune aura ses avantages et ses inconvĂ©nients... ![](https://i.imgur.com/HsZUh4I.png) ![](https://i.imgur.com/5wWuI8e.png) ::: You should start the schema design process by considering the application’s query requirements. You should model the data in a way that takes advantage of the document model’s flexibility. :::warning :cactus: Perhaps one of the more difficult parts of Mongo is designing the structure of our data. To do this efficiently, we will need to know how the application will use the database, what data we want to get, how often we get that data, what size the documents can be, etc. ::: Let's take a look at some patterns! ### Model One-to-One Relationships with Embedded Documents Consider the same example that maps `patron` and `address` relationships. The example illustrates the advantage of embedding over referencing if you need to view one data entity in the context of the other. In this one-to-one relationship between `patron` and `address` data, the `address` belongs to the `patron`. In the normalized data model, the `address` document contains a reference to the `patron` document. :::info lecture Dans cet ex de referencing, si notre application a souvent/toujours besoin des 2 informations en mĂȘme temps... ::: ```javascript { _id: ObjectId("593e7a1a4299c306d656200f"), name: "Joe Bookreader" } { patron_id: ObjectId("593e7a1a4299c306d656200f"), street: "123 Fake Street", city: "Faketon", state: "MA", zip: "12345" } ``` If we frequently retrieve the address data with the name information, then with referencing, your application needs to issue multiple queries to resolve the reference. The better data model would be to embed the address data in the patron data, as in the following document: :::info lecture Il sera alors plus "judicieux" d'opter pour l'embeding : ::: ```javascript { _id: ObjectId("593e7a1a4299c306d656200f"), name: "Joe Bookreader", address: { street: "123 Fake Street", city: "Faketon", state: "MA", zip: "12345" } } ``` :::info lecture 👌En effet, accĂ©der aux 2 informations ne fera l'objet que d'une seule requĂȘte (et non plus 2). ::: **With the embedded data model, your application can retrieve the complete patron information with one query.** ### Model One-to-Many Relationships with Embedded Documents Following the same example, it illustrates the advantage of embedding over referencing if you need to view many data entities in the context of another. In this one-to-many relationship between `patron` and `address` data, the `patron` has multiple `address` entities. In the normalized data model, the `address` documents contain a reference to the `patron` document. :::info lecture Ceci est d'autant plus vrai si les adresses sont multiples... ::: ```javascript { _id: ObjectId("593e7a1a4299c306d656200f"), name: "Joe Bookreader" } { patron_id: ObjectId("593e7a1a4299c306d656200f"), street: "123 Fake Street", city: "Faketon", state: "MA", zip: "12345" } { patron_id: ObjectId("593e7a1a4299c306d656200f"), street: "1 Some Other Street", city: "Boston", state: "MA", zip: "12345" } ``` If your application frequently retrieves the address data with the name information, then your application needs to issue multiple queries to resolve the references. A more optimal schema would be to embed the address data entities in the patron data, as in the following document: :::info lecture Nous obtenons ici toutes les infos en 1 seule requĂȘte, contrairement Ă  3. ::: ```javascript { _id: ObjectId("593e7a1a4299c306d656200f"), name: "Joe Bookreader", addresses: [ { street: "123 Fake Street", city: "Faketon", state: "MA", zip: "12345" }, { street: "1 Some Other Street", city: "Boston", state: "MA", zip: "12345" } ] } ``` **With the embedded data model, your application can retrieve the complete patron information with one query.** ### Model Many-to-Many Relationships with Document References :::info lecture Cependant, n'opter que pour l'embeding peut avoir Ă©galement des inconvĂ©nients... ::: Consider the following example that maps `publisher` and `book` relationships. The example illustrates the advantage of referencing over embedding to avoid repetition of the `publisher` information. Embedding the `publisher` document inside the `book` document would lead to the repetition of the `publisher` data, as the following documents show: :::info lecture Ici par ex, le `publisher` Ă©tant embed, cette information est dupliquĂ©e entre les livres. Si nous voulons mettre Ă  jour notre publisher, il nous faudra le faire dans chacun des livres le portant : ::: ```javascript { title: "MongoDB: The Definitive Guide", author: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher: { name: "O'Reilly Media", founded: 1980, location: "CA" } } { title: "50 Tips and Tricks for MongoDB Developer", author: "Kristina Chodorow", published_date: ISODate("2011-05-06"), pages: 68, language: "English", publisher: { name: "O'Reilly Media", founded: 1980, location: "CA" } } ``` To avoid repetition of the `publisher` data, use references and keep the `publisher` information in a separate collection from the `book` collection. :::info lecture Il serait ici plus judicieux de stocker nos publishers dans une collection propre, et de les lier en rĂ©fĂ©rence dans nos livres. ::: ```javascript { _id: ObjectId("593e7a1a2312c306d321323g"), name: "O'Reilly Media", founded: 1980, location: "CA" } { _id: ObjectId("593e7a1a4299c306d656200f"), title: "MongoDB: The Definitive Guide", author: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English", publisher_id: ObjectId("593e7a1a2312c306d321323g") } { _id: ObjectId("593e7b2b4299c306d656299d"), title: "50 Tips and Tricks for MongoDB Developer", author: "Kristina Chodorow", published_date: ISODate("2011-05-06"), pages: 68, language: "English", publisher_id: ObjectId("593e7a1a2312c306d321323g") } ``` --- When using references, the growth of the relationships determines where to store it. If the number of books per `publisher` is small with limited growth, storing the `book` reference inside the `publisher` document may sometimes be useful. Otherwise, if the number of books per `publisher` is unbounded, this data model would lead to mutable, growing arrays, as in the following example: ```javascript { name: "O'Reilly Media", founded: 1980, location: "CA", books: [123456789, 234567890, ...] } { _id: ObjectId("593e7a1a4299c306d656200f"), title: "MongoDB: The Definitive Guide", author: [ "Kristina Chodorow", "Mike Dirolf" ], published_date: ISODate("2010-09-24"), pages: 216, language: "English" } { _id: ObjectId("593e7b2b4299c306d656299d"), title: "50 Tips and Tricks for MongoDB Developer", author: "Kristina Chodorow", published_date: ISODate("2011-05-06"), pages: 68, language: "English" } ``` ## Independent Practice (10 min) Let's practice modeling! Here you have some common problems. Please pair with another student and propose how to implement the schema for the following scenarios. :::warning In this exercise, deciding what fields go into which collection is less important. Focus on deciding whether to **use relations** or **embed documents**. ::: ### Twitter - Users - Tweets - Followers - Favorites :::info lecture [![](https://docs.google.com/drawings/d/e/2PACX-1vQPW4-sojYAiUmACSTpSDqTIAFhzysRtfYHf8Liipt8DXVOrtBEBd4pE_9k0wbWH1WXTmYvKEg9iV37/pub?w=1370&h=556)](https://docs.google.com/drawings/d/1BRjRieiYRmLoqsCzz_mEeWsSdvjfBw8XcpUP9nbYqYg/edit?folder=0AAQNm5fx5y30Uk9PVA) ::: ### Airbnb - Users - Homes - Bookings - Reviews ### Spotify - Users - Artists - Songs - Albums - Genres - Favorites ## Summary In this lesson, we learned how to design databases using embedded documents and relations. ## Extra Resources Here you have interesting articles if you want to learn more about designing documents. - [6 Rules of Thumb for MongoDB Schema Design: Part 1](http://blog.mongodb.org/post/87200945828/6-rules-of-thumb-for-mongodb-schema-design-part-1) - [6 Rules of Thumb for MongoDB Schema Design: Part 2](http://blog.mongodb.org/post/87892923503/6-rules-of-thumb-for-mongodb-schema-design-part-2) - [Thinking in Documents: Part 1](https://www.mongodb.com/blog/post/thinking-documents-part-1?jmp=docs) - [Thinking in Documents: Part 2](https://www.mongodb.com/blog/post/thinking-documents-part-2) - [Data Modeling Article](https://www.infoq.com/articles/data-model-mongodb)