Social Movie Service === ## Table of Contents [TOC] # Architecture Diagram ![](https://i.imgur.com/VixwZtU.png) 1. We have different client applications calling the API gateway. 2. Requests coming into the API gateway are authenticated and authorized by the User / Auth service. For login requests, we will receive credentials (username and password) for the service to verify and we pass back a authorization token in the response. For non-login requests, we take the token from the request's authorization header and validate this. 3. We can have user roles/groups setup. We can establish rules to allow/deny access to certain routes. 4. Our User service handles requests that involve: * Friend/unfriending requests * Adding/updating information 5. User graph databases a. **Users & Relationships** - We'll need to store user information and relationships in a graph database. The User service will use this to add/modify relationships between users. The Recommendation service will only be reading from the database to determine user's friends. b. **User Movie Preferences** - This will contain movies that a user likes, their movie ratings, favorite genre, favorite actors, favorite directors, etc. The Recommendation service will use these relationships to generate recommendations. 6. Recommendation service will handle requests for recommended movies. Using a graph databse we can send user's recommendation based of, but not limited to: * Friend's favorites * Favorite genre * Favorite actor * Average user ratings 7. Our Movie service serve requests pertaining to movies. We don't need to authorize the endpoints in this service since we won't be manipulating any data, but we'll need to implement rate limit users to prevent abuse. a. **Movie Metadata** - This will contain relationships between casts, actors, and movies. This will be used by the Recommendation service to generate recommendations. b. **Trailer Videos & Image** - we can introduce a cache for our movie service to cache popular movie metadata. This will lower latency and lighten the load on the Movie database. c. **CDN** - we can move popular trailer videos and images into CDNs for quicker access. 8. We'll need to grab movie data from an external source such as [OMDb API](http://www.omdbapi.com/) and update our databases. A worker process that runs on a schedule will pull the data and load it into our relational and graph database. 9. We could be pulling trailer videos from different sources. There will likely be some differences in format that our system doesn't support. We should encode our videos to a consistent format. 10. We will store videos and images in a distributed file storage system. # Data Sharding This system will have billions of users and 100s of millions of movies, actors, and directors. Data sharding will allow us to distribute the load. User Data --- **User & Relationships** we would want to keep in one shard to keep the relationships for users in different countries. We can instead setup replicas for each geographical region or country. **User Movie Preferences** we can also setup replicas for each geographical region or country. **User Database** we can have shards for each country and store user data based on the user's country of residence. However, we will need to sync the data for the **User & Relationships** and **User Movie Preferences** graph databases. Movie Data --- **Movie Database** we'll be generating our own internal MovieID on the Movie Data Loader process. Using this ID and consistent hashing, we can evenly distribute movie data. **Movie Metadata** we would need to keep all the movies in the world because we have international casts and movie directors that produce and act in movies worldwide. However, we can shard to more granular relationships such as: actor-to-movie, director-to-movie, genre-to-movie. The Movie Data Loader process can handle this splitting process. # Deliverables Github Project --- https://github.com/ber2go/imdb-service Examples on how to run the system and run tests are on the README.md What has been implemented? --- ![](https://i.imgur.com/kcyA7vN.png) I had initially planned to create the User service and Recommendation service for this project, where the Recommendation service would serve as a reverse proxy for the User service. This was so I could use gRPC for both services to communicate. Unfortunately, learning about graph databases and Neo4J took all the time that I had. This is not how it is layed out in the Architecture Diagram, but this would have been a quickest way for me to learn gRPC and fulfill the requirement. ![](https://i.imgur.com/X1MXrxA.png) Here are the endpoints that were implemented. There's an **openapi.yml** file in the repository that will have more details on how to use these endpoints. I believe the three most important endpoints are **Create user**, **Add a user as friend** because in order to **Return a list of movies that the user's friends liked**, I needed to be able to create user and establish a relationship between the User nodes via **Add a user as friend**. **Add a user as friend** is not how I envisioned establishing relationships in this system, but I needed a quick implementation to establish relationships between nodes. It should be a two step process where a user sends a friend request to another user and that user will need to approve the request. ![](https://i.imgur.com/GxTB17T.png)