Design a search engine prototype
===
The task is to design a system like Google. This is a very basic prototype kind of a project, spanning not more than 300-400 lines of code.
Some of the basic components of a search engine are:-
## Web crawler
Read and Parse all the web pages and store their content.
Inputs: a web page with `href` links, words in `p` tags.
### link1.com -> string format read the page data.
"""
<body>
<p>Dog</p>
<p>Dog</p><p>Cat</p><p>Fish</p><p>Fish</p><p>Fish</p><p>Fish</p><p>Fish</p>
<a href="https://link2.com">second website</a>
<a href="https://link3.com">third website</a>
</body>
"""
### link2.com
"""
<body>
<p>Dog</p>
<p>Dog</p><p>Dog</p><p>Dog</p><p>Dog</p><p>Cat</p><p>Fish</p>
<a href="https://link3.com">second website</a>
</body>
"""
### link3.com
<body>
<p>Dog</p>
<p>Dog</p><p>Dog</p><p>Cat</p><p>Fish</p><p>Cat</p><p>Cat</p><p>Cat</p><p>Cat</p>
<a href="https://link3.com">second website</a>
</body>
## Web Index
Ranks web url as per the word counts.
Maintain a count of words for every website.
## Retriever
User enters a query let's say a `dog`, the response should be a list of all the urls in decending order or the word count for word `dog`