--- tags: freelance, work-pending, python, scraping, nigeria, legalfyre title: Sample - Case Law Extraction - NWLR --- We want to scrape every article on this site into a PostGreSQL database. [Here's the data model](https://raw.githubusercontent.com/authwit/bot-caselaws/main/data/craft-ERD.svg?token=AAFPGRW2ZNQQKP5XG2R4QQ3AZO7QM) ### Credentials - URL: https://nwlronline.com/report - username: runsm.an@gmail.com - password: QweCFn4bdZze To understand how we want to store the data, let's use an example from https://nwlronline.com/readpage?q=resultHeader&id=MTUxNl8xXzEyNg==&signature=$2y$10$WPg/QwsCYJ3nhQARPGCGa.lyk/gTwQcK3SS9fOddriv7Dp2D9kkje&exp_id=$2y$10$uHPh17hi6oFCr3Qa7DRUwei4QTTAUaXTJTspUL..Yuk3jVeQI3YKC (This is single article) ### Data Points [Using this example.](https://nwlronline.com/readpage?q=resultHeader&id=MTUxNl8xXzEyNg==&signature=$2y$10$WPg/QwsCYJ3nhQARPGCGa.lyk/gTwQcK3SS9fOddriv7Dp2D9kkje&exp_id=$2y$10$uHPh17hi6oFCr3Qa7DRUwei4QTTAUaXTJTspUL..Yuk3jVeQI3YKC) With this on-hand, here's the section to scrape to get those (there are some that will need to be transform). [Here's the database schema]([https://raw.githubusercontent.com/authwit/bot-caselaws/main/data/craft-ERD.svg?token=AAFPGRQL7VYIUVXGRJOSFUDAZO3QA](https://www.dropbox.com/s/88pbg6iceld88ng/craft-ERD.svg?dl=0)) ![](https://static.authw.it/Abiodun_v_FRN_Part1516_001.png) ![](https://static.authw.it/Abiodun_v_FRN_Part1516_00.png) ![](https://static.authw.it/Abiodun_v_FRN_Part1516_2.png) ![](https://static.authw.it/Abiodun_v_FRN_Part1516_3.png) ![](https://static.authw.it/httpsstatic.authw.itAbiodun_v_FRN_Part1516_4.png) ![](https://static.authw.it/Abiodun_v_FRN_Part1516_5_1.png) ![](https://static.authw.it/Abiodun_v_FRN_Part1516_6_1.png) 1. 00.1 -> `cases.name` | `cases.name_abbreviation` | `citations.cite` | 00.3 -> `cases.first_page` | 00.1 -> `citations.type` = 'nominative' reference to `cases` via `case_id` 3. `parties.name` | `parties.type` (for the first , "appellants" and the others, "respondents") 2. `courts.name` 3. `citations.type` = 'official' | `citations.cite` | `name_abbreviation` = 'scng' reference to cases via `case_id` 4. `judges.name` 5. `cases.decision_date` 6. `matters.name` 7. `matters.description` 8. `issues.text` 9. `summaries.text` reference to `cases` via `case_id` 10. `cases_citations` via `case_id` and `case_cites_to_id` 11. `opinions.text` | `author_id` via `judges` | `type` = 'majority' | `cases` via `cases_id` 12. `cases.status` ### PostGreSQL Credentials - Here's an [initial SQL dump](https://www.dropbox.com/s/lamrt2jmqe32qfh/craft.initial.sql?dl=0)