# Meeting 01-June-2021 ## Attendees - Labra - Dani F - Pablo - Guillermo - Jorge ## Discussions - Creation of [Weso-scholia](https://github.com/weso/weso-scholia) repo to keep our own code snippets and issues - Dockerized Scholia version available [here](https://github.com/weso/scholia/issues/3) - Research on HDT (Pablo) - [Wikidata dump](https://dumps.wikimedia.org/wikidatawiki/entities/) problems downloading - 80Gb - 15Gb/day. - Problems to do demo with Fuseki - Discussion about HDT with Scholie - SPARQL queries in Scholia depend on Blazegraph. [Example query](https://github.com/fnielsen/scholia/blob/master/scholia/app/templates/country_authors.sparql) - Research on blazegraph (Jorge) - Deployed Blazegraph on Docker - Common tasks todo: - Run selenium tests on our Scholia to check performance of existing Scholia - Measure with selenium the new performance - Approach 1 - Optimize subqueries - Run subqueries like [author's search subquery](https://github.com/fnielsen/scholia/blob/fb743b9bd15f4a6006f2c03067d00ef3c233e41a/scholia/app/templates/country_authors.sparql#L7) and store intermediate results in an intermediate blazegraph store - Rewrite the big SPARQL query to use previous results - Approach 2 - Queries over subsets - Generate Scholia based wikidata subset - Keep the same properties as wikibase - Map the items from wikidata - Example: [Daniel Mietchen](https://www.wikidata.org/wiki/Q20895785) -> [Q34](http://example.org/Q34) - Keep a link to related wikidata item - Approach 3 - Hybrid approach - Question from Guillermo: Do we really need to use Wikibase or could we just use our own Blazegraph ## Actions - Add issue to scholia repo documenting what we had to do and ask if they are planning to use Docker (Guillermo) - Create an endpoint with Fuseki + HDT for a subset of wikidata (Pablo) - Run selenium tests on our Scholia to check performance of existing Scholia. Check [this repo](https://github.com/nunogit/scholia-junit-selenium). - Generate Empty wikibase with the same properties as Wikidata. Check WBStack option. - Study Blazegraph's SPARQL extensions. (Jorge) - Ask in Skype: - Guillermo: Use blazegraph directly for subsettings instead of wikibase - Repo of selenium tests - is that the one? - Feature in [this repo](https://github.com/nunogit/scholia-junit-selenium) : "Run the queries on a cache server to speed up Scholia operation with a caching proxy". Is that so? - Next meeting?