Neo4j Doc - HackMD

Neo4j Doc === :::info We can test on Ken's demo @ http://35.187.240.245:2251/ [Set-Up](#Set-Up) [Creation](#Creation) [Update] [Fundamental Query](#Fundamental-Query) [Composite Query](#Composite-Query) [Sample Queries and Translation](#Sample-Queries-and-Translation) ::: ### Set Up #### DB visualisations [Arcade Analytics](https://arcadeanalytics.com/) [neo4j like querying](https://github.com/AdrianInsua/neo4j-dashboard) [JQA](https://github.com/softvis-research/jqa-dashboard) with [demo]() #### Start VM and Neo4j Server Ensure Capstone ASKIE project GCP -> VM instance -> start VM GCP -> Depolyment Manager -> get address and password Either: Connect via web/local neo4j app using the address and password Or: ``` pip install neo4j ``` ```python= from neo4j import GraphDatabase driver = GraphDatabase.driver("bolt://address:7687", auth=("username", "password"), encrypted=False) with driver.session() as session: result = session.run(query) for record in result: print(record.values()[0]) driver.close() ``` Result are returned as a BoltQuery Object Detailed break down of this data structure is [here](https://neo4j.com/docs/api/python-driver/current/types/graph.html) Useful package [py2neo](https://py2neo.org/) ``` pip install py2neo pip install neomodel ``` ### Creation Create Node with its property Then Create relationship ``` load csv with headers from "file:///val.csv" AS row MERGE (p1:Node {name: row.head}) MERGE (p2:Node {name: row.tail}) WITH p1, p2, row CALL apoc.create.relationship(p1, row.relation, {}, p2) YIELD rel RETURN rel ``` ### Fundamental Query #### 1st Degree by name Return 1st degree link of a particular node by name Output JSON, ```Return x``` for nodes instead ``` MATCH (n)-[r]-(x) WHERE (n.name = 'U.S.') RETURN COLLECT({head:n.name ,tail:x.name,relation:type(r)}) AS jsonOutput ``` #### by relationship Return node-node link with particular relationship Output JSON ``` MATCH (n)-[r:conflict]-(x) RETURN COLLECT({head:n.name ,tail:x.name,relation:type(r)}) AS jsonOutput ``` #### Multiple options for properties ``` MATCH (n)-[r]-(x) WHERE (n.name in ['U.S.','California']) RETURN COLLECT({head:n.name ,tail:x.name,relation:type(r)}) AS jsonOutput ``` ### Composite Query #### 2nd degree ``` match (x)-[r]-(y) where (x.name = "United States") with y Match (y)-[r]-(z) return COLLECT(distinct {head:y.name ,tail:z.name,relation:type(r)}) AS jsonOutput ``` ### Ideas https://github.com/FerreroJeremy/ln2sql ### UI Ideas Left panel are the tools (collapsable) (create,delete,amend,search) Middle panel for Main Graph(or largest cluster) - On create,delete,amend -> Zoom into appropriate nodes Right panel for information Display(by default close) - onclick node in mid graph,open right panel and open new tab Scrub CSS, pending massive improvement ![](https://i.imgur.com/aoNJl4g.png) ![](https://i.imgur.com/IKvL9FO.png) ## Sample Queries and Translation Zhou Zhi & Qing Ze: Familiarize with current knowledge graph and work with Sam/Martin to educate/discover examples of queries useful to them Eg 5 queries/questions for “Startup News” and “E-Commerce” respectively Eg “How many entities invested in company X?” Eg “For X brand, what is their cheapest product?” Resources: “GraphPage” on streamlit demo (http://35.187.240.245:2251/) It shows what relation types there are, eg Investor, FoundedBy, Price etc It shows what attributes each node and each edge has (click “Schema”) These are useful for what kind of queries are possible Click “CustomQuery”: This allows to enter any cypher query and see result Another option: Build docker system locally and use Neo4j browser to try queries https://git.reddragon.ai/RedDragonAI/ASKIE/src/master/ent_link_flask_api/wikidata 1. Popularity of X item. Search for all instances X item appears ``` Match (n) Where n.label ="Google" Return * Limit 5 Or MATCH (n)-[r]-(x) WHERE (n.label = 'Google') Return n,type(r),x ``` 2. Aggregation of X items' entity/relations across multiple entities. ``` MATCH (n)-[r]-(x) WHERE (n.label = 'Google') Return n,type(r),x,count(*) ``` 3. Importance of X items, counting in-degrees and out-degrees ``` in-degree MATCH (n)-[r]->() WHERE n.label = 'Google' RETURN COUNT(r) out-degree MATCH (n)<-[r]-() WHERE n.label = 'Google' RETURN COUNT(r) ``` 4. Given a list of items, what is the most popular/important/best funded ``` MATCH(n) With max(n.someproperty) as p1 //Where n.label in [] MATCH (b) WHERE b.someproperty = p1 //AND b.label in [] RETURN b ``` 5. Relative importance of X items, comparing aggregated in-degree and out-degrees ``` repeat qn3,i.e MATCH (n)-[r]->() WHERE n.label = 'Google' with count(r) as r1 MATCH (x)-[r]->() WHERE x.label = 'business' return r1, count(r) as r2 ``` 6. Items most seen together, comparing the times items have the same entity/relation. Aids in clustering ``` // cluster CALL gds.alpha.scc.stream({ nodeProjection: 'Entity', relationshipProjection: 'InstanceOf' }) YIELD nodeId, componentId RETURN gds.util.asNode(nodeId).label AS Name, componentId AS Component ORDER BY Component DESC // return largest cluster MATCH (u:Entity) RETURN u.componentId AS Component, count(*) AS ComponentSize ORDER BY ComponentSize DESC LIMIT 1 ``` 7. Given list of items, compare based on a specific entity/relation ``` Match n Where n.label in ["a","b"] ``` 8. Comparing similarity of entities. Checking which entities have the most similar kind of relations to other entities. ``` ``` 9. Reconstructing a site from sitelinks/data etc ``` ``` 10. Temporal comparison of relation/entity across past data ``` ``` Goal is to have a generic plug-and-play set of queries [CB Insights report: Venture Capital Funding Report Q2 2020 ](https://www.cbinsights.com/research/report/venture-capital-q2-2020/) **Types of queries** - deal activity per quarter - deals by geography, then compared - number of IPO exits, compared QoQ - highest quarter historically [CB Insights report: 50 Future Unicorns](https://www.cbinsights.com/research/report/future-unicorn-startups-billion-dollar-companies/) ![](https://i.imgur.com/XWIss9h.png) **Types of queries** - company's financial health - company product type (enterprise tools, search, customer protection etc) - company by country - type of industry (fintech, digital, security) - type of market in country (emerging market, developed market) - company by funding - company by state [CB Insights report: Here Are The Top AI Unicorns In Asia](https://www.cbinsights.com/research/asia-ai-unicorns-q1-20/) **Types of queries** - location of unicorns - investor in unicorns - type of product (computer vision for retail etc) [CB Insights report: AI In Asia: The Impact Of Covid-19 On Funding, Exits, Valuations, And R&D](https://www.cbinsights.com/research/report/artificial-intelligence-asia/) **Types of queries** - acquisitions of specific companies (Intel spent X on Y) - capital flows (companies from X country invested in Y country) Arena - founded, last funded Uber - what articles its taken from, then give the answer. or provide half answers/ that make up the eventual answer how to add relationships that are needed, need to automatically add new relationships?

Syntax	Example	Reference
# Header	Header	基本排版
- Unordered List	Unordered List
1. Ordered List	Ordered List
- [ ] Todo List	Todo List
> Blockquote	Blockquote
Bold font	Bold font
Italics font	Italics font
~~Strikethrough~~	~~Strikethrough~~
19^th^	19^th
H~2~O	H₂O
++Inserted text++	Inserted text
==Marked text==	Marked text
[link text](https:// "title")	Link
![image alt](https:// "title")	Image
`Code`	`Code`	在筆記中貼入程式碼
```javascript var i = 0; ```	`var i = 0;`
:smile:		Emoji list
{%youtube youtube_id %}	Externals
$L^aT_eX$	L^aT_eX
:::info This is a alert area. :::	This is a alert area.