# Assignment 3: Neo4j Graph Databases
### Group Members:
kibriya-shaikhshahaalam.inamdar@stud.uni-bamberg.de
syed-mamoon.ahmed@stud.uni-bamberg.de
klint.sulstarova@stud.uni-bamberg.de
---
### TASK 1 Use Case: Equi Join
#### 1.1 List of people with their department Max Time: 1h
```typescript
MATCH (p:Person)-[:WORKS_IN]->(d:Department)
RETURN p.firstname, p.lastname, d.depname
```
#### 1.2 Number of emails sent out per department Max Time: 0,5h
```typescript
MATCH(e:Email)<-[EMAIL_FROM]-(p:Person)-[WORKS_IN]->(d:Department)
RETURN d.depname, count(e);
```
#### 1.3 Number of emails received per department Max Time: 0,5h
```typescript
MATCH (e:Email)-[EMAIL_TO]->(p:Person)-[WORKS_IN]->(d:Department)
RETURN d.depname , count(e)
```
*Adding and removing < sign in the relation has significant effect on the result, not sure why*
---
### TASK 2 Use Case: Theta Join
#### 2.1 Correlation between salary and number of emails Max Time: 2h
```typescript
MATCH (p:Person) -[:WORKS_IN]- (d:Department)
WITH d, avg(p.salary) AS avg
MATCH (p:Person) -[:EMAIL_FROM]- (e:Email)
WITH p, count(e) AS amount, d, avg
MATCH (p:Person) -[:WORKS_IN]- (d:Department)
RETURN d.depname,
SUM(CASE
WHEN p.salary > avg
THEN amount
ELSE null
END)
AS aboveAvg,
SUM(CASE WHEN p.salary < avg
THEN amount
ELSE null
END)
AS underAvg
```
---
### TASK 3 Use Case: Missing values
#### 3.1 Find missing values Max Time: 1h
```typescript
MATCH(e:Email)
WHERE e.emailId IS NULL OR e.email_date IS NULL
RETURN COUNT(e);
```
*No null values found in either emailId or emailDate.*
---
### TASK 4 Use Case: Range queries
#### 4.1 Emails between two dates Max Time: 0,5h
```typescript
MATCH (e:Email)
WHERE e.email_date >= "2001-09-01 00:00:00+0000"
AND e.email_date <= "2001-10-31 00:00:00+0000"
RETURN e;
```
*date column uses “yyyy-mm-dd HH:mm:ssZ” format for timestamp.*
#### 4.2 Emails between two dates for Larry John May Max Time: 0,2h
```typescript
MATCH (p:Person)-[:EMAIL_TO]-(e:Email)
WHERE p.firstname = "Larry" and p.lastname ="May"
AND e.email_date > "2001-09-01" AND e.email_date < "2001-10-31"
RETURN e;
```
---
### TASK 5 Use Case: Network analysis
#### 5.1 Network size by e-mail Max Time: 1h
Since it is a fully-connected network, it is a single-hop connection to everyone from Larry May by email.
#### 5.2 Network size by "knows" relation Max Time: 0,5h
```typescript
MATCH (larry:Person {firstname:"Larry"}), (others:Person)
WHERE others.firstname <> "Larry"
WITH others,
length(shortestPath((larry) -[:KNOWS*]- (others))) AS hops
RETURN others.firstname, others.lastname, hops
```
#### 5.3 2-hop email network Max Time: 0,5h
```typescript
MATCH (larry:Person {firstname:"Larry"}), (others:Person)
WHERE others.firstname <> "Larry"
WITH others, length(shortestPath((larry) -[:KNOWS*..2]- (others))) AS hops
RETURN others.firstname, others.lastname, hops
```
#### 5.4 Count outgoing edges Max Time: 1h
```typescript
MATCH (sender:Person) -[:EMAIL_FROM]-
(email:Email) -[:EMAIL_TO]-
(receiver:Person )
WITH sender,
count(receiver) AS num_of_recipients
WHERE num_of_recipients = 7
RETURN DISTINCT sender.firstname ,sender.lastname
```