DB Mappings
=====
> 20171203 version
Cofacts uses elasticsearch to store all its data, event the relational ones. Elasticsearch "schema", or the [index mappings](https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping.html), are defined in [rumors-db](https://github.com/cofacts/rumors-db/).
This document contains a visualization of the schema and the reason behind such design.
## Mapping
```graphviz
graph mappings{
rankdir=LR;
node[shape=record];
replyrequests [fixedsize="true" width="2" height="1.5" label="
<id>𝐫𝐞𝐩𝐥𝐲𝐫𝐞𝐪𝐮𝐞𝐬𝐭𝐬
|
<user>
userId\l
appId\l
|
<timestamps>
createdAt\l
updatedAt\l
"]
articlereplyfeedbacks [fixedsize="true" width="3" height="2" label="
<id>𝐚𝐫𝐭𝐢𝐜𝐥𝐞𝐫𝐞𝐩𝐥𝐲𝐟𝐞𝐞𝐝𝐛𝐚𝐜𝐤𝐬
|
userId\l
appId\l
|
<score> score\l
|
createdAt\l
updatedAt\l
"]
replies [label="
<id>𝐫𝐞𝐩𝐥𝐢𝐞𝐬
|
userId\l
appId\l
|
<type>
type\l
|
text\l
|
{
hyperlinks
\n(nested)
|
{
<url>
url\l
|
<title>
title\l
|
<summary>
summary\l
}
}
|
reference\l
|
createdAt\l
"]
articles [label="
<id>
𝐚𝐫𝐭𝐢𝐜𝐥𝐞𝐬
|
<requests>
replyRequestCount\l
lastRequestedAt\l
|
createdAt\l
updatedAt\l
|
text\l
|
userId\l
appId\l
|
{
references\n
(nested)
|
{
type\l
permalink\l
createdAt\l
|
userId\l
appId\l
}
}
|
{
articleReplies
\n(nested)
|
{
<reply>
replyId\l
|
<feedbackcounts>
positiveFeedbackCount\l
negativeFeedbackCount\l
|
<replytype>
replyType\l
|
userId\l
appId\l
|
status\l
|
createdAt\l
updatedAt\l
}
}
|
{
hyperlinks
\n(nested)
|
{
<url>
url\l
|
<title>
title\l
|
<summary>
summary\l
}
}
|
<tags>tags\l
"]
tags[label="
<id>𝐭𝐚𝐠𝐬
|
<title>title\l
|
description\l
|
userId\l
appId\l
"]
urls[label="
<id>𝐮𝐫𝐥𝐬
|
<url>url\l
canonical\l
|
<title>title\l
|
<summary>summary\l
|
html\l
topImageUrl\l
|
fetchedAt\l
"]
articles:id -- replyrequests:id [headlabel="n",taillabel="1"];
articles:requests -- replyrequests:timestamps [style="dotted"];
articles:tags -- tags:title [headlabel="m",taillabel="n"];
articles:url -- urls:url [headlabel="1",taillabel="n"];
articles:id -- articlereplyfeedbacks:id [headlabel="n",taillabel="1"];
articles:reply -- articlereplyfeedbacks:id [headlabel="n",taillabel="1"];
articles:reply -- replies:id [headlabel="1",taillabel="n"];
articles:feedbackcounts -- articlereplyfeedbacks:score [style="dotted"];
articles:replytype -- replies:type [style="dotted"];
articles:title -- urls:title [style="dotted"];
articles:summary -- urls:summary [style="dotted"];
replies:url -- urls:url [headlabel="1",taillabel="n"];
replies:title -- urls:title [style="dotted"];
replies:summary -- urls:summary [style="dotted"];
}
```
## Requirements
The DB structure above is to fulfill the requirements mentioned in https://github.com/cofacts/rumors-api/issues/35 , including:
### Filters
- Get articles I have replied
- Get articles with all replies having only negative feedbacks
- Get articles that has replies marked with "Contains truth" / "Contains misinformation" / "Not artile"
- Article tag (TBD in [cofacts/api#32](https://github.com/cofacts/rumors-api/issues/32))
### Sortings
- Using the last reported time
### Future directions
> not supported yet, but should be available by adding fields or cached fields
- Editors would like to know if there are any new replies from the other editors to the articles they have already replied to. http://beta.hackfoldr.org/cofacts/https%253A%252F%252Fhackmd.io%252Fs%252FHJy19V-E-
- Editors want their own personalized page: ee their reply number, number of articles replied by them, and "draft replies" (TBD) / inbox scenario
- http://beta.hackfoldr.org/cofacts/https%253A%252F%252Fhackmd.io%252Fs%252FSJlw9zD94W
- https://hackmd.io/s/rkdgg_aKb#inbox-scenario
- Sorts article by number of "Contains truth" replies / Sorts article by the number of "Contains truth", subtract number of "contains misinformation"
### Design choice
* Use nested object as much as we can [to speed up search and take sparsity into consideration](https://www.elastic.co/guide/en/elasticsearch/guide/current/parent-child-performance.html)。
* Fields required by filters should be cached. For example, aritlces can be filtered by reply type & reply count, so they should be cached in `articles` index.