# Setting Up Blackfoot Database on MySQL (Example Walkthrough)
## Still need to think about:
- [x] Columns for each table
- [x] Word components?
- [x] Data type for each column
- Length of `VARCHAR` columns: 1000 characters for phrase fields
- [x] `README` table on NocoDB
- [x] Default column order for each table (important?)
- Use order from spreadsheets
- [x] `ON DELETE` (and `ON UPDATE`) behavior for foreign keys (`CASCADE`, `SET NULL`, or just do nothing) - [relevant MySQL docs here](https://dev.mysql.com/doc/refman/8.0/en/create-table-foreign-keys.html)
- `ON DELETE` default `NO ACTION`; `ON UPDATE CASCADE`
- [x] How the presence of the foreign key links will restrict the ways in which data is uploaded/updated
## Setting up the database
### Create a Database/Schema
Either **one** of the follow commands below works:
```sql
CREATE SCHEMA IF NOT EXISTS testdb;
CREATE DATABASE IF NOT EXISTS testdb;
```
You can specify the character set and collation for the database/schema upon creation of the database/schema:
```sql
CREATE DATABASE testdb CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
```
### Select the Database/Schema
```sql
USE testdb;
```
## Creating tables
### Sources
```sql
CREATE TABLE sources (
Source_ID varchar(100) NOT NULL,
Author varchar(255),
`Year` varchar(25),
Title varchar(1000),
Subtitle varchar(1000),
Booktitle varchar(1000),
Booksubtitle varchar(1000),
Editor varchar(255),
Edition int,
`Type` varchar(50),
Note varchar(1000),
Publisher varchar(1000),
Institution varchar(1000),
Address varchar(100),
Journaltitle varchar(1000),
Volume int,
`Number` int,
Series varchar(1000),
Issuetitle varchar(1000),
`Pages` varchar(100),
ISBN varchar(25),
Dialect text,
Orthography text,
Provenance text,
LabSourceComments text,
PRIMARY KEY (Source_ID)
);
```
### Words
To create the Words table in MySQL, your command might look something like this:
```sql
CREATE TABLE words (
Word_ID varchar(100) NOT NULL,
Source varchar(255) NOT NULL,
`Page` int,
Dialect varchar(100),
Speaker varchar(100),
OriginalWord varchar(255),
WordTranslation varchar(255),
OriginalCategory varchar(100),
OriginalUR varchar(255),
LabWordCategory varchar(50),
OriginalPhrase varchar(1000),
OriginalPhraseTranslation varchar(1000),
OriginalPhraseUR varchar(1000),
OriginalPartialWord varchar(150),
OriginalPartialWordTranslation varchar(255),
OriginalPartialWordCategory varchar(50),
OriginalPartialWordUR varchar(255),
CitedFrom varchar(255),
OriginalComments text,
LabComments text,
PRIMARY KEY (Word_ID),
FOREIGN KEY (Source) REFERENCES sources(Source_ID) ON UPDATE CASCADE
);
```
#### Old templates below.
```sql
CREATE TABLE words (
Source varchar(255),
Word_ID varchar(100) NOT NULL, -- primary key should be NOT NULL
OriginalWord varchar(255),
OriginalTranslation varchar(255),
LabWordCategory varchar(50),
PRIMARY KEY (Word_ID) -- specify primary key column
);
```
**Note:** If you did not specify the character set and the collation on database/schema creation, you can specify it on table creation:
```sql
CREATE TABLE words (
Source varchar(255),
Word_ID varchar(100) NOT NULL,
OriginalWord varchar(255),
OriginalTranslation varchar(255),
LabWordCategory varchar(50),
PRIMARY KEY (Word_ID)
) DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
```
### Lemmas
Similarly, for the Lemmas table:
```sql
CREATE TABLE lemmas (
Lemma_ID varchar(100) NOT NULL,
LabLemma varchar(255),
LabLemmaCategory varchar(50),
LabLemmaTranslation varchar(255),
PRIMARY KEY (Lemma_ID)
);
```
### Stems
For the Stems table:
```sql
CREATE TABLE stems (
Stem_ID varchar(100) NOT NULL,
Word_ID varchar(100),
ContainedIn varchar(100),
Precedence int,
LabStem varchar(255),
LabStemCategory varchar(50),
Lemma_ID varchar(100),
LabStemComments text,
PRIMARY KEY (Stem_ID),
FOREIGN KEY (Word_ID) REFERENCES words(Word_ID) ON UPDATE CASCADE, -- foreign key on table creation
FOREIGN KEY (Lemma_ID) REFERENCES lemmas(Lemma_ID) ON UPDATE CASCADE,
FOREIGN KEY (ContainedIn) REFERENCES stems(Stem_ID) -- recursive structure
);
```
In addition to the table's primary key, the script above also specifies 3 foreign keys, referencing the Words table, the Lemmas table, and the Stems table itself.
\
**Note:** `ON UPDATE CASCADE` indicates that when an update occurs on the referenced column from the parent table (for example, `Word_ID` from Words), it must automatically update the matching rows in the child table (`Word_ID` in Stems) with the new value.
### Morphemes
For the Morphemes table:
```sql
CREATE TABLE morphemes (
Morpheme_ID varchar(100) NOT NULL,
Stem_ID varchar(100),
Precedence int,
LabMorpheme varchar(255),
LabMorphemeCategory varchar(50),
Lemma_ID varchar(100),
LabMorphemeComments text,
PRIMARY KEY (Morpheme_ID),
FOREIGN KEY (Stem_ID) REFERENCES stems(Stem_ID) ON UPDATE CASCADE,
FOREIGN KEY (Lemma_ID) REFERENCES lemmas(Lemma_ID) ON UPDATE CASCADE
);
```
## Uploading data from CSV files into tables
### General things to keep in mind
- If you're uploading entries into a table with a foreign key constraint, make sure that the values in the relevant columns are present in the table that the foreign key references. In other words, for example, before uploading new entries into the Stems table, make sure all the `Word_ID` values for these new entries exist in the Words table, and all the `Lemma_ID` values exist in the Lemmas table.
- It might make sense to update the Words and Lemmas tables before updating the Stems table, and update the Stems table before updating the Morphemes table.
- The column names in the last line of the scripts given below should be listed in the order that they appear in the corresponding CSV files, not in the order that they appear in the MySQL tables.
### Words
```sql
LOAD DATA LOCAL INFILE 'mockwords_dash.csv' -- file path
INTO TABLE words -- table name
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n' -- might just be '\n' if you're not using Windows...?
IGNORE 1 ROWS -- ignore first row, which contains column names
(Source, Word_ID, OriginalWord, OriginalTranslation, LabWordCategory);
```
**Note:** The column names in the last line of the script above are given in the order that they appear in the CSV file, not in the order that they appear in the MySQL table.
### Lemmas
```sql
LOAD DATA LOCAL INFILE 'mocklemmas_dash.csv'
INTO TABLE lemmas
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS
(Lemma_ID, LabLemma, LabLemmaCategory, LabLemmaTranslation);
```
### Stems
Because the `ContainedIn` column in the Stems table is a foreign key column, but we also know that it might be empty, we first store it in a variable `@vContainedIn`, then, if it is an empty string, we replace it with a `NULL` so that the foreign key constraint is not violated.
```sql
LOAD DATA LOCAL INFILE 'mockstems_dash.csv'
INTO TABLE stems
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS
(Word_ID, Stem_ID, LabStem, LabStemCategory, @vContainedIn, Lemma_ID) -- read into variable @vContainedIn
SET ContainedIn = NULLIF(@vContainedIn, ''); -- set value to NULL if empty string read
```
### Morphemes
```sql
LOAD DATA LOCAL INFILE 'mockmorphs_dash.csv'
INTO TABLE morphemes
FIELDS TERMINATED BY ','
OPTIONALLY ENCLOSED BY '"'
LINES TERMINATED BY '\r\n'
IGNORE 1 ROWS
(Stem_ID, Morpheme_ID, LabMorpheme, LabMorphemeCategory, Lemma_ID);
```
## Other useful commands
### Adding a foreign key constraint
You can use `ALTER TABLE` to add constraints after the creation of a table.
If we wish to make the column `Lemma_ID` in the Stems table a foreign key *after* having already created the tables, we can do:
```sql
ALTER TABLE stems
ADD FOREIGN KEY (Lemma_ID) REFERENCES lemmas(Lemma_ID) ON UPDATE CASCADE;
```
### Dropping a foreign key constraint
```sql
ALTER TABLE stems DROP FOREIGN KEY stems_ibfk_5;
```
### Temporarily disabling foreign key checks
[See this link](https://tableplus.com/blog/2018/08/mysql-how-to-temporarily-disable-foreign-key-constraints.html) (I haven't tested this personally but it should work).
### Looking at constraints
We can look at the constraints on a given table:
```sql
SELECT COLUMN_NAME, CONSTRAINT_NAME, REFERENCED_COLUMN_NAME, REFERENCED_TABLE_NAME
FROM information_schema.KEY_COLUMN_USAGE
WHERE TABLE_NAME = 'stems'; -- replace with table name of choice
```