# Setting Up Blackfoot Database on MySQL (Example Walkthrough) ## Still need to think about: - [x] Columns for each table - [x] Word components? - [x] Data type for each column - Length of `VARCHAR` columns: 1000 characters for phrase fields - [x] `README` table on NocoDB - [x] Default column order for each table (important?) - Use order from spreadsheets - [x] `ON DELETE` (and `ON UPDATE`) behavior for foreign keys (`CASCADE`, `SET NULL`, or just do nothing) - [relevant MySQL docs here](https://dev.mysql.com/doc/refman/8.0/en/create-table-foreign-keys.html) - `ON DELETE` default `NO ACTION`; `ON UPDATE CASCADE` - [x] How the presence of the foreign key links will restrict the ways in which data is uploaded/updated ## Setting up the database ### Create a Database/Schema Either **one** of the follow commands below works: ```sql CREATE SCHEMA IF NOT EXISTS testdb; CREATE DATABASE IF NOT EXISTS testdb; ``` You can specify the character set and collation for the database/schema upon creation of the database/schema: ```sql CREATE DATABASE testdb CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; ``` ### Select the Database/Schema ```sql USE testdb; ``` ## Creating tables ### Sources ```sql CREATE TABLE sources ( Source_ID varchar(100) NOT NULL, Author varchar(255), `Year` varchar(25), Title varchar(1000), Subtitle varchar(1000), Booktitle varchar(1000), Booksubtitle varchar(1000), Editor varchar(255), Edition int, `Type` varchar(50), Note varchar(1000), Publisher varchar(1000), Institution varchar(1000), Address varchar(100), Journaltitle varchar(1000), Volume int, `Number` int, Series varchar(1000), Issuetitle varchar(1000), `Pages` varchar(100), ISBN varchar(25), Dialect text, Orthography text, Provenance text, LabSourceComments text, PRIMARY KEY (Source_ID) ); ``` ### Words To create the Words table in MySQL, your command might look something like this: ```sql CREATE TABLE words ( Word_ID varchar(100) NOT NULL, Source varchar(255) NOT NULL, `Page` int, Dialect varchar(100), Speaker varchar(100), OriginalWord varchar(255), WordTranslation varchar(255), OriginalCategory varchar(100), OriginalUR varchar(255), LabWordCategory varchar(50), OriginalPhrase varchar(1000), OriginalPhraseTranslation varchar(1000), OriginalPhraseUR varchar(1000), OriginalPartialWord varchar(150), OriginalPartialWordTranslation varchar(255), OriginalPartialWordCategory varchar(50), OriginalPartialWordUR varchar(255), CitedFrom varchar(255), OriginalComments text, LabComments text, PRIMARY KEY (Word_ID), FOREIGN KEY (Source) REFERENCES sources(Source_ID) ON UPDATE CASCADE ); ``` #### Old templates below. ```sql CREATE TABLE words ( Source varchar(255), Word_ID varchar(100) NOT NULL, -- primary key should be NOT NULL OriginalWord varchar(255), OriginalTranslation varchar(255), LabWordCategory varchar(50), PRIMARY KEY (Word_ID) -- specify primary key column ); ``` **Note:** If you did not specify the character set and the collation on database/schema creation, you can specify it on table creation: ```sql CREATE TABLE words ( Source varchar(255), Word_ID varchar(100) NOT NULL, OriginalWord varchar(255), OriginalTranslation varchar(255), LabWordCategory varchar(50), PRIMARY KEY (Word_ID) ) DEFAULT CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci; ``` ### Lemmas Similarly, for the Lemmas table: ```sql CREATE TABLE lemmas ( Lemma_ID varchar(100) NOT NULL, LabLemma varchar(255), LabLemmaCategory varchar(50), LabLemmaTranslation varchar(255), PRIMARY KEY (Lemma_ID) ); ``` ### Stems For the Stems table: ```sql CREATE TABLE stems ( Stem_ID varchar(100) NOT NULL, Word_ID varchar(100), ContainedIn varchar(100), Precedence int, LabStem varchar(255), LabStemCategory varchar(50), Lemma_ID varchar(100), LabStemComments text, PRIMARY KEY (Stem_ID), FOREIGN KEY (Word_ID) REFERENCES words(Word_ID) ON UPDATE CASCADE, -- foreign key on table creation FOREIGN KEY (Lemma_ID) REFERENCES lemmas(Lemma_ID) ON UPDATE CASCADE, FOREIGN KEY (ContainedIn) REFERENCES stems(Stem_ID) -- recursive structure ); ``` In addition to the table's primary key, the script above also specifies 3 foreign keys, referencing the Words table, the Lemmas table, and the Stems table itself. \ **Note:** `ON UPDATE CASCADE` indicates that when an update occurs on the referenced column from the parent table (for example, `Word_ID` from Words), it must automatically update the matching rows in the child table (`Word_ID` in Stems) with the new value. ### Morphemes For the Morphemes table: ```sql CREATE TABLE morphemes ( Morpheme_ID varchar(100) NOT NULL, Stem_ID varchar(100), Precedence int, LabMorpheme varchar(255), LabMorphemeCategory varchar(50), Lemma_ID varchar(100), LabMorphemeComments text, PRIMARY KEY (Morpheme_ID), FOREIGN KEY (Stem_ID) REFERENCES stems(Stem_ID) ON UPDATE CASCADE, FOREIGN KEY (Lemma_ID) REFERENCES lemmas(Lemma_ID) ON UPDATE CASCADE ); ``` ## Uploading data from CSV files into tables ### General things to keep in mind - If you're uploading entries into a table with a foreign key constraint, make sure that the values in the relevant columns are present in the table that the foreign key references. In other words, for example, before uploading new entries into the Stems table, make sure all the `Word_ID` values for these new entries exist in the Words table, and all the `Lemma_ID` values exist in the Lemmas table. - It might make sense to update the Words and Lemmas tables before updating the Stems table, and update the Stems table before updating the Morphemes table. - The column names in the last line of the scripts given below should be listed in the order that they appear in the corresponding CSV files, not in the order that they appear in the MySQL tables. ### Words ```sql LOAD DATA LOCAL INFILE 'mockwords_dash.csv' -- file path INTO TABLE words -- table name FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n' -- might just be '\n' if you're not using Windows...? IGNORE 1 ROWS -- ignore first row, which contains column names (Source, Word_ID, OriginalWord, OriginalTranslation, LabWordCategory); ``` **Note:** The column names in the last line of the script above are given in the order that they appear in the CSV file, not in the order that they appear in the MySQL table. ### Lemmas ```sql LOAD DATA LOCAL INFILE 'mocklemmas_dash.csv' INTO TABLE lemmas FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n' IGNORE 1 ROWS (Lemma_ID, LabLemma, LabLemmaCategory, LabLemmaTranslation); ``` ### Stems Because the `ContainedIn` column in the Stems table is a foreign key column, but we also know that it might be empty, we first store it in a variable `@vContainedIn`, then, if it is an empty string, we replace it with a `NULL` so that the foreign key constraint is not violated. ```sql LOAD DATA LOCAL INFILE 'mockstems_dash.csv' INTO TABLE stems FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n' IGNORE 1 ROWS (Word_ID, Stem_ID, LabStem, LabStemCategory, @vContainedIn, Lemma_ID) -- read into variable @vContainedIn SET ContainedIn = NULLIF(@vContainedIn, ''); -- set value to NULL if empty string read ``` ### Morphemes ```sql LOAD DATA LOCAL INFILE 'mockmorphs_dash.csv' INTO TABLE morphemes FIELDS TERMINATED BY ',' OPTIONALLY ENCLOSED BY '"' LINES TERMINATED BY '\r\n' IGNORE 1 ROWS (Stem_ID, Morpheme_ID, LabMorpheme, LabMorphemeCategory, Lemma_ID); ``` ## Other useful commands ### Adding a foreign key constraint You can use `ALTER TABLE` to add constraints after the creation of a table. If we wish to make the column `Lemma_ID` in the Stems table a foreign key *after* having already created the tables, we can do: ```sql ALTER TABLE stems ADD FOREIGN KEY (Lemma_ID) REFERENCES lemmas(Lemma_ID) ON UPDATE CASCADE; ``` ### Dropping a foreign key constraint ```sql ALTER TABLE stems DROP FOREIGN KEY stems_ibfk_5; ``` ### Temporarily disabling foreign key checks [See this link](https://tableplus.com/blog/2018/08/mysql-how-to-temporarily-disable-foreign-key-constraints.html) (I haven't tested this personally but it should work). ### Looking at constraints We can look at the constraints on a given table: ```sql SELECT COLUMN_NAME, CONSTRAINT_NAME, REFERENCED_COLUMN_NAME, REFERENCED_TABLE_NAME FROM information_schema.KEY_COLUMN_USAGE WHERE TABLE_NAME = 'stems'; -- replace with table name of choice ```