# Solr dev notes
- Current release: 8.8.2
- Documentation: https://solr.apache.org/guide/8_8/index.html
- Solr schema design: https://solr.apache.org/guide/8_8/documents-fields-and-schema-design.html
:::info
**Important:**
Solr is not a database. It is a search index. Content in Solr should generally be treated as ephemeral.
:::
## General
- Two operating modes: stand alone and cloud.
- Certain types of advanced query operations are only available on cloud installs
- iSamples installations will use Solr Cloud
- Interaction is through a web API with messaging in JSON (preferred), or XML.
- A "collection" is basically the same as a Database in SQL.
- A "collection" is defined by a combination of service configuration and a schema.
- The schema defines field types and fields that are available in a collection. Fields may be defined as created on demand, though this is generally discouraged for our work.
## Collections
- Collections can be created using the [collections API](https://solr.apache.org/guide/8_8/collections-api.html) or from the command line
- Collections may be spread over multiple shards (per server) and across multiple servers (cloud)
- Collections may be replicated across data centers
- Collection structure can impact certain types of query operations (e.g. joins require single shard collections)
## Fields
- Generally stick with the [recommended field types](https://solr.apache.org/guide/8_8/field-types-included-with-solr.html)
- `String` fields are handled literally, `text` fields have various NLP
analyzer actions applied (tokenization, stop words, synonym filter, etc.)
- Fields may be single or multiple values
- Field values may be stored or not, indexed or not, default values or not, ...
- Field types of initial interest to iSamples:
- StrField (e.g. identifiers)
- TextField (e.g. abstract text, descriptions)
- DatePointField
- DateRangeField
- FloatPointField (possibly DoublePointField is high precision required)
- LatLonPointSpatialField
- Possibly also:
- BBoxField
- SpatialRecursivePrefixTreeFieldType
- Fields may be [copy fields](https://solr.apache.org/guide/8_8/copying-fields.html).
e.g. to store a literal title (`title_str`) and a text title (`title_txt`), the
text field can be set as a copy field using `title_str` as source.
## Adding / editing documents (records)
- Every document must have a unique identifier
- Sent to Solr `update/` endpoint.
- Multiple documents can be sent per request
- Documents are stored immediately but searchable only after commit
- Generally commits are best managed by the server (after certain elapsed time, number of docs, size of backlog)
- See tutorials and some code in [isb_lib](https://github.com/isamplesorg/isamples_inabox/blob/main/isb_lib/core.py#L44) and also the `scripts/sesar_things.py` and `scripts/geome_things.py` scripts in the [iSamples in a box repo](https://github.com/isamplesorg/isamples_inabox)
- Deletes can be per document or by matching query, [tutorial example](https://solr.apache.org/guide/8_8/solr-tutorial.html#deleting-data)
## Searching
Many options are available for searching Solr collections. Most will be simple queries using the [common query parameters](https://solr.apache.org/guide/8_8/common-query-parameters.html)
[Faceting](https://solr.apache.org/guide/8_8/faceting.html) will also be commonly used. Faceting basically provides the unique values and their occurrence counts for a field.
Other types of searches likely to be important to iSamples include:
- [Spatial search](https://solr.apache.org/guide/8_8/spatial-search.html)
- [Graph traversal](https://solr.apache.org/guide/8_8/graph-traversal.html)
- [Parallel SQL](https://solr.apache.org/guide/8_8/parallel-sql-interface.html)
## OS X install
Use [homebrew](https://brew.sh/). Basically:
```
brew install solr
```
Solr admin user interface: http://localhost:8983/
Start / stop (`-f`: foreground terminal, `-p`: port, default to 8983):
```
solr start -f -c
```
Start / stop (as a service):
```
brew services start solr
```
OS X Solr startup properties, in:
```
/usr/local/Cellar/solr/8.8.2/homebrew.mxcl.solr.plist
```
Default configuration:
```
/usr/local/Cellar/solr/8.8.2/server/solr/configsets/_default/conf
```
Create a core or collection (solr cloud mode) using default configuration:
```
solr create -c collection_name
```
## Ubuntu setup
Solr config:
```
/etc/default/solr.in.sh
ZK_HOST="localhost:2181/solr"
ZK_CLIENT_TIMEOUT="30000"
SOLR_HOST="127.0.0.1"
SOLR_WAIT_FOR_ZK="30"
SOLR_PID_DIR="/var/solr"
SOLR_HOME="/var/solr/data"
LOG4J_PROPS="/var/solr/log4j2.xml"
SOLR_LOGS_DIR="/var/solr/logs"
SOLR_PORT="8983"
```
Setup zookeeper:
```
bin/solr zk mkroot /solr -z localhost:2181
server/scripts/cloud-scripts/zkcli.sh \
-z localhost:2181 \
-cmd bootstrap \
-solrhome /var/solr/data
```
Create a core:
```
sudo su - solr
/opt/solr/bin/solr create -c isb_rel
```
Then set autocreate fields off, with:
```
solr config -c isb_rel -p 8983 \
-action set-user-property \
-property update.autoCreateFields \
-value false
```