# 012 Internal
## git-annex
- why
- you can manage data in terms of operations: copy, delete, etc. that does'nt scale
- you can manage data in terms of "I need this data here" or "delete this data if you can verify two other copies exist"
- Basic under the hood
- symlinks
- metadata in git
- Altertatives
- git-lfs
- dvc
Mechanics:
- Create the repository
- git init
- git annex init 'rkdarst computer'
- Single repo
- echo "data" > data1
- git annex add data1
- git commit
- explore what happens: symlink, objects dir
- ls -l
- git status
- git show
- ls .git/annex/objects/...
- cat the file
- git annex list
- create a new files
- echo "data2" > data2
- dd if=/dev/urandom of=large bs=1M count=100
- git add data2
- This is ont annexed
- git annex add data2
- git commit
- git annex drop data1
- This raises an error, since removing the only
- git annex numcopies
- Try to edit the file
- We'll go to editing again later
- Editing files
- git annex unlock
- git annex commit
- git annex direct
- Remotes
- concept of remote
- git remotes: git repositories
- git special remotes:
- Make a regular remote
- cd ..
- git clone sample-project sample-2
- cd sample2
- git annex list
- We see 'origin' is a remote
- git annex get data1
- ls
- git annex get large
- get annex sync
- This is the universal command for "update what I have to all other repos"
- cd ../sample-project
- git annex list
- How does original remote relate to the first?
- git annex drop large
- git remote add other ../sample-2
- git annex drop large
- Make a special remote
- mkdir /home/rkdarst/git/sample-data
- git annex initremote directory type=directory encryption=shared directory=/home/rkdarst/git/sample-data
- git annex list
- git annex copy -t directory large
- git annex list
- ls ../sample-data/...
-
- git remote add ; git annex enableremote (???)
- git annex sync - F
- git annex list
- git annex get - A
- git annex numcopies
- special remote
- what is a special remote?
- git annex initremote
- directory remote
- Allas/google drive remote?
- source ~/.aws/credentials
- git annex initremote type=S3 encryption=shared host=a3s.fi bucket=ga-demo-1 protocol=https port=443
advanced features:
- metadata! I
- encryption
- git data https://github.com/AaltoSciComp/git-data
- git annex watch / assistant
- wanted rules and groups
- map