# Proposal: Use aos-dev/go-storage to replace storage.ExternalStorage
## Background
[dumping] uses `storage.ExternalFileWriter` to support data export.
`storage.ExternalFileWriter` use following APIs:
```golang
type ExternalFileWriter interface {
// Write writes to buffer and if chunk is filled will upload it
Write(ctx context.Context, p []byte) (int, error)
// Close writes final chunk and completes the upload
Close(ctx context.Context) error
}
```
In order to support multipart uploads, `storage.ExternalStorage` will create a struct to carry upload_id and completed parts:
```golang
type S3Uploader struct {
svc s3iface.S3API
createOutput *s3.CreateMultipartUploadOutput
completeParts []*s3.CompletedPart
}
```
`S3Uploader` will create new parts in every call of `Write` and complete parts in `Close`.
Based on these design, [dumping]'s main data export logic is following:
```golang
func WriteInsert(pCtx *tcontext.Context, cfg *Config, meta TableMeta, tblIR TableDataIR, w storage.ExternalFileWriter) (n uint64, err error) {
...
wp := newWriterPipe(w, cfg.FileSize, cfg.StatementSize, cfg.Labels)
...
for fileRowIter.HasNext() {
...
for fileRowIter.HasNext() {
lastBfSize := bf.Len()
if selectedField != "" {
if err = fileRowIter.Decode(row); err != nil {
pCtx.L().Error("fail to scan from sql.Row", zap.Error(err))
return counter, errors.Trace(err)
}
row.WriteToBuffer(bf, escapeBackslash)
} else {
bf.WriteString("()")
}
counter++
wp.AddFileSize(uint64(bf.Len()-lastBfSize) + 2) // 2 is for ",\n" and ";\n"
...
fileRowIter.Next()
shouldSwitch := wp.ShouldSwitchStatement()
if fileRowIter.HasNext() && !shouldSwitch {
bf.WriteString(",\n")
} else {
bf.WriteString(";\n")
}
if bf.Len() >= lengthLimit {
select {
case <-pCtx.Done():
return counter, pCtx.Err()
case err = <-wp.errCh:
return counter, err
case wp.input <- bf:
bf = pool.Get().(*bytes.Buffer)
if bfCap := bf.Cap(); bfCap < lengthLimit {
bf.Grow(lengthLimit - bfCap)
}
AddCounter(finishedRowsCounter, cfg.Labels, float64(counter-lastCounter))
lastCounter = counter
}
}
if shouldSwitch {
break
}
}
if wp.ShouldSwitchFile() {
break
}
}
...
if bf.Len() > 0 {
wp.input <- bf
}
close(wp.input)
<-wp.closed
...
return counter, wp.Error()
}
```
[dumping] will create a buffer and call `ExternalFileWriter.Write` every time the buffer has been written 1048576(1M) lines.
## Propose
It's indeed a burden for applications to connect to all storage services, especially for an application that has complicated business logic. So I propose to use [aos-dev/go-storage] to replace storage.ExternalStorage.
[aos-dev/go-storage] is an application-oriented unified storage layer for Golang. It's design goals are **Production ready**, **High performance** and **Vendor agnostic**. go-storage will support as many services as possible, including S3, GCS, OSS, COS, Kodo(qiniu), QingStor, even Dropbox(contributed via community).
### Benefits
- go-storage is maintained by a dedicated team who focused on storage areas, licensed under [Apache-2.0](https://github.com/aos-dev/go-storage/blob/master/LICENSE).
- go-storage supports 10 storage services and could be more in the future.
- go-storage has all services tested via CI: https://github.com/aos-dev/go-service-s3/actions/workflows/intergration-test.yml
- go-storage is a general storage layer designed for different workloads, so there are no limitations when it comes to [dumpling] business expansion.
### Drawbacks
- go-storage needs to support all features that [dumping] supports for now, as described in issue [go-service-s3#51](https://github.com/aos-dev/go-service-s3/issues/51), such as SSE.
- [dumping] needs to handle the config parse to construct go-storage's Storager.
## Implementations
For the first stage, we can just replace the `Write` and `Close` call without touching other parts of the projects.
- Change the config parse to support construct go-storage's Storager
- Way A: Use go-storage's [Multiparter](https://github.com/aos-dev/go-storage/blob/master/types/operation.generated.go) to replace `storage.ExternalFileWriter`.
- Way B: Use go-storage to implement `storage.ExternalFileWriter`
## Rational
### `io.FS`
`io.FS` has been included in std lib since go 1.16. But `io.FS` is designed to work with file instead of bytes or stream. And is's lack of object storage's Multipart Object support.
### `spf13/afero`
[afero](https://github.com/spf13/afero) is another FileSystem Abstraction System for Go. As his name implies, it also works with files.
There is no official support for s3 like services, but there is community built one: [afero-s3](https://github.com/fclairamb/afero-s3/). It uses `S3Manager` to in `Write` operations which means user can't control the logic of underlying multipart object.
---
[dumping]: https://github.com/pingcap/dumpling
[aos-dev/go-storage]: https://github.com/aos-dev/go-storage