# isucon10 final 環境準備  https://github.com/matsuu/aws-isucon https://github.com/matsuu/vagrant-isucon/tree/master/isucon10-final-standalone ## server aws ec2 t2.large 1: `13.231.182.10` アプリケーション (xsuportal) 2: `52.198.229.213` 実ベンチマーカー usename: `ubuntu` 全員分のsshキーを追加済み ## 改変まとめ - `benchmarker/main.go` ```diff=88 - agent.DefaultTLSConfig.InsecureSkipVerify = false + agent.DefaultTLSConfig.InsecureSkipVerify = true ``` - `benchmarker/scenario/benchmarker.go` ```diff=276 - tlsConfig = grpc.WithTransportCredentials(credentials.NewClientTLSFromCert(nil, "")) + // tlsConfig = grpc.WithTransportCredentials(credentials.NewClientTLSFromCert(nil, "")) ``` - `/etc/hosts` に追記 ``` 172.31.20.0 app1.isucon.unipota.me 172.31.17.1 app2.isucon.unipota.me 172.31.45.139 bench.isucon.unipota.me ``` - `/home/isucon/env` @ app.t.isucon.dev ```diff MYSQL_HOSTNAME=localhost MYSQL_PORT=3306 MYSQL_USER=isucon MYSQL_PASS=isucon MYSQL_DATABASE=xsuportal +BENCHMARK_SERVER_HOST=app.t.isucon.dev +BENCHMARK_SERVER_PORT=50051 ``` ## setup - セットアップ手順 - https://github.com/matsuu/vagrant-isucon/tree/master/isucon10-final-standalone - https://github.com/isucon/isucon10-final - `sudo -i -u isucon` - サービスの起動 ``` sudo systemctl start envoy sudo systemctl start xsuportal-api-golang.service sudo systemctl start xsuportal-web-golang.service ``` - benchmarker - 別サーバーからベンチマーカーを動かそうとすると、 ``` x509: cannot validate certificate for 172.31.33.9 because it doesn't contain any IP SANs ``` のようなエラーが出るので修正する - 自己署名証明書のせい? - `benchmarker/main.go:88` ```diff - agent.DefaultTLSConfig.InsecureSkipVerify = false + agent.DefaultTLSConfig.InsecureSkipVerify = true ``` - `/etc/hosts` ``` 13.231.182.10 app.t.isucon.dev 52.198.229.213 bench.t.isucon.dev ``` - ec2イメージ作成時証明書とってるところ - https://github.com/matsuu/aws-isucon/blob/main/isucon10-final/provision.sh#L21 - [How do I use SANs with openSSL instead of common name? - Stack Overflow](https://stackoverflow.com/questions/64814173/how-do-i-use-sans-with-openssl-instead-of-common-name) - `openssl` コマンドに `-addext 'subjectAltName = DNS:app.t.isucon.dev' -addext 'subjectAltName = DNS:bench.t.isucon.dev'` を渡せばできそう? - ``` openssl req -x509 -subj "/CN=*.t.isucon.dev" \ -addext "subjectAltName = DNS:app.t.isucon.dev, DNS:bench.t.isucon.dev" \ -sha256 -nodes -days 3650 -newkey rsa:2048 \ -keyout secrets/san-key.pem -out secrets/san-cert.pem ``` - `req: Use -help for summary.` と言われるのでコマンドの使い方が何か間違ってそう - [Know about SAN Certificate and How to Create With OpenSSL](https://geekflare.com/san-ssl-certificate/) - > SAN stands for “Subject Alternative Names” and this helps you to have a single certificate for multiple CN (Common Name). - systemd-runに実行時の環境変数を渡してあげる - -E GODEBUG=x509ignoreCN=0 ``` sudo systemd-run \ --working-directory=/home/isucon/benchmarker \ --pipe \ --wait \ --collect \ --uid=$(id -u)\ --gid=$(id -g) \ --slice=benchmarker.slice \ --service-type=oneshot \ -E GODEBUG=x509ignoreCN=0 \ -p AmbientCapabilities=CAP_NET_BIND_SERVICE \ -p CapabilityBoundingSet=CAP_NET_BIND_SERVICE \ -p LimitNOFILE=2000000 \ -p TimeoutStartSec=110s \ ~isucon/benchmarker/bin/benchmarker \ -exit-status \ -tls \ -target app.t.isucon.dev:443 \ -host-advertise bench.t.isucon.dev \ -push-service-port 1001 \ -tls-cert /etc/ssl/private/tls-cert.pem \ -tls-key /etc/ssl/private/tls-key.pem ``` ``` 07:03:41.881460 ISUCON10 benchmarker e858b2588a199f9c7407baacf48b53126b8aeed6+dirty 07:03:41.881569 ===> PREPARE 07:03:41.975203 ERR: prepare: critical: http: Post "https://app.t.isucon.dev:443/initialize": x509: certificate signed by unknown authority 07:03:41.975267 ===> SCORE Count: (0 * 1.0) + 0 - 0(err: 0, timeout: 0) Pass: false / score: 0 (0 - 0) Fail reason: Critical error ``` - `benchmarker/main.go:88` ```diff - agent.DefaultTLSConfig.InsecureSkipVerify = false + agent.DefaultTLSConfig.InsecureSkipVerify = true ``` ``` 07:05:33.772142 ISUCON10 benchmarker e858b2588a199f9c7407baacf48b53126b8aeed6+dirty 07:05:33.772247 ===> PREPARE 07:05:33.971151 ERR: prepare: critical: http-server-error: 不正な HTTP ステータスコード: 503 (POST: /initialize) 07:05:33.971226 ===> SCORE Count: ``` ```shell $ journalctl -u xsuportal-web-golang.service | grep /initialize May 08 05:55:10 ip-172-31-42-155 .x[1894]: {"time":"2021-05-08T05:55:10.3458802Z","id":"","remote_ip":"127.0.0.1","host":"localhost:9292","method":"POST","uri":"/initialize","user_agent":"benchmarker-initializer","status":200,"error":"","latency":787534298,"latency_human":"787.534298ms","bytes_in":34,"bytes_out":21} May 08 05:55:18 ip-172-31-42-155 .x[1894]: {"time":"2021-05-08T05:55:18.878274989Z","id":"","remote_ip":"127.0.0.1","host":"localhost:9292","method":"POST","uri":"/initialize","user_agent":"benchmarker-initializer","status":200,"error":"","latency":357342096,"latency_human":"357.342096ms","bytes_in":34,"bytes_out":21} ``` - Envoy - config: `/etc/envoy/config.yaml` - log: `/var/log/envoy/access.log` ```shell isucon@ip-172-31-39-27:~$ curl --insecure https://13.231.182.10 <!doctype html><html lang="ja" class="has-background-grey-lighter"><head><title>XSUCON Portal</title><meta name="viewport" content="width=device-width,initial-scale=1"/><meta name="xsu:api-base-url" content="/"/><link href="/packs/vendor.css" rel="stylesheet"></head><body><div id="app"></div><script src="/packs/vendor.js"></script><script src="/packs/audience.js"></script></body></html> isucon@ip-172-31-39-27:~$ curl --insecure https://app.t.isucon.dev/ upstream connect error or disconnect/reset before headers. reset reason: connection failureisucon ``` ```= curl -vk https://app.t.isucon.dev * Trying ::1:443... * TCP_NODELAY set * Connected to app.t.isucon.dev (::1) port 443 (#0) * ALPN, offering h2 * ALPN, offering http/1.1 * successfully set certificate verify locations: * CAfile: /etc/ssl/certs/ca-certificates.crt CApath: /etc/ssl/certs * TLSv1.3 (OUT), TLS handshake, Client hello (1): * TLSv1.3 (IN), TLS handshake, Server hello (2): * TLSv1.2 (IN), TLS handshake, Certificate (11): * TLSv1.2 (IN), TLS handshake, Server key exchange (12): * TLSv1.2 (IN), TLS handshake, Server finished (14): * TLSv1.2 (OUT), TLS handshake, Client key exchange (16): * TLSv1.2 (OUT), TLS change cipher, Change cipher spec (1): * TLSv1.2 (OUT), TLS handshake, Finished (20): * TLSv1.2 (IN), TLS handshake, Finished (20): * SSL connection using TLSv1.2 / ECDHE-RSA-CHACHA20-POLY1305 * ALPN, server accepted to use h2 * Server certificate: * subject: CN=*.t.isucon.dev * start date: May 1 03:54:40 2021 GMT * expire date: Apr 29 03:54:40 2031 GMT * issuer: CN=*.t.isucon.dev * SSL certificate verify result: self signed certificate (18), continuing anyway. * Using HTTP2, server supports multi-use * Connection state changed (HTTP/2 confirmed) * Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0 * Using Stream ID: 1 (easy handle 0x5643cf49d820) > GET / HTTP/2 > Host: app.t.isucon.dev > user-agent: curl/7.68.0 > accept: */* > * Connection state changed (MAX_CONCURRENT_STREAMS == 2147483647)! < HTTP/2 503 < content-length: 91 < content-type: text/plain < date: Sat, 08 May 2021 07:39:23 GMT < server: envoy < * Connection #0 to host app.t.isucon.dev left intact upstream connect error or disconnect/reset before headers. reset reason: connection failure ``` `/etc/hosts` の記載を間違えていた :kan: - AWS ![](https://i.imgur.com/IKYfIL5.png) webapp/golang/cmd/xsuportal/main.go ```go=188 host := util.GetEnv("BENCHMARK_SERVER_HOST", "localhost") port, _ := strconv.Atoi(util.GetEnv("BENCHMARK_SERVER_PORT", "50051")) ``` これはxsucon側のbenchmarker? ←はい - サーバー1の/home/isucon/envにBENCHMARK_SERVER_HOSTを設定 - BENCHMARK_SERVER_HOST=app.t.isucon.dev - `sudo systemctl daemon-reload` - `sudo systemctl restart xsuportal-web-golang.service` ``` 08:38:28.014903 ISUCON10 benchmarker e858b2588a199f9c7407baacf48b53126b8aeed6+dirty 08:38:28.015092 ===> PREPARE 08:38:37.792012 Language: go 08:38:37.792034 HTTP: https://app.t.isucon.dev:443/(tls=true) 08:38:37.792040 gRPC: localhost:50051 08:38:37.792120 ===> LOAD 08:38:37.792148 LOAD INFO Registration Open at: 2021-05-08 08:38:38 +0000 UTC Contest Start at: 2021-05-08 08:38:48 +0000 UTC Contest Freeze at: 2021-05-08 08:39:28 +0000 UTC Contest Ends at: 2021-05-08 08:39:38 +0000 UTC 08:38:48.134783 ERR: load: benchmarker-receive: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused" 08:38:48.147122 ERR: load: benchmarker-receive: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused" 08:38:48.165834 ERR: load: benchmarker-receive: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused" 08:38:48.192635 ERR: load: benchmarker-receive: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused" 08:38:48.204953 ERR: load: benchmarker-receive: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused" 08:38:48.218591 ERR: load: benchmarker-receive: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused" 08:38:48.233141 ERR: load: benchmarker-receive: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused" 08:38:48.273034 ERR: load: benchmarker-receive: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused" 08:38:48.279541 ERR: load: benchmarker-receive: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused" 08:38:48.294305 ERR: load: benchmarker-receive: rpc error: code = Unavailable desc = connection error: desc = "transport: Error while dialing dial tcp 127.0.0.1:50051: connect: connection refused" ``` ``` mysql> select * from benchmark_jobs; +----+---------+--------+----------------------+--------+-----------+-----------------+--------+--------+------------+-------------+----------------------------+----------------------------+ | id | team_id | status | target_hostname | handle | score_raw | score_deduction | reason | passed | started_at | finished_at | created_at | updated_at | +----+---------+--------+----------------------+--------+-----------+-----------------+--------+--------+------------+-------------+----------------------------+----------------------------+ | 1 | 10 | 0 | xsu-contestant-00010 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 2021-05-08 08:45:37.050763 | 2021-05-08 08:45:37.050763 | | 2 | 2 | 0 | xsu-contestant-00002 | NULL | NULL | NULL | NULL | NULL | NULL | NULL | 2021-05-08 08:45:37.050891 | 2021-05-08 08:45:37.050891 | ``` エラー内容 ``` 08:45:48.073575 ERR: load: invalid-response: 期待より古い内容のリーダーボードが返却されています(GET /api/contestant/dashboard) 08:45:48.074067 OLDER LEADERBOARD: 2021-05-08 08:45:48.047563828 +0000 UTC requested at 2021-05-08 08:45:37.000182 +0000 UTC latest finish 2021-05-08 08:45:35.000182 +0000 UTC allowed cache time 1970-01-01 00:00:00 +0000 UTC leadeboard max time 2021-05-08 08:46:17 +0000 UTC frozen time 2021-05-08 08:45:48.07405842 +0000 UTC now time ``` ↑のエラーを出す実ベンチマーカーのコード (benchmarker/scenario/verify.go) ```go=315 AdminLogger.Printf("OLDER LEADERBOARD: \n %s requested at\n %s latest finish\n %s allowed cache time\n %s leadeboard max time\n %s frozen time\n %s now time\n", requestedAt, sLatestMarkedAt, allowedMaxTime.Add(-cacheTime), maxMarkedAt, contest.ContestFreezesAt, time.Now().UTC()) if allowCache { return errorInvalidResponse("期待より古い内容のリーダーボードが返却されています(GET /api/audience/dashboard)") } else { return errorInvalidResponse("期待より古い内容のリーダーボードが返却されています(GET /api/contestant/dashboard)") } ``` webapp/golang/cmd/xsuportal/main.go ```golang func (*ContestantService) EnqueueBenchmarkJob(e echo.Context) error { // (略) _, err = tx.Exec( "INSERT INTO `benchmark_jobs` (`team_id`, `target_hostname`, `status`, `updated_at`, `created_at`) VALUES (?, ?, ?, NOW(6), NOW(6))", team.ID, req.TargetHostname, int(resourcespb.BenchmarkJob_PENDING), ) ``` https://github.com/fullstorydev/grpcurl ```shell $ go/bin/grpcurl app.t.isucon.dev:50051 xsuportal.proto.services.bench.BenchmarkQueue/ReceiveBenchmarkJob Failed to dial target host "app.t.isucon.dev:50051": tls: first record does not look like a TLS handshake $ go/bin/grpcurl -plaintext app.t.isucon.dev:50051 xsuportal.proto.services.bench.BenchmarkQueue/ReceiveBenchmarkJob Error invoking method "xsuportal.proto.services.bench.BenchmarkQueue/ReceiveBenchmarkJob": failed to query for service descriptor "xsuportal.proto.services.bench.BenchmarkQueue": server does not support the reflection API ``` https://github.com/fullstorydev/grpcurl#proto-source-files > To use grpcurl on servers that do not support reflection, you can use .proto source files. ↑にあるように `.proto` を指定すると server2 から grpcurl でアクセスできる ```shell $ go/bin/grpcurl -plaintext -proto proto/xsuportal/services/bench/receiving.proto app.t.isucon.dev:50051 xsuportal.proto.services.bench.BenchmarkQueue/ReceiveBenchmarkJob { "jobHandle": { "jobId": "1", "handle": "CiLAofVx0z13pdtnaH7G8A==", "targetHostname": "xsu-contestant-00008", "contestStartedAt": "2021-05-08T13:59:49Z", "jobCreatedAt": "2021-05-08T13:59:49.054096Z" } } ``` [isucon10-final/webapp/tools at master · isucon/isucon10-final](https://github.com/isucon/isucon10-final/tree/master/webapp/tools) /etc/hosts ``` #public #13.231.182.10 app.t.isucon.dev #52.198.229.213 bench.t.isucon.dev #private 172.31.42.155 app.t.isucon.dev 172.31.39.27 bench.t.isucon.dev ``` `gRPC: app.t.isucon.dev:443` にしてみる ``` 13:30:28.134857 ERR: load: benchmarker-receive: rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: x509: certificate signed by unknown authority" ``` 証明書を元のやつに戻してみる ``` 13:42:24.377349 ERR: load: benchmarker-receive: rpc error: code = Unavailable desc = connection error: desc = "transport: authentication handshake failed: x509: certificate relies on legacy Common Name field, use SANs or temporarily enable Common Name matching with GODEBUG=x509ignoreCN=0" ``` ベンチマーカー実行時の `-tls` オプションを消す ``` 13:44:50.817450 ===> PREPARE 13:44:50.820464 ERR: prepare: critical: http: Post "http://app.t.isucon.dev:443/initialize": EOF ``` `-target 13.231.182.10` にする ``` 13:46:12.458471 ===> PREPARE 13:46:12.461016 ERR: prepare: critical: http-server-error: 不正な HTTP ステータスコード: 301 (POST: /initialize) ``` 実ベンチマーカー側の gRPC client の TLS を無効にしたら正の点数が出た :tada: https://github.com/isucon/isucon10-final/blob/master/benchmarker/scenario/benchmarker.go#L276 ```diff - tlsConfig = grpc.WithTransportCredentials(credentials.NewClientTLSFromCert(nil, "")) + // tlsConfig = grpc.WithTransportCredentials(credentials.NewClientTLSFromCert(nil, "")) ``` - 実ベンチマーカーに渡す `-tls` を消すだけではだめだった - `web` (HTTP) はTLS有効、 `api` (gRPC) はTLS無効にしないといけなかった [grpc-go/grpc-auth-support.md at master · grpc/grpc-go](https://github.com/grpc/grpc-go/blob/master/Documentation/grpc-auth-support.md#enabling-tls-on-a-grpc-server) > Enabling TLS on a gRPC server ```go creds, err := credentials.NewServerTLSFromFile(certFile, keyFile) server := grpc.NewServer(grpc.Creds(creds)) ``` ~/webapp/golang/cmd/benchmark_server/main.go を書き換えて TLS を有効にすれば良いのでは? ```golang=284 creds, err := credentials.NewServerTLSFromFile("/home/isucon/secrets/san-cert.pem", "/home/isucon/secrets/san-key.pem") if err != nil { log.Fatalf("Failed to generate credentials %v", err) } server := grpc.NewServer(grpc.Creds(creds)) ``` これで実ベンチマーカーを走らせたら `"transport: authentication handshake failed: x509: certificate signed by unknown authority"` と怒られてダメだった :sad_parrot: むしろ本番でどうしてこれで動いていたのかが気になる………… ## メモ oisu- journalctlの使い方 https://qiita.com/aosho235/items/9fbff75e9cccf351345c [そもそも RPC ってなんだ - Qiita](https://qiita.com/il-m-yamagishi/items/8709de06be33e7051fd2) [内部実装から理解するgRPC - Qiita](https://qiita.com/immrshc/items/075fcbf22b5f5e48f9ec)