EZID Operational Error
Issue description
- EZID processes can't connect to the Amazon RDS (Relational Database Service) that is running MySQL and keeps receiving one of these errors.
- EZID was down for ~3 minutes and operations resumed as soon as the RDS was up.
Observations:
- Something terribly wrong happend at 12:22 PM (Pacific Time) on the
rds-ias-ezid-search
production server.
Note: We had observed similar service failure on August 16, 2019
- The server came back up at ~12:25 PM (Pacific Time)
- Ways to regenerate the error:
Errors
Error 1
OperationalError: (2013, 'Lost connection to MySQL server during query')
Error 2
OperationalError: (2003, "Can't connect to MySQL server on 'rds-ias-ezid-search-prd…' (111)")
Error 3
OperationalError: (2006, 'MySQL server has gone away')
Possible solutions?
Given that all the services are encountering issues during a specifc time frame, it might be useful to check if they are failing over any AWS RDS Events and reviewing MySQL error logs.
How To:
Check AWS RDS Events
Review MySQL error logs
Handle specific cases
If the failure occurs due to any of the following reasons:
- connection times out
- If a connection times out, increase the timeouts for MySQL by increasing the wait_timeout and interactive_timeout parameters by using a custom parameter group.
- the query that is generating the error is retrieving a large data set
- Increase the max_allowed_packet size parameter by using a custom parameter group.
- If millions of rows are being sent as part of one or more queries
- Try increasing net_read_timeout from its default of 30 seconds to 60 seconds or longer, sufficient for the data transfer to complete.
- You have encountered a bug where the server died while executing the query.
- Get the query that crashed the server and try to fix it.
Logs
EZID API/UI logs
/ezid/apache/ezid/SITE/logs/transactions.log
2019-08-22 12:22:00,199 1d3ae118c51211e9b6b30a9901e9a631 BEGIN getMetadata doi:10.7284/115059 anonymous anonymous anonymous anonymous False
2019-08-22 12:22:00,210 1d3ae118c51211e9b6b30a9901e9a631 END BADREQUEST
2019-08-22 12:22:11,079 23b70012c51211e9b6b30a9901e9a631 BEGIN getMetadata doi:10.7284/900617 anonymous anonymous anonymous anonymous False
2019-08-22 12:22:11,091 23b70012c51211e9b6b30a9901e9a631 END BADREQUEST
2019-08-22 12:22:31,959 - STATUS pid=17922 threads=22 running activeOperations=0 waitingRequests=0 activeDataciteOperations=0 updateQueueLength=0 binderQueueLength=0 dataciteQueueLength=2 crossrefQueue:archived/unsubmitted/submitted=5/0/0 downloadQueueLength=0 activeSearches=0 operationCount=6
2019-08-22 12:22:37,637 338b7040c51211e993c60a9901e9a631 BEGIN getMetadata doi:10.7284/901508 anonymous anonymous anonymous anonymous False
2019-08-22 12:22:37,649 338b7040c51211e993c60a9901e9a631 END BADREQUEST
2019-08-22 12:22:43,378 36f76dbac51211e9932d0a9901e9a631 BEGIN getMetadata doi:10.7284/107166 anonymous anonymous anonymous anonymous False
2019-08-22 12:22:43,390 36f76dbac51211e9932d0a9901e9a631 END BADREQUEST
2019-08-22 12:22:48,183 39d4ad54c51211e9932d0a9901e9a631 BEGIN getMetadata doi:10.7284/904436 anonymous anonymous anonymous anonymous False
2019-08-22 12:22:48,195 39d4ad54c51211e9932d0a9901e9a631 END BADREQUEST
2019-08-22 12:22:49,393 3a8d395ac51211e993c60a9901e9a631 BEGIN search/count - invemardoi ark:/99166/p9dz03470 invemar ark:/99166/p9jm23k00 owner invemardoi
2019-08-22 12:22:49,410 3a8d395ac51211e993c60a9901e9a631 END SUCCESS 1699
2019-08-22 12:22:49,411 3a900374c51211e993c60a9901e9a631 BEGIN search/results - invemardoi ark:/99166/p9dz03470 invemar ark:/99166/p9jm23k00 -updateTime 0 10 owner invemardoi
2019-08-22 12:22:49,415 3a900374c51211e993c60a9901e9a631 END SUCCESS 10
2019-08-22 12:22:53,155 3ccb4f54c51211e993c60a9901e9a631 BEGIN search/count - invemardoi ark:/99166/p9dz03470 invemar ark:/99166/p9jm23k00 owner invemardoi keywords doi.org/10.5072/FK2VQ3122Q
2019-08-22 12:22:53,405 3ccb4f54c51211e993c60a9901e9a631 END SUCCESS 0
2019-08-22 12:22:53,406 3cf19f42c51211e993c60a9901e9a631 BEGIN search/results - invemardoi ark:/99166/p9dz03470 invemar ark:/99166/p9jm23k00 -updateTime 0 10 owner invemardoi keywords doi.org/10.5072/FK2VQ3122Q
2019-08-22 12:22:53,634 3cf19f42c51211e993c60a9901e9a631 END SUCCESS 0
2019-08-22 12:23:04,438 43850556c51211e993c60a9901e9a631 BEGIN mintIdentifier ark:/99999/fk4 merritt ark:/99166/p9td9n725 merritt ark:/99166/p9pn8xd8d
2019-08-22 12:23:04,438 43850556c51211e993c60a9901e9a631 END SUCCESS ark:/99999/fk4ff51c9m
2019-08-22 12:23:04,439 438515e6c51211e993c60a9901e9a631 BEGIN createIdentifier ark:/99999/fk4ff51c9m merritt ark:/99166/p9td9n725 merritt ark:/99166/p9pn8xd8d
2019-08-22 12:23:04,473 438515e6c51211e993c60a9901e9a631 END SUCCESS
2019-08-22 12:23:04,546 439577d8c51211e993c60a9901e9a631 BEGIN setMetadata ark:/99999/fk4ff51c9m merritt ark:/99166/p9td9n725 merritt ark:/99166/p9pn8xd8d
2019-08-22 12:23:04,577 439577d8c51211e993c60a9901e9a631 END SUCCESS
2019-08-22 12:23:11,315 479e652ec51211e993c60a9901e9a631 BEGIN getMetadata doi:10.7284/901440 anonymous anonymous anonymous anonymous False
2019-08-22 12:23:11,328 479e652ec51211e993c60a9901e9a631 END BADREQUEST
2019-08-22 12:23:17,725 4b704bc2c51211e993c60a9901e9a631 BEGIN search/count - invemardoi ark:/99166/p9dz03470 invemar ark:/99166/p9jm23k00 owner invemardoi keywords https:
2019-08-22 12:23:32,739 - STATUS pid=17922 threads=22 running activeOperations=0 waitingRequests=0 activeDataciteOperations=0 updateQueueLength=0 binderQueueLength=0 dataciteQueueLength=2 crossrefQueue:archived/unsubmitted/submitted=5/0/0 downloadQueueLength=0 activeSearches=1 operationCount=12
2019-08-22 12:23:36,202 5673ba86c51211e986c90a9901e9a631 BEGIN mintIdentifier ark:/88435/dsp01 pudiglib ark:/99166/p9df6k49v pudiglib ark:/99166/p9j678z36
2019-08-22 12:23:36,202 5673ba86c51211e986c90a9901e9a631 END SUCCESS ark:/88435/dsp011v53k084w
2019-08-22 12:23:36,202 5673cb16c51211e986c90a9901e9a631 BEGIN createIdentifier ark:/88435/dsp011v53k084w pudiglib ark:/99166/p9df6k49v pudiglib ark:/99166/p9j678z36
2019-08-22 12:23:39,296 5673cb16c51211e986c90a9901e9a631 END SUCCESS
2019-08-22 12:23:39,456 58645a1cc51211e986c90a9901e9a631 BEGIN setMetadata ark:/88435/dsp011v53k084w pudiglib ark:/99166/p9df6k49v pudiglib ark:/99166/p9j678z36 _target http:
2019-08-22 12:23:43,197 58645a1cc51211e986c90a9901e9a631 END SUCCESS
2019-08-22 12:23:44,279 5b441bfac51211e9932d0a9901e9a631 BEGIN search/count - invemardoi ark:/99166/p9dz03470 invemar ark:/99166/p9jm23k00 owner invemardoi keywords https:
2019-08-22 12:23:44,847 5b9ac414c51211e9b6b30a9901e9a631 BEGIN search/count - invemardoi ark:/99166/p9dz03470 invemar ark:/99166/p9jm23k00 owner invemardoi keywords doi.org/10.5072/FK2VQ3122Q
2019-08-22 12:23:46,689 5cb408ecc51211e9882d0a9901e9a631 BEGIN getMetadata doi:10.7284/117233 anonymous anonymous anonymous anonymous False
2019-08-22 12:23:47,616 5cb408ecc51211e9882d0a9901e9a631 END BADREQUEST
2019-08-22 12:23:48,538 5b9ac414c51211e9b6b30a9901e9a631 END SUCCESS 0
2019-08-22 12:23:48,539 5dce22b2c51211e9b6b30a9901e9a631 BEGIN search/results - invemardoi ark:/99166/p9dz03470 invemar ark:/99166/p9jm23k00 -updateTime 0 10 owner invemardoi keywords doi.org/10.5072/FK2VQ3122Q
2019-08-22 12:23:50,488 5ef7a80cc51211e986c90a9901e9a631 BEGIN search/count - invemardoi ark:/99166/p9dz03470 invemar ark:/99166/p9jm23k00 owner invemardoi
2019-08-22 12:23:50,833 5dce22b2c51211e9b6b30a9901e9a631 END SUCCESS 0
2019-08-22 12:23:50,856 5ef7a80cc51211e986c90a9901e9a631 END SUCCESS 1699
2019-08-22 12:23:50,857 5f2fddeec51211e986c90a9901e9a631 BEGIN search/results - invemardoi ark:/99166/p9dz03470 invemar ark:/99166/p9jm23k00 -updateTime 0 10 owner invemardoi
2019-08-22 12:23:52,535 5f2fddeec51211e986c90a9901e9a631 END SUCCESS 10
2019-08-22 12:23:59,568 4b704bc2c51211e993c60a9901e9a631 END ERROR OperationalError: (2013, 'Lost connection to MySQL server during query')
2019-08-22 12:23:59,568 5b441bfac51211e9932d0a9901e9a631 END ERROR OperationalError: (2013, 'Lost connection to MySQL server during query')
2019-08-22 12:24:00,873 - ERROR register_async._daemonThread/binder OperationalError: (2003, "Can't connect to MySQL server on 'rds-ias-ezid-search-prd.cmcguhglinoa.us-west-2.rds.amazonaws.com' (111)")
2019-08-22 12:24:00,873 - ERROR register_async._daemonThread/datacite OperationalError: (2003, "Can't connect to MySQL server on 'rds-ias-ezid-search-prd.cmcguhglinoa.us-west-2.rds.amazonaws.com' (111)")
2019-08-22 12:24:00,873 - ERROR backproc._backprocDaemon OperationalError: (2003, "Can't connect to MySQL server on 'rds-ias-ezid-search-prd.cmcguhglinoa.us-west-2.rds.amazonaws.com' (111)")
2019-08-22 12:24:05,864 - ERROR download._daemonThread OperationalError: (2003, "Can't connect to MySQL server on 'rds-ias-ezid-search-prd.cmcguhglinoa.us-west-2.rds.amazonaws.com' (111)")
2019-08-22 12:24:05,910 - ERROR backproc._backprocDaemon OperationalError: (2003, "Can't connect to MySQL server on 'rds-ias-ezid-search-prd.cmcguhglinoa.us-west-2.rds.amazonaws.com' (111)")
2019-08-22 12:24:05,928 - ERROR register_async._daemonThread/binder OperationalError: (2003, "Can't connect to MySQL server on 'rds-ias-ezid-search-prd.cmcguhglinoa.us-west-2.rds.amazonaws.com' (111)")
2019-08-22 12:24:05,956 - ERROR register_async._daemonThread/datacite OperationalError: (2003, "Can't connect to MySQL server on 'rds-ias-ezid-search-prd.cmcguhglinoa.us-west-2.rds.amazonaws.com' (111)")
2019-08-22 12:24:10,941 - ERROR backproc._backprocDaemon OperationalError: (2003, "Can't connect to MySQL server on 'rds-ias-ezid-search-prd.cmcguhglinoa.us-west-2.rds.amazonaws.com' (111)")
2019-08-22 12:24:10,960 - ERROR register_async._daemonThread/binder OperationalError: (2003, "Can't connect to MySQL server on 'rds-ias-ezid-search-prd.cmcguhglinoa.us-west-2.rds.amazonaws.com' (111)")
2019-08-22 12:24:10,991 - ERROR register_async._daemonThread/datacite OperationalError: (2003, "Can't connect to MySQL server on 'rds-ias-ezid-search-prd.cmcguhglinoa.us-west-2.rds.amazonaws.com' (111)")
2019-08-22 12:24:15,917 - ERROR download._daemonThread OperationalError: (2003, "Can't connect to MySQL server on 'rds-ias-ezid-search-prd.cmcguhglinoa.us-west-2.rds.amazonaws.com' (111)")
2019-08-22 12:24:15,963 - ERROR backproc._backprocDaemon OperationalError: (2003, "Can't connect to MySQL server on 'rds-ias-ezid-search-prd.cmcguhglinoa.us-west-2.rds.amazonaws.com' (111)")
2019-08-22 12:24:15,994 - ERROR register_async._daemonThread/binder OperationalError: (2003, "Can't connect to MySQL server on 'rds-ias-ezid-search-prd.cmcguhglinoa.us-west-2.rds.amazonaws.com' (111)")
2019-08-22 12:24:16,026 - ERROR register_async._daemonThread/datacite OperationalError: (2003, "Can't connect to MySQL server on 'rds-ias-ezid-search-prd.cmcguhglinoa.us-west-2.rds.amazonaws.com' (111)")
2019-08-22 12:24:25,117 739bb096c51211e9934b0a9901e9a631 BEGIN mintIdentifier ark:/88435/dsp01 pudiglib ark:/99166/p9df6k49v pudiglib ark:/99166/p9j678z36
2019-08-22 12:24:25,118 739bb096c51211e9934b0a9901e9a631 END SUCCESS ark:/88435/dsp01x346d703k
2019-08-22 12:24:25,118 739bc1c6c51211e9934b0a9901e9a631 BEGIN createIdentifier ark:/88435/dsp01x346d703k pudiglib ark:/99166/p9df6k49v pudiglib ark:/99166/p9j678z36
2019-08-22 12:24:25,180 739bc1c6c51211e9934b0a9901e9a631 END SUCCESS
2019-08-22 12:24:25,349 73bf000ac51211e9934b0a9901e9a631 BEGIN setMetadata ark:/88435/dsp01x346d703k pudiglib ark:/99166/p9df6k49v pudiglib ark:/99166/p9j678z36 _target http:
2019-08-22 12:24:25,387 73bf000ac51211e9934b0a9901e9a631 END SUCCESS
EZID Link Checker logs
Useful links:
- AWS rds-mysql-server-gone-away
- dev.mysql.com : MySQL server has gone away
- dev.mysql.com : Can't connect to MySQL server
- dev.mysql.com : Lost connection to MySQL server