# hunting the web for forgotten mirrors/dumps of the mythical "Batvirus" mysql db - me: @100ideas ([twitter](https://twitter.com/100ideas), [github](https://github.com/100ideas)) - this doc: https://hackmd.io/@100ideas/BkfDLbmtd - [twitter list of D.R.A.S.T.I.C. contributors](https://twitter.com/i/lists/1394818088223219715) - background on missing db - D.R.A.S.T.I.C. report: ["An Investigation into the WIV Databases that were Taken Offline"](https://drasticresearch.org/2021/02/19/an-investigation-into-the-wiv-databases-that-were-taken-offline/) ## goal: find dump of 61.5MB " batvirus.whiov.ac.cn" mysql db (or just fragments / csvs / partial records etc) 1) started by familiarizing myself w/ the Chinese bioninformatic sites & dataset indexes that are still public today (they have gotten a lot more elaborate than what is visible in the historic archives!) 2) also searched for orphaned files and ids in wayback machine, timetravel.mementoweb.org Not feeling very lucky, but at least here are some notes of various Chinese bioinfo sites. ## want to find the content of this data repository - seems like it is an index of all the other ones? results: all the links to it and other interesting indexes like pathogenic virus sequence repositories just time out for me. (located in north america) ## 2021-05-19 notes 病毒遗传资源数据库 Virus Genetic Resources Database http://www.nsdata.cn/resource/serviceInfoDetail?sheetId=7453 Virus genetic resources database (genetic resources-cloning library) collects basic information about the cloning of important functional genes of various pathogens, including the biological background of the cloned genes, gene description, coding protein, function, reference literature, etc.; and establishes and reserves Nucleic acid detection methods for these pathogens. Wuhan Institute of Virology, Chinese Academy of Sciences 2020-11-19 18:31:17 #### also National Genomics Data Center 2019 Novel Coronavirus Resource - appears to be open access - https://bigd.big.ac.cn/ncov/?lang=en - Genome Warehouse sub-db - https://bigd.big.ac.cn/gwh/browse/virus/coronaviridae - appears to just mirror ncbi for betacoronavirus sequences (collected in china) China National GeneBank DataBase: CNGBdb - https://db.cngb.org/ China ScienceDataBank - was able to create account w/o hassle - but note, phone number needs to look like this: - "0852-66247673" or "18999999999" - no spaces or hyphens - https://www.scidb.cn/ - **we want the "batvirus" mysql db, doi:10.11922/sciencedb.768** - used to be at http://www.sciencedb.cn/dataSet/downloads/768 - new scidb seems to change url scheme - now resources have an opaque "dataSetId" separate from the number at the end of the doi. - example: - "Checklist of Vascular Plants in Shannan Prefecture": - doi: 10.11922/sciencedb.796 - url: https://www.scidb.cn/en/detail?dataSetId=633694461330718721&language=en_US&dataSetType=journal&code=5c36e22c13f6b34064283d5e - download url ends up being: https://www.scidb.cn/api/sdb-statistics-service/download?dataSetId=633694461330718721&dataSetType=journal&size= - ** so we need to know the new "dataSetId" 788 -> 633694461322330116 => 6336944613 22330116 768 -> ?????????????????? => 6336944613 ???????? 796 -> 633694461330718721 => 6336944613 30718721 CovDB - Coronavirus Database (v3) - http://covdb.popgenetics.net/v3/ - "CoVdb extensively collects published coronavirus data and have taken in genomes of 5709 strains after the update in May 22, 2020. The strains were collected from 32 organisms and in the years from 1941 to present, 2020 (Figure 1). " - http://www.chinacdc.cn/jkzt/crb/zl/szkb_11803/ a selection of databases listed at nsdata.cn (re wiv cov) -------------------------------------------------------- nsdata.cn: created an account... but still not sure how/if to access dbs - these all time out - probably need to tunnel into china via vpn china or perhaps even need "official" account - http://www.viruses.nsdc.cn/vri.jsp - http://www.viruses.nsdc.cn/byyczyklk.jsp - http://www.viruses.nsdc.cn/chinavpi/ - http://mrs.im.ac.cn/info.do?db=refseq_release - http://srs.im.ac.cn/srs71bin/cgi-bin/wgetz?-page+LibInfo+-lib+EPDSEQ - http://www.emgene.csdb.cn/~showEntity[cn.csdb.emgene.genome.tbGenomeSpecies].vpage historical site archives for reference - sciencedb.cn https://web.archive.org/web/20200517063406/http://www.sciencedb.cn/dataSet/handle/768 - archived snapshot of old viruses.nsdc.cn db https://web.archive.org/web/20200317030410/http://www.viruses.nsdc.cn/main.jsp - sitelist of old msis virus site: https://web.archive.org/web/*/msis.nsdc.cn/* - cool diagrams/catalog of viruses https://web.archive.org/web/20200125220716/http://www.mgc.ac.cn/cgi-bin/DBatVir/main.cgi\ - Virus Genetic Resources Database https://web.archive.org/web/20200317030410/http://www.viruses.nsdc.cn/byyczyklk.jsp - (2017) "Chinese Professional Database of Viral Pathogen Investigation" https://archive.vn/Rktr7 (http://www.viruses.nsdc.cn/chinavpi/) - "The "China Viral Pathogen Investigation Professional Database" is based on the "Viral Pathogen Investigation Project of Important Natural Hosts and Vector Insects in China"" - https://web.archive.org/web/20190425020455/http://www.viruses.nsdc.cn/chinavpi/ non-exhaustive list of china virus databases listed on nsdata.cn ------------------------------------------------------------------ **动物疫病病原综合应用数据库** "Characteristic Database of Viral Pathogens Carried by Wild Animals" http://www.nsdata.cn/resource/list?code=1803710 and **中国病毒性病原调查专业数据库** "Database of Viral Pathogen Investigation in China" https://web.archive.org/web/20190425020455/http://www.viruses.nsdc.cn/chinavpi/ 中国微生物与病毒主题数据库 China Microbial and Virus Subject Database http://www.micro.csdb.cn 野生动物携带病毒病原数据库 Wild Animals Carrying Virus Pathogen Database 2020-11-19 18:31:17 病毒性病原本底数据库 Viral Disease Original Database Enter the detection and analysis data of the collected samples to carry viral pathogens to form a representative natural focus area (Xinjiang, Qinghai, Hubei, Yunnan) of different species carrying viral diseases 高通量病原检测数据库 High-throughput pathogen detection database - http://rsr.csdb.cn/general/toDataDetail?rsDataId=5ed85897aa664cc6600ed7a4 - http://www.viruses.nsdc.cn/chinavpi/detect.jsp Through bioinformatics analysis, establish the experimental technology of Pan-PCR and multiple primer combined PCR for viruses carried by different species (bats, birds, mice, ticks, mosquitoes); establish needles Wuhan Institute of Virology, Chinese Academy of Sciences 病毒资源数据库 Virus Resource Database - http://rsr.csdb.cn/general/toDataDetail?rsDataId=5ed85898aa664cc6600ed806 - http://www.viruses.nsdc.cn/vri.jsp The virus resource database integrates databases with independent intellectual property rights, including virus preservation databases, virus species databases, type specimen databases, and virus species databases, and establishes a large-scale "Chinese virus resource database" whose virus resources cover various virus databases, including humans Medical virus database, animal virus database, zoonotic virus database, wild animal virus database, natural foci virus database, emerging infectious disease pathogen database, insect virus database, plant virus database, bacterial virus... 病毒性病原遗传资源数据库 Viral pathogen genetic resources database - http://rsr.csdb.cn/general/toDataDetail?rsDataId=5ed85897aa664cc6600ed7a8 - http://www.viruses.nsdc.cn/chinavpi/giv.jsp Enter the isolation and identification data of viral pathogens in different animal species and the information of new pathogen isolates, and establish a database of viral genetic resources from bats, mosquitoes, ticks, rodents, and birds in my country, providing indispensability for major scientific researches such as vaccines and drugs. Indispensable source resources. Wuhan Institute of Virology, Chinese Academy of Sciences *HIT* 重要病毒性病原检测专业数据库 - http://rsr.csdb.cn/general/toDataDetail?rsDataId=5ed85875aa664cc6600ecd0a - http://www.virus.csdb.cn/ Professional database of important viral pathogen detection The professional database for the detection of important viral pathogens, combined with the important research fields of the Institute, is to establish a professional database for the detection of important viral pathogens, including human respiratory viruses (10-16 common and more common types). Onset) 病毒性病原历史疫情数据库 Historical Epidemic Database of Viral Pathogens Enter a series of historical epidemic data including the outbreak of key pathogens. The specific data should include the time of the outbreak, the end time, the spread of the epidemic, the number of infections, and the number of deaths (rate), and restore as much as possible the habitat data of the foci at the time of the outbreak. . At the same time, the secondary data processing is carried out to establish a historical epidemic database according to different regions, different hosts, different pathogens, and habitats. 病毒性病原调查基础数据库 http://www.nsdata.cn/resource/serviceInfoDetail?sheetId=7391 Basic Database of Viral Pathogen Investigation Visit this library details The input project conducts surveys of main animals and vector insects and sample collection data in Xinjiang, Qinghai, Hubei and Yunnan each year (including: habitat data, collection methods, pretreatment methods, collection implementation process, pretreatment implementation process). Wuhan Institute of Virology, Chinese Academy of Sciences --- 病毒编目数据库 Virus cataloging database The virus cataloging database mainly contains the basic information of more than 7,800 strains of viruses, including the English name, Chinese name, classification, host, collection time, location, source, quantity, original literature, biological safety level and other data information of the virus, as well as the physical and chemical information of the virus. Characteristics, genome information, virus classification information, to maximize the background information of species; the establishment of the database provides a convenient and friendly search interface, which can be based on the classification of viruses, such as family name, species... Wuhan Institute of Virology, Chinese Academy of Sciences 2020-11-19 18:31:17 病毒敏感细胞数据库 Virus Sensitive Cell Database The virus-sensitive cell bank mainly collects and preserves the corresponding virus-sensitive cell lines (lines) resources of humans and animals in my country; researches and develops new cell culture technologies, and researches and develops new technologies for the preservation and quality control of cell lines (lines); facing the whole country, Provide standardized cell lines (lines) and related services for research in the field of virology and biotechnology in my country. Wuhan Institute of Virology, Chinese Academy of Sciences 2020-11-19 18:31:17 http://www.nsdata.cn/resource/serviceInfoDetail?sheetId=7389 中国病毒性病原调查专业库 China Viral Pathogen Investigation Professional Database The project is based on the national strategic biological resources and life science research needs, based on the collection and preservation of virus resources in China and the information on virus biological characteristics, to establish a complete virus data collection, virus resource preservation and genetic resource database, and build virus resources Information sharing platform. Unify standard data specifications, integrate existing virus resource preservation databases, and establish a standardized and standardized Chinese virus species database, virus cataloging database, virus resource preservation... Wuhan Institute of Virology, Chinese Academy of Sciences 2020-11-19 18:31:17 病毒核酸检测分析数据库 http://www.nsdata.cn/resource/serviceInfoDetail?sheetId=6397 Viral nucleic acid detection and analysis database Based on the pathogenic nucleic acid analysis maps of common respiratory viruses and digestive tract viruses, genomic mutation data within individuals, and standard pathogen database data, a pathogen polymorphism database is established to provide systematic reference data for pathogen detection and diagnosis. Wuhan Institute of Virology, Chinese Academy of Sciences 2020-11-19 18:31:17 动物疫病病原综合应用数据库 http://www.nsdata.cn/resource/serviceInfoDetail?sheetId=7451 Comprehensive Application Database of Animal Disease Pathogens Visit this library details The comprehensive application database of animal disease pathogens integrates animal disease pathogen data and related virus strain information, and provides independent research epidemiological survey data, including: animal disease pathogen isolate information, biological characteristics, antigenic phenotype, pathogenicity, genome Science, genetic stability and molecular evolution characteristics, epidemiological analysis and geographic display systems, etc.; can intuitively realize the graphical display of geographic information in biological information systems such as molecular epidemiological data, epidemic dynamics, etc.; can integrate... Wuhan Institute of Virology, Chinese Academy of Sciences 2020-11-19 18:31:17 --- 自然疫原性病毒综合应用数据库 http://www.nsdata.cn/resource/serviceInfoDetail?sheetId=7455 Comprehensive Application Database of Natural Epidemic Viruses The comprehensive application database of natural foci virus integrates natural foci virus data and related virus strain information, and provides independent research epidemiological survey data, including: natural foci virus isolate information, biological characteristics, and antigenic phenotypes , Pathogenicity, genomics, genetic stability and molecular evolution characteristics, epidemiological analysis and geographic display systems, etc.; it can intuitively realize the graphical display of geographic information in biological information systems such as molecular epidemiological data and epidemic dynamics; .. . Wuhan Institute of Virology, Chinese Academy of Sciences 2020-11-19 18:31:17 特异病毒快速检测诊断数据库 Specific virus rapid detection and diagnosis database Analyze the comprehensive information of specific infectious disease pathogens, design specific amplification primers and detection probes, and establish a database of specific pathogen detection probes. Complete relevant evaluations such as effectiveness and save the evaluation data into the database to provide data support for further data mining. Wuhan Institute of Virology, Chinese Academy of Sciences 2020-11-19 18:31:17 流感病毒基因组数据库 Influenza Virus Genome Database The influenza virus genome database contains the entire genome data of influenza viruses, including the original sequence, CDS sequence and protein sequence, etc., for the use of researchers. Beijing Institute of Genomics, Chinese Academy of Sciences 2020-11-19 18:31:17 病毒血清学检测分析数据库 Virus serological detection and analysis database All viral infections can cause the body's humoral immune response, which can often be used for diagnostic purposes. The humoral response caused by different viruses is very different. Some virus antibodies are neutralizing antibodies. The detection of these antibodies can help diagnose viral infections. Wuhan Institute of Virology, Chinese Academy of Sciences 2020-11-19 18:31:17 昆虫病毒综合应用数据库 Insect Virus Comprehensive Application Database The Insect Virus Comprehensive Application Database integrates insect virus data and related strain information, and provides independent research epidemiological survey data, including: natural foci virus isolate information, biological characteristics, antigenic phenotype, pathogenicity, genome Science, genetic stability and molecular evolution characteristics, epidemiological analysis and geographic display systems, etc.; can intuitively realize the graphical display of geographic information of biological information systems such as molecular epidemiological data, epidemic dynamics, etc.; can integrate common scores... Wuhan Institute of Virology, Chinese Academy of Sciences 2020-11-19 18:31:17 中国病毒资源基础数据库 China Virus Resource Basic Database The China Virus Resource Database mainly collects the preservation information of various live virus strains of the virus discipline in my country, including important scientific information such as background information, genetic information, epidemic characteristics and exchanges, to enhance the scientific value of virus strain resources and promote research data in the virus discipline With continuous accumulation and integration, virological research plays an important supporting role. Aiming at the key research areas of the virus discipline, based on independent research data and standardized mass data, relying on different analysis modules,... Wuhan Institute of Virology, Chinese Academy of Sciences 2020-11-19 18:31:17 昆虫病毒综合应用数据库 Insect Virus Comprehensive Application Database The Insect Virus Comprehensive Application Database integrates insect virus data and related strain information, and provides independent research epidemiological survey data, including: natural foci virus isolate information, biological characteristics, antigenic phenotype, pathogenicity, genome Science, genetic stability and molecular evolution characteristics, epidemiological analysis and geographic display systems, etc.; can intuitively realize the graphical display of geographic information of biological information systems such as molecular epidemiological data, epidemic dynamics, etc.; can integrate common scores... Wuhan Institute of Virology, Chinese Academy of Sciences 2020-11-19 18:31:17 鸟类样线调查数据子库 Sub-database of bird transect survey data Due to the special natural ecological environment in the Qinghai Lake basin, it has become an important habitat for wild animals on the Qinghai-Tibet Plateau. The resources of wild animals and plants are extremely rich. Among them, there are 35 species of national first and second protected animals, accounting for 32.3% of the national total. In recent years, Tibetan antelopes, wild yaks, eagles, eagles and other animals have died in large numbers after repeated poisonings caused by hunting or grassland rodent control. The continuous land and shrinkage of the Bird Island have also caused a large number of migratory birds to migrate. Benziku is an adjustment to the patrol survey bird-like line since 2006... Computer Network Information Center, Chinese Academy of Sciences 2019-11-27 02:51:56 鸟类环湖调查数据子库 Sub-database of Bird Circle Lake Survey Data Due to the special natural ecological environment in the Qinghai Lake basin, it has become an important habitat for wild animals on the Qinghai-Tibet Plateau. The resources of wild animals and plants are extremely rich. Among them, there are 35 species of national first and second protected animals, accounting for 32.3% of the national total. In recent years, Tibetan antelopes, wild yaks, eagles, eagles and other animals have died in large numbers after repeated poisonings caused by hunting or grassland rodent control. The continuous land and shrinkage of the Bird Island have also caused a large number of migratory birds to migrate. Benziku is based on the results of the patrol survey of birds since 2006... Computer Network Information Center of Chinese Academy of Sciences --- Virus cell culture technical standard (SOP) database Use mammalian cell lines to isolate and culture samples of enteroviruses, respiratory viruses and other infectious diseases, and use nucleic acid molecular detection, immunoserology and other identification methods to establish a new, rapid and efficient virus isolation and culture identification system, and establish virus culture Database of new technologies and standards. Wuhan Institute of Virology, Chinese Academy of Sciences 2020-11-19 18:31:17 Przewalski's gazelle survey data sub-database around the lake Due to the special natural ecological environment in the Qinghai Lake basin, it has become an important habitat for wild animals on the Qinghai-Tibet Plateau. The resources of wild animals and plants are extremely rich. Among them, there are 35 species of national first and second protected animals, accounting for 32.3% of the national total. Przewalski's gazelle is a rare species unique to our country. The number is less than 300, even rarer than giant pandas. It only lives in the Qinghai Lake basin. In recent years, the number of Przewalski's gazelle has become less and less. The book is a description of the results of the patrol survey of Przewalski's gazelle since 2006... Computer Network Information Center, Chinese Academy of Sciences 2019-11-27 02:51:56 Przewalski's gazelle transect survey data sub-database Due to the special natural ecological environment in the Qinghai Lake basin, it has become an important habitat for wild animals on the Qinghai-Tibet Plateau. The resources of wild animals and plants are extremely rich. Among them, there are 35 species of national first and second protected animals, accounting for 32.3% of the national total. Przewalski's gazelle is a rare species unique to our country. The number is less than 300, even rarer than giant pandas. It only lives in the Qinghai Lake basin. In recent years, the number of Przewalski's gazelle has become less and less. Benziku is a description of the Przewalski's gazelle line that has been patrolling the lake since 2006... Computer Network Information Center, Chinese Academy of Sciences 2020-11-19 18:31:17 Influenza Comprehensive Application Database The Influenza Comprehensive Application Database integrates influenza virus data and related strain information, and provides independent research epidemiological survey data, including: influenza virus isolate information, biological characteristics, antigenic phenotype, pathogenicity, genomics, genetic stability Sexual and molecular evolution characteristics, epidemiological analysis and geographic display systems, etc.; it can dynamically realize the graphical display of geographic information in the biological information system of influenza virus, including: pathogens, flow information, time and space information, transmission... Wuhan Institute of Virology, Chinese Academy of Sciences 2020-11-19 18:31:17 Animal name database This database collects the Latin names, Chinese names and English names of 30,000 species of animals (including fish, amphibians, reptiles, birds and mammals) in the world, and can display the upper taxa of the species in question. Institute of Zoology, Chinese Academy of Sciences 2020-11-19 18:31:17 Animal Habitat Database The animal habitat database mainly covers important animal habitat information in key research areas in the western region. Based on the analysis of surface vegetation index, surface radiation temperature and remote sensing landscape information, it is oriented to analytic data such as animal habitat background, migration background and dynamic changes in key research areas. . It also includes information on the habitat of animals in the study area, the surface vegetation index of the migration area, the surface radiation temperature, and the photosynthetically active radiation information. Institute of Remote Sensing Applications, Chinese Academy of Sciences 2020-11-19 18:31:17 mailto drasticgroup@protonmail.com