Bot activity detected

# Bot activity detected ## Posible detect web-bot via libs 1. JS and Cookie challenge - if it cannot pass it, then it's fishy -- not always 2. https://github.com/RoBYCoNTe/js-bot-detector -- not work 3. https://www.npmjs.com/package/isbot-fast -- not work 4. https://www.npmjs.com/package/express-device -- not good 5. https://www.npmjs.com/package/es6-crawler-detect -- not work ## User agent from browser We can check the user agent of the browser. We found these [functions](https://www.coditty.com/code/how-to-detect-search-crawlers-using-javascript), it can filter part of bot traffic. ```javascript= function botCheck(){ var botPattern = "(googlebot\/|Googlebot-Mobile|Googlebot-Image|Google favicon|Mediapartners-Google|bingbot|slurp|java|wget|curl|Commons-HttpClient|Python-urllib|libwww|httpunit|nutch|phpcrawl|msnbot|jyxobot|FAST-WebCrawler|FAST Enterprise Crawler|biglotron|teoma|convera|seekbot|gigablast|exabot|ngbot|ia_archiver|GingerCrawler|webmon |httrack|webcrawler|grub.org|UsineNouvelleCrawler|antibot|netresearchserver|speedy|fluffy|bibnum.bnf|findlink|msrbot|panscient|yacybot|AISearchBot|IOI|ips-agent|tagoobot|MJ12bot|dotbot|woriobot|yanga|buzzbot|mlbot|yandexbot|purebot|Linguee Bot|Voyager|CyberPatrol|voilabot|baiduspider|citeseerxbot|spbot|twengabot|postrank|turnitinbot|scribdbot|page2rss|sitebot|linkdex|Adidxbot|blekkobot|ezooms|dotbot|Mail.RU_Bot|discobot|heritrix|findthatfile|europarchive.org|NerdByNature.Bot|sistrix crawler|ahrefsbot|Aboundex|domaincrawler|wbsearchbot|summify|ccbot|edisterbot|seznambot|ec2linkfinder|gslfbot|aihitbot|intelium_bot|facebookexternalhit|yeti|RetrevoPageAnalyzer|lb-spider|sogou|lssbot|careerbot|wotbox|wocbot|ichiro|DuckDuckBot|lssrocketcrawler|drupact|webcompanycrawler|acoonbot|openindexspider|gnam gnam spider|web-archive-net.com.bot|backlinkcrawler|coccoc|integromedb|content crawler spider|toplistbot|seokicks-robot|it2media-domain-crawler|ip-web-crawler.com|siteexplorer.info|elisabot|proximic|changedetection|blexbot|arabot|WeSEE:Search|niki-bot|CrystalSemanticsBot|rogerbot|360Spider|psbot|InterfaxScanBot|Lipperhey SEO Service|CC Metadata Scaper|g00g1e.net|GrapeshotCrawler|urlappendbot|brainobot|fr-crawler|binlar|SimpleCrawler|Livelapbot|Twitterbot|cXensebot|smtbot|bnf.fr_bot|A6-Indexer|ADmantX|Facebot|Twitterbot|OrangeBot|memorybot|AdvBot|MegaIndex|SemanticScholarBot|ltx71|nerdybot|xovibot|BUbiNG|Qwantify|archive.org_bot|Applebot|TweetmemeBot|crawler4j|findxbot|SemrushBot|yoozBot|lipperhey|y!j-asr|Domain Re-Animator Bot|AddThis)"; var re = new RegExp(botPattern, 'i'); var userAgent = navigator.userAgent; if (re.test(userAgent)) { return true; }else{ return false; } } ``` if this function return, we not send get settings request. ## Algorithm   ![](https://i.imgur.com/g2M1n7d.png) After analyzing user profiles on the site, including rakeLiveChatBotId: 789, we have described this algorithm. Bot Features: * isOnline field is null * empty activity array The absence of these fields indicates that the site was opened for a very short time. > **The isOnline field** is filled only after a successful connection with STOMP > > The first element of the **activity array** appears a few ms after the basic settings have been obtained. > We made a research about best practices to filtering bot activity\spiders\search engine. But not created yet functionality or rules which will handle this 100%. Google analytics has this [functionality](https://www.searchenginewatch.com/2014/07/31/google-analytics-helps-you-filter-spider-and-bot-traffic/), but they just have DB with a list of known bots. Also, we checked if we can use this filter separately, but google not provide a separate API of this.