Secret Google Algorithm Exposed

### Secret Google Algorithm Exposed Has part of the secret Google search algorithm just been exposed? In a shocking turn of events, Google accidentally published over 2,500 pages of Google search API documentation, and they indicate that Google has been lying to us about key aspects of how its search engine operates. This revelation could mark one of the most pivotal moments in SEO history, potentially transforming our entire approach to search optimisation. Welcome to this week's marketing news roundup. I'm Mark Webster, co-founder of Authority Hacker and strategic advisor at Marketing Pros. This video is brought to you by Search Intelligence. More about them later. But for now, let's dive into exactly what these leaked documents reveal about the inner workings of Google's algorithm. ### Google’s Internal Documentation In a groundbreaking revelation, Rand Fishkin from SparkToro has publicised internal documentation from Google Searches Content Warehouse API. These documents reveal a treasure trove of information about the data Google collects from its users and its various products. The documents were published to a public GitHub code repository and captured by an external automated documentation service. According to Danny Goodwin from Search Engine Journal, this was an accident, not an intentional leak. The mistake was corrected on May 7th, but the automated documentation remains live. The documents were apparently discovered by Erfan Azimi, CEO of EA Eagle Digital, and Fishkin confirmed this in a comment on Azimi's video. To ensure authenticity, Fishkin reached out to several ex-Googlers who confirmed the documents appear legitimate. This isn't a case of outdated information coming to light. With a March 27th publish date, these documents appear very recent, potentially even reflecting the current algorithm. Fishkin enlisted Mike King from iPullRank, a highly skilled technical SEO, to analyse the technical aspects of the documentation. King released his own findings and confirmed the leak contains an extraordinary amount of previously unconfirmed information about Google's inner workings. ### Click Data and Ranking We'll dive into all these findings in a second, but I want to start by tempering expectations. Fishkin makes it clear that the mentioned features are not conclusive proof of a Google ranking factor. The leak provides valuable insights, but it's not a complete blueprint of Google's algorithm. There's no mention of how these potential factors are weighted within the algorithm, and some of the elements may be experimental or no longer in use. However, the documents distinctly label features that have been depreciated. This strongly suggests that everything without a depreciated label is currently active in the algorithm. Much of what's included also corroborates the testimonies of Google executives, adding credibility. Let's start with the most shocking revelations. Google has lied to us about quite a few things. First is their use of click data. Google analyst Gary Illyes has said clicks are not directly used in rankings. But in reality, the Navboost ranking system, which is mentioned 84 times in the documentation, has a specific module for click signals. It gets pretty dense, but the gist is that Google uses clicks and post-click behaviour as part of its ranking algorithms. They track good clicks and bad clicks, and even measure the date of the last good click to a document. According to King, this could mean the content decay or traffic loss over time is partially caused by a page not driving the expected amount of good clicks for its position in the SERPs. What this means is if users are found to engage less deeply over time or pogo stick back to search and click another result, then your rankings will likely fall over time. This system makes Google mimic social networks where engagement is the number one metric that decides how widespread the distribution of a piece is. Rand Fishkin must feel particularly good about this one Google's Gary Illyes has called his theory of Google using click data "made up crap". ### Domain Authority and Chrome Data Google also claimed they don't care about domain authority. However, the documents reveal Google has a feature called site authority. We don't know how this is used, but it suggests that the concept of website authority exists and is used in the Google algorithm. Shocker, I know. Google search advocate John Mueller also claimed that they don't use Chrome data for ranking. However, one module related to page quality scores features a site level measure of views from Chrome. And another schedule related to generating site links has a Chrome-related attribute as well. Google has also denied the existence of a sandbox for new websites that limits their ability to rank until they reach a certain age. But the documentation indicated Google has an attribute called hostAge to limit new sites. This means that newer sites may be held back in rankings until they're deemed trustworthy. Google's apparent lies aside, let's move on to some actionable insights for business owners from this leak. ### Actionable Insights for Business Owners One of the most interesting insights comes from Erfan Azimi, the apparent source of the information. He says that the Navboost ranking system is essentially a personalised algorithm on steroids. Azimi claims that users' actions influence the rankings of other users in their location and demographic. Fishkin expands on this with an example. If many people in Seattle search for Lehman Brothers and scroll until they find the Lehman Brothers' theater website, Google will quickly realise that's what people in Seattle want when they type the query. They won't find the Wikipedia link to the failed financial firm anymore. He continues by saying that if you can create demand for your website among enough likely searchers in the regions you're targeting, you may be able to ignore the classic SEO signals like links and optimised content to rank. There are many other findings, too. For example, Google explicitly stores the author of a page and whether an entity on that page is also the author. Google cares who your authors are, which validates an important element of EEAT. There's also clear evidence of algorithmic demotions for a variety of reasons. These include an anchor mismatch, which is when the link does not appear relevant to the site it's linking to. SERP demotion, which likely has to do with user dissatisfaction as measured by clicks. Nav demotion, which likely applies to pages with bad navigation or user experience. Exact match domains demotion, which means exact match domains don't get as much value. And location demotion, which suggests Google attempts to associate pages with a location and rank them accordingly. Page titles are still crucial. The title match score variable indicates that how well your page title matches the query is a valuable ranking factor. Content freshness is also important, as is ensuring your publish and last updated dates are consistent across structured data, page titles, XML site maps, and anywhere else dates are listed. There's also a series of metrics dedicated to fighting link spam. Basically, Google can detect spikes in spam anchor text and negate negative SEO attacks. It's comforting to know that, especially considering the disavow tool may not be around much longer. King couldn't find any mention of disavow data, and John Mueller has hinted that Google may be removing the disavow tool soon. Google also tracks domain registration dates, likely to inform sandboxing of new content and domains that have changed ownership. This has likely become more relevant with the expired domain abuse spam policy. Interestingly, Google has a label for small personal websites. While the definition is unclear, this could be used to either boost or demote such sites. So who wants to bet HouseFresh has this label? Google tracks the average weighted font size of terms and anchor text in documents. This is likely because many people don't use heading formatting properly. So Google uses font size to help determine the intended headings on a page. ### Importance of Building a Brand We've covered some of the biggest findings, but we couldn't fit everything worth learning into this video. SEOs around the world will be analysing this release for months to come. If there are any more important revelations, we'll report them on here. So take a moment to make sure you're subscribed so you don't miss out. Arguably, the most important takeaway is that building a brand matters more than anything else if you want to do well in search these days. Fishkin says, If there's one piece of universal advice I had for marketers seeking to broadly improve their organic search rankings and traffic, it would be "build a notable, popular, well-recognized brand in your space outside of Google search". This means that search marketing has flipped 180 degrees. Up until a few years ago, SEO was a discovery engine that you could use to build up other channels. But now people find you on other channels, prefer your content in search, and give you a boost in search as a result. Basically, building a brand outside of Google eventually pays you back in organic Google traffic. This would explain the mysterious case of CJ Eats, a DR 28 site that recently shot up to 1 million estimated monthly visits, according to Ahrefs. This was during a time that many similar recipe sites tanked heavily during a recent round of updates. He's probably doing so well because he's focused on external marketing. His TikTok has 1.1 million followers, and his Instagram is close behind at 1 million. Many searchers recognise his brand when they search for recipes, and they click his content over others they don't recognise. They also spend more time on the page because they trust him more. This means higher click-through rates and dwell times, and Google ranks the site higher than his competitors. We discussed this case and brand building in more detail on our latest podcast. The link is in the description. To round up the story, Fishkin sums up the situation nicely. For most small and medium businesses and newer creators or publishers, SEO is likely to show poor returns until you've established credibility, navigational demand, and strong reputation among a sizable audience. Oh, and we forgot one potential ranking factor. King suggests that the indexing tier of a page impacts its link value. Think of it like Google organising their index into three levels. The most important and frequently updated content is stored in flash memory, which is the fastest way to retrieve information. Less important content is kept on solid state drives, while content that isn't updated often is stored on regular hard drives. This means that the higher a page's tier, the more valuable the link. King claims that fresh content is seen as high quality, so you want links from pages that are either fresh or in the top tier. This is why links from higher ranked pages and news pages help improve your ranking more effectively than older content edits, for example. This confirms digital PR is one of the best ways to obtain high quality links at the moment. And speaking of digital PR, here's a quick word from our sponsor, Search Intelligence. They're a digital PR agency who recently capitalised on Coldplay frontman Chris Martin's admission that he only eats one meal per day. They contacted journalists to provide expert commentary from their client, Bulk.com. Now, their data-driven approach looked at trends on both Google and TikTok to find insights that fitness and nutrition experts were sure to love. Sure enough, they landed 40 links from super high-quality websites, including Healthline and Huffington Post. If you're looking for a digital PR campaign to supercharge your link building, head on over to search-intelligence.com or reintelligence.co.uk. ### AI-Generated Overviews The rollout of AI-generated overviews in US search results is taking a disastrous turn. Mainstream media outlets like the New York Times, BBC, and CNBC have reported numerous inaccuracies and bizarre responses from these AI overviews, and many users are begging Google for a way to turn them off. Some of the more ridiculous results have been widely shared on social media, including recommendations to use non-toxic glue to get cheese to stick to pizza, or recommendations to eat one small rock per day, an answer that was apparently sourced from satirical publication The Onion. Yes, the same Onion that once convinced North Korea that Kim Jong-Un had been voted the sexiest man alive. These over-the-shoulder answers are fun to laugh at, but the real issue is the potential for Google's incorrect answers to actually hurt people. For example, one AI answer recommended drinking vegetable juice as a home remedy to a burst appendix, and another shared some well-known health benefits of smoking tobacco, such as better lung volume and reduced risks of cancer. To be fair to Google, this isn't entirely on them. Many trolls have hopped on the Google hate train and created fake AI overview screenshots. Many people have also asked weird questions designed to get a harmful answer. Google spokesperson Lara Levin stated that many of the examples we've seen have been uncommon queries, and we've also seen examples that were doctored or that we couldn't reproduce. But regardless of the number of doctored fakes versus real screenshots floating around, public uproar has forced Google to take a step back. Cyrus Shepard, owner of Zyppy SEO, said that there seemed to be far fewer AI overviews than there were just a few days ago. And real estate marketer Matt McGee also reported that less than 5% of real estate queries he tested have AI overviews. And Lily Ray said she can hardly get AI overviews to trigger at all right now. A report from Kevin Indig on his newsletter Growth Memo goes even further by attempting to gauge the impact of AI overviews on organic traffic. He included 1,675 health queries in his experiment and found that 42% of queries show AI overviews. That seems like quite a lot, especially for the health niche, though the experiment was run from May 7th to May 21st, which is likely before Google started to turn the dial down. His analysis found that AI overviews often reduce traffic to cite URLs. Despite this, the impact can vary depending on the user's intent, with some complex queries potentially benefiting from the increased traffic. Even when AI overviews don't cite the domain, there's still a slight decline in organic traffic, indicating that users still engage with organic results despite the presence of AI content. This is a small experiment in a single niche, and we'll need much more data to draw conclusions. However, the decrease of AI overviews in the search results and the early data showing that traffic decline is slight are both encouraging signs for site owners. Added to this fact, we haven't seen any traffic drop horror stories since the release of AI overviews like we did with the HCU or the March core update. Which of these findings from Google's leaked documents revelations do you think is the most important? How do you plan to incorporate this new information into your SEO efforts? Let us know in the comments below. We'll be answering them for the next week. If you enjoyed this video, then make sure to check out this video right here. --- ### 谷歌算法曝光谷歌部分秘密搜索算法是否已經曝光？令人震驚的是，谷歌意外發布了超過 2,500 頁的谷歌搜索 API 文檔，表明谷歌在其搜索引擎的運作方式上對我們撒了謊。這一揭示可能成為 SEO 歷史上最具決定性的時刻之一，可能徹底改變我們的搜索優化方法。歡迎來到本週的營銷新聞回顧。我是 Authority Hacker 的聯合創始人兼 Marketing Pros 的戰略顧問 Mark Webster。這個視頻由 Search Intelligence 贊助。稍後會詳細介紹他們。但現在，讓我們深入了解這些洩露文件揭示的谷歌算法內部運作。 ### 谷歌內部文件在一個突破性的揭示中，SparkToro 的 Rand Fishkin 公開了來自谷歌搜索內容倉庫 API 的內部文檔。這些文檔揭示了谷歌從其用戶及其各種產品收集的數據寶庫。這些文檔被發布到公共 GitHub 代碼庫，並由外部自動文檔服務捕獲。根據 Search Engine Journal 的 Danny Goodwin，這是一個意外，不是故意洩露。這個錯誤在 5 月 7 日得到了糾正，但自動文檔仍然有效。這些文件顯然是由 EA Eagle Digital 的 CEO Erfan Azimi 發現的，Fishkin 在 Azimi 的視頻評論中確認了這一點。為了確保真實性，Fishkin 聯繫了幾位前谷歌員工，他們確認這些文件看起來是真實的。這不是一個過時的信息被揭示出來的案例。這些文件的發布日期是 3 月 27 日，這些文件看起來非常近期，甚至可能反映了當前的算法。Fishkin 邀請了 iPullRank 的 Mike King，一位高技能的技術 SEO，來分析這些文檔的技術方面。King 發布了自己的發現，並確認這次洩漏包含大量以前未確認的關於谷歌內部運作的信息。 ### 點擊數據和排名我們將深入探討所有這些發現，但我想先降低期望。Fishkin 明確表示，提到的功能並不是谷歌排名因素的確鑿證據。洩漏提供了有價值的見解，但這不是谷歌算法的完整藍圖。沒有提到這些潛在因素在算法中的權重，一些元素可能是實驗性的或不再使用。然而，文件明確標註了已被貶值的功能。這強烈表明，沒有貶值標籤的所有內容目前在算法中是活動的。包含的大部分內容也證實了谷歌高管的證詞，增加了可信度。讓我們從最令人震驚的揭示開始。谷歌對我們撒了很多謊。首先是他們對點擊數據的使用。谷歌分析師 Gary Illyes 曾表示點擊不會直接用於排名。但事實上，文件中提到的 Navboost 排名系統有一個專門的模塊用於點擊信號。文件中 84 次提到這個系統。它非常複雜，但總的來說，Google 使用點擊和點擊後行為作為其排名演算法的一部分。他們跟踪好點擊和壞點擊，甚至衡量最後一次好點擊的日期。根據 King 的說法，這可能意味著內容衰退或流量隨時間減少部分是由於頁面未能為其在 SERPs 中的位置驅動預期數量的好點擊造成的。這意味著如果用戶隨時間參與度降低或者迅速返回搜索並點擊另一個結果，那麼你的排名很可能會隨時間下降。這個系統使 Google 模仿社交網絡，在那裡，參與度是決定一個作品的分佈範圍的首要指標。Rand Fishkin 對此特別感到滿意，因為 Google 的 Gary Illyes 曾稱他的 Google 使用點擊數據的理論為“捏造的胡說八道”。 ### 域名權威和 Chrome 數據 Google 還聲稱他們不關心域名權威。然而，文件顯示 Google 有一個名為站點權威的功能。我們不知道這是如何使用的，但這表明網站權威的概念存在並在 Google 演算法中使用。令人震驚，我知道。Google 搜尋倡導者 John Mueller 也聲稱他們不使用 Chrome 數據進行排名。然而，一個與頁面質量評分相關的模塊有一個來自 Chrome 的站點級別查看量的指標。另一個與生成站點鏈接相關的計劃也有一個 Chrome 相關的屬性。Google 還否認存在限制新網站排名的沙盒，直到它們達到一定年齡。但文檔表明 Google 有一個名為 hostAge 的屬性來限制新網站。這意味著新網站可能會在排名中被限制，直到它們被認為是可信的。撇開谷歌的謊言，我們來看一些這次洩漏對商業業主的可行見解。 ### 商業業主的可行見解最有趣的見解之一來自信息的顯然來源 Erfan Azimi。他說 Navboost 排名系統本質上是一個強化版的個性化算法。Azimi 聲稱用戶的行動影響其位置和人口統計中的其他用戶的排名。Fishkin 通過一個例子擴展了這一點。如果很多西雅圖的人搜索 Lehman Brothers 並滾動到找到 Lehman Brothers 劇院網站，Google 很快就會意識到這就是西雅圖的人在輸入查詢時想要的。他們將不再找到失敗的金融公司的維基百科鏈接。他繼續說，如果你能在你針對的地區創造對你網站的需求，你可能可以忽略經典的 SEO 信號，如鏈接和優化內容來排名。還有很多其他的發現。例如，Google 明確存儲頁面的作者以及頁面上的實體是否也是作者。Google 關心你的作者是誰，這驗證了 EEAT 的一個重要元素。還有明確的證據顯示算法降級的多種原因。這些包括錨點不匹配，即鏈接看起來與所鏈接的網站不相關。SERP 降級，可能與用戶不滿意度有關，通過點擊測量。導航降級，可能適用於導航或用戶體驗不好的頁面。精確匹配域名降級，意味著精確匹配域名不再那麼有價值。以及位置降級，表明 Google 嘗試將頁面與位置關聯並相應排名。頁面標題仍然至關重要。標題匹配分數變量表明頁面標題與查詢的匹配程度是有價值的排名因素。內容的新鮮度也很重要，確保你的發布和最後更新日期在結構化數據、頁面標題、XML 站點地圖和其他地方一致。還有一系列專門用於打擊鏈接垃圾郵件的指標。基本上，Google 可以檢測到垃圾郵件錨文本的高峰並否定負面的 SEO 攻擊。這尤其令人安慰，考慮到 disavow 工具可能不會再存在太久。King 找不到任何提及 disavow 數據的內容，而 John Mueller 也暗示 Google 可能會很快移除 disavow 工具。Google 還跟踪域名的註冊日期，這可能用於新內容和更換所有者域名的沙盒處理。這可能與過期域名濫用垃圾郵件政策變得更加相關。有趣的是，Google 有一個小型個人網站的標籤。雖然定義不清楚，但這可能用於提升或降級這些網站。誰願意打賭 HouseFresh 擁有這個標籤？Google 還跟踪文檔中的術語和錨文本的平均加權字體大小。這可能是因為許多人沒有正確使用標題格式，所以 Google 使用字體大小來幫助確定頁面上的預期標題。 ### 建立品牌的重要性我們已經涵蓋了一些最大的發現，但我們無法在這個視頻中容納所有值得學習的內容。全球的 SEOs 將在接下來的幾個月裡分析這次發布的內容。如果有更多重要的發現，我們會在這裡報告。所以花點時間確保你已經訂閱，這樣你就不會錯過。可以說，最重要的啟示是，如果你希望在搜索中表現良好，建立一個品牌比其他任何事情都重要。Fishkin 說，如果我要給營銷人員一條尋求廣泛提升其有機搜索排名和流量的建議，那就是“在你的領域建立一個顯著的、受歡迎的、在 Google 搜尋之外廣受認可的品牌”。這意味著搜索營銷已經徹底改變了。直到幾年前，SEO 還是一個可以用來建立其他渠道的發現引擎。但現在人們在其他渠道上發現你，偏愛你的內容，並因此在搜索中給你帶來提升。基本上，在 Google 之外建立品牌最終會為你帶來有機的 Google 流量。這可以解釋為什麼 DR 28 網站 CJ Eats 最近根據 Ahrefs 的估算，月訪問量達到 100 萬。這是在許多類似食譜網站在最近的更新中大幅下滑的時期。他之所以表現得這麼好，可能是因為他專注於外部營銷。他的 TikTok 擁有 110 萬粉絲，Instagram 也接近 100 萬。很多搜尋者在搜索食譜時認出他的品牌，並點擊他的內容而不是他們不認識的其他內容。他們也會在頁面上花更多時間，因為他們更信任他。這意味著更高的點擊率和停留時間，Google 將網站排名提高到他的競爭對手之上。我們在最新的播客中更詳細地討論了這個案例和品牌建設。鏈接在描述中。總結這個故事，Fishkin 對情況做了很好的總結。對於大多數中小型企業和新創作者或出版商來說，SEO 可能在你建立了可信度、導航需求和強大的聲譽之前，在一個龐大的受眾中顯示出較差的回報。哦，我們忘了一個潛在的排名因素。King 認為頁面的索引級別會影響其鏈接價值。想像一下，Google 將其索引組織成三個級別。最重要且經常更新的內容存儲在閃存中，這是檢索信息的最快方式。不太重要的內容存儲在固態硬盤上，而不經常更新的內容存儲在普通硬盤上。這意味著頁面的級別越高，鏈接的價值就越高。King 認為新鮮的內容被視為高質量，所以你希望鏈接來自新鮮或高級別的頁面。這就是為什麼來自高排名頁面和新聞頁面的鏈接比舊內容編輯更有效地提升你的排名的原因。這確認了數字公關目前是獲得高質量鏈接的最佳方式之一。說到數字公關，這裡是我們的贊助商 Search Intelligence 的簡短介紹。他們是一家數字公關公司，最近利用 Coldplay 主唱 Chris Martin 承認他每天只吃一頓飯的新聞，聯繫了記者，提供來自他們客戶 Bulk.com 的專家評論。他們的數據驅動方法查看了 Google 和 TikTok 上的趨勢，找到健身和營養專家會喜歡的見解。果然，他們從包括 Healthline 和 Huffington Post 在內的超高質量網站獲得了 40 個鏈接。如果你正在尋找一個數字公關活動來超級充電你的鏈接建立，請前往 search-intelligence.com 或 reintelligence.co.uk。 ### AI 生成的概述在美國搜索結果中推出的 AI 生成概述正變得災難性。主流媒體如紐約時報、BBC 和 CNBC 報導了這些 AI 概述中出現的許多不準確和怪異的回答，許多用戶正在懇求 Google 提供關閉它們的方法。一些更荒謬的結果在社交媒體上廣泛傳播，包括建議使用無毒膠水將奶酪粘在比薩上，或者建議每天吃一塊小石頭，這個答案顯然來自諷刺刊物 The Onion。是的，就是那個曾經讓朝鮮相信金正恩被選為最性感男人的 The Onion。這些“肩上答案”很有趣，但真正的問題是 Google 的錯誤答案可能會對人們造成實際傷害。例如，一個 AI 答案建議喝蔬菜汁作為治療闌尾炎的家庭療法，另一個分享了吸煙的眾所周知的健康益處，如更好的肺活量和降低的癌症風險。公平地說，這並不完全是 Google 的錯。許多網友加入了 Google 仇恨列車並創建了假 AI 概述截圖。很多人還問了一些奇怪的問題，旨在得到有害的答案。Google 發言人 Lara Levin 表示，我們看到的許多例子是罕見的查詢，我們也看到了被篡改或我們無法再現的例子。但無論是篡改的假截圖還是真實的截圖在流傳，公眾的強烈反對迫使 Google 退後一步。Zyppy SEO 的擁有者 Cyrus Shepard 表示，AI 概述似乎比幾天前少了很多。房地產營銷人 Matt McGee 也報告說，他測試的房地產查詢中少於 5% 有 AI 概述。Lily Ray 說她現在幾乎無法觸發 AI 概述。Kevin Indig 在他的 Growth Memo 通訊中進一步報告說，他試圖評估 AI 概述對有機流量的影響。他在他的實驗中包括了 1675 個健康查詢，發現 42% 的查詢顯示 AI 概述。這似乎很多，特別是對健康領域，雖然實驗是在 5 月 7 日到 5 月 21 日期間進行的，可能是在 Google 開始調低之前。他的分析發現 AI 概述通常會減少站點 URL 的流量。儘管如此，影響可能因用戶的意圖而異，一些複雜的查詢可能受益於增加的流量。即使 AI 概述沒有引用域名，有機流量也會略有下降，這表明用戶在有 AI 內容的情況下仍然會與有機結果互動。這是一個單一領域的小實驗，我們需要更多數據來得出結論。然而，搜索結果中 AI 概述的減少和早期數據顯示的流量下降輕微，對站點擁有者來說都是令人鼓舞的跡象。此外，AI 概述發布以來，我們還沒有看到任何流量驟降的恐怖故事，像 HCU 或 3 月核心更新那樣。你認為 Google 洩露文件中哪個發現最重要？你打算如何將這些新信息納入你的 SEO 工作？在下面的評論中讓我們知道。我們會在下一周內回答它們。如果你喜歡這個視頻，請務必查看這個視頻。