Imagine wielding the power to propel your website to the pinnacle of search results, outshining competitors and capturing the attention of countless users.
What if you could unlock the secrets behind Search Engine Results Pages (SERPs) and harness their potential to transform your online presence?
This article delves into the intricate world of SERPs, unravelling the complex interplay between crawling bots, indexing algorithms, and ranking factors determining which content rises to the top. From the nuances of relevance calculations to the critical elements of quality assessment, I’ll guide you through the labyrinth of search engine optimization. Prepare to master the art of digital visibility and learn how to navigate the ever-evolving landscape of online search with the precision of a seasoned SEO expert.
Article Chapters
What is a SERP?
A SERP, or Search Engine Results Page, is the list of results that a search engine displays in response to a user’s query. It typically includes organic search results, paid ads, and various SERP features such as Featured Snippets, Knowledge Graphs, and video results. In some cases, the SERP will also include filters or query refinements to allow you to get more specific results.
SERPs have evolved significantly by implementing Natural Language Processing (NLP) technologies. Google’s BERT algorithm, introduced in 2018, changed how search engines interpret queries by considering context and user intent rather than just matching keywords. This advancement allows search engines to understand natural language more effectively, providing more relevant results to users’ questions4.
How does a search engine generate a SERP?
Search engines are complex machines. I’m not going to pretend that I know how it all works, but I will cover the foundations—the machine’s structure.
While search engines have evolved, these structures have remained and will continue to do so for the foreseeable future. However, like anything on the web, things can change faster than a politician’s views, so there may be a time when this model will become obsolete. I’ll be sure to keep you informed.
Understanding the underpinnings of a search engine will help you devise how to get your website to rank on a search engine. It gives you the tools to diagnose problems and take advantage of opportunities. It also allows you to adapt to changes when they occur. You’ll be set to crush your competition.
Below is the process that search engines use to build their results pages. I will step through the order from content discovery to displaying the results. This will help you understand a search engine’s steps and help you troubleshoot where you need to focus your time and efforts. For example, if your website isn’t in the index, you need to check that search engines can find and crawl your content.
Crawling
Crawling is the process that search engines use to discover content on the web. Search engines use bots, software designed to do a specific task, to go from page to page via a link, grab the page’s content and store it.
It looks something like this:
The cool thing about bots is that they are automated, so this process is never-ending. Search engines are continuously discovering new content.
FAQs about crawling
How often do search engines crawl websites?
The crawling frequency varies depending on website popularity, how often content is updated, and the site’s overall authority. There are no ways to control this, so make high-quality content, publish as frequently as possible, and get other people to talk about you.
How do search engines find content?
Search engines use various methods to discover content across the web. Here are the most common ways:
- Sitemaps
- Domain registration
- User submissions
- Backlinks
- Social media
- RSS feeds
Can I control which pages are crawled?
Yes, website owners can use tools like robots.txt, meta robots tags, and canonical tags to guide search engine crawlers. Note the keyword here is guide. Crawlers are supposed to follow the directives in these tools, but for various reasons, they might not.
How do I know if my pages are being crawled?
You can use tools like Google Search Console to check if your pages are being crawled and identify any issues that have arisen.
Do search engines process content during the crawling phase?
Search engines do not process content during the crawling phase. Crawling is focused on discovering and fetching web pages. The actual processing and understanding of content occurs during the indexing phase.
Indexing
To index something refers to organizing information to make it easier to retrieve. Once a search engine has discovered your content, it must organize it with all of the other content it has so that it can be retrieved.
Since we’re talking about search engines, indexing refers to building a data structure that facilitates faster data retrieval in computer science. This could involve various structures such as hash maps, binary search trees, or B-trees. The goal is to reduce the time it takes to find specific data within larger datasets. For instance, when searching for a term in a database, an index allows the system to locate the relevant information more efficiently than scanning through all entries sequentially.
The most common analogy of this step is a filing cabinet or folders on your desktop. Search engines like Google use sophisticated systems to organize content so that when a specific query is typed into a search bar, all relevant content can be retrieved faster than Superman can change his suit.
The indexing step has four essential duties:
- Analyze the textual content, images, videos, tags, and attributes of the page
- Attempt to understand and categorize the collected content
- Use NLP techniques to comprehend the semantic meaning of the content
- After analysis, tag the content for quick retrieval
The indexing step is where search engines start processing the content they find. Because the internet is full of multimedia, one of the first steps is categorizing content by type (video, image, text, etc.). Once the content type has been determined, the content can be processed by tools specifically designed to understand that content type. For example, LLMs (Large Language Models) like ChatGPT are great at analyzing text but struggle with images.
When analyzing discovered content, because two words spelled the same can have different meanings, today’s search engines are looking to understand the semantic meaning of the content. Search engines want to know the context and the intent of the content. It also wants to see the relationships between topics and concepts. So, gone are the days of keyword stuffing.
Ranking
Ranking is likely part of the indexing process, but I want to give it the royal treatment because it’s vital to SEO (search engine optimization). Crawling and indexing are table stakes; ranking is where your SEO strategy will make gains.
Relevance
Like the bouncer at Hakkasan, relevance is the first barrier to entry on a results page. If your page isn’t relevant to the query, it won’t make it onto the results list.
What is relevance?
By definition, relevance refers to how one topic relates to another. However, relationships can be hard to define. A more technical, nuanced definition might be when something (A) is deemed relevant to a task (T) if it increases the likelihood of accomplishing the associated goal (G) implied by T. Essentially, relevance is how close two topics are to each other and can they act in complementary ways. Pertinence, applicableness, and appropriateness are all synonyms and relevant words to releance.
Surprisingly, using math makes relevance easier to understand. This is good because search engines don’t understand words. They understand math. I will now work through a few technical points, but I promise it’s easy.
NLP (Natural Language Processing) advances allow us to represent words using a vector. Remember this: a vector is a point in space expressed as a set of numbers. For example, airline pilots use vectors to get you from one airport to another since each airport is a vector.
Previously, this mapping of words to numbers only captured the sequence of letters in a word. That meant the words bank in bank robber and river bank occupied the same space on a map. Up to this point, search engines didn’t understand the context. When Google released BERT (Bidirectional Encoder Representations from Transformers) in 2018, it significantly advanced the ability of machines to understand human language by also considering the context of a word in a sentence. After that point, the words bank in bank robber and river bank would map to different points in space, thus allowing machines to understand semantics.
There are several ways to measure the distance between two points, such as k-means, euclidean distance, and dot product. For search engines, the most popular measure is cosine similarity.
What is cosine similarity?
Cosine similarity is a mathematical metric widely used in Natural Language Processing (NLP) to measure the similarity between two vectors in an inner product space. It is particularly effective for comparing text data, as it focuses on the orientation of vectors rather than their magnitude, making it robust for tasks involving text of varying lengths or word frequencies.
What’s important to note here is that cosine similarity doesn’t measure the frequency of words like the previous SEO gold standard TD-IDF. It looks to measure the semantic similarity of text. This is one of the reasons you will find content in a SERP that doesn’t have the search query in the text.
Key takeaway
For SEOs, this means that for your content to show up in a SERP for a specific keyword, your content must meet a threshold of cosine similarity.
Quality
Once a search engine determines your content is relevant to a query, it starts to look at the quality of the content in the results set. This is where a search engine decides who gets the top spot and who gets relegated to the bottom.
Quality can take more work to quantify. What’s quality to you may not be quality to me. However, there are several items that I believe search engines love when it comes to quality.
- Website authority (backlinks and their quality)
- Page popularity (backlinks and traffic)
- (Expected) Click-through rate
- Page errors (technical + spelling and grammar)
- Return rate (I came, I saw, I puked, I left)
- Engagement rate
Key takeaway
Create content that people will like and share. Make your page easy to use and engage with. Help people get deeper into your website.
Conclusion
From understanding the complexities of crawling and indexing to grasping the nuances of relevance and quality in ranking algorithms, the journey to the top of search results requires knowledge, strategy, and persistence. Creating high-quality, relevant content that resonates with your audience and adhering to technical best practices can significantly improve your website’s visibility and performance in search results.
However, navigating the ever-changing landscape of SEO can be challenging, especially as search engines continue to evolve their algorithms and introduce new technologies like BERT. If you find yourself overwhelmed by the complexities of search engine optimization or want to ensure you’re making the most of your digital presence, it may be time to seek expert guidance.
A professional can provide tailored strategies to boost your website’s performance, help you understand the latest SEO trends, and ultimately drive more traffic and conversions for your business. Don’t let your website get lost in the vast sea of online content – take action now to secure your place at the top of the SERPs and watch your online presence soar.
Need help with digital marketing? Talk to a search engine optimization consultant today.