Researchers tested multiple ways to optimize a website for AI search and discovered exactly what to do in order to boost visibility. They successfully increased visibility of smaller lower ranked websites by 115%, giving them the ability to outrank larger corporate sites that normally dominated the top of the search results.
The researchers, from Princeton University, Georgia Tech, Allen Institute for AI and IIT Delhi, observed that their Generative Engine Optimization technique, called GEO, was able to boost visibility in general by up to 40%.
Nine techniques for optimizations were tested across multiple knowledge domains (like legal, history, science, etc.) and they discovered which ones worked, which ones did nothing and which approaches actually made rankings worse.
Of particular interest is that some of the techniques worked especially well for specific knowledge domains, while three worked particularly well across all kinds of sites.
The researchers underlined the ability of GEO to democratize the top of the search results, writing:
“This finding underscores the potential of GEO as a tool to democratize the digital space.
Importantly, many of these lower-ranked websites are often created by small content creators or independent businesses, who traditionally struggle to compete with larger corporations that dominate the top rankings in search engine results.”
Tested on Perplexity.AI
The researchers tested in the Perplexity.ai search engine and on an AI search engine modeled on Bing Chat and they found that the the results were similar to those of the the engine modeled on Bing Chat.
In section 6 of the research paper they observe:
“We find, similar to our generative engine Quotation Addition performs the best in Position-Adjusted Word Count with a relative improvement of 22% over the baseline. Further, methods that performed well in our generative engine such as Cite Sources, Statistics Addition show high improvements of up to 9% and 37% on the two metrics.”
Tested On AI Search Modeled On Bing Chat
The researchers tested their methods on a generative search engine they created that was modeled on the Bing Chat workflow and they also tested it on Perplexity.AI, an AI search engine.
“We describe a generative engine, which includes several backend generative models and a search engine for source retrieval.
A Generative Engine (GE) takes as input a user query qu and returns a natural language response r, where PU represents personalized user information, such as preferences and history.
Generative Engines are comprised of two crucial components:
a.) A set of generative models G = , each serving a specific purpose like query reformulation or summarization, and
b.) A search engine SE that returns a set of sources S = given a query q.
We present a representative workflow…, which at the time of writing, closely resembles the design of BingChat. This workflow breaks down the input query into a set of simpler queries that are easier to consume for the search engine.”
Search Queries Used For Testing
The researchers created a benchmark from nine different sources, containing 10,000 search queries across multiple knowledge domains and different levels of complexity. For example, some of the queries required reasoning to solve the answers.
The research paper explains:
“…we curate GEO-BENCH, a benchmark consisting of 10K queries from multiple sources, repurposed for generative engines, along with synthetically generated queries. The benchmark includes queries from nine different sources, each further categorized based on their target domain, difficulty, query intent, and other dimensions.”
Here is a list of the nine search query sources:
1. MS Macro,
3. Natural Questions
4. AllSouls: This dataset contains essay questions from “All Souls College, Oxford University
5. LIMA: contains challenging questions requiring Generative Engines to not only aggregate information but also perform suitable reasoning to answer the question
7. Perplexity.ai Discover: These queries are sourced from Perplexity.ai’s Discover section, which is an updated list of trending queries
8. ELI-5: This dataset contains questions from the ELI5 subreddit
9. GPT-4 Generated Queries: To supplement diversity in query distribution, we prompt GPT-4 to generate queries ranging from various domains (eg: science, history) and based on query intent (eg: navigational, transactional) and based on difficulty and scope of generated response (eg: open-ended, fact-based)
Nine Ranking Strategies Tested
The researchers tested nine different methods for optimizing websites, tracking how the different approaches worked for different kinds of searches such as Law & Government, business, science, people & society, health, history and other topics.
They discovered that each kind of niche topic responded well to different optimization strategies.
The nine tested strategies are:
Authoritative: Changing the writing style to be more persuasive in authoritative claims
Keyword optimization: Adding more keywords from the search query
Statistics Addition: Changing existing content to include statistics instead of interpretative information.
- Cite Sources (quoting reliable sources)
- Quotation Addition: Adding quotes and citation from high quality sources
- Easy-to-Understand: Making the content simpler to understand
- Fluency Optimization is about making the content more articulate
- Unique Words: Adding words that are less widely used, rare and unique but without changing the meaning of the content
- Technical Terms: This strategy adds both unique and technical terms wherever it makes sense to do so and without changing the meaning of the content
- Cite Sources
- Quotation Addition
- Statistics Addition
Which methods worked the best?
The top three optimization strategies were:
- Cite Sources
- Quotation Addition
- Statistics Addition
Those three strategies achieved relative improvements of 30-40% compared to the baselines.
The researchers wrote about the success of these strategies:
“These methods, which involve adding relevant statistics (Statistics Addition), incorporating credible quotes (Quotation Addition), and including citations from reliable sources (Cite Sources) in the website content, require minimal changes to the actual content itself.
Yet, they significantly improve the website’s visibility in Generative Engine responses, enhancing both the credibility and richness of the content.”
The Fluency Optimization and Easy-to Understand methods were also useful for improving visibility by 15-30%.
These results were interpreted by the researchers to show how AI search engines valued both the content and the presentation of the content.
What Optimization Strategies Didn’t Work
The researchers were surprised to discover that using persuasive and authoritative tones in the content did not generally improve rankings in AI search engines, not as well as the other approaches.
Similarly, the method of adding more keywords from the search query into the content didn’t work either. In fact, keyword optimization performed worse than the baseline by 10%.
Optimizations Worked Differently Across Knowledge Domains
An interesting finding in the report is that which kind of optimization worked best depended on the knowledge domain (legal, government, science, history, etc.).
They found that content related to the Historical domain ranked better when the “Authoritative” optimization was applied, where more persuasive language was used.
The Citation optimization, where the content was improved with citations from authoritative sources, worked significantly best for factual search queries.
Adding statistics worked well for Law and Government related questions. Statistics also worked well for “opinion” question where a searcher asks the AI about its opinion about something.
The researchers observed:
“This suggests that the incorporation of data-driven evidence can enhance the visibility of a website in particular contexts especially these.”
Adding quotations worked well for the People & Society, Explanation, and History knowledge domains. The researchers interpreted those results to mean that perhaps the AI search engine prefers “authenticity” and “depth” for these kinds of questions.
The researchers concluded that making domain-specific optimizations were the best approach to take.
Low Ranked Websites Improved Rankings With GEO
The good news from this research is that normally low-ranked websites will benefit from these strategies for optimizing for AI search engines.
“Interestingly, websites that are ranked lower in SERP, which typically struggle to gain visibility, benefit significantly more from GEO than those ranked higher.
For instance, the Cite Sources method led to a substantial 115.1% increase in visibility for websites ranked fifth in SERP, while on average the visibility of the top-ranked website decreased by 30.3%.
…the application of GEO methods presents an opportunity for these small content creators to significantly improve their visibility in Generative Engine responses.
By enhancing their content using GEO, they can reach a wider audience, thereby leveling the playing field and allowing them to compete more effectively with larger corporations in the digital space.”
Game Changer For SEO
This research study shows a new path for SEO when it comes to AI-based search engines. Those who said that AI Search was going to defeat SEO spoke too early. This research appears to show that SEO will eventually evolve to become GEO in order to compete in the next generation of AI search engines.
Read the research study here:
Featured Image by Shutterstock/ProStockStudio
Hashtags: #Researchers #Discover #SEO #Search