Improving content quality at scale with AI
I’ve learned through experience to be cautious when using the words “content” and “scale” close to each other in SEO because it’s usually coded-speak for creating content in large volumes, primarily for search engines.
We’ve seen time and time again this approach ending in disaster when the search engines work out what is going on.
When used correctly, however, AI can be a powerful assistant for SEO and help work out how to improve the quality of our content.
What we’re going to do
Our goal is to use an automated process to find “intent gaps” in our content.
To do this – all in real-time – we will:
- Crawl our content URLs.
- Analyze the text content on the page with ChatGPT.
- Compare this to an intent map of Google’s People Also Ask data to determine where we have gaps in our content.
The result will be a spreadsheet that potentially saves us hundreds of hours by automatically listing questions that our content does not answer, which Google has already determined are related to the page’s intent.
Tools we need
- Screaming Frog SEO spider: This popular web crawler recently released its v20, which, among other things, includes a new feature we will use to execute custom JavaScript while crawling, meaning we can extract data as we go.
- OpenAI API: The OpenAI API will allow us to programmatically interact with ChatGPT for content analysis. Summarizing and reviewing content, rather than creating it, is one of the strongest uses for Large Language Model systems.
- AlsoAsked API: AlsoAsked is the only tool with an async/sync API which allows us to programmatically query and access People Also Asked data in any language/region supported by Google.
Why this approach is so powerful
People Also Ask (PAA) data
We’re using PAA data for this project because it has several distinct advantages over other types of keyword data:
Intent clustering by Google
Google uses PAA boxes to help users refine queries, but they also serve as an induction loop of interaction data for Google to understand what users want from a query on average.
The term ‘intent’ generally refers to the overall goal a user wants to complete, and this intent can consist of several searches. Google’s research has shown that for complex tasks, it takes, on average, eight searches for a user to complete a task.
In the above example, Google knows that when users have the intent of learning how to change a car battery, one of the most common searches they will perform on this journey is asking which terminal to take off first.
We also know that Time To Result (TTR) is one of Google’s metrics for measuring its own performance. It’s essentially how quickly a user has completed their mission and fulfilled their intent. Therefore, it makes sense that we can improve our content and reduce the TTR by including searches that are in close ‘intent proximity’ to the topic of our article.
If we can make the content more useful, we’re improving its chances of ranking well. No other source of keyword data can provide such detail on queries that come up as ‘zero volume’ keywords on traditional research tools.
Recency
No other sources of search data give queries and updates as quickly as People Also Ask data. As I write this, GPT-4o was released 4 days ago. However, major keyword research tools (incorrectly) still say there are 0 searches for “GPT-4o”:
For the same search term, you can see that Google’s People Also Ask feature has already been updated with numerous queries about GPT-4o, asking if it’s free and how it’s better.
Being the first to publish on a particular topic is a huge advantage in SEO. Not only are you almost guaranteed to rank if you’re one of the first sites to produce the content, but there is usually an early flurry of links around new topics that go to these sites that will help you sustain rankings.
The recency of the data also means it’s an excellent way to see if your content needs updating to align with the current search intent, which is not static.
Step-by-step approach
1. Update Screaming Frog to >v20.1
Before we begin, it’s worth checking that you have the latest version of Screaming Frog. CustomJS was introduced in v20.0, and since v20.1 the AlsoAsked + ChatGPT CustomJS is packaged with the installer, so you don’t need to manually add it.
Screaming Frog can update directly from the program while only one instance is running. To find this option, go to Help > Check for Updates, which will require a restart.
2. Crawl site URLs
Although we will not run this process on all URLs, we need a list of URLs to choose from. The easiest way to achieve this is to start a standard crawl of your website with Screaming Frog and select the HTML filter to view pages.
If your site requires client-side JavaScript to render content and links, don’t forget to go into Configuration > Spider > Rendering and change Rendering from Text Only to JavaScript.
3. Select content URLs
Although this process can work on all different types of pages, it tends to offer the most value on informational pages. We must also consider that each URL we query will use more OpenAI tokens and AlsoAsked credits.
For this reason, I would recommend starting with your content URLs. For this example website, I will look at blog posts, which I know all have /blogs/ in the URL.
Screaming Frog offers a quick way to show only these URLs by typing ‘/blogs/’ into the filter box at the top right.
Your URL pattern may differ, and it doesn’t matter if there is no obvious URL pattern, as Screaming Frog offers a powerful Custom Search to filter based on page rules.
For this example, I will simply select and Copy the URLs I am interested in, although it would also be possible to export them to a spreadsheet if you have a large amount you want to work through.
4. Import CustomJS
The new CustomJS option can be found under the Configuration > Custom > Custom JavaScript menu.
This will open the Custom JavaScript window. In the bottom right, click the + Add from Library button to load a list of pre-packaged custom JavaScript that ships with Screaming Frog.
Scroll down and select (AlsoAsked+ChatGPT) Find unanswered questions and click Insert.
5. Configure API keys
We’re not quite ready to go yet. We now need to edit the imported JavaScript with our API keys — but don’t worry, that’s really easy!
Once the CustomJS is important, you need to click on this edit icon:
You should now see the JavaScript code in the editor window. There are two parts you need to edit, which are in capitals: ‘ENTER CHATGPT API KEY’ and ‘ENTER ALSOASKED API KEY’.
OpenAI API key
You can create an OpenAI key from https://platform.openai.com/api-keys
When you click Create new secret key, you’ll be prompted for a name and the project it’s attached to. You can call these whatever you like. OpenAI will put a secret key (be careful never to share this!) on your clipboard, which you can paste into your Screaming Frog CustomJS edit window.
The cost of ChatGPT will depend on token usage, which also depends on which pages we provide. Before deploying anything, it’s worth double-checking the spending limits you have set up to make sure you don’t unexpectedly go over budget.
AlsoAsked API key
AlsoAsked API access requires a Pro account, which provides 1,000 queries every month, although you can buy additional credits if you need to do more.
The costs here are much easier to predict, with a single URL costing $0.06 or with bulk Pay As You Go credits as low as $0.03. This means you could fully analyze 1,000 URLs of content for as low as $30, which would take days of manual work to achieve the same.
With a Pro account, you can create an API key.
Once again, give the key a name you will recognize, leave the ‘environment’ set to ‘Live’ and click ‘Create key’.
This will generate an API key to paste into the Screaming Frog CustomJS edit window.
6. Review settings
Configure PAA language and region
AlsoAsked supports all of the same languages and regions that Google offers, so if your website is not in English or targeting Great Britain, you can configure these two settings within the JavaScript from line 25 onwards.
You can use any ISO 639 language codes and ISO 3166 country codes. Google’s coverage with People Also Asked data is much lower in non-English languages.
Occasionally, English results will be returned as a fallback if no results for the region/language combination are provided, as there are often intent commonalities.
Customize the ChatGPT prompt
The current prompt used in the script for ChatGPT is:
- List the questions in this JSON array ${JSON.stringify(questions)} which are not answered in the text content of this page, but would make sense to answer in context to the rest of the content. Output the questions that are not answered in a JSON array of strings within an object called unanswered_questions.
There may be ways to improve output with more specific prompting related to your content by editing the part of the prompt in bold. This can be worth playing around with and seeing where you get the best output for your website.
To improve the output of the prompt, we have also asked ChatGPT to filter not only the unanswered questions but also the unanswered questions that might make sense to answer given the rest of the content on the page.
Warning: The beginning and end of the prompt, which are not highlighted in bold, specify specific formats, variables and objects that are used elsewhere in the script. If you change these without adjusting the script, it will likely break.
Check H1 inputs
We are prompting for People Also Asked data with the contents of the Header 1 (h1) on the target URL.
This means that if the page does not have a readable H1 tag, the script will fail, but I’m sure that, as we’re all SEOs, nobody will be in that position.
With a little coding, it is possible to change this variable to pass other parameters, such as a title tag, to fetch People Also Ask data, although our experiments have shown that H1s tend to be the best bet as they are a good description of page content.
7. Run List crawl for selected URLs
Use the Clear button at the top of the Screaming Frog interface to start a new crawl, and then select the Mode menu and change the crawl type to List.
Important: As you will be running Custom JavaScript, you must ensure your rendering mode is set to JavaScript in Configuration > Spider > Rendering or the script will not execute.
The Upload button will now let you import your list of URLs. You can simply select Paste if you copied your URLs to the clipboard as I did. If you exported them to a file, select From a File…
8. View results
Your results will be in the Custom JavaScript tab, which you can find either by clicking the down arrow to the right of the tabs and selecting Custom JavaScript’
Here, you will find your URLs, along with a list of questions based on PAAs that ChatGPT has determined have not been answered within your content and that might make sense.
Once the crawl is complete, you can use the Export button to produce a convenient spreadsheet for review.
Fit this in with your current SOPs
There are many ways to gather data to improve your content, from qualitative user feedback to looking at quantitative metrics within analytics. This is just one method.
This particular methodology is extremely useful because it can give some inspiration based on actual data while leaning into the strength of LLMs by summarising instead of generating content to put it into context.
With some extra tooling, it would be possible to build these kinds of checks in as you are producing content and even on scheduled crawls to alert content creators when new gaps appear.