Validate keyword difficulty and thin content, checking if Quora or Reddit rank w/ Python

Share this:

One piece of advice that is pretty common—especially for those whose audience is primarily niche site builders or new affiliates—is to target keywords that have thin content, as these will usually be far easier to rank for, do to the lack of competition.

It’s sound advice—and I’d definitely recommend the same to anyone trying to get a brand new site to rank. But how do you check for thin content? Well, the usual go-to is to look for phrases which have sites like Reddit and Quora ranking, as not only do these indicate thin content, they also offer a helpful resource as a jumping off point for any article you intend to write to rank for these terms.

So how do you check this? Well, one way would be to check manually—but frankly, this would be a pretty poor use of time. Instead, use my script, which takes a list of keywords, pulls the first page results using a SERP API, and filters the keywords, returning only those that have either of these two sites ranked somewhere on the first page.

The only reason I’m writing this article now is that I came across a tweet today from Craig Campbell sharing this tip with his followers—so I decided to write take the 30 minutes or so needed to put this article and script together to help people automate this process.

Things you’ll need

To use this script—and just like my last article, on how to find guest posting opportunities automaticallyyou’re going to need a ScrapingRobot account.

You can create an account for free, which gives you 5,000 credits p/month, which should be more than enough for most people. In fact, I’ve been hammering my free account over the past couple of days and I still have a couple thousand credits left.

I must say, their product is pretty awesome. Not just in terms of the insanely generous number of credits you get for free, but also the product itself—I’ve not had any issues using it these past few days, it’s worked perfectly. I’m a big fan and highly recommend it—hopefully they get their act together and open up their affiliate program soon…

The code

Historically, I’ve been pretty bad with instructions—putting everything in a single file and not making much effort to explain what’s going on. I suppose it’s not surprising—I’m someone who likes to work almost exclusively with .txt files, which I’m sure most people would think is crazy.

This is why—and again, just like my last article—I’m making much more of an effort to split all these example scripts into separate files, so you can get a better idea of what’s going on and how these all work together.

So, let’s break down the code, it’s parts, and what each respective part does, giving you a brief overview of how it all works.

index.py

Not a lot to say here really—this is the file you’ll need to run—you’ll also need to remember to include your ScrapingRobot token and change your country if you want to check against a local version of Google, for example, switch out “US” for “UK” if you’re interested in Google UK.

That’s it really, no further explanation needed here.

index.py

from importkeywords import *
from scrapingrobot import *
from writecsv import *

scrapingrobot_token = "xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx" #SCRAPINGROBOT TOKEN HERE
country = 'US' #COUNTRY CODE HERE

def main():
    keywords = import_keywords()
    easy_keywords = check_who_ranks(keywords, country, scrapingrobot_token)
    print(write_csv(easy_keywords))

if __name__ == "__main__":
   main()

importkeywords.py

This file contains the function which imports your list of keywords, which you need to provide as a txt file named “keywords” in the same directory as the rest of the script.

All it does is import the txt file and converts it into a list, ready for use with the rest of the script.

importkeywords.py

def import_keywords():
    file = open("keywords.txt", "r")
    read = file.read()
    data = read.split('\n')
    print("keywords imported!")
    return data

scrapingrobot.py

This is where all the action happens. ScrapingRobot loops through each keyword, pulls the first 10 results, then iterates over these results to check for a URL match on either “Quora.com” or “Reddit.com“—meaning that either of these sites rank somewhere in the top 10.

When a match is found, it’s appended to the “easy_keywords” list, which is then returned once all keywords have been checked and the loop has finished running.

There’s also a second delay between each iteration of the loop. While I couldn’t find an explicit rate limit in the ScrapingRobot documentation, I’ve been getting away with this second delay without any issues so far.

scrapingrobot.py

import time
import requests

def check_who_ranks(keywords, country, token):
    easy_keywords = []
    for x, keyword in enumerate(keywords):
        print("processing "+str((x+1))+" of "+str(len(keywords))+"...")
        time.sleep(1)
        url = "https://api.scrapingrobot.com?token="+token
        payload = {
            "url": "https://www.google.com",
            "module": "GoogleScraper",
            "params":
            {
            "proxyCountry":country.upper(),
            "countryCountry":country.lower(),
            "query":keyword,
            "num": 10,
            }
        }
        headers = {
            "accept": "application/json",
            "content-type": "application/json"
        }
        response = requests.post(url, json=payload, headers=headers)
        data = response.json()
        if "result" in data.keys():
            results = data["result"]["organicResults"]
            for result in results:
                matches = ["quora.com", "reddit.com"]
                if any(x in result["url"] for x in matches):
                    easy_keywords.append(keyword)
    return easy_keywords

writecsv.py

This file does exactly what it says on the tin—this is the function that writes the “easy_keywords” list to CSV, which has the file name “easy_keywords.csv” and is output to the same directory as the rest of the script.

If you want to change the file name—or modify it, with a timestamp or similar—do it here.

writecsv.py

import csv
 
def write_csv(data):
    file = open('easy_keywords.csv', 'w', newline ='')
    with file:   
        writer = csv.writer(file)
        for row in data:
            writer.writerow([row])
    return 'written to csv!'

How to use it

To use the script, you don’t need to know much about programming, just follow this list of instructions and copy and paste the code given above.

  • If you don’t already have Python, you’ll need to install it—you can refer to the official Python getting started guide here.

  • There are a few dependencies, namely “time“, “requests“, and “csv“—but I’m pretty sure these should already be installed when setting up Python.

  • Copy and create each of the files give above, putting them in the same folder—it’s important to keep the filenames exactly the same or it won’t work. You can however name your folder whatever you like.

  • Open up a terminal, navigate to the folder you put everything in—using “cd” to change paths—and enter “py index.py” into the command line.

  • That’s it—enjoy!

Summing it up

Well, that’s all—this was a nice easy one for a change! Definitely a welcomed break from some of the more complicated bits and pieces I’m currently working on at the minute.

This tip—of looking for low-competition keywords, based on the face that either Reddit or Quora rank well for them, is a sound one—and definitely something that is worth focusing on initially, as it’s places like this where the early SEO wins are to be had for any new site.

The script and method laid out here should be easy enough to follow and utilize, even for those with no or limit programming knowledge. Follow my bulleted steps on “how to use it” and you shouldn’t run into any problems.

As always, if you made it this far—thanks for reading! And if you have any issues getting the script up and running, give me a shout in the comments and I’ll try to help where I can.

Share this:

Leave a Comment