What is Indexing API

The Indexing API is an API provided by Google that allows for the quick summoning of Google's crawler for pages that need to be indexed immediately, such as job postings or real-time events.

The Indexing API offers the following functionalities:

  • Update URLs
    Notifies Google about new URLs to be crawled or informs them about updated content for URLs previously submitted.
  • Remove URLs
    Notifies Google after removing pages from the server, prompting Google to remove the corresponding pages from its index and avoid recrawling those URLs.
  • Get Request Status
    Allows checking various notifications related to the specified URL that Google has received recently.
  • Send Batch Indexing Requests
    Reduces the number of required HTTP connections for a site's client by combining up to 100 calls into a single HTTP request.

How to Use the Indexing API

In this article, I will introduce how to use the Indexing API in Google Colab.

Create a Google Cloud Project

First, go to the Service Accounts page in the Google Cloud console and create or select a project. For this article, I have created and selected a project named sandbox.

Create a Service Account

Click on "CREATE SERVICE ACCOUNT" to create a service account. In this article, I have created a service account with the name indexing-api. Enter the service name and click the "DONE" button.

Indexng API | 1
Indexng API | 2

Select "Manage keys" for the added service account.

Indexng API | 3

Click "Add KEY" > "Create new key." Choose the type of key as JSON and click the "CREATE" button. Once created, the key will be downloaded, so make sure to keep it safe.

Indexng API | 4
Indexng API | 5

Enable the Indexing API

Enable the Indexing API by searching for "indexing api" in the search bar, selecting the Indexing API, and clicking on the "Enable" button. If it says "API Enabled," then it is successfully enabled.

Indexng API | 6
Indexng API | 7

Verify Ownership in Google Search Console

Verify ownership by going to the Ownership Verification page in Google Search Console.

Indexng API | 8

Make sure "You are a verified owner" is displayed.

Add Owner in Google Search Console

Open the Permission Management page in Google Search Console, select the verified website, and add the address listed in the client_email of the JSON key for the service account created in Google Cloud as an owner.

Indexng API | 9

Execute Indexing API in Google Colab

Open Google Colab, and upload the JSON key file for the service account.

Then, paste the following code:

from oauth2client.service_account import ServiceAccountCredentials
import httplib2

import time

SCOPES = ["https://www.googleapis.com/auth/indexing"]
ENDPOINT = "https://indexing.googleapis.com/v3/urlNotifications:publish"

# Write path of json key file you got from GCP console.
JSON_KEY_FILE = "sandbox-aaaaaaaaaaaa.json"

credentials = ServiceAccountCredentials.from_json_keyfile_name(JSON_KEY_FILE, scopes=SCOPES)

http = credentials.authorize(httplib2.Http())

# Define contents here as a JSON string.
# This example shows a simple update request.
# Other types of requests are described in the next step.

# Write the website urls you want Google crawl
urls = [
    "https://io.traffine.com/id/articles/minikube",
    "https://io.traffine.com/id/articles/kubectl",
]

for url in urls:
    content = """{
    "url": "{url}",
    "type": "URL_UPDATED"
    }""".replace('{url}', url)

    response, content = http.request(ENDPOINT, method="POST", body=content)

    if response.status == 200:
        print(f"Succeeded. Status: {response.status} {url}")
    else:
        print(f"Failed. Status: {response.status} {url}")
        print(content)

    time.sleep(5)

After running the code, the following output will be printed:

Succeeded. Status: 200 https://io.traffine.com/id/articles/minikube
Succeeded. Status: 200 https://io.traffine.com/id/articles/kubectl

After sending the requests, the respective URLs will be crawled within a maximum of about 2 minutes. In some cases, it can be as quick as a few seconds.

References

https://developers.google.com/search/apis/indexing-api/v3/prereqs
https://developers.google.com/search/apis/indexing-api/v3/using-api

Ryusei Kakujo

researchgatelinkedingithub

Focusing on data science for mobility

Bench Press 100kg!