2023-07-03

Automating Sitemap Updates Using GiHub Actions

Introduction

As a part of SEO, registering your sitemap with Google Search Console can be a crucial step, especially for new websites. This is because having a sitemap enables Google's web crawlers to find and index your web pages more efficiently, which can help improve your site's visibility in search results.

Once you've registered your sitemap with Google Search Console, you might think your job is done. However, this is not the case. In reality, Google doesn't check your sitemap every time your site is crawled.

https://developers.google.com/search/docs/crawling-indexing/sitemaps/build-sitemap

Therefore, it is important to regularly notify Google of updates to your sitemap, especially if you are frequently adding or modifying content on your site.

Updating Sitemaps to Google

Initial Manual Update through Google Search Console

The most straightforward method to update your sitemap is through Google Search Console itself. Here, you can manually resubmit your sitemap for crawling. However, doing this every day can be quite tedious and time-consuming.

Automating Updates with URL Access

To automate sitemap updates, Google provides a simple method: you can automatically update your sitemap by accessing a specific URL. For example, accessing the following URL will allow you to update your sitemap:

http://www.google.com/ping?sitemap=https://example.com/sitemap.xml

Automating Sitemaps Update using GitHub Actions

GitHub Actions is a powerful tool that allows you to automate, customize, and execute your software development workflows right in your repository. One of its features is the support for cron, a time-based job scheduler in Unix-like operating systems. This feature can be utilized to schedule the automatic update of your sitemap on a regular basis.

To accomplish this, you need to implement the following code in GitHub Actions:

.github/workflows/update-sitemap
name: Update Sitemap

on:
  schedule:
    - cron: '0 15 * * *'

jobs:
  update-sitemap:
    runs-on: ubuntu-latest
    steps:
      - name: Ping to Google
        run: |
          curl -X GET "http://www.google.com/ping?sitemap=https://io.traffine.com/sitemap.xml"

This script tells GitHub Actions to execute a curl command which sends a GET request to the provided URL, thereby updating the sitemap. The schedule section tells GitHub Actions when to run the script. In this case, the cron syntax 0 15 * * * tells GitHub Actions to run the script every day at 15:00 UTC.

References

https://developers.google.com/search/docs/crawling-indexing/sitemaps/build-sitemap

Ryusei Kakujo

researchgatelinkedingithub

Focusing on data science for mobility

Bench Press 100kg!