HomeOnline MarketingSEOHow To Use IndexNow API With Python For Bulk Indexing

How To Use IndexNow API With Python For Bulk Indexing


IndexNow is a protocol developed by Microsoft Bing and adopted by Yandex that permits site owners and search engine optimization professionals to simply notify engines like google when a webpage has been up to date through an API.

And at present, Microsoft introduced that it's making the protocol simpler to implement by guaranteeing that submitted URLs are shared between engines like google.

Given its optimistic implications and the promise of a sooner indexing expertise for publishers, the IndexNow API must be on each search engine optimization skilled’s radar.

Utilizing Python for automating URL submission to the IndexNow API or making an API request to the IndexNow API for bulk URL indexing could make managing IndexNow extra environment friendly for you.

On this tutorial, you’ll learn to just do that, with step-by-step directions for utilizing the IndexNow API to submit URLs to Microsoft Bing in bulk with Python.

Word: The IndexNow API is much like Google’s Indexing API with just one distinction: the Google Indexing API is just for job commercials or broadcasting internet pages that comprise a video object inside it.

Google introduced that they will take a look at the IndexNow API however hasn’t up to date us since.

Bulk Indexing Utilizing IndexNow API with Python: Getting Began

Under are the requirements to know and implement the IndexNow API tutorial.

Under are the Python packages and libraries that can be used for the Python IndexNow API tutorial.

  • Advertools (should).
  • Pandas (should).
  • Requests (should).
  • Time (non-obligatory).
  • JSON (non-obligatory).

Earlier than getting began, studying the fundamentals will help you to know this IndexNow API and Python tutorial higher. We can be utilizing an API Key and a .txt file to offer authentication together with particular HTTP Headers.

IndexNow API Utilization Steps with Python.IndexNow API Usage with Python

1. Import The Python Libraries

To make use of the mandatory Python libraries, we'll use the “import” command.

  • Advertools can be used for sitemap URL extraction.
  • Requests can be used for making the GET and POST requests.
  • Pandas can be used for taking the URLs within the sitemap into a listing object.
  • The “time” module is to stop a “An excessive amount of request” error with the “sleep()” methodology.
  • JSON is for presumably modifying the POST JSON object if wanted.

Under, you can find the entire essential import strains for the IndexNow API tutorial.

import advertools as adv
import pandas as pd
import requests
import json
import time

2. Extracting The Sitemap URLs With Python

To extract the URLs from a sitemap file, totally different internet scraping strategies and libraries can be utilized equivalent to Requests or Scrapy.

However to maintain issues easy and environment friendly, I'll use my favourite Python search engine optimization package deal – Advertools.

With solely a single line of code, the entire URLs inside a sitemap might be extracted.

sitemap_urls = adv.sitemap_to_df("https://www.instance.com/sitemap_index.xml")

The “sitemap_to_df” methodology of the Advertools can extract all of the URLs and different sitemap-related tags equivalent to “lastmod” or “precedence.”

Under, you'll be able to see the output of the “adv.sitemap_to_df” command.

Sitemap URL Extraction for IndexNow API UsageSitemap URL Extraction might be carried out through Advertools’ “sitemap_to_df” methodology.Sitemap URL Extraction for IndexNow API Usage

All the URLs and dates are specified inside the “sitemap_urls” variable.

Since sitemaps are helpful sources for engines like google and SEOs, Advertools’ sitemap_to_df methodology can be utilized for a lot of totally different duties together with a Sitemap Python Audit.

However that’s a subject for one more time.

3. Take The URLs Into A Record Object With “to_list()”

Python’s Pandas library has a technique for taking a knowledge body column (information collection) into a listing object, to_list().

Under is an instance utilization:

sitemap_urls["loc"].to_list()

Under, you'll be able to see the consequence:

Sitemap URL ListingPandas’ “to_list” methodology can be utilized with Advertools for itemizing the URLs.Sitemap URL Listing

All URLs inside the sitemap are in a Python record object.

4. Perceive The URL Syntax Of IndexNow API Of Microsoft Bing

Let’s check out the URL syntax of the IndexNow API.

Right here’s an instance:

https://<searchengine>/indexnow?url=url-changed&key=your-key

The URL syntax represents the variables and their relations to one another inside the RFC 3986 requirements.

  • The <searchengine> represents the search engine identify that you'll use the IndexNow API for.
  • “?url=” parameter is to find out the URL that can be submitted to the search engine through IndexNow API.
  • “&key=” is the API Key that can be used inside the IndexNow API.
  • “&keyLocation=” is to offer an authenticity that exhibits that you're the proprietor of the web site that IndexNow API can be used for.

The “&keyLocation” will carry us to the API Key and its “.txt” model.

5. Collect The API Key For IndexNow And Add It To The Root

You’ll want a legitimate key to make use of the IndexNow API.

Use this hyperlink to generate the Microsoft Bing IndexNow API Key.

IndexNow API Key Taking There isn't any restrict for producing the IndexNow API Key.IndexNow API Key Taking

Clicking the “Generate” button creates an IndexNow API Key.

While you click on on the obtain button, it would obtain the “.txt” model of the IndexNow API Key.

IndexNow API Key GenerationIndexNow API Key might be generated by Microsoft Bing’s said tackle.IndexNow API Key Generation
txt version of IndexNow API KeyDownloaded IndexNow API Key as txt file.txt version of IndexNow API Key

The TXT model of the API key would be the file identify and in addition to inside the textual content file.

IndexNow API Key in TXT FileIndexNow API Key in TXT File must be the identical with the identify of the file, and the precise API Key worth.IndexNow API Key in TXT File

The following step is importing this TXT file to the foundation of the web site’s server.

Since I exploit FileZilla for my FTP, I've uploaded it simply to my internet server’s root.

Root Server and IndexNow API Set upBy placing the .txt file into the net server’s root folder, the IndexNow API setup might be accomplished.Root Server and IndexNow API Set up

The following step is performing a easy for a loop instance for submitting the entire URLs inside the sitemap.

6. Submit The URLs Inside The Sitemap With Python To IndexNow API

To submit a single URL to the IndexNow, you should utilize a single “requests.get()” occasion. However to make it extra helpful, we'll use a for a loop.

To submit URLs in bulk to the IndexNow API with Python, comply with the steps beneath:

  1. Create a key variable with the IndexNow API Key worth.
  2. Exchange the <searchengine> part with the search engine that you just wish to submit URLs (Microsoft Bing, or Yandex, for now).
  3. Assign the entire URLs from the sitemap inside a listing to a variable.
  4. Use the “txt” file inside the root of the net server with its URL worth.
  5. Place the URL, key, and key location URL inside the string manipulation worth.
  6. Begin your for a loop, and use the “requests.get()” for the entire URLs inside the sitemap.

Under, you'll be able to see the implementation:

key = "22bc7c564b334f38b0b1ed90eec8f2c5"
url = sitemap_urls["loc"].to_list()
for i in url:
          endpoint = f"https://bing.com/indexnow?url={i}&key={key}&keyLocation={location}"
          response = requests.get(endpoint)
          print(i)
          print(endpoint)
          print(response.status_code, response.content material)
          #time.sleep(5)

For those who’re involved about sending too many requests to the IndexNow API, you should utilize the Python time module to make the script wait between each request.

Right here you'll be able to see the output of the script:

IndexNow API Automation ScriptThe empty string because the request’s response physique represents the success of the IndexNow API request in accordance with Microsoft Bing’s IndexNow documentation.IndexNow API Automation Script

The 200 Standing Code implies that the request was profitable.

With the for a loop, I've submitted 194 URLs to Microsoft Bing.

In line with the IndexNow Documentation, the HTTP 200 Response Code alerts that the search engine is conscious of the change within the content material or the brand new content material. However it doesn’t essentially assure indexing.

As an example, I've used the identical script for one more web site. After 120 seconds, Microsoft Bing says that 31 outcomes are discovered. And conveniently, it exhibits 4 pages.

The one downside is that on the primary web page there are solely two outcomes, and it says that the URLs are blocked by Robots.txt even when the blocking was eliminated earlier than submission.

This will occur if the robots.txt was modified to take away some URLs earlier than utilizing the IndexNow API as a result of it appears that evidently Bing doesn't verify the Robots.txt once more.

Thus, should you beforehand blocked them, they attempt to index your web site however nonetheless use the earlier model of the robots.txt file.

Bing IndexNow API ResultsIt exhibits what's going to occur should you use IndexNow API by blocking Bingbot through Robots.txt.Bing IndexNow API Results

On the second web page, there is just one consequence:

IndexNow Bing Paginated ResultMicrosoft Bing would possibly use a unique indexation and pagination methodology than Google. The second web page exhibits just one among the many 31 outcomes.IndexNow Bing Paginated Result

On the third web page, there is no such thing as a consequence, and it exhibits the Microsoft Bing Translate for translating the string inside the search bar.

Microsoft Bing TranslateIt exhibits typically, Microsoft Bing infers the “web site” search operator as part of the question.Microsoft Bing Translate

Once I checked Google Analytics, it exhibits that Bing nonetheless hadn’t crawled the web site or listed it. I do know that is true as I additionally checked the log information.

Google and Bing Indexing ProcessesUnder, you will notice the Bing Webmaster Software’s report for the instance web site:

Google and Bing Indexing Processes

Bing Webmaster Tools Report

Bing Webmaster Tools Report

It says that I submitted 38 URLs.

The following step will contain the majority request with the POST Technique and a JSON object.

7. Carry out An HTTP Put up Request To The IndexNow API

To carry out an HTTP publish request to the IndexNow API for a set of URLs, a JSON object must be used with particular properties.

  • Host property represents the search engine hostname.
  • Key represents the API Key.
  • Key represents the placement of the API Key’s txt file inside the internet server.
  • urlList represents the URL set that can be submitted to the IndexNow API.
  • Headers symbolize the POST Request Headers that can be used that are “Content material-type” and “charset.”

Since this can be a POST request, the “requests.publish” can be used as a substitute of the “requests.get().”

Under, you can find an instance of a set of URLs submitted to Microsoft Bing’s IndexNow API.

information = {
  "host": "www.bing.com",
  "key": "22bc7c564b334f38b0b1ed90eec8f2c5",
  "keyLocation": "https://www.instance.com/22bc7c564b334f38b0b1ed90eec8f2c5.txt",
  "urlList": [
    'https://www.example.com/technical-seo/http-header/',
    'https://www.example.com/python-seo/nltk/lemmatize',
    'https://www.example.com/pagespeed/broser-hints/preload',
    'https://www.example.com/python-seo/nltk/stemming',
    'https://www.example.com/python-seo/categorize-queries/',
    'https://www.example.com/python-seo/nltk/tokenization',
    'https://www.example.com/review/oncrawl/',
    'https://www.example.com/technical-seo/hreflang/',
    'https://www.example.com/technical-seo/multilingual-seo/'
      ]
}
headers = {"Content material-type":"software/json", "charset":"utf-8"}
r = requests.publish("https://bing.com/", information=information, headers=headers)
r.status_code, r.content material

Within the instance above, we've got carried out a POST Request to index a set of URLs.

We now have used the “information” object for the “information parameter of requests.publish,” and the headers object for the “headers” parameter.

Since we POST a JSON object, the request ought to have the “content-type: software/json” key and worth with the “charset:utf-8.”

After I make the POST request, 135 seconds later, my reside logfile evaluation dashboard began to indicate the quick hits from the Bingbot.

Bingbot Log File Analysis

Bingbot Log File Analysis

8. Create Customized Operate For IndexNow API To Make Time

Making a customized operate for IndexNow API is helpful to lower the time that can be spent on the code preparation.

Thus, I've created two totally different customized Python capabilities to make use of the IndexNow API for bulk requests and particular person requests.

Under, you can find an instance for under the majority requests to the IndexNow API.

The customized operate for bulk requests is known as “submit_url_set.”

Even should you simply fill within the parameters, nonetheless it is possible for you to to make use of it correctly.

def submit_url_set(set_:record, key, location, host="https://www.bing.com", headers={"Content material-type":"software/json", "charset":"utf-8"}):
     key = "22bc7c564b334f38b0b1ed90eec8f2c5"
     set_ = sitemap_urls["loc"].to_list()
     information = {
     "host": "www.bing.com",
     "key": key,
     "keyLocation": "https://www.instance.com/22bc7c564b334f38b0b1ed90eec8f2c5.txt",
     "urlList": set_
     }
     r = requests.publish(host, information=information, headers=headers)
     return r.status_code

A proof of this practice operate:

  • The “Set_” parameter is to offer a listing of URLs.
  • “Key” parameter is to offer an IndexNow API Key.
  • “Location” parameter is to offer the placement of the IndexNow API Key’s txt file inside the internet server.
  • “Host” is to offer the search engine host tackle.
  • “Headers” is to offer the headers which might be essential for the IndexNow API.

I've outlined a few of the parameters with default values equivalent to “host” for Microsoft Bing. If you wish to use it for Yandex, you have to to state it whereas calling the operate.

Under is an instance utilization:

submit_url_set(set_=sitemap_urls["loc"].to_list(), key="22bc7c564b334f38b0b1ed90eec8f2c5", location="https://www.instance.com/22bc7c564b334f38b0b1ed90eec8f2c5.txt")

If you wish to extract sitemap URLs with a unique methodology, or if you wish to use the IndexNow API for a unique URL set, you have to to alter “set_” parameter worth.

Under, you will notice an instance of the Customized Python operate for the IndexNow API for under particular person requests.

def submit_url(url, location, key = "22bc7c564b334f38b0b1ed90eec8f2c5"):
     key = "22bc7c564b334f38b0b1ed90eec8f2c5"
     url = sitemap_urls["loc"].to_list()
     for i in url:
          endpoint = f"https://bing.com/indexnow?url={i}&key={key}&keyLocation={location}"
          response = requests.get(endpoint)
          print(i)
          print(endpoint)
          print(response.status_code, response.content material)
          #time.sleep(5)

Since that is for a loop, you'll be able to submit extra URLs one after the other. The search engine can prioritize some of these requests otherwise.

A number of the bulk requests will embody non-important URLs, the person requests may be seen as extra cheap.

If you wish to embody the sitemap URL extraction inside the operate, it is best to embody Advertools naturally into the capabilities themselves.

Suggestions For Utilizing The IndexNow API With Python

An Overview of How The IndexNow API Works, Capabilities & Makes use of

  • The IndexNow API doesn’t assure that your web site or the URLs that you just submitted can be listed.
  • It is best to solely submit URLs which might be new or for which the content material has modified.
  • The IndexNow API impacts the crawl finances.
  • Microsoft Bing has a threshold for the URL Content material High quality and Calculation of the Crawl Want for a URL. If the submitted URL shouldn't be adequate, they might not crawl it.
  • You'll be able to submit as much as 10,000 URLs.
  • The IndexNow API suggests submitting URLs even when the web site is small.
  • Submitting the identical pages many occasions inside a day can block the IndexNow API from crawling the redundant URLs or the supply.
  • The IndexNow API is helpful for websites the place the content material modifications ceaselessly, like each 10 minutes.
  • IndexNow API is helpful for pages which might be gone and are returning a 404 response code. It lets the search engine know that the URLs are gone.
  • IndexNow API can be utilized for notifying of recent 301 or 302 redirects.
  • The 200 Standing Response Code implies that the search engine is conscious of the submitted URL.
  • The 429 Standing Code implies that you made too many requests to the IndexNow API.
  • For those who put a “txt” file that comprises the IndexNow API Key right into a subfolder, the IndexNow API can be utilized just for that subfolder.
  • When you've got two totally different CMS, you should utilize two totally different IndexNow API Keys for 2 totally different web site sections
  • Subdomains want to make use of a unique IndexNow API key.
  • Even should you already use a sitemap, utilizing IndexNow API is helpful as a result of it effectively tells the various search engines of web site modifications and reduces pointless bot crawling.
  • All engines like google that undertake the IndexNow API (Microsoft Bing and Yandex) share the URLs which might be submitted between one another.
IndexNow API Infographic SEOIndexNow API Documentation and utilization ideas might be discovered above.IndexNow API Infographic SEO

On this IndexNow API tutorial and guideline with Python, we've got examined a brand new search engine expertise.

As a substitute of ready to be crawled, publishers can notify the various search engines to crawl when there's a want.

IndexNow reduces the usage of search engine information heart sources, and now you understand how to make use of Python to make the method extra environment friendly, too.

Extra sources:

An Introduction To Python & Machine Studying For Technical search engine optimization

How you can Use Python to Monitor & Measure Web site Efficiency

Superior Technical search engine optimization: A Full Information


Featured Picture: metamorworks/Shutterstock

!function(f,b,e,v,n,t,s)
{if(f.fbq)return;n=f.fbq=function(){n.callMethod?
n.callMethod.apply(n,arguments):n.queue.push(arguments)};
if(!f._fbq)f._fbq=n;n.push=n;n.loaded=!0;n.version='2.0′;
n.queue=[];t=b.createElement(e);t.async=!0;
t.src=v;s=b.getElementsByTagName(e)[0];
s.parentNode.insertBefore(t,s)}(window,document,'script',
‘https://connect.facebook.net/en_US/fbevents.js');

if( typeof sopp !== “undefined” && sopp === ‘yes' ){
fbq(‘dataProcessingOptions', [‘LDU'], 1, 1000);
}else{
fbq(‘dataProcessingOptions', []);
}

fbq(‘init', ‘1321385257908563');

fbq(‘track', ‘PageView');

fbq(‘trackSingle', ‘1321385257908563', ‘ViewContent', {
content_name: ‘indexnow-api-python',
content_category: ‘seo ‘
});



Supply hyperlink

RELATED ARTICLES

Leave a Reply

Most Popular