[原创]Google SEO教程之Google Indexing API第一时间抓取新页面

作者: geekzl管理员 发布时间: 2020-11-29 21:11:45 287 人阅读   百度未收录

Google SEO教程之Google Indexing API第一时间抓取新页面

别名: Node.js如何使用Google Indexing API

上篇文章 Google SEO动态之Request Indexing功能停用 中,我们提到 2020年10月14日,Google暂停了Request Indexing 功能,中心君还提到过,会告诉大家相应的解决办法 - 使用Google Indexing API,这次我们就来好好聊聊怎么操作吧~

获取indexing API的私钥文件(json格式)

打开Google服务帐号页面

Service account details

From https://console.cloud.google.com/iam-admin/serviceaccounts/details/

访问 https://console.developers.google.com/apis/credentials?project=https://console.cloud.google.com/projectselector2/iam-admin/serviceaccounts?supportedpurview=project,然后点击创建密钥(Create Key)按钮下载包含API密钥的文件(建议用json格式)。

Google服务帐号页面-极客中心

下载完,重命名为: service_account.json,供后面代码使用。

记录Service account邮箱账号

在Google服务帐号页面找到Service account邮箱账号(Email for Service account) in Google Cloud:

indexing-api-runner@xxx.iam.gserviceaccount.com

记录下来,后面需要用。

在站点设置中给予Service account账号相应权限

Google Search Console:

Google Search Console站点设置-极客中心

Google Search Console站点设置2-极客中心

Google Search Console站点设置3-极客中心

如果不设置这一步,运行后文中的nodejs代码, 会出现下面的错误返回值:

{
  "error": {
    "code": 403,
    "message": "Permission denied. Failed to verify the URL ownership.",
    "status": "PERMISSION_DENIED"
  }
}

调用Google Indexing API的node.js代码

使用 Node.js库 google-api-nodejs-client 获取 OAuth 令牌:

nodejs环境准备工作:

npm install googleapis
npm install request

原始代码是:

var request = require("request");
var { google } = require("googleapis");
var key = require("./service_account.json");

const jwtClient = new google.auth.JWT(
    key.client_email,
    null,
    key.private_key,
    ["https://www.googleapis.com/auth/indexing"],
    null
);

jwtClient.authorize(function (err, tokens) {
    if (err) {
        console.log(err);
        return;
    }
    let options = {
        url: "https://indexing.googleapis.com/v3/urlNotifications:publish",
        method: "POST",
        // Your options, which must include the Content-Type and auth headers
        headers: {
            "Content-Type": "application/json"
        },
        auth: { "bearer": tokens.access_token },
        // Define contents here. The structure of the content is described in the next step.
        json: {
            "url": "https://www.geekzl.com/why-name-jekyll.html",
            "type": "URL_UPDATED"
        }
    };
    request(options, function (error, response, body) {
        // Handle the response
        console.log(body);
    });
});

直接输入 node ./indexing.js 运行,出现问题:

Error while trying to retrieve access token { FetchError: request to https://oauth2.googleapis.com/token failed, reason: connect ETIMEDOUT 216.58.200.10:443
    at ClientRequest.<anonymous> (/Users/hesk/Documents/localize-spreadsheet-bot/node_modules/node-fetch/lib/index.js:1453:11)
    at ClientRequest.emit (events.js:180:13)
    at TLSSocket.socketErrorListener (_http_client.js:395:9)
    at TLSSocket.emit (events.js:180:13)
    at emitErrorNT (internal/streams/destroy.js:64:8)
    at process._tickCallback (internal/process/next_tick.js:178:19)
  message: 'request to https://oauth2.googleapis.com/token failed, reason: connect ETIMEDOUT 216.58.200.10:443',
  type: 'system',
  errno: 'ETIMEDOUT',
  code: 'ETIMEDOUT',
  config:
   { method: 'POST',
     url: 'https://oauth2.googleapis.com/token',
     data: 'code=4%2FQgFCT-LEUxcnDljD1DMn9olKwYQVJ9bVxiaZJMmUgT7fPAyu5Gc14Ro&client_id=875537178561-5j56883h195m8e8lrggah3fes3gh253t.apps.googleusercontent.com&client_secret=6IkI8HtPvcmXU7XORCgKg7TR&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&grant_type=authorization_code&code_verifier=',
     headers:
      { 'Content-Type': 'application/x-www-form-urlencoded',
        'User-Agent': 'google-api-nodejs-client/3.1.2',
        Accept: 'application/json' },
     params: {},
     paramsSerializer: [Function: paramsSerializer],
     body: 'code=4%2FQgFCT-LEUxcnDljD1DMn9olKwYQVJ9bVxiaZJMmUgT7fPAyu5Gc14Ro&client_id=875537178561-5j56883h195m8e8lrggah3fes3gh253t.apps.googleusercontent.com&client_secret=6IkI8HtPvcmXU7XORCgKg7TR&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&grant_type=authorization_code&code_verifier=',
     validateStatus: [Function: validateStatus],
     responseType: 'json' } }

Google Indexing API-nodejs exception-极客中心

解决方法:

为nodejs代码加入ip代理(确保在能科学上网时找到相应的ip proxy, 需要放到nodejs代码中).

process.env.http_proxy = 'http://10.179.8.31:9090';  /* Set proxy */
process.env.HTTPS_PROXY = 'http://10.179.8.31:9090';

当然,你如果在浏览器中使用且能访问Google (比如,可以用Chrome上网助手 - 插件),可以直接用 Repl.it 运行你的nodejs代码。

node.js在线测试:

Repl.it - Node.js Online Compiler and IDE - Fast, Powerful, Free

https://repl.it/languages/nodejs

文件结构:

Google Indexing API-nodejs代码-目录设置-极客中心

改进后的 nodejs 代码:

var request = require("request");
var { google } = require("googleapis");
var key = require("./service_account.json");

process.env.http_proxy = 'http://10.179.8.31:9090';  /* Set proxy */
process.env.HTTPS_PROXY = 'http://10.179.8.31:9090';

const jwtClient = new google.auth.JWT(
    key.client_email,
    null,
    key.private_key,
    ["https://www.googleapis.com/auth/indexing"],
    null
);

jwtClient.authorize(function (err, tokens) {
    if (err) {
        console.log(err);
        return;
    }
    let options = {
        url: "https://indexing.googleapis.com/v3/urlNotifications:publish",
        method: "POST",
        // Your options, which must include the Content-Type and auth headers
        headers: {
            "Content-Type": "application/json"
        },
        auth: { "bearer": tokens.access_token },
        // Define contents here. The structure of the content is described in the next step.
        json: {
            "url": "https://www.geekzl.com/jsdelivr-not-update.html",
            "type": "URL_UPDATED"
        }
    };
    request(options, function (error, response, body) {
        // Handle the response
        console.log(body);
    });
});

我们再次执行:

bravo@BR MINGW64 /d/coding/GitHub/google-index-api
$ node ./indexing.js

返回结果:

{
  urlNotificationMetadata: {
    url: 'https://www.geekzl.com/jsdelivr-not-update.html',
    latestUpdate: {
      url: 'https://www.geekzl.com/jsdelivr-not-update.html',
      type: 'URL_UPDATED',
      notifyTime: '2020-10-16T08:14:24.510420447Z'
    }
  }
}

参考:

Google's officially supported Node.js client library for accessing Google APIs
googleapis.dev/nodejs/googleapis/latest/

Support for authorization and authentication with OAuth 2.0, API Keys and JWT (Service Tokens) is included.

Auth error: ETIMEDOUT #283 - set proxy

From https://github.com/googleapis/google-auth-library-nodejs/issues/283#issuecomment-563285724

How to get new pages or site updates indexed by Google quickly

From https://builtvisible.com/how-do-you-get-new-pages-indexed-or-your-site-re-crawled/

How to request Google to re-crawl my website?

From https://stackoverflow.com/questions/9466360/how-to-request-google-to-re-crawl-my-website

使用Google Indexing API 的前提条件

From https://developers.google.com/search/apis/indexing-api/v3/prereqs

Google Indexing API - 403 'Forbidden Response'

Index API: Permission denied. Failed to verify the URL ownership.
From https://support.google.com/webmasters/thread/4763732?hl=en

了解服务帐号 谷歌官方文档
From https://cloud.google.com/iam/docs/understanding-service-accounts#managing_service_account_keys



版权声明

当前位置:geekzl | 极客中心 » Google SEO教程之Google Indexing API第一时间抓取新页面

发表评论

Captcha Code

我还会在以下平台发布内容

知乎 CSDN