对官网文档做SEO

阿里云国内75折 回扣 微信号:monov8
阿里云国际,腾讯云国际,低至75折。AWS 93折 免费开户实名账号 代冲值 优惠多多 微信号:monov8 飞机:@monov6

对官网文档做SEO

思路将我们现有文档喂给搜索引擎

引擎蜘蛛爬虫

爬虫 User-Agent

谷歌搜索引擎蜘蛛爬虫

google 搜索引擎蜘蛛爬虫的 UA 一般为

  • Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)、
  • Googlebot/2.1 (+http://www.googlebot.com/bot.html)、
  • Googlebot/2.1 (+http://www.google.com/bot.html)、
  • Googlebot-Image/1.0

其中最后一个是 google 图片搜索蜘蛛爬虫。

百度搜索引擎蜘蛛爬虫

  • 移动 UA: Mozilla/5.0 (Linux;u;Android 4.2.2;zh-cn;) AppleWebKit/534.46 (KHTML,likeGecko) Version/5.1 Mobile Safari/10600.6.3 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)
  • PC UA: Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html
  • 新增渲染 UA:
    • 移动 UA Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 likeMac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; Baiduspider-render/2.0; +http://www.baidu.com/search/spider.html)
    • PC UAMozilla/5.0 (compatible; Baiduspider-render/2.0; +http://www.baidu.com/search/spider.html)

搜狗搜索引擎蜘蛛爬虫

搜狗搜索引擎 UA 为

  • Sogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)
  • 图片蜘蛛Sogou Pic Spider/3.0(+http://www.sogou.com/docs/help/webmasters.htm#07)

360搜索引擎蜘蛛爬虫

360搜索蜘蛛爬虫的 UA 为

  • Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36; 360Spider

神马搜索引擎蜘蛛爬虫

  • 神马 Spider 的 user-agent 为YisouSpider

Bing 搜索引擎蜘蛛爬虫

  • Mozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)

Yandex 搜索引擎蜘蛛爬虫

  • Yandex 是俄罗斯的搜索引擎其 UA 是 Mozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)

UA 汇总

搜索引擎botUser-Agent 详细
谷歌googlebotMozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
Googlebot/2.1 (+http://www.googlebot.com/bot.html)
Googlebot/2.1 (+http://www.google.com/bot.html)
Googlebot-Image/1.0
百度Baiduspider移动 UA: Mozilla/5.0 (Linux;u;Android 4.2.2;zh-cn;) AppleWebKit/534.46 (KHTML,likeGecko) Version/5.1 Mobile Safari/10600.6.3 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html)
PC UA: Mozilla/5.0 (compatible; Baiduspider/2.0; +http://www.baidu.com/search/spider.html
移动 UA Mozilla/5.0 (iPhone; CPU iPhone OS 9_1 likeMac OS X) AppleWebKit/601.1.46 (KHTML, like Gecko) Version/9.0 Mobile/13B143 Safari/601.1 (compatible; Baiduspider-render/2.0; +http://www.baidu.com/search/spider.html)
PC UAMozilla/5.0 (compatible; Baiduspider-render/2.0; +http://www.baidu.com/search/spider.html)
搜狗sogou web spiderSogou web spider/4.0(+http://www.sogou.com/docs/help/webmasters.htm#07)
sogou pic spider图片蜘蛛Sogou Pic Spider/3.0(+http://www.sogou.com/docs/help/webmasters.htm#07)
360360SpiderMozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/50.0.2661.102 Safari/537.36; 360Spider
神马YisouSpiderYisouSpider
BingbingbotMozilla/5.0 (compatible; bingbot/2.0; +http://www.bing.com/bingbot.htm)
YandexYandexBotMozilla/5.0 (compatible; YandexBot/3.0; +http://yandex.com/bots)

判断 UA

测试环境

机器IP xx.xx.229.139

工作目录 /opt/www/doc.rongcloud.net/xxxxxxx

Nginx 服务配置 /usr/local/openresty/nginx/conf/vhosts/doc.rongcloud.net.conf

Nginx 配置修改

        location / {
                root /opt/www/docqa.rongcloud.net/xxxxxxx;
+                default_type text/html;
+                if ( $http_user_agent ~* "(Baiduspider)|(360Spider)|(bingbot)|(YandexBot)" ){
+                        root /opt/www/doc.rongcloud.net/xxxxxxx/docs;
+                }
                try_files $uri $uri/ /index.html;
                error_page 404 /index.html;
        }
+        location ~* .*.(js|css|png|jpg|svg)$ {
+                root /opt/www/doc.rongcloud.net/xxxxxxx;
+        }

注意部署时机器上有目录与文件名冲突的。将目录删除即可。

生成SEO文件

  • 将 md 文件转换为 html 添加默认类型

  • 添加头尾

验证 百度SEO UA 带有 Baiduspider

curl 'https://doc.rongcloud.net/livevideoroom/Android/2.X/guides/intro' \
  -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
  -H 'Accept-Language: zh-CN,zh;q=0.9' \
  -H 'Cache-Control: no-cache' \
  -H 'Connection: keep-alive' \
  -H 'Cookie: im-token-login=0; ' \
  -H 'Pragma: no-cache' \
  -H 'Sec-Fetch-Dest: document' \
  -H 'Sec-Fetch-Mode: navigate' \
  -H 'Sec-Fetch-Site: same-origin' \
  -H 'Sec-Fetch-User: ?1' \
  -H 'Upgrade-Insecure-Requests: 1' \
  -H $'User-Agent: \'Mozilla/5.0 (Linux; Android 8.0.0; SM-G955U Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Mobile Safari/537.36 Baiduspider' \
  -H 'sec-ch-ua: ""' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: ""' \
  --compressed

验证google UA 带有 googlebot

curl 'https://doc.rongcloud.net/livevideoroom/Android/2.X/guides/intro' \
  -H 'Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3;q=0.9' \
  -H 'Accept-Language: zh-CN,zh;q=0.9' \
  -H 'Cache-Control: no-cache' \
  -H 'Connection: keep-alive' \
  -H 'Cookie: im-token-login=0; trajectoryId=M_G0bm3NaS8EyOqQ379wAg; ' \
  -H 'Pragma: no-cache' \
  -H 'Sec-Fetch-Dest: document' \
  -H 'Sec-Fetch-Mode: navigate' \
  -H 'Sec-Fetch-Site: same-origin' \
  -H 'Sec-Fetch-User: ?1' \
  -H 'Upgrade-Insecure-Requests: 1' \
  -H $'User-Agent: \'Mozilla/5.0 (Linux; Android 8.0.0; SM-G955U Build/R16NW) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/87.0.4280.141 Mobile Safari/537.36 GOOGLEBOT' \
  -H 'sec-ch-ua: ""' \
  -H 'sec-ch-ua-mobile: ?0' \
  -H 'sec-ch-ua-platform: ""' \
  --compressed

[外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-g2e3z5eE-1674089741740)(.assets/image-20230118145717947.png)]

数据提交

  • 百度搜索引擎提交入口[https://ziyuan.baidu.com/linksubmit/url]
  • 360搜索网站收录提交入口[http://info.so.360.cn/site_submit.html]
  • 360搜索新闻源收录入口[http://info.so.com/news_submit.html]
  • 搜狗网站收录提交入口[http://fankui.help.sogou.com/index.php/web/web/index]
  • Google网站登录口[https://www.google.com/webmasters/tools/submit-url]
  • 必应网站提交登录入口[https://www.bing.com/toolbox/submit]

站长平台

百度搜索

  • https://ziyuan.baidu.com/site/siteverify?id=1113540423#/

360搜索

  • https://zhanzhang.so.com/sitetool/site_manage

搜狗网站

  • https://zhanzhang.sogou.com/index.php/dashboard/websiteAdd

Bing

  • https://www.bing.com/webmasters?siteUrl=https%3A%2F%2Fdoc.rongcloud.cn
阿里云国内75折 回扣 微信号:monov8
阿里云国际,腾讯云国际,低至75折。AWS 93折 免费开户实名账号 代冲值 优惠多多 微信号:monov8 飞机:@monov6