Nodejs 爬虫

阿里云国内75折 回扣 微信号:monov8
阿里云国际,腾讯云国际,低至75折。AWS 93折 免费开户实名账号 代冲值 优惠多多 微信号:monov8 飞机:@monov6


本章节主要用的是

const cheerio = require('cheerio')
const request = require('request')
const fs = require('fs')

1 cheerio 读取html元素

2 request 请求

3 fs操作文件

爬取的网站    ​​微信个性头像高清图片_微信头像图片大全​

知识点

1. request(url).pipe() 返回的是文件流

2.fs.createWriteStream 写入文件流

全部代码

const cheerio = require('   ')
const request = require('request')
const fs = require('fs')
const headers = {
//Host: "www.netbian.com",
'User-Agent': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.55 Safari/537.36',
//"Cookie": "__yjs_duid=1_aad00e5279c2256037ecf95bb8dd16b11639024945696; yjs_js_security_passport=00a97397b008ebbfa879ba38e7989c0cb150fd05_1639055269_js; Hm_lvt_14b14198b6e26157b7eba06b390ab763=1639033176,1639033789,1639033793,1639055270; Hm_lpvt_14b14198b6e26157b7eba06b390ab763=1639055270"
}


const Repier = (options) => {
for (let i = 1; i <= options.page; i++) {
let optUrl = i == 1 ? options.url : options.url + `list_${i}.html`
request({
url: optUrl,
headers
}, (_, response, body) => {
console.log(optUrl)
const $ = cheerio.load(body)
const list = $('.pics img')
console.log(list,$)
list.each(async (index, item) => {
const url = $(item).attr('src')
if (url) {
try {
await request(url).pipe(fs.createWriteStream('./img/' +i+ '-'+index+ '.jpg'))
console.log(i, '----', index, url)
}
catch (e) {
console.log(e, 'error')
}
}
})
})
}
}

Repier({
url: 'http://www.duoziwang.com/head/gexing/',
page: 10
})
阿里云国内75折 回扣 微信号:monov8
阿里云国际,腾讯云国际,低至75折。AWS 93折 免费开户实名账号 代冲值 优惠多多 微信号:monov8 飞机:@monov6
标签: NodeJS