lightcrawler轻型爬虫

联合创作 · 2023-09-29 13:52

lightcrawler 可以抓取网站并通过 Google lighthouse 运行网站。


npm install --save-dev lightcrawler

lightcrawler --url https://atom.io/ --config lightcrawler-config.json

lightcrawler-config.json看起来像这样:


{
"extends": "lighthouse:default",
"settings": {
"crawler": {
"maxDepth": 2,
"maxChromeInstances": 5
},
"onlyCategories": [
"Accessibility",
"Performance",
"Best Practices"
],
"onlyAudits": [
"accesskeys",
"aria-allowed-attr",
"external-anchors-use-rel-noopener",
"geolocation-on-start",
"no-document-write",
"no-mutation-events",
"no-old-flexbox",
"time-to-interactive",
"user-timings",
"viewport",
"without-javascript"
]
}
}

 

浏览 21
点赞
评论
收藏
分享

手机扫一扫分享

编辑 分享
举报
评论
图片
表情
推荐
点赞
评论
收藏
分享

手机扫一扫分享

编辑 分享
举报