比 requests 更强大 Python 库,让你的爬虫效率提高一倍!
来源:网络

什么是协程?
协程比多线程好在哪呢?
协程的适用 & 不适用场景
初探异步 http 框架 httpx
什么是 httpx
安装
pip install httpx
最佳实践
import asyncioimport httpximport threadingimport timedef sync_main(url, sign):response = httpx.get(url).status_codeprint(f'sync_main: {threading.current_thread()}: {sign}2 + 1{response}')sync_start = time.time()[sync_main(url='http://www.baidu.com', sign=i) for i in range(200)]sync_end = time.time()print(sync_end - sync_start)
sync_main: <_MainThread(MainThread, started 4471512512)>: 192: 200sync_main: <_MainThread(MainThread, started 4471512512)>: 193: 200sync_main: <_MainThread(MainThread, started 4471512512)>: 194: 200sync_main: <_MainThread(MainThread, started 4471512512)>: 195: 200sync_main: <_MainThread(MainThread, started 4471512512)>: 196: 200sync_main: <_MainThread(MainThread, started 4471512512)>: 197: 200sync_main: <_MainThread(MainThread, started 4471512512)>: 198: 200sync_main: <_MainThread(MainThread, started 4471512512)>: 199: 20016.56578803062439
import asyncioimport httpximport threadingimport timeclient = httpx.AsyncClient()async def async_main(url, sign):response = await client.get(url)status_code = response.status_codeprint(f'async_main: {threading.current_thread()}: {sign}:{status_code}')loop = asyncio.get_event_loop()tasks = [async_main(url='http://www.baidu.com', sign=i) for i in range(200)]async_start = time.time()loop.run_until_complete(asyncio.wait(tasks))async_end = time.time()loop.close()print(async_end - async_start)
async_main: <_MainThread(MainThread, started 4471512512)>: 56: 200async_main: <_MainThread(MainThread, started 4471512512)>: 99: 200async_main: <_MainThread(MainThread, started 4471512512)>: 67: 200async_main: <_MainThread(MainThread, started 4471512512)>: 93: 200async_main: <_MainThread(MainThread, started 4471512512)>: 125: 200async_main: <_MainThread(MainThread, started 4471512512)>: 193: 200async_main: <_MainThread(MainThread, started 4471512512)>: 100: 2004.518340110778809

加入知识星球【我们谈论数据科学】
400+小伙伴一起学习!
· 推荐阅读 ·
评论
