全国统计用区划代码和城乡划分代码[爬虫代码]【Json+CSV格式】

页面地址:http://www.stats.gov.cn/tjsj/tjbz/tjyqhdmhcxhfdm/2021/11/01/01/110101001.html 最近需要使用最新的行政区划信息,虽然统计局公开了相关的数据,但是并没有提供数据文件。于是,就写了个爬虫把所有的数据爬取了一遍。生成的默认数据格式为json,另外提供了一个工具来把json转成csv。

Continue Reading

微图坊爬虫 [Chrome Support]【22.08.21】【Windows】

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
更新记录:
1.修复部分页面链接失效导致创建目录之后不能下载的问题;
2.修复登录模式下超出浏览次数导致下载失败的问题,提前结束进程;
更新记录: 1.修复部分页面链接失效导致创建目录之后不能下载的问题; 2.修复登录模式下超出浏览次数导致下载失败的问题,提前结束进程;
更新记录:
1.修复部分页面链接失效导致创建目录之后不能下载的问题;
2.修复登录模式下超出浏览次数导致下载失败的问题,提前结束进程;

Continue Reading

微图坊爬虫 【22.06.16】【Windows】

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
更新记录:
1. 修复某些分类无法获取专辑列表的问题
2. 每天启动请重新登录
更新记录: 1. 修复某些分类无法获取专辑列表的问题 2. 每天启动请重新登录
更新记录:
1. 修复某些分类无法获取专辑列表的问题
2. 每天启动请重新登录

Continue Reading

微图坊爬虫 【22.06.07】【Windows】

Change Log:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
1. Install newst chrome before use this program.
2. Open chrome and login to v2ph.com
3. The spider will auto stop after crawl 16 albums
1. Install newst chrome before use this program. 2. Open chrome and login to v2ph.com 3. The spider will auto stop after crawl 16 albums
1. Install newst chrome before use this program.
2. Open chrome and login to v2ph.com
3. The spider will auto stop after crawl 16 albums

Usage:

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
(venv) PS F:\Pycharm_Projects\meitulu-spider> python .\v2ph.py
Arguments:
-a <download all site images>
-q <query the image with keywords>
-h <display help text, just this>
Option Arguments:
-p <image download path>
-r <random index category list>
-c <single category url>
-e <early stop, work in site crawl mode only>
-s <site url eg: https://www.v2ph.com (no last backslash "/")>
****************************************************************************************************
(venv) PS F:\Pycharm_Projects\meitulu-spider> python .\v2ph.py Arguments: -a <download all site images> -q <query the image with keywords> -h <display help text, just this> Option Arguments: -p <image download path> -r <random index category list> -c <single category url> -e <early stop, work in site crawl mode only> -s <site url eg: https://www.v2ph.com (no last backslash "/")> ****************************************************************************************************
(venv) PS F:\Pycharm_Projects\meitulu-spider> python .\v2ph.py
Arguments:
         -a <download all site images>
         -q <query the image with keywords>
         -h <display help text, just this>
Option Arguments:
         -p <image download path>
         -r <random index category list>
         -c <single category url>
         -e <early stop, work in site crawl mode only>
         -s <site url eg: https://www.v2ph.com (no last backslash "/")>
****************************************************************************************************

Continue Reading

KU138爬虫 【22.05.23】【Windows】

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
****************************************************************************************************
USAGE:
spider -h <help> -a <all> -q <search>
Arguments:
-a <download all site images>
-q <query the image with keywords>
-h <display help text, just this>
Option Arguments:
-p <image download path>
-r <random index category list>
-c <single category url>
-e <early stop, work in site crawl mode only>
-s <site url eg: https://www.v2ph.com (no last backslash "/")>
****************************************************************************************************
**************************************************************************************************** USAGE: spider -h <help> -a <all> -q <search> Arguments: -a <download all site images> -q <query the image with keywords> -h <display help text, just this> Option Arguments: -p <image download path> -r <random index category list> -c <single category url> -e <early stop, work in site crawl mode only> -s <site url eg: https://www.v2ph.com (no last backslash "/")> ****************************************************************************************************
****************************************************************************************************
USAGE:
spider -h <help> -a <all> -q <search>
Arguments:
         -a <download all site images>
         -q <query the image with keywords>
         -h <display help text, just this>
Option Arguments:
         -p <image download path>
         -r <random index category list>
         -c <single category url>
         -e <early stop, work in site crawl mode only>
         -s <site url eg: https://www.v2ph.com (no last backslash "/")>
****************************************************************************************************

Continue Reading

微图坊爬虫 【22.05.16】【Windows】

Plain text
Copy to clipboard
Open code in new window
EnlighterJS 3 Syntax Highlighter
使用参数:
****************************************************************************************************
USAGE:
spider -h <help> -a <all> -q <search>
Arguments:
-a <download all site images>
-q <query the image with keywords>
-h <display help text, just this>
Option Arguments:
-p <image download path>
-r <random index category list>
-c <single category url>
-e <early stop, work in site crawl mode only>
-s <site url eg: https://www.v2ph.com (no last backslash "/")>
****************************************************************************************************
使用参数: **************************************************************************************************** USAGE: spider -h <help> -a <all> -q <search> Arguments: -a <download all site images> -q <query the image with keywords> -h <display help text, just this> Option Arguments: -p <image download path> -r <random index category list> -c <single category url> -e <early stop, work in site crawl mode only> -s <site url eg: https://www.v2ph.com (no last backslash "/")> ****************************************************************************************************
使用参数:

****************************************************************************************************
USAGE:
spider -h <help> -a <all> -q <search>
Arguments:
         -a <download all site images>
         -q <query the image with keywords>
         -h <display help text, just this>
Option Arguments:
         -p <image download path>
         -r <random index category list>
         -c <single category url>
         -e <early stop, work in site crawl mode only>
         -s <site url eg: https://www.v2ph.com (no last backslash "/")>
****************************************************************************************************

Continue Reading