1
0
mirror of https://github.com/vimagick/dockerfiles.git synced 2024-12-23 01:39:27 +02:00
dockerfiles/crawlee/README.md
2023-03-08 15:03:19 +08:00

669 B

crawlee

Crawlee is a web scraping and browser automation library Crawlee is a web scraping and browser automation library.

$ docker-compose build
Building crawlee
Successfully built xxxxxxxxxxxx
Successfully tagged crawlee:latest

$ docker-compose run --rm crawlee
INFO  BasicCrawler: Starting the crawl
INFO  BasicCrawler: Processing ...
Crawler finished.

$ tree data
├── datasets
│   └── default
│       ├── 000000001.json
│       ├── 000000002.json
│       ├── 000000003.json
│       └── 000000004.json
├── key_value_stores
└── request_queues