1
0
mirror of https://github.com/lorien/awesome-web-scraping.git synced 2025-02-21 19:06:39 +02:00

Merge pull request #126 from ray-102/master

Added Extractnet an ML based crawler in Python
This commit is contained in:
lorien 2021-07-24 21:29:06 +03:00 committed by GitHub
commit 44e7083e3a
No known key found for this signature in database
GPG Key ID: 4AEE18F83AFDEB23

View File

@ -385,6 +385,7 @@ Libraries for extracting web contents.
* [trafilatura](https://github.com/adbar/trafilatura) - Fast extraction of main text and comments along with structure, conversion to TXT, CSV & XML.
* [advertools](https://github.com/eliasdabbas/advertools) - A customizable crawler to analyze SEO and content of pages and websites.
* [photon](https://github.com/s0md3v/Photon) - Incredibly fast crawler designed for OSINT
* [extractnet](https://github.com/currentsapi/extractnet) - Machine Learning based content and metadata extraction in Python 3
## WebSocket