mirror of
https://github.com/lorien/awesome-web-scraping.git
synced 2024-11-28 08:48:58 +02:00
44 lines
2.2 KiB
Markdown
44 lines
2.2 KiB
Markdown
# Awesome Web Scraping
|
|
|
|
The list of tools, programming libraries and web services used for web scraping and data processing.
|
|
|
|
Feel free to give feedback or ask web scraping questions in Telegram groups: [@grablab](https://t.me/grablab) (English) and [@grablab_ru](https://t.me/grablab_ru) (Russian).
|
|
|
|
## Programming Libraries
|
|
|
|
* [Python](https://github.com/lorien/web-scraping/blob/master/python.md)
|
|
* [PHP](https://github.com/lorien/web-scraping/blob/master/php.md)
|
|
* [Ruby](https://github.com/lorien/web-scraping/blob/master/ruby.md)
|
|
* [JavaScript](https://github.com/lorien/web-scraping/blob/master/javascript.md)
|
|
* [Go](https://github.com/lorien/web-scraping/blob/master/golang.md)
|
|
|
|
## Other Things
|
|
|
|
* [Learning Web Scraping](https://github.com/lorien/learning-web-scraping) - list of articles and books teaching web scraping
|
|
* [Web Scraping Services](https://github.com/lorien/web-scraping/blob/master/web_services.md)
|
|
* [Console tools](https://github.com/lorien/web-scraping/blob/master/console_tools.md)
|
|
* [dhamaniasad / HeadlessBrowsers](https://github.com/dhamaniasad/HeadlessBrowsers) - a list of (almost) all headless web browsers in existence
|
|
* [DNS over HTTPS providers](https://github.com/curl/curl/wiki/DNS-over-HTTPS) - list of DNS over HTTPs providers
|
|
|
|
## Captcha Solving Services
|
|
|
|
These two links point to same captcha service, it is just a different language versions
|
|
|
|
* [https://2captcha.com](https://2captcha.com/?from=3019071) - English UI language
|
|
* [https://rucaptcha.com](https://rucaptcha.com/?from=3019071) - Russian UI language
|
|
|
|
## Proxy Server Marketplaces
|
|
|
|
Larget marketplaces in the world which contain offers from hundreds sellers and services:
|
|
|
|
* https://www.blackhatworld.com/forums/proxies-for-sale.112/
|
|
* https://forum.antichat.com/forums/147/
|
|
|
|
## How to Contribute to this List
|
|
|
|
See [Contributing](https://github.com/lorien/web-scraping/blob/master/CONTRIBUTING.md) guide.
|
|
|
|
## Credits
|
|
|
|
The list is based initially on some data from these sources [awesome-python](https://github.com/vinta/awesome-python), [awesome-php](https://github.com/ziadoz/awesome-php), [awesome-ruby](https://github.com/markets/awesome-ruby), [ruby-nlp](https://github.com/diasks2/ruby-nlp), [awesome-javascript](https://github.com/sorrycc/awesome-javascript)
|