mirror of
https://github.com/lorien/awesome-web-scraping.git
synced 2024-11-28 08:48:58 +02:00
39 lines
2.4 KiB
Markdown
39 lines
2.4 KiB
Markdown
# Awesome Web Scraping
|
|
|
|
The list of tools, programming libraries and web services used in web scraping.
|
|
|
|
Web scraping chats: [@grablab](https://t.me/grablab) (English) and [@grablab_ru](https://t.me/grablab_ru) (Russian)
|
|
|
|
## Programming Libraries
|
|
|
|
* [Python](http://github.com/lorien/web-scraping/blob/master/python.md)
|
|
* [PHP](http://github.com/lorien/web-scraping/blob/master/php.md)
|
|
* [Ruby](http://github.com/lorien/web-scraping/blob/master/ruby.md)
|
|
* [JavaScript](http://github.com/lorien/web-scraping/blob/master/javascript.md)
|
|
* [Go](http://github.com/lorien/web-scraping/blob/master/golang.md)
|
|
|
|
## Other Things
|
|
|
|
* [Web Scraping Services](http://github.com/lorien/web-scraping/blob/master/web_services.md)
|
|
* [Console tools](http://github.com/lorien/web-scraping/blob/master/console_tools.md)
|
|
* [dhamaniasad / HeadlessBrowsers](https://github.com/dhamaniasad/HeadlessBrowsers) - a list of (almost) all headless web browsers in existence
|
|
* [lorien / awesome-python-dev](https://github.com/lorien/awesome-python-dev) - a list of tools for debugging, profiling and analyzing python programs.
|
|
|
|
## Captcha Solving Services
|
|
|
|
* [2captcha.com](https://2captcha.com/?from=3019071) - 2captcha.com
|
|
* [anti-gate.com](http://getcaptchasolution.com/ijykrofoxz) - anti-gate.com
|
|
|
|
## Proxy Services
|
|
|
|
* [proxyrack.com](http://www.proxyrack.com/access/aff/go/lorien) - Over 1,250,000 Unique Private IPs With Instant Setup
|
|
* [webshare.io](https://proxy.webshare.io/register/?referral_code=g41thaun6ip6) - 5000+ HTTP server proxies
|
|
* [proxium.ru](http://proxium.ru/partner/5c47bdaae8359) - extremely cheap small proxy lists
|
|
* [scrapinghub.com/crawlera](https://scrapinghub.com/crawlera?rfsn=4147167.2ec6da) - Smart Proxy Network that automatically rotates IPs
|
|
|
|
## About awesome-web-scraping
|
|
|
|
Make this list better! Your contributions are always welcome! See [contributing how-to](https://github.com/lorien/web-scraping/blob/master/CONTRIBUTING.md). To add new language to Programming Libraries section use [new_language_template](http://github.com/lorien/web-scraping/blob/master/new_language_template.md) as starting point.
|
|
|
|
The list is based initially on some data from these sources [awesome-python](https://github.com/vinta/awesome-python), [awesome-php](https://github.com/ziadoz/awesome-php), [awesome-ruby](https://github.com/markets/awesome-ruby), [ruby-nlp](https://github.com/diasks2/ruby-nlp), [awesome-javascript](https://github.com/sorrycc/awesome-javascript)
|