1
0
mirror of https://github.com/lorien/awesome-web-scraping.git synced 2024-11-24 08:32:19 +02:00
awesome-web-scraping/web_services.md

27 lines
1.7 KiB
Markdown
Raw Normal View History

2015-08-12 22:49:22 +02:00
# Web-scraping Web Services
2015-08-17 12:15:18 +02:00
## Web-data Extracting Services
2018-11-08 00:16:14 +02:00
* [Dataflow kit](https://dataflowkit.com) - Turn websites data into structured data with a simple point-and-click toolkit
2018-07-16 19:12:25 +02:00
* [ProxyCrawl](https://proxycrawl.com) - Crawl and scrape any website without blocks, captchas or proxies
2018-05-23 18:28:32 +02:00
* [ScraperAPI](https://www.scraperapi.com) - A service that manages proxies
and headless browsers, exposing a single API endpoint to scrape any url.
2015-08-17 12:15:18 +02:00
* [import.io](https://import.io/)
* [ScraperWiki](https://scraperwiki.com/about)
* [Mozenda](https://www.mozenda.com/)
* [PhantomJs.Cloud](https://phantomjscloud.com/)
* [CloudScrape](http://cloudscrape.com/)
2016-01-05 09:20:51 +02:00
* [DiffBot](http://www.diffbot.com/)
* [Apify](https://www.apify.com/) - A serverless web scraping, data extraction and web automation platform
2016-04-09 06:17:00 +02:00
* [Portia](http://scrapinghub.com/portia/); also on GitHub: [scrapinghub/portia](https://github.com/scrapinghub/portia)
* [Dexi](https://dexi.io)
2016-04-20 15:40:01 +02:00
* [Morph.io](https://morph.io) free of charge, fully [open-source](https://github.com/openaustralia/morph) service
2018-01-28 15:57:45 +02:00
* [Page.REST](https://page.rest/)
* [ParseHub](https://www.parsehub.com/)
* [WrapAPI](https://wrapapi.com/)
* [Agenty](https://www.agenty.com/)
* [ScrapingBee](https://www.scrapingbee.com/) - A web scraping API that handles rotating proxies and headless browsers.
* [SerpApi](https://serpapi.com/) - Real-time API to access structured search results of search engines.
2021-02-05 23:12:45 +02:00
* [ScrapingAnt](https://scrapingant.com/) - Web Scraping API with thousands of residential proxies and headless Chrome cluster.
2021-03-02 18:30:29 +02:00
* [Zyte](https://www.zyte.com/) - Web data extraction services and platform, also lead maintainers of Scrapy.