1
0
mirror of https://github.com/lorien/awesome-web-scraping.git synced 2025-02-21 19:06:39 +02:00

464 Commits

Author SHA1 Message Date
Some User
b5f82fc622 Fix spaces 2022-06-24 14:47:34 +03:00
lorien
c4b75ab321
Merge pull request #141 from milahu/patch-2
fix links
2022-06-03 10:31:13 +07:00
lorien
15f02d7a20
Merge pull request #142 from otsch/patch-2
Add crwlr packages and Symfony DomCrawler
2022-06-03 10:28:14 +07:00
otsch
394e973b96
Add crwlr packages and Symfony DomCrawler 2022-06-02 23:09:46 +02:00
milahu
46d625c36a
sort by github stars 2022-04-18 08:54:38 +02:00
milahu
1c3879c18a
fix links 2022-04-18 08:51:15 +02:00
lorien
db64c54277
Merge pull request #140 from roniemartinez/patch-2
Add Dude to python web scraping frameworks
2022-03-18 13:03:22 +03:00
Ronie Martinez
22589140be
Add Dude to python web scraping frameworks
Added Dude (https://github.com/roniemartinez/dude)
2022-03-18 01:55:37 +01:00
lorien
72521cc512
Update CONTRIBUTING.md 2022-03-16 12:57:23 +03:00
lorien
b666583354
Update CONTRIBUTING.md 2022-03-16 12:56:19 +03:00
lorien
7c39d7e6a7
Update CONTRIBUTING.md 2022-03-16 12:54:58 +03:00
lorien
a1cc8f4482
Merge pull request #139 from gidoneli/patch-3
Update web_services.md
2022-03-16 12:50:58 +03:00
lorien
fa4a1f9c59
Add captcha solving libraries 2022-03-16 12:48:45 +03:00
Gidoneli
eab9ed7740
Update web_services.md
Request to add the data collector
2022-03-16 11:39:57 +02:00
lorien
9e1145e29e
Remove links to outdated python lists 2022-03-16 12:35:20 +03:00
lorien
0d64dfd6cc
Merge pull request #137 from dvlop/patch-2
fix url to caddy proxy server repository
2022-03-16 12:32:47 +03:00
dvlop
5c3dbef58b
fix url to caddy proxy server repository
wrong url of caddy cross-platform HTTP/2 web server
2022-03-15 00:41:39 +02:00
lorien
2d7b6f0d02
Merge pull request #135 from rand-net/master
Adds dribbble-py - a dribbble.com scraper
2022-03-07 17:40:57 +03:00
rand-net
9d637fcafb
Merge pull request #1 from rand-net/rand-net-patch-1
Adds dribbble-py - a dribbble.com scraper
2022-03-07 13:16:17 +00:00
rand-net
a8bdec29ee
Adds dribbble-py - a dribbble.com scraper 2022-03-07 13:14:22 +00:00
lorien
46319dba34
Update README.md 2022-03-02 00:39:52 +03:00
lorien
2ec1e070c7
Update CONTRIBUTING.md 2022-02-28 12:41:01 +03:00
lorien
01f15b9652
Merge pull request #132 from thiagola92/patch-2
Update python.md: pyparsing link
2022-02-28 12:34:48 +03:00
Thiago Lages de Alencar
b4190c7e22
Update python.md
Link was down, switched to github repository
2022-01-09 05:18:06 -03:00
lorien
fda6018fb5
Merge pull request #129 from AnderRV/add-zenrows
Add ZenRows to Web-data Extracting Services
2021-10-22 00:51:13 +03:00
Ander
594e11e213 Add ZenRows to Web-data Extracting Services 2021-10-20 16:01:54 +02:00
lorien
5004b1b71f
Update python.md 2021-08-10 17:15:23 +03:00
lorien
4e91f7bd49
Merge pull request #128 from ProxiesAPI/add-proxies-api
Add proxies api
2021-08-06 17:33:05 +03:00
Girish Lakshminarayana
d7767a04f0 added Proxies API 2021-08-06 19:33:11 +05:30
Girish Lakshminarayana
672f1f4e38 add Proxies API 2021-08-06 19:30:23 +05:30
lorien
44e7083e3a
Merge pull request #126 from ray-102/master
Added Extractnet an ML based crawler in Python
2021-07-24 21:29:06 +03:00
lorien
06162ffbb6
Update README.md 2021-05-10 16:38:55 +03:00
lorien
84b604f27f
Update README.md 2021-05-10 16:37:37 +03:00
lorien
f7b1dd25b2
Merge pull request #127 from 0xflotus/patch-2
fixed small error
2021-04-05 16:22:34 +03:00
0xflotus
66da267fa5
fixed small error 2021-04-05 00:48:14 +02:00
ray-102
d36d03b647
Added ExtractNet 2021-03-07 14:20:53 +08:00
lorien
ac41d69b17
Merge pull request #125 from zseta/master
Add Zyte to services
2021-03-03 16:50:16 +03:00
Attila Tóth
f8c41043b8
Add formerly 2021-03-02 17:31:04 +01:00
Attila Tóth
d388d73cb7
Minor fix 2021-03-02 17:30:29 +01:00
Attila Tóth
22b9f3530f
Add Zyte to services 2021-03-02 17:29:23 +01:00
lorien
cbf58b61f3
Merge pull request #124 from kami4ka/patch-2
ScrapingAnt added
2021-02-17 12:02:18 +03:00
Oleg Kulyk
ea7699d14e
ScrapingAnt added 2021-02-05 23:12:45 +02:00
lorien
68548b984e
Update python.md 2020-12-12 18:42:28 +03:00
lorien
1d3672a3d7
Merge pull request #123 from Proteusiq/master
Moved photon
2020-12-04 00:36:46 +03:00
Prayson Wilfred Daniel
ff18f3f2dd
moved photon to correct category
Moved photo to web extraction
2020-12-03 19:26:03 +01:00
Prayson Wilfred Daniel
46b1749888
Update python.md
added seleniumbase, photon,  and frontera
2020-12-01 09:27:28 +01:00
Gregory Petukhov
df08f74dde
Merge pull request #121 from Proteusiq/master
added gazpacho
2020-10-30 19:10:22 +03:00
Gregory Petukhov
3b2d45ade8
Merge pull request #120 from ZilvinasKucinskas/feature/add-search-engine-data-extraction-web-service
Add SerpApi To Web-data Extracting Services
2020-10-30 19:07:11 +03:00
Prayson Wilfred Daniel
a65a3d4baa
added gazpacho
gazpacho is a simple, fast, and modern web scraping library
2020-10-30 10:51:39 +01:00
Zilvinas Kucinskas
79484a3102
Add SerpApi to web-data extracting services 2020-10-29 18:34:00 +02:00