2017-05-15 18:42:40 +02:00
|
|
|
# Perl Web Scraping
|
|
|
|
|
|
|
|
This list contains Perl libraries related to web scraping and data processing
|
|
|
|
|
|
|
|
* [Perl Web Scraping](#javascript-web-scraping)
|
|
|
|
* [Network](#network)
|
|
|
|
* [Web-scraping Frameworks](#web-scraping-frameworks)
|
|
|
|
* [HTML/XML Parsing](#htmlxml-parsing)
|
|
|
|
* [Text processing](#text-processing)
|
|
|
|
* [Specific Formats Processing](#specific-formats-processing)
|
|
|
|
* [Natural Language Processing](#natural-language-processing)
|
|
|
|
* [Browser automation and emulation](#browser-automation-and-emulation)
|
|
|
|
* [Multiprocessing](#multiprocessing)
|
|
|
|
* [Queue](#queue)
|
|
|
|
* [Email](#email)
|
|
|
|
* [URL and Network Address Manipulation](#url-and-network-address-manipulation)
|
|
|
|
* [Web Content Extracting](#web-content-extracting)
|
|
|
|
* [Asynchronous](#asynchronous)
|
|
|
|
* [WebSocket](#websocket)
|
|
|
|
* [DNS Resolving](#dns-resolving)
|
|
|
|
* [Computer Vision](#computer-vision)
|
|
|
|
* [Proxy Server](#proxy-server)
|
|
|
|
* [Other Perl Lists](#other-Perl-lists)
|
|
|
|
|
|
|
|
## Network
|
|
|
|
* General
|
|
|
|
* TODO
|
|
|
|
* Asynchronous
|
|
|
|
* TODO
|
|
|
|
|
|
|
|
## Web-Scraping Frameworks
|
|
|
|
* Full Featured Crawlers
|
|
|
|
* TODO
|
|
|
|
* Other
|
|
|
|
* TODO
|
|
|
|
|
|
|
|
## HTML/XML Parsing
|
|
|
|
|
2018-09-06 15:01:37 +02:00
|
|
|
* TODO
|
2017-05-15 18:42:40 +02:00
|
|
|
|
|
|
|
## Text Processing
|
|
|
|
|
|
|
|
*Libraries for parsing and manipulating plain texts.*
|
|
|
|
|
|
|
|
* General
|
|
|
|
* TODO
|
|
|
|
|
|
|
|
## Specific Formats Processing
|
|
|
|
|
|
|
|
*Libraries for parsing and manipulating specific text formats.*
|
|
|
|
|
|
|
|
* General
|
|
|
|
* TODO
|
|
|
|
* Something
|
|
|
|
* TODO
|
|
|
|
|
|
|
|
## Natural Language Processing
|
|
|
|
|
|
|
|
*Libraries for working with human languages.*
|
|
|
|
|
2018-09-06 15:01:37 +02:00
|
|
|
* TODO
|
2017-05-15 18:42:40 +02:00
|
|
|
|
|
|
|
## Browser automation and emulation
|
2018-09-06 15:01:37 +02:00
|
|
|
|
|
|
|
* TODO
|
2017-05-15 18:42:40 +02:00
|
|
|
|
|
|
|
## Multiprocessing
|
2018-09-06 15:01:37 +02:00
|
|
|
|
|
|
|
* TODO
|
2017-05-15 18:42:40 +02:00
|
|
|
|
|
|
|
## Asynchronous
|
|
|
|
|
|
|
|
*Libraries for asynchronous networking programming.*
|
|
|
|
|
2018-09-06 15:01:37 +02:00
|
|
|
* TODO
|
2017-05-15 18:42:40 +02:00
|
|
|
|
|
|
|
## Queue
|
2018-09-06 15:01:37 +02:00
|
|
|
|
|
|
|
* TODO
|
2017-05-15 18:42:40 +02:00
|
|
|
|
|
|
|
## Email
|
|
|
|
|
|
|
|
*Libraries for parsing email.*
|
|
|
|
|
2018-09-06 15:01:37 +02:00
|
|
|
* TODO
|
2017-05-15 18:42:40 +02:00
|
|
|
|
|
|
|
## URL and Network Address Manipulation
|
|
|
|
|
|
|
|
*Libraries for parsing/modifying URLs and network addresses.*
|
|
|
|
|
|
|
|
* URL
|
|
|
|
* TODO
|
|
|
|
* Network Address
|
|
|
|
* TODO
|
|
|
|
|
|
|
|
## Web Content Extracting
|
|
|
|
|
|
|
|
*Libraries for extracting web contents.*
|
|
|
|
|
|
|
|
* Text and Meta Data from HTML pages
|
|
|
|
* TODO
|
|
|
|
|
|
|
|
## WebSocket
|
|
|
|
|
|
|
|
*Libraries for working with WebSocket.*
|
|
|
|
|
2018-09-06 15:01:37 +02:00
|
|
|
* TODO
|
2017-05-15 18:42:40 +02:00
|
|
|
|
|
|
|
## DNS Resolving
|
2018-09-06 15:01:37 +02:00
|
|
|
|
|
|
|
* TODO
|
2017-05-15 18:42:40 +02:00
|
|
|
|
|
|
|
## Computer Vision
|
2018-09-06 15:01:37 +02:00
|
|
|
|
|
|
|
* TODO
|
2017-05-15 18:42:40 +02:00
|
|
|
|
|
|
|
## Proxy Server
|
2018-09-06 15:01:37 +02:00
|
|
|
|
|
|
|
* TODO
|
2017-05-15 18:42:40 +02:00
|
|
|
|
|
|
|
## Other Perl lists
|
|
|
|
|
2018-09-06 15:01:37 +02:00
|
|
|
* TODO
|