mirror of
https://github.com/MontFerret/ferret.git
synced 2025-01-18 03:22:02 +02:00
Merge branch 'master' of github.com:MontFerret/ferret
This commit is contained in:
commit
84b71fdbc6
20
README.md
20
README.md
@ -4,7 +4,7 @@
|
|||||||
![ferret](https://raw.githubusercontent.com/MontFerret/ferret/master/assets/intro.jpg)
|
![ferret](https://raw.githubusercontent.com/MontFerret/ferret/master/assets/intro.jpg)
|
||||||
|
|
||||||
## What is it?
|
## What is it?
|
||||||
```ferret``` is a web scraping system aiming to simplify data extraction from the web for such things like machine learning and analytics.
|
```ferret``` is a web scraping system aiming to simplify data extraction from the web for such things like ui testing, machine learning and analytics.
|
||||||
Having it's own declarative language, ```ferret``` abstracts away technical details and complexity of the underlying technologies, helping to focus on the data itself.
|
Having it's own declarative language, ```ferret``` abstracts away technical details and complexity of the underlying technologies, helping to focus on the data itself.
|
||||||
It's extremely portable, extensible and fast.
|
It's extremely portable, extensible and fast.
|
||||||
|
|
||||||
@ -18,15 +18,15 @@ Once the page gets loaded, we iterate over all elements in search results and as
|
|||||||
The final for loop filters out empty elements that might be because of inaccurate use of selectors.
|
The final for loop filters out empty elements that might be because of inaccurate use of selectors.
|
||||||
|
|
||||||
```aql
|
```aql
|
||||||
LET g = DOCUMENT("https://www.google.com/", true)
|
LET google = DOCUMENT("https://www.google.com/", true)
|
||||||
|
|
||||||
INPUT(g, 'input[name="q"]', "ferret")
|
INPUT(google, 'input[name="q"]', "ferret")
|
||||||
CLICK(g, 'input[name="btnK"]')
|
CLICK(google, 'input[name="btnK"]')
|
||||||
|
|
||||||
WAIT_NAVIGATION(g)
|
WAIT_NAVIGATION(google)
|
||||||
|
|
||||||
LET result = (
|
LET result = (
|
||||||
FOR result IN ELEMENTS(g, '.g')
|
FOR result IN ELEMENTS(google, '.g')
|
||||||
RETURN {
|
RETURN {
|
||||||
title: ELEMENT(result, 'h3 > a'),
|
title: ELEMENT(result, 'h3 > a'),
|
||||||
description: ELEMENT(result, '.st'),
|
description: ELEMENT(result, '.st'),
|
||||||
@ -35,9 +35,9 @@ LET result = (
|
|||||||
)
|
)
|
||||||
|
|
||||||
RETURN (
|
RETURN (
|
||||||
FOR i IN result
|
FOR page IN result
|
||||||
FILTER i.title != NONE
|
FILTER page.title != NONE
|
||||||
RETURN i
|
RETURN page
|
||||||
)
|
)
|
||||||
```
|
```
|
||||||
|
|
||||||
@ -60,7 +60,7 @@ But due to the domain specifics, there are some differences in how things work.
|
|||||||
|
|
||||||
## WIP
|
## WIP
|
||||||
Be aware, the the project is under heavy development. There is no documentation and some things may change in the final release.
|
Be aware, the the project is under heavy development. There is no documentation and some things may change in the final release.
|
||||||
For query syntax, you may go to [ArrangoDB web site](https://docs.arangodb.com/3.3/AQL/index.html) and use AQL docs as docs for FQL - since they are identical.
|
For query syntax, you may go to [ArangoDB web site](https://docs.arangodb.com/3.3/AQL/index.html) and use AQL docs as docs for FQL - since they are identical.
|
||||||
|
|
||||||
|
|
||||||
## Installation
|
## Installation
|
||||||
|
Loading…
Reference in New Issue
Block a user