From 46c7dea9befe6a843c338634e4755f974881ff58 Mon Sep 17 00:00:00 2001
From: Tim Voronov <ziflex@gmail.com>
Date: Sun, 28 Mar 2021 23:16:55 -0400
Subject: [PATCH] Removed documentation from README

---
 README.md | 553 +-----------------------------------------------------
 1 file changed, 1 insertion(+), 552 deletions(-)

diff --git a/README.md b/README.md
index ea762146..f8664b10 100644
--- a/README.md
+++ b/README.md
@@ -36,555 +36,4 @@ It is extremely portable, extensible, and fast.
 * Embeddable
 * Extensible
 
-### Show me some code
-The following example demonstrates the use of dynamic pages.    
-We load the main Google Search page, type a search criteria into the input box, and then click the search button.   
-The click action triggers a redirect, so we wait until the page we were redirected to finishes loading.   
-Once the results page is loaded, we iterate over all elements in the search results and assign output to a variable.   
-
-```aql
-LET google = DOCUMENT("https://www.google.com/", {
-    driver: "cdp",
-    userAgent: "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_14_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/76.0.3809.87 Safari/537.36"
-})
-
-HOVER(google, 'input[name="q"]')
-WAIT(RAND(100))
-INPUT(google, 'input[name="q"]', @criteria, 30)
-
-WAIT(RAND(100))
-
-WAIT_ELEMENT(google, '.UUbT9')
-WAIT(RAND(100))
-CLICK(google, 'input[name="btnK"]')
-
-WAIT_NAVIGATION(google)
-
-FOR result IN ELEMENTS(google, '.g')
-    // filter out extra elements like videos and 'People also ask'
-    FILTER TRIM(result.attributes.class) == 'g'
-    RETURN {
-        title: INNER_TEXT(result, 'h3'),
-        description: INNER_TEXT(result, '.rc > div:nth-child(2) span'),
-        url: INNER_TEXT(result, 'cite')
-    }
-```
-
-You can find more examples [here](./examples).
-
-
-### Motivation
-Nowadays, data is everything and who owns data - owns the world.    
-I have worked on multiple data-driven projects where data was an essential part of a system, and I realized how repetitive it is to write scraping code.
-Other scraping libraries require lots of boilerplate code and tend to encourage an imperative approach to extracting data.
-After some time looking for a tool that would let me declare which data I needed (instead of imperatively instructing it how to extract it), I decided to build my own solution.    
-```ferret``` project is an ambitious initiative trying to bring the universal platform for writing scrapers without the hassle of other scrapers.
-
-### Inspiration
-FQL (Ferret Query Language) is meant to feel like writing a database query.
-It is heavily inspired by [AQL](https://www.arangodb.com/) (ArangoDB Query Language).
-But, due to the domain specifics, there are some differences in syntax and how things work.     
-
-
-## Installation
-
-### Binary
-You can download the latest binaries from [here](https://github.com/MontFerret/ferret/releases).
-
-### Source code
-#### Production
-* Go >=1.11
-* Chrome or Docker
-
-#### Development
-* GNU Make
-* ANTLR4 >=4.8
-
-
-```sh
-go get github.com/MontFerret/ferret
-```
-
-### Environment
-
-In order to use all Ferret features, you will need to have Chrome either installed locally or running in Docker.
-For ease of use, we recommend to running Chromium inside a Docker container.
-You can probably use most Chromium-based headless images, but we've put together [an image that's ready to go](https://github.com/MontFerret/chromium):
-
-```sh
-docker pull montferret/chromium
-docker run -d -p 9222:9222 montferret/chromium
-```
-
-If you'd rather see what's happening during query execution, just start launch Chrome from your host with the remote debugging port set:
-
-```sh
-chrome.exe --remote-debugging-port=9222
-```
-
-## Quick start
-
-### Browserless mode
-
-If you want to try out ```fql```, you can get started without Chrome or a Chromium container.
-Executing the `ferret` CLI without any options will open `ferret` in REPL mode.
-```
-ferret
-```
-
-```ferret``` will run in REPL mode.
-
-```shell
-Welcome to Ferret REPL
-Please use `exit` or `Ctrl-D` to exit this program.
->%
->LET doc = DOCUMENT('https://news.ycombinator.com/')
->FOR post IN ELEMENTS(doc, '.storylink')
->RETURN post.attributes.href
->%
-
-```
-
-**Note:** symbol ```%``` is used to start and end multi-line queries. You also can use the heredoc format.
-
-If you want to execute a query stored in a file, just pass a file name:
-
-```
-ferret ./docs/examples/static-page.fql
-```
-
-```
-cat ./docs/examples/static-page.fql | ferret
-```
-
-```
-ferret < ./docs/examples/static-page.fql
-```
-
-
-### Browser mode
-
-By default, ``ferret`` loads HTML pages directly via HTTP protocol, because it's faster.    
-But, nowadays, more and more websites are rendered with JavaScript, and this 'old school' approach does not really work.    
-For these dynamic websites, you may fetch documents using Chrome or Chromium via Chrome DevTools protocol (aka CDP).    
-First, you need to make sure that you launched Chrome with ```remote-debugging-port=9222``` flag (see "Environment" in this README for instructions on setting this up).    
-Second, you need to pass the address to ```ferret``` CLI.    
-
-```
-ferret --cdp http://127.0.0.1:9222
-```
-
-**NOTE:** By default, ```ferret``` will try to use this local address as a default one.
-You only need to explicitly pass the parameter if you are using a different port number or remote address.
-
-Alternatively, you can tell CLI to launch Chrome for you.
-
-```shell
-ferret --cdp-launch
-```
-
-Once ```ferret``` knows how to communicate with Chrome, you can use the function ```DOCUMENT(url, isDynamic)```, setting ```isDynamic``` to ```{driver: "cdp"}``` for dynamic pages:
-
-```shell
-Welcome to Ferret REPL
-Please use `exit` or `Ctrl-D` to exit this program.
->%
->LET doc = DOCUMENT('https://soundcloud.com/charts/top', { driver: "cdp" })
->WAIT_ELEMENT(doc, '.chartTrack__details', 5000)
->LET tracks = ELEMENTS(doc, '.chartTrack__details')
->FOR track IN tracks
->    LET username = ELEMENT(track, '.chartTrack__username')
->    LET title = ELEMENT(track, '.chartTrack__title')
->    RETURN {
->       artist: username.innerText,
->        track: title.innerText
->    }
->%
-```
-
-```shell
-Welcome to Ferret REPL
-Please use `exit` or `Ctrl-D` to exit this program.
->%
->LET doc = DOCUMENT("https://github.com/", { driver: "cdp" })
->LET btn = ELEMENT(doc, ".HeaderMenu a")
-
->CLICK(btn)
->WAIT_NAVIGATION(doc)
->WAIT_ELEMENT(doc, '.IconNav')
-
->FOR el IN ELEMENTS(doc, '.IconNav a')
->    RETURN TRIM(el.innerText)
->%
-```
-
-### Embedded mode
-
-```ferret``` is a very modular system.
-It can be embedded into your Go application in only a few lines of code.
-
-Here is an example of a short Go application that defines an `fql` query, compiles it, executes it, then returns the results.
-
-
-```go
-package main
-
-import (
-	"context"
-	"encoding/json"
-	"fmt"
-	"os"
-
-	"github.com/MontFerret/ferret/pkg/compiler"
-	"github.com/MontFerret/ferret/pkg/drivers"
-	"github.com/MontFerret/ferret/pkg/drivers/cdp"
-	"github.com/MontFerret/ferret/pkg/drivers/http"
-)
-
-type Topic struct {
-	Name        string `json:"name"`
-	Description string `json:"description"`
-	URL         string `json:"url"`
-}
-
-func main() {
-	topics, err := getTopTenTrendingTopics()
-
-	if err != nil {
-		fmt.Println(err)
-		os.Exit(1)
-	}
-
-	for _, topic := range topics {
-		fmt.Println(fmt.Sprintf("%s: %s %s", topic.Name, topic.Description, topic.URL))
-	}
-}
-
-func getTopTenTrendingTopics() ([]*Topic, error) {
-	query := `
-		LET doc = DOCUMENT("https://github.com/topics")
-
-		FOR el IN ELEMENTS(doc, ".py-4.border-bottom")
-			LIMIT 10
-			LET url = ELEMENT(el, "a")
-			LET name = ELEMENT(el, ".f3")
-			LET description = ELEMENT(el, ".f5")
-
-			RETURN {
-				name: TRIM(name.innerText),
-				description: TRIM(description.innerText),
-				url: "https://github.com" + url.attributes.href
-			}
-	`
-
-	comp := compiler.New()
-
-	program, err := comp.Compile(query)
-
-	if err != nil {
-		return nil, err
-	}
-
-	// create a root context
-	ctx := context.Background()
-
-	// enable HTML drivers
-	// by default, Ferret Runtime does not know about any HTML drivers
-	// all HTML manipulations are done via functions from standard library
-	// that assume that at least one driver is available
-	ctx = drivers.WithContext(ctx, cdp.NewDriver())
-	ctx = drivers.WithContext(ctx, http.NewDriver(), drivers.AsDefault())
-
-	out, err := program.Run(ctx)
-
-	if err != nil {
-		return nil, err
-	}
-
-	res := make([]*Topic, 0, 10)
-
-	err = json.Unmarshal(out, &res)
-
-	if err != nil {
-		return nil, err
-	}
-
-	return res, nil
-}
-
-```
-
-## Extras
-
-### Extensibility
-
-With ```ferret```'s modular system, you can also extend its standard library.
-
-In this example, we define a `transform` function in Go, then register that function with ```ferret```, making it available for use in ```fql``` queries.
-
-```go
-package main
-
-import (
-	"context"
-	"encoding/json"
-	"fmt"
-	"os"
-	"strings"
-
-	"github.com/MontFerret/ferret/pkg/compiler"
-	"github.com/MontFerret/ferret/pkg/runtime/core"
-	"github.com/MontFerret/ferret/pkg/runtime/values"
-)
-
-func main() {
-	strs, err := getStrings()
-
-	if err != nil {
-		fmt.Println(err)
-		os.Exit(1)
-	}
-
-	for _, str := range strs {
-		fmt.Println(str)
-	}
-}
-
-func getStrings() ([]string, error) {
-	// function implements is a type of a function that ferret supports as a runtime function
-	transform := func(ctx context.Context, args ...core.Value) (core.Value, error) {
-		// it's just a helper function which helps to validate a number of passed args
-		err := core.ValidateArgs(args, 1, 1)
-
-		if err != nil {
-			// it's recommended to return built-in None type, instead of nil
-			return values.None, err
-		}
-
-		// this is another helper functions allowing to do type validation
-		err = core.ValidateType(args[0], core.StringType)
-
-		if err != nil {
-			return values.None, err
-		}
-
-		// cast to built-in string type
-		str := args[0].(values.String)
-
-		return values.NewString(strings.ToUpper(str.String() + "_ferret")), nil
-	}
-
-	query := `
-		FOR el IN ["foo", "bar", "qaz"]
-			// conventionally all functions are registered in upper case
-			RETURN TRANSFORM(el)
-	`
-
-	comp := compiler.New()
-
-	if err := comp.RegisterFunction("transform", transform); err != nil {
-		return nil, err
-	}
-
-	program, err := comp.Compile(query)
-
-	if err != nil {
-		return nil, err
-	}
-
-	out, err := program.Run(context.Background())
-
-	if err != nil {
-		return nil, err
-	}
-
-	res := make([]string, 0, 3)
-
-	err = json.Unmarshal(out, &res)
-
-	if err != nil {
-		return nil, err
-	}
-
-	return res, nil
-}
-```
-
-You can completely turn off the ```ferret``` standard library, as follows:
-
-```go
-comp := compiler.New(compiler.WithoutStdlib())
-```
-
-After disabling ```stdlib```, you can register your own implementation of functions from standard library.    
-
-If you only need a subset of the ```stdlib``` functions, you can only have those enabled by disabling the entire ```stdlib```, then registering the individual packages that are needed:    
-
-```go
-package main
-
-import (
-    "github.com/MontFerret/ferret/pkg/compiler"
-    "github.com/MontFerret/ferret/pkg/stdlib/strings"
-)
-
-func main() {
-    comp := compiler.New(compiler.WithoutStdlib())
-
-    comp.RegisterFunctions(strings.NewLib())
-}
-```
-
-### Proxy
-
-By default, ```ferret``` does not attempt to use a proxy. This is due to an inability to CDP-compatible browsers to use an arbitrary proxy. If you need to use a proxy, it should be defined while launching the browser.
-
-However, if you are querying static pages, you can define a proxy while launching ``ferret``` from the CLI or from embedded applications.
-
-#### CLI example
-
-```sh
-ferret --proxy=http://localhost:8888 my-query.fql
-```
-
-#### Embedded example
-
-```go
-package main
-
-import (
-    "context"
-    "encoding/json"
-    "fmt"
-    "os"
-	
-    "github.com/MontFerret/ferret/pkg/compiler"
-    "github.com/MontFerret/ferret/pkg/drivers"
-    "github.com/MontFerret/ferret/pkg/drivers/http"
-)
-
-func run(q string) ([]byte, error) {
-    proxy := "http://localhost:8888"
-    comp := compiler.New()
-    program := comp.MustCompile(q)
-
-    // create a root context
-    ctx := context.Background()
-
-    // we inform the driver what proxy to use
-    ctx = drivers.WithContext(ctx, http.NewDriver(http.WithProxy(proxy)), drivers.AsDefault())
-
-    return program.Run(ctx)
-}
-
-```
-
-### Cookies
-
-#### Get, Set, Delete
-For more precise work, you can set/get/delete cookies manually before and after loading the page:
-
-```
-LET doc = DOCUMENT("https://www.google.com", {
-    driver: "cdp",
-    cookies: [
-         {
-             name: "foo",
-             value: "bar"
-         }
-    ]
-})
-
-COOKIE_SET(doc, { name: "baz", value: "qaz"}, { name: "daz", value: "gag" })
-COOKIE_DEL(doc, "foo")
-
-LET c = COOKIE_GET(doc, "baz")
-
-FOR cookie IN doc.cookies
-    RETURN cookie.name
-	
-```
-
-#### Access previously-set cookies (non-incognito mode)
-
-By default, ``CDP`` driver execute each query in an incognito mode in order to avoid collisions from cookies persisted by previous queries.   
-However, sometimes you might want access to persisted cookies (e.g. to avoid re-authenticating with a site).   
-In order to do that, we need to configure the driver to execute all queries in non-incognito tabs.
-
-Here is how to do that:
-
-##### CLI example
-
-```sh
-ferret --cdp-keep-cookies my-query.fql
-```
-
-##### Embedded example
-
-```go
-package main
-
-import (
-	"context"
-	"encoding/json"
-	"fmt"
-	"os"
-
-	"github.com/MontFerret/ferret/pkg/compiler"
-	"github.com/MontFerret/ferret/pkg/drivers"
-	"github.com/MontFerret/ferret/pkg/drivers/cdp"
-)
-
-func run(q string) ([]byte, error) {
-	comp := compiler.New()
-	program := comp.MustCompile(q)
-
-	// create a root context
-	ctx := context.Background()
-
-	// we inform the driver to keep cookies between queries
-	ctx = drivers.WithContext(
-		ctx,
-		cdp.NewDriver(cdp.WithKeepCookies()),
-		drivers.AsDefault(),
-	)
-
-	return program.Run(ctx)
-}
-```
-
-##### Query
-```
-LET doc = DOCUMENT("https://www.google.com", {
-    driver: "cdp",
-    keepCookies: true
-})
-```
-
-### File System
-
-```ferret``` can also read and write to the file system.
-
-#### Write example
-```
-USE IO::FS
-
-LET favicon = DOWNLOAD("https://www.google.com/favicon.ico")
-
-RETURN WRITE("google.favicon.ico", favicon)
-```
-
-#### Read example
-```
-USE IO::FS
-
-LET urls_data = READ("urls.json")
-LET urls = JSON_PARSE(urls_data)
-
-FOR url IN urls
-    RETURN DOCUMENT(url)
-```
-
-## References
-
-Further documentation is available [at our website](https://www.montferret.dev/docs/introduction/)
+Documentation is available [at our website](https://www.montferret.dev/docs/introduction/).