Ville Skyttä
b5d03c0079
feat(regexlexer): compile in RE2 compatibility mode
...
To better match vanilla Go regexps and support some additional
constructs that might be present in Pygments rules.
https://github.com/dlclark/regexp2#re2-compatibility-mode
2021-05-17 14:09:19 +10:00
Siavash Askari Nasr
2e23e7f215
regexp2 uses number of group as its name so name check isn't needed
2021-05-08 18:48:49 +10:00
mlpo
ff6eedba72
Fix: sort words in descending order of length before regex generation ( #496 )
...
* Fix: sort words in descending order of length before regex generation
* Avoid code duplication in Raku lexer
2021-05-08 09:10:18 +10:00
Siavash Askari Nasr
225e1862d3
Pass *LexerState
as context to emitters
...
Useful for accessing named capture groups and context set by
`mutators` and other field and methods LexerState provides.
2021-05-07 22:55:54 +10:00
Siavash Askari Nasr
dcfd826b25
Add support for named capture groups
2021-05-06 21:34:28 +10:00
Alec Thomas
7e282be495
Update golangci-lint so we can force use of LazyLexer.
2021-04-29 12:08:28 +10:00
Cameron Moore
59126c5b32
Add NewLazyLexer to defer rules definitions and reduce init costs ( #449 )
...
Add NewLazyLexer and MustNewLazyLexer which accept a function that
returns the rules for the lexer. This allows us to defer the rules
definitions until they're needed.
Lexers in a, g, s, and x packages have been updated to use the new lazy
lexer.
2021-02-08 12:16:49 +11:00
Alec Thomas
5da831672d
Fix a few bugs including sub-lexers adding additional newlines when
...
EnsureNL is true.
2021-02-06 20:13:50 +11:00
Alec Thomas
e62d93f4aa
Add a timeout to regexes.
...
This avoids pathologically bad match times. Fixes #378 .
2020-07-08 20:23:13 +10:00
Alec Thomas
2b9ea60d89
Split PHP into two lexers - PHP and PHTML.
...
The former is pure PHP code while the latter is PHP code in <? ?> tags,
within HTML.
Fixes #210 .
2020-06-30 21:00:09 +10:00
Alec Thomas
ee4284bb40
Add a Rules.Merge() helper function.
...
Might be useful for #363 .
2020-05-16 16:04:21 +10:00
satotake
34d9c7143b
Add new TokeniseOption EnsureLF ( #336 )
...
* Add new TokeniseOption EnsureLF
ref #329
* Use efficient process suggested by @chmike
2020-03-04 18:56:47 +11:00
Alec Thomas
28dcb8565c
Fixes #305 .
2019-11-24 12:49:34 +11:00
Alec Thomas
bbc59ac372
Emit error tokens when there's a group mismatch.
...
Also don't panic/recover, as we no longer use panic to report "real"
errors.
Fixes #295 .
2019-10-24 17:03:35 +11:00
Alec Thomas
73d11b3c45
Clear background colour for TTY formatters.
2019-10-15 21:08:17 +11:00
Alec Thomas
ea14dd8660
Fixed a fundamental bug where ^ would always match.
...
The engine was always passing a string sliced to the current position,
resulting in ^ always matching. Switched to use
FindRunesMatchStartingAt.
Fixes #242 .
2019-06-12 12:32:20 +10:00
Alec Thomas
2105c68ed2
Implemented a weird little Pygments rule that I missed.
...
> If the RegexLexer encounters a newline that is flagged as an error
> token, the stack is emptied and the lexer continues scanning in the
> 'root' state. This can help producing error-tolerant highlighting for
> erroneous input, e.g. when a single-line string is not closed.
Fixes #246 .
2019-04-22 18:22:58 +10:00
Alec Thomas
da5ac60d8c
Add golangci-lint and fix all lint issues.
2018-12-31 22:46:59 +11:00
Daniel Eloff
9c3abeae1d
Tokens by value ( #187 )
...
This results in about a 8% improvement in speed.
2018-11-04 10:22:51 +11:00
Kenneth Shaw
95d0a9381b
Fix Dollar-Quoted Strings (postgres + cql)
...
This commit refactors code from the markdown lexer into the chroma
package, and alters the PostgreSQL and CQL lexers to make use of it.
Additionally, an example markdown with the various sublexers is added.
2018-06-12 09:16:18 +07:00
Alec Thomas
f315512f5c
Add support for Go templates.
...
These are exposed as go-text-template and go-html-template.
Fixes #105 .
2018-03-18 21:57:34 +11:00
Alec Thomas
3020e2ea8c
Fix bug with nested newlines.
...
Fixes #124 .
Also reinstitute lexer tests that disappeared during package split.
2018-03-03 10:16:21 +11:00
Alec Thomas
35126f9a94
Implement rudimentary JSX lexer based on https://github.com/fcurella/jsx-lexer/blob/master/jsx/lexer.py
...
Fixes #111 .
2018-02-07 22:11:40 +11:00
Alec Thomas
ce3d6bf527
Invert default "ensure newline" behaviour so that it is opt-in.
...
See #47 .
2017-09-30 14:41:05 +10:00
Alec Thomas
573c1d157d
Ensure a newline exists at the end of files.
...
Fixes #42 .
2017-09-29 21:59:52 +10:00
Alec Thomas
d5083b3f7c
Big changes to the style and colour APIs.
...
- Styles now use a builder system, to enforce immutability of styles.
- Corrected and cleaned up how style inheritance works.
- Added a brightening function to colours
- HTML formatter will now automatically pick line and highlight colours
if they are not provided in the style. This is done by slightly
darkening or lightening.
Fixes #21 .
2017-09-23 22:09:46 +10:00
Alec Thomas
9d7539a4cd
Fix bug in Turtle lexer.
2017-09-22 23:27:40 +10:00
Alec Thomas
a5a3b67010
Reprocess all rules after a LexerMutator is applied.
2017-09-22 23:14:32 +10:00
Alec Thomas
2ce2ec7f65
Fix bug with empty states.
2017-09-22 22:40:00 +10:00
Alec Thomas
0bb853fb4f
Convert Include to a LexerMutator.
...
Fixes #18 .
2017-09-22 22:29:17 +10:00
Alec Thomas
1724aab879
Implement compile-time lexer mutators.
...
This should fix #15 .
2017-09-21 20:02:53 +10:00
Alec Thomas
60797cc03f
Add tracing + better error recovery.
2017-09-21 17:52:28 +10:00
Alec Thomas
e2d6abaa64
Document and add iterator panic recovery.
2017-09-20 23:06:23 +10:00
Alec Thomas
cc0e4a59ab
Switch to an Iterator interface.
...
This is to solve an issue where writers returned by the Formatter
were often stateful, but this fact was not obvious to the API consumer,
and failed in interesting ways.
2017-09-20 22:19:36 +10:00
Alec Thomas
36ead7258a
Use utf8.RuneCountInString() rather than len() :(
...
Fixes #10 . Thanks @curio77.
2017-09-20 20:36:25 +10:00
Alec Thomas
44b23f97b4
Split Regexp lexer into its own file.
2017-09-20 20:19:33 +10:00