1
0
mirror of https://github.com/alecthomas/chroma.git synced 2025-02-13 13:28:27 +02:00

559 Commits

Author SHA1 Message Date
Alec Thomas
cc2dd5b8ad Version 2 of Chroma
This cleans up the API in general, removing a bunch of deprecated stuff,
cleaning up circular imports, etc.

But the biggest change is switching to an optional XML format for the
regex lexer.

Having lexers defined only in Go is not ideal for a couple of reasons.
Firstly, it impedes a significant portion of contributors who use Chroma
in Hugo, but don't know Go. Secondly, it bloats the binary size of any
project that imports Chroma.

Why XML? YAML is an abomination and JSON is not human editable. XML
also compresses very well (eg. Go template lexer XML compresses from
3239 bytes to 718).

Why a new syntax format? All major existing formats rely on the
Oniguruma regex engine, which is extremely complex and for which there
is no Go port.

Why not earlier? Prior to the existence of fs.FS this was not a viable
option.

Benchmarks:

    $ hyperfine --warmup 3 \
        './chroma.master --version' \
        './chroma.xml-pre-opt --version' \
        './chroma.xml --version'
    Benchmark 1: ./chroma.master --version
      Time (mean ± σ):       5.3 ms ±   0.5 ms    [User: 3.6 ms, System: 1.4 ms]
      Range (min … max):     4.2 ms …   6.6 ms    233 runs

    Benchmark 2: ./chroma.xml-pre-opt --version
      Time (mean ± σ):      50.6 ms ±   0.5 ms    [User: 52.4 ms, System: 3.6 ms]
      Range (min … max):    49.2 ms …  51.5 ms    51 runs

    Benchmark 3: ./chroma.xml --version
      Time (mean ± σ):       6.9 ms ±   1.1 ms    [User: 5.1 ms, System: 1.5 ms]
      Range (min … max):     5.7 ms …  19.9 ms    196 runs

    Summary
      './chroma.master --version' ran
        1.30 ± 0.23 times faster than './chroma.xml --version'
        9.56 ± 0.83 times faster than './chroma.xml-pre-opt --version'

A slight increase in init time, but I think this is okay given the
increase in flexibility.

And binary size difference:

    $ du -h lexers.test*
    $ du -sh chroma*                                                                                                                                                                                                                                                                                                                                                                                                                                                             951371ms
    8.8M	chroma.master
    7.8M	chroma.xml
    7.8M	chroma.xml-pre-opt

Benchmarks:

    $ hyperfine --warmup 3 \
        './chroma.master --version' \
        './chroma.xml-pre-opt --version' \
        './chroma.xml --version'
    Benchmark 1: ./chroma.master --version
      Time (mean ± σ):       5.3 ms ±   0.5 ms    [User: 3.6 ms, System: 1.4 ms]
      Range (min … max):     4.2 ms …   6.6 ms    233 runs

    Benchmark 2: ./chroma.xml-pre-opt --version
      Time (mean ± σ):      50.6 ms ±   0.5 ms    [User: 52.4 ms, System: 3.6 ms]
      Range (min … max):    49.2 ms …  51.5 ms    51 runs

    Benchmark 3: ./chroma.xml --version
      Time (mean ± σ):       6.9 ms ±   1.1 ms    [User: 5.1 ms, System: 1.5 ms]
      Range (min … max):     5.7 ms …  19.9 ms    196 runs

    Summary
      './chroma.master --version' ran
        1.30 ± 0.23 times faster than './chroma.xml --version'
        9.56 ± 0.83 times faster than './chroma.xml-pre-opt --version'

Incompatible changes:

- (*RegexLexer).SetAnalyser: changed from func(func(text string) float32) *RegexLexer to func(func(text string) float32) Lexer
- (*TokenType).UnmarshalJSON: removed
- Lexer.AnalyseText: added
- Lexer.SetAnalyser: added
- Lexer.SetRegistry: added
- MustNewLazyLexer: removed
- MustNewLexer: changed from func(*Config, Rules) *RegexLexer to func(*Config, func() Rules) *RegexLexer
- Mutators: changed from func(...Mutator) MutatorFunc to func(...Mutator) Mutator
- NewLazyLexer: removed
- NewLexer: changed from func(*Config, Rules) (*RegexLexer, error) to func(*Config, func() Rules) (*RegexLexer, error)
- Pop: changed from func(int) MutatorFunc to func(int) Mutator
- Push: changed from func(...string) MutatorFunc to func(...string) Mutator
- TokenType.MarshalJSON: removed
- Using: changed from func(Lexer) Emitter to func(string) Emitter
- UsingByGroup: changed from func(func(string) Lexer, int, int, ...Emitter) Emitter to func(int, int, ...Emitter) Emitter
v2.0.0-alpha1
2022-01-27 15:22:00 +11:00
Alec Thomas
d491f1b5c1 Helper scripts for on-the-fly running chroma{,d}. 2022-01-27 10:09:13 +11:00
silverwind
4ba9a05fee Add mariadb aliases
Fixes: https://github.com/alecthomas/chroma/issues/590
2022-01-16 10:29:45 +11:00
pu
0b2b5eec0a
Add[styles/gruvbox]: vim gruvbox theme for chroma (#592) 2022-01-14 00:18:06 +11:00
Alec Thomas
36bdd4b988 Benchmark Java instead of Go. v0.10.0 2022-01-12 21:49:38 +11:00
Jelle Hulter
b01c8fcab6
Added OpenEdgeABL lexer (#585) 2022-01-04 03:22:50 +11:00
James
decf9d3229
Highlight MSBuild project files for C#, F#, and C++ as XML. (#584) 2022-01-02 18:16:11 +11:00
Martin Stühmer
3bdc3fbe7a
Removed mimetype for the Lexer bicep (#574) 2021-12-20 18:12:21 +11:00
Tobias Neitzel
ac2891ff2a
Add support for more complex command prompts (#583)
The bashsession lexer did only recognized command prompts starting with
one of the characters '#$%>'. This commit adds support for a prefixed
block of the form `[.*@.*]`, which is a common layout for command prompts.
2021-12-20 18:04:05 +11:00
MsK`
d38fcfcce0 fixed scheme characters 2021-12-13 14:36:15 +11:00
Nicolas Jeannerod
06f476d11a
Add lexer for Typed and Untyped Plutus Core (#579) 2021-12-02 21:23:30 +11:00
Yang Yang
3f5761f9fe Fix background color of xcode-dark style 2021-11-30 19:03:01 +11:00
Siavash Askari Nasr
d6e61d3cbe Use span for .line, put code inside <code>
Apparently only "phrasing content" are permitted inside `pre` and
`code`, and `div` is not one of them, so replace it with `span`, `div`
wasn't necessary anyway.

Also it makes sense to put code inside `code` tag(except for line
numbers).

Removed `overflow:auto` and `width: auto` as they didn't seem to be
necessary and actually prevented horizontal scroll bar to appear when content
didn't fit in the viewport.
2021-11-10 19:52:49 +11:00
Siavash Askari Nasr
5ed5054d73 Put lines in a div & code lines in a span, Add WrapLongLines Option 2021-11-09 19:14:33 +11:00
Stefan van der Walt
7cefa29980
Add WitchHazel by Thea Flowers (#570)
https://witchhazel.thea.codes/
2021-11-09 19:12:41 +11:00
Joseph Schorr
0fcd2d8e0e Add lexer/detector for zed
Reference: https://docs.authzed.com/reference/schema-lang

Used by project SpiceDB (https://github.com/authzed/spicedb), a database system for managing security-critical permissions checking, inspired by Google's Zanzibar paper
2021-10-27 15:18:05 +11:00
Martin Stühmer
202379826e
feat: Added .bicep lexer #562 (#564) 2021-10-23 21:48:00 +11:00
Guillaume Corré
175c35e532 Handle enclosing backtick in Kotlin lexer for identifiers, close #565 2021-10-23 08:50:29 +11:00
Martin Stühmer
4c36740e5e
Refinement of the C# lexer (#563) 2021-10-21 14:03:33 +11:00
Alec Thomas
0e8972ef9b Fix chromad build. 2021-10-17 09:07:08 +11:00
Alec Thomas
6520148857 Re-add release workflow. v0.9.4 2021-10-17 08:46:20 +11:00
Tom Lebreux
22ed667b6d Add binary number to go lexer v0.9.3 2021-09-28 07:50:31 +10:00
Tom Lebreux
785554e8b1 Add sieve lexer
Generated by pygments2chroma.py
2021-09-28 07:50:06 +10:00
Alec Thomas
4b2b42da1d Tidy go.mod 2021-09-27 14:40:46 +10:00
Alec Thomas
9a8a647afb Report file pattern errors when a lexer is initialised.
See #555
2021-09-27 14:27:46 +10:00
Alec Thomas
07a127dd74 Test for compiling all lexer patterns. 2021-09-27 14:19:39 +10:00
Steven Penny
23160b3058 lexers/internal: remove danwakefield/fnmatch
This package appears to be old and unmaintained, and it gets pulled into the
go.sum of every single package importing Chroma. Also fnmatch doesnt seem to do
any error checking on the pattern, while the standard package does. Also, the
standard package path/filepath is already imported, so it appears not to cost
anything to change it.
2021-09-27 13:37:55 +10:00
toshimaru
368394959b Update go modules
- mattn/go-colorable
- mattn/go-isatty
- stretchr/testify
2021-09-27 09:43:14 +10:00
Patrice Chalin
7259f5b254 BashSession lexer 2021-09-19 15:09:58 +10:00
Siavash Askari Nasr
4d7154e8c7 Raku: Fix unterminated heredoc fixes #547
- Fix heredoc
- Move testdata to raku directory
- Add testdata for unterminated heredoc
2021-09-19 08:41:00 +10:00
Guillermo León
82795e1420 Update to the last version of microsoft grammar file 2021-09-18 18:20:49 +10:00
Siddhant N Trivedi
04ed07d7b4
Update rust lexer (#548) 2021-09-18 17:11:51 +10:00
Alec Thomas
d1f987668b Add JSON test. 2021-09-11 11:33:47 +10:00
Andy Yankovsky
f7c1454f13 Support comments in JSON
JSON spec doesn't allow comments, but many implementation support them anyway. Having comments is very convenient, for example, when showing code in blog posts and articles.
2021-09-11 11:21:32 +10:00
Dmitry Titov
02951cec42
Implement 1S:Enterprise (#545) 2021-09-04 10:20:45 +10:00
Ville Skyttä
4436c497ee lexers: add Debian user configuration file backups to ignored suffixes
https://manpages.debian.b/bullseye/ucf/ucf.1.en.html
2021-08-30 07:36:04 +10:00
Siavash Askari Nasr
26f16a6f7c Raku: Improve nested regex and code, function designs
- Improve nested regex and code
- Improve function designs
- Add escape character support to single quotes and regex
- Fixes and cleanups
2021-08-29 07:10:23 +10:00
Siavash Askari Nasr
4f779665e9 Raku: Add more test data 2021-08-29 07:10:23 +10:00
Siavash Askari Nasr
22fac1fc0f Raku: Fix Match hash access, $<variable><key> 2021-08-29 07:10:23 +10:00
Siavash Askari Nasr
2535d1a99e Raku: Fix operators that come after < 2021-08-29 07:10:23 +10:00
Siavash Askari Nasr
bb38ae204f Raku: Fix detecting adverbs, plus a little cleanup 2021-08-29 07:10:23 +10:00
Stefan kruger
2eba3ce9a1 Update APL lexer to cope with current Dyalog APL
* Names can start with underscore
* Missing APL primitive ops: ⍥@⌺⌶⍢
* Missing APL primitive funcs: ⊆⍸
2021-08-20 08:28:03 +10:00
Siavash Askari Nasr
b71f4c6607 Raku: Fix incorrectly matching closing brackets as opening 2021-08-15 21:39:18 +10:00
Alec Thomas
7966c29526
Update README.md 2021-08-01 21:51:06 +10:00
Alec Thomas
61cfd7d7a7
Update README.md 2021-08-01 21:50:36 +10:00
Alec Thomas
594e3117f8
Update README.md 2021-08-01 21:49:50 +10:00
Siavash Askari Nasr
f4ffd6cea9 Fix Raku colon pair, function adverb and POD declaration 2021-08-01 21:48:42 +10:00
Alec Thomas
2cff0c9b1f Switch from Circle to GHA. 2021-08-01 14:38:38 +10:00
Nelo Mitranim
fb1dd01cfb [kotlin] expensive char list -> char classes
This reduces the lexer's init time by about x1000, from ≈350ms to ≈350μs
on my machine.
2021-07-31 19:07:24 +10:00
Nelo Mitranim
eafea0d771 replace expensive char lists with char classes
Huge hardcoded character lists have a cost. In some current lexers, such
as Haskell and JavaScript, this bloats lexer init time to 60-80ms on
some current systems, as opposed to sub-ms for many others.

Replacing them with character classes such as `\p{L}` seems to
eliminate this cost, reducing lexer init time to the norm (around 1ms).
In addition, this significantly reduces and simplifies the code.

The current tests pass, but there may be inaccuracies not covered by
tests. This requires a review.

This change is likely to cause edge case regressions, as the sets of
characters considered "letters" vary between languages. However, Chroma
lexers don't aim to be perfectly accurate. Performance should be just as
much a goal as accuracy. I believe this tradeoff to be justified.

This commit leaves at least two lexers unfixed: Julia and Kotlin.
Judging by the code, they might have the same issue, and should also be
addressed.
2021-07-27 22:04:33 +10:00