chroma

go/chroma

mirror of https://github.com/alecthomas/chroma.git synced 2025-02-13 13:28:27 +02:00

Author	SHA1	Message	Date
Alec Thomas	cc2dd5b8ad	Version 2 of Chroma This cleans up the API in general, removing a bunch of deprecated stuff, cleaning up circular imports, etc. But the biggest change is switching to an optional XML format for the regex lexer. Having lexers defined only in Go is not ideal for a couple of reasons. Firstly, it impedes a significant portion of contributors who use Chroma in Hugo, but don't know Go. Secondly, it bloats the binary size of any project that imports Chroma. Why XML? YAML is an abomination and JSON is not human editable. XML also compresses very well (eg. Go template lexer XML compresses from 3239 bytes to 718). Why a new syntax format? All major existing formats rely on the Oniguruma regex engine, which is extremely complex and for which there is no Go port. Why not earlier? Prior to the existence of fs.FS this was not a viable option. Benchmarks: $ hyperfine --warmup 3 \ './chroma.master --version' \ './chroma.xml-pre-opt --version' \ './chroma.xml --version' Benchmark 1: ./chroma.master --version Time (mean ± σ): 5.3 ms ± 0.5 ms [User: 3.6 ms, System: 1.4 ms] Range (min … max): 4.2 ms … 6.6 ms 233 runs Benchmark 2: ./chroma.xml-pre-opt --version Time (mean ± σ): 50.6 ms ± 0.5 ms [User: 52.4 ms, System: 3.6 ms] Range (min … max): 49.2 ms … 51.5 ms 51 runs Benchmark 3: ./chroma.xml --version Time (mean ± σ): 6.9 ms ± 1.1 ms [User: 5.1 ms, System: 1.5 ms] Range (min … max): 5.7 ms … 19.9 ms 196 runs Summary './chroma.master --version' ran 1.30 ± 0.23 times faster than './chroma.xml --version' 9.56 ± 0.83 times faster than './chroma.xml-pre-opt --version' A slight increase in init time, but I think this is okay given the increase in flexibility. And binary size difference: $ du -h lexers.test* $ du -sh chroma* 951371ms 8.8M chroma.master 7.8M chroma.xml 7.8M chroma.xml-pre-opt Benchmarks: $ hyperfine --warmup 3 \ './chroma.master --version' \ './chroma.xml-pre-opt --version' \ './chroma.xml --version' Benchmark 1: ./chroma.master --version Time (mean ± σ): 5.3 ms ± 0.5 ms [User: 3.6 ms, System: 1.4 ms] Range (min … max): 4.2 ms … 6.6 ms 233 runs Benchmark 2: ./chroma.xml-pre-opt --version Time (mean ± σ): 50.6 ms ± 0.5 ms [User: 52.4 ms, System: 3.6 ms] Range (min … max): 49.2 ms … 51.5 ms 51 runs Benchmark 3: ./chroma.xml --version Time (mean ± σ): 6.9 ms ± 1.1 ms [User: 5.1 ms, System: 1.5 ms] Range (min … max): 5.7 ms … 19.9 ms 196 runs Summary './chroma.master --version' ran 1.30 ± 0.23 times faster than './chroma.xml --version' 9.56 ± 0.83 times faster than './chroma.xml-pre-opt --version' Incompatible changes: - (RegexLexer).SetAnalyser: changed from func(func(text string) float32) RegexLexer to func(func(text string) float32) Lexer - (TokenType).UnmarshalJSON: removed - Lexer.AnalyseText: added - Lexer.SetAnalyser: added - Lexer.SetRegistry: added - MustNewLazyLexer: removed - MustNewLexer: changed from func(Config, Rules) RegexLexer to func(Config, func() Rules) RegexLexer - Mutators: changed from func(...Mutator) MutatorFunc to func(...Mutator) Mutator - NewLazyLexer: removed - NewLexer: changed from func(Config, Rules) (RegexLexer, error) to func(Config, func() Rules) (*RegexLexer, error) - Pop: changed from func(int) MutatorFunc to func(int) Mutator - Push: changed from func(...string) MutatorFunc to func(...string) Mutator - TokenType.MarshalJSON: removed - Using: changed from func(Lexer) Emitter to func(string) Emitter - UsingByGroup: changed from func(func(string) Lexer, int, int, ...Emitter) Emitter to func(int, int, ...Emitter) Emitter v2.0.0-alpha1	2022-01-27 15:22:00 +11:00
Alec Thomas	d491f1b5c1	Helper scripts for on-the-fly running chroma{,d}.	2022-01-27 10:09:13 +11:00
silverwind	4ba9a05fee	Add mariadb aliases Fixes: https://github.com/alecthomas/chroma/issues/590	2022-01-16 10:29:45 +11:00
pu	0b2b5eec0a	Add[styles/gruvbox]: vim gruvbox theme for chroma (#592 )	2022-01-14 00:18:06 +11:00
Alec Thomas	36bdd4b988	Benchmark Java instead of Go. v0.10.0	2022-01-12 21:49:38 +11:00
Jelle Hulter	b01c8fcab6	Added OpenEdgeABL lexer (#585 )	2022-01-04 03:22:50 +11:00
James	decf9d3229	Highlight MSBuild project files for C#, F#, and C++ as XML. (#584 )	2022-01-02 18:16:11 +11:00
Martin Stühmer	3bdc3fbe7a	Removed mimetype for the Lexer bicep (#574 )	2021-12-20 18:12:21 +11:00
Tobias Neitzel	ac2891ff2a	Add support for more complex command prompts (#583 ) The bashsession lexer did only recognized command prompts starting with one of the characters '#$%>'. This commit adds support for a prefixed block of the form `[.@.]`, which is a common layout for command prompts.	2021-12-20 18:04:05 +11:00
MsK`	d38fcfcce0	fixed scheme characters	2021-12-13 14:36:15 +11:00
Nicolas Jeannerod	06f476d11a	Add lexer for Typed and Untyped Plutus Core (#579 )	2021-12-02 21:23:30 +11:00
Yang Yang	3f5761f9fe	Fix background color of xcode-dark style	2021-11-30 19:03:01 +11:00
Siavash Askari Nasr	d6e61d3cbe	Use `span` for `.line`, put code inside `<code>` Apparently only "phrasing content" are permitted inside `pre` and `code`, and `div` is not one of them, so replace it with `span`, `div` wasn't necessary anyway. Also it makes sense to put code inside `code` tag(except for line numbers). Removed `overflow:auto` and `width: auto` as they didn't seem to be necessary and actually prevented horizontal scroll bar to appear when content didn't fit in the viewport.	2021-11-10 19:52:49 +11:00
Siavash Askari Nasr	5ed5054d73	Put lines in a div & code lines in a span, Add WrapLongLines Option	2021-11-09 19:14:33 +11:00
Stefan van der Walt	7cefa29980	Add WitchHazel by Thea Flowers (#570 ) https://witchhazel.thea.codes/	2021-11-09 19:12:41 +11:00
Joseph Schorr	0fcd2d8e0e	Add lexer/detector for zed Reference: https://docs.authzed.com/reference/schema-lang Used by project SpiceDB (https://github.com/authzed/spicedb), a database system for managing security-critical permissions checking, inspired by Google's Zanzibar paper	2021-10-27 15:18:05 +11:00
Martin Stühmer	202379826e	feat: Added .bicep lexer #562 (#564 )	2021-10-23 21:48:00 +11:00
Guillaume Corré	175c35e532	Handle enclosing backtick in Kotlin lexer for identifiers, close #565	2021-10-23 08:50:29 +11:00
Martin Stühmer	4c36740e5e	Refinement of the C# lexer (#563 )	2021-10-21 14:03:33 +11:00
Alec Thomas	0e8972ef9b	Fix chromad build.	2021-10-17 09:07:08 +11:00
Alec Thomas	6520148857	Re-add release workflow. v0.9.4	2021-10-17 08:46:20 +11:00
Tom Lebreux	22ed667b6d	Add binary number to go lexer v0.9.3	2021-09-28 07:50:31 +10:00
Tom Lebreux	785554e8b1	Add sieve lexer Generated by pygments2chroma.py	2021-09-28 07:50:06 +10:00
Alec Thomas	4b2b42da1d	Tidy go.mod	2021-09-27 14:40:46 +10:00
Alec Thomas	9a8a647afb	Report file pattern errors when a lexer is initialised. See #555	2021-09-27 14:27:46 +10:00
Alec Thomas	07a127dd74	Test for compiling all lexer patterns.	2021-09-27 14:19:39 +10:00
Steven Penny	23160b3058	lexers/internal: remove danwakefield/fnmatch This package appears to be old and unmaintained, and it gets pulled into the go.sum of every single package importing Chroma. Also fnmatch doesnt seem to do any error checking on the pattern, while the standard package does. Also, the standard package path/filepath is already imported, so it appears not to cost anything to change it.	2021-09-27 13:37:55 +10:00
toshimaru	368394959b	Update go modules - mattn/go-colorable - mattn/go-isatty - stretchr/testify	2021-09-27 09:43:14 +10:00
Patrice Chalin	7259f5b254	BashSession lexer	2021-09-19 15:09:58 +10:00
Siavash Askari Nasr	4d7154e8c7	Raku: Fix unterminated heredoc fixes #547 - Fix heredoc - Move testdata to raku directory - Add testdata for unterminated heredoc	2021-09-19 08:41:00 +10:00
Guillermo León	82795e1420	Update to the last version of microsoft grammar file	2021-09-18 18:20:49 +10:00
Siddhant N Trivedi	04ed07d7b4	Update rust lexer (#548 )	2021-09-18 17:11:51 +10:00
Alec Thomas	d1f987668b	Add JSON test.	2021-09-11 11:33:47 +10:00
Andy Yankovsky	f7c1454f13	Support comments in JSON JSON spec doesn't allow comments, but many implementation support them anyway. Having comments is very convenient, for example, when showing code in blog posts and articles.	2021-09-11 11:21:32 +10:00
Dmitry Titov	02951cec42	Implement 1S:Enterprise (#545 )	2021-09-04 10:20:45 +10:00
Ville Skyttä	4436c497ee	lexers: add Debian user configuration file backups to ignored suffixes https://manpages.debian.b/bullseye/ucf/ucf.1.en.html	2021-08-30 07:36:04 +10:00
Siavash Askari Nasr	26f16a6f7c	Raku: Improve nested regex and code, function designs - Improve nested regex and code - Improve function designs - Add escape character support to single quotes and regex - Fixes and cleanups	2021-08-29 07:10:23 +10:00
Siavash Askari Nasr	4f779665e9	Raku: Add more test data	2021-08-29 07:10:23 +10:00
Siavash Askari Nasr	22fac1fc0f	Raku: Fix Match hash access, $<variable><key>	2021-08-29 07:10:23 +10:00
Siavash Askari Nasr	2535d1a99e	Raku: Fix operators that come after `<`	2021-08-29 07:10:23 +10:00
Siavash Askari Nasr	bb38ae204f	Raku: Fix detecting adverbs, plus a little cleanup	2021-08-29 07:10:23 +10:00
Stefan kruger	2eba3ce9a1	Update APL lexer to cope with current Dyalog APL * Names can start with underscore * Missing APL primitive ops: ⍥@⌺⌶⍢ * Missing APL primitive funcs: ⊆⍸	2021-08-20 08:28:03 +10:00
Siavash Askari Nasr	b71f4c6607	Raku: Fix incorrectly matching closing brackets as opening	2021-08-15 21:39:18 +10:00
Alec Thomas	7966c29526	Update README.md	2021-08-01 21:51:06 +10:00
Alec Thomas	61cfd7d7a7	Update README.md	2021-08-01 21:50:36 +10:00
Alec Thomas	594e3117f8	Update README.md	2021-08-01 21:49:50 +10:00
Siavash Askari Nasr	f4ffd6cea9	Fix Raku colon pair, function adverb and POD declaration	2021-08-01 21:48:42 +10:00
Alec Thomas	2cff0c9b1f	Switch from Circle to GHA.	2021-08-01 14:38:38 +10:00
Nelo Mitranim	fb1dd01cfb	[kotlin] expensive char list -> char classes This reduces the lexer's init time by about x1000, from ≈350ms to ≈350μs on my machine.	2021-07-31 19:07:24 +10:00
Nelo Mitranim	eafea0d771	replace expensive char lists with char classes Huge hardcoded character lists have a cost. In some current lexers, such as Haskell and JavaScript, this bloats lexer init time to 60-80ms on some current systems, as opposed to sub-ms for many others. Replacing them with character classes such as `\p{L}` seems to eliminate this cost, reducing lexer init time to the norm (around 1ms). In addition, this significantly reduces and simplifies the code. The current tests pass, but there may be inaccuracies not covered by tests. This requires a review. This change is likely to cause edge case regressions, as the sets of characters considered "letters" vary between languages. However, Chroma lexers don't aim to be perfectly accurate. Performance should be just as much a goal as accuracy. I believe this tradeoff to be justified. This commit leaves at least two lexers unfixed: Julia and Kotlin. Judging by the code, they might have the same issue, and should also be addressed.	2021-07-27 22:04:33 +10:00

1 2 3 4 5 ...

559 Commits