1
0
mirror of https://github.com/alecthomas/chroma.git synced 2025-02-09 13:23:51 +02:00

516 Commits

Author SHA1 Message Date
FlyingStitchman
f3ff20b1f5
fix(nix): nix lexor missing '=' operator (#1031)
The nix lexor was missing the '=' operator in the operators rule. This
meant that in some cases (like the first '=' in a code block), it would
return an err instead of the proper highlighting.

Before:

![screenshot](https://github.com/user-attachments/assets/a34dd5e1-ea6b-4898-a8ef-81cf9b9f40fe)
After:

![screenshot](https://github.com/user-attachments/assets/eb59697c-b828-4c33-bb12-19f67432a99d)
2024-12-25 19:37:32 +11:00
JakobDev
99c5b1e0b7
Import NSIS Lexer from Pygments (#1026)
This imports `pygments.lexers.installers.NSISLexer` from Pygmnets into
Chroma

Implements #459
2024-12-03 08:24:09 +11:00
Dennis Wagelaar
b07ff27bb0
Add Eclipse ATL language (https://eclipse.dev/atl/) (#1024)
Would like to use ATL syntax highlighting on the ATL website, which is
generated using hugo.
2024-12-02 11:34:48 +11:00
Carter Li
df769f9366
feat(JSON): support .jsonc extension (#1022)
JSONC is JSON with comments. The JSON lexer already supports comments.
2024-11-19 14:08:00 +11:00
cloudchamb3r
539d031254
Remove whitespace tokenizing rule in markdown lexer (#1008)
fix #744 
related issue: go-gitea/gitea#32136

current markdown lexer is tokenizing by whitespace, but in markdown
sytanx this implementation can cause some errors.
2024-11-14 19:45:38 +11:00
Frederik Kvartborg Albertsen
8d04e7e2f1
add any as a builtin type for go (#1021)
Thanks for this awesome go module.

I have noticed that the builtin `any` type in Go isn't highlighted, so i
have added it in this PR. :)
2024-11-14 12:32:04 +11:00
Steven Kalt
23ed7b98d2
fix(typescript): highlight string literal type parameters (#1010)
Fixes #1009
2024-11-14 08:48:23 +11:00
Johan Walles
e76e1e2233
Add CSV lexer (#1005) 2024-11-11 08:03:57 +11:00
Gusted
5e7b53e590
fix: add underscore parsing in numbers for haskell (#1020) 2024-11-02 10:48:24 +11:00
John Olheiser
2e669a272c
Add Jsonnet Lexer (#1011)
No open issue, but I wanted this for myself so I figured I'd open a PR
as well. 🙂

I've used the third example from the jsonnet site as the test:
https://jsonnet.org/

Converted uses the pygments converter, and here's a quick screenshot of
the playground:

![Screenshot 2024-10-15
115533](https://github.com/user-attachments/assets/d0b6d054-4e42-4911-ba67-670bdc55bf3e)

It seems to have a small problem with strings as keys, but otherwise
looks close.

Signed-off-by: jolheiser <git@jolheiser.com>
2024-10-16 05:29:14 +11:00
oliverpool
114c584bd6
Add Typst Lexer (#1007) 2024-10-07 09:05:36 +11:00
Fredrare
a56e228d32
Update TypeScript lexer to allow nested generics (#1002)
https://github.com/alecthomas/chroma/issues/425 shows that generics are
not correctly identified in general, but they are rather being treated
as JSX elements. I proposed a simple solution in the comments by adding
a space between `<` and the word next to it, but I believe most people
will either not find the solution or some of them will find it rather
unappealing.

For this reason, I made the JSX rules recursive and added a `","`
`Punctuation` token inside so that there can be a number of generics
used, as well as allowing nested generics. While I am not really fond of
this hack, given that generics are already treated as JSX elements, I
think this is a fair and easy enough solution for most cases.

#### Before
<img width="359" alt="Screenshot 2024-09-20 at 9 28 05 PM"
src="https://github.com/user-attachments/assets/b03c2c8a-3278-438b-8803-00eb62cc4a17">

#### With spacing solution
<img width="392" alt="Screenshot 2024-09-20 at 9 30 13 PM"
src="https://github.com/user-attachments/assets/89289476-c92a-41df-b893-5ab289fa96aa">

#### With recursive JSX and `","` `Punctuation` token
<img width="362" alt="Screenshot 2024-09-20 at 9 55 11 PM"
src="https://github.com/user-attachments/assets/d77d892e-667d-4fb4-93cf-8227d5bd4b17">
2024-09-22 07:46:46 +10:00
Aru Sahni
10c0c524c4
Update the Materialize lexer (#1001)
Syncing over the latest keywords for the Materialize lexer.

Thank you for the review!
2024-09-21 03:22:07 +10:00
Jannis
a6994de218
add beef syntax and tests (#995)
This adds #994 Beeflang.
The syntax itself is just a modifed version of c# with a few keywords
added or removed.
Also Ive added a test and run the test until it worked.

I am unsure if the mime_type is relevant to keep in there since I dont
know where that actually used.
Another thing im unsure of is how the selection of what language/syntax
to use for a file actually works aka:
Does this actually choose Beef syntax highlighting over Brainfuck if its
used on a beef (.bf) file ?
for that I would probably need help.

Co-authored-by: Booklordofthedings <Booklordofthedings@tutanota.com>
2024-08-28 02:46:33 +10:00
Gusted
4d11870090
Don't output extra whitespace in YAML multiline (#993)
This resolves a particular issue with parsing YAML multiline, for
example:
```yaml
a: |
  multiline literal
  line 2
```

The regex used would capture the amount of indentation in the third
capture group and then use that as a kind of "status" to know which
lines are part of the indented multiline. However, because its a
captured group it has to be assigned a token which was `TextWhitespace`.
This meant that the indentation was outputted after the multiline,
technically it should be seen as an non-captured group, but then its no
longer to refer to it in the regex. Therefore I've gone with the
solution to add a new token, Ignore, which will not be emitted as a
token in the iterator, which can safely be used to make use of capture
groups but not have them show up in the output.

## Before

![image](https://github.com/user-attachments/assets/c29353c5-9e15-4f14-a733-57a60fb51910)

## After

![image](https://github.com/user-attachments/assets/57b5d129-a9d3-4b84-ae1f-dc05182b9ad3)
2024-08-23 06:58:31 +10:00
Koki Fukuda
895a0488b5
Port Minecraft lexers from Pygments (#992)
Ported lexers for mcfuntion, snbt from Pygments using
`pygments2chroma_xml.py` script.

While doing so, I encountered lack of `LiteralNumberByte` in TokenType,
so I've added the type and regenerated tokentype_enumer.go.
2024-08-23 06:04:41 +10:00
Aru Sahni
9221c6d919
Update the Materialize lexer (#987) 2024-08-03 07:38:08 +10:00
Mikhail Sorochan
3044bf5f32
Go lexer: single line comment without consuming endline, disable EnsureNL (#984)
This PR changes `CommentSingle` to not consume the newline at the end as
a part of comment.
That solves the problems of single line comment being not parsed at the
end of the line or at the end of the file. Which was reported earlier as
the reason to not highlight single line comment properly.

Disabling `EnsureNL: true` does not add unnecessary newline element for
`Text`, `CommentSymbol` symbols. Using chroma in console with syntax
highlighting was unusable becasue of this, since typing e.g. `b := `
adds newline each time space is at the end when host app asks for
highlighted text from `quick`.

Tokens behavior:
<table>
<tr>
<td> Before </td> <td> After </td>
</tr>
<tr>
<td>

``` go
t.Run("Single space", func(t *testing.T) {
        tokens, _ := chroma.Tokenise(Go, nil, " ")
        expected := []chroma.Token{
                {chroma.Text, " \n"},
        }
        assert.Equal(t, expected, tokens)
})
t.Run("Assignment unfinished", func(t *testing.T) {
        tokens, _ := chroma.Tokenise(Go, nil, "i = ")
        expected := []chroma.Token{
                { chroma.NameOther, "i" },
                { chroma.Text, " " },
                { chroma.Punctuation, "=" },
                { chroma.Text, " \n" },
        }
        assert.Equal(t, expected, tokens)
})
t.Run("Single comment", func(t *testing.T) {
        tokens, _ := chroma.Tokenise(Go, nil, "// W")
        expected := []chroma.Token{
                { chroma.CommentSingle, "// W\n" },
        }
        assert.Equal(t, expected, tokens)
})
```

</td>
<td>
    
``` go
t.Run("Single space", func(t *testing.T) {
        tokens, _ := chroma.Tokenise(Go, nil, " ")
        expected := []chroma.Token{
                {chroma.Text, " "},
        }
        assert.Equal(t, expected, tokens)
})
t.Run("Assignment unfinished", func(t *testing.T) {
        tokens, _ := chroma.Tokenise(Go, nil, "i = ")
        expected := []chroma.Token{
                { chroma.NameOther, "i" },
                { chroma.Text, " " },
                { chroma.Punctuation, "=" },
                { chroma.Text, " " },
        }
        assert.Equal(t, expected, tokens)
})
t.Run("Single comment", func(t *testing.T) {
        tokens, _ := chroma.Tokenise(Go, nil, "// W")
        expected := []chroma.Token{
                { chroma.CommentSingle, "// W" },
        }
        assert.Equal(t, expected, tokens)
})
```
</td>
</tr>
</table>
2024-07-23 02:19:08 +10:00
Vlad Dimov
40e5e9989e
Add JSONata Lexer (#983)
Add lexer for [JSONata](https://docs.jsonata.org/overview.html)
including test data.
2024-07-12 05:24:31 +10:00
wackbyte
76772513ea
feat(lexers/hare): add done keyword (#979)
Adds the `done` keyword described in [this blog
post](https://harelang.org/blog/2024-04-01-introducing-for-each-loops-in-hare/)
and introduced in
[0.24.2-rc1](https://lists.sr.ht/~sircmpwn/hare-dev/%3CD2FVIMECX52X.2MYXMW7OG4UIK@cmpwn.com%3E).
2024-07-06 21:40:16 +10:00
Aru Sahni
d1034f8fe1
Update the Materialize lexer (#978)
This introduces some additional keywords. I've also scripted this on our
end hence the changes in formatting and encoding of certain characters
in attribute values.

This also includes:

 - Some tests
- Updates to the README to call out the `--csrf-key` argument for
chromad. Without it securecookie throws an error.
2024-06-29 07:27:47 +10:00
Joe Mooring
2060600780
styles: Fix Gleam alias (#973) 2024-06-09 08:46:02 +10:00
Simran
d7cf9378e7
AQL: Add builtin functions introduced in v3.12 (#968)
New functions added to 3.12.0 (`PARSE_COLLECTION()`, `PARSE_KEY()`,
`RANDOM()`, `REPEAT()`, `TO_CHAR()`) and the current stand for the
upcoming 3.12.1 (`ENTRIES()`).

Some functions were also removed (`CALL_GREENSPUN()`, `PREGEL_RESULT()`)
but I think it makes sense to keep them in the definition so that
highlighting older queries still works.
2024-05-30 08:12:17 +10:00
Paul Jolly
1e983e734d
lexers/cue: support CUE attributes (#961)
Currently the following CUE results in the chroma lexer producing an
error token for the '@':

    value: string @go(Value)

This code is, however, valid CUE. '@go' is an attributes.

This change adds lexer support for attributes in CUE.
2024-04-28 19:32:07 +10:00
Pi-Cla
9347b550ab
Add Gleam syntax highlighting (#959)
Resolves #958
2024-04-12 15:54:38 +10:00
Gusted
736c0ea4cd
Typescript: Several fixes (#952) 2024-04-03 10:57:12 +11:00
Gusted
e5c25d045e
Org: Keep all newlines (#951)
- Move the `*` operator to inside the capturing group so all characters
that match `\s` is preserved and not just the first character.
- Add tests.
- Resolves https://github.com/alecthomas/chroma/issues/845

<details><summary>Input</summary>

```orgmode
Generic text.

/Italicized/

*strong*

=Heading=

~Codeblock~

+StrikeTrough+

_Underlined_
```

</details>

## Before
![Screen Shot 2024-04-01 at 23 26
15](https://github.com/alecthomas/chroma/assets/25481501/0ddc4b34-c434-4798-9b2a-0773b2be4f05)

## After
![Screen Shot 2024-04-01 at 23 25
52](https://github.com/alecthomas/chroma/assets/25481501/ad46591d-8d3c-421e-8564-d3d36abe6fa1)
2024-04-02 08:40:43 +11:00
Gusted
5f83664c2a
Vue: Handle more edge cases (#950)
- The parsing of `v-` directives expected that this always ended in
`">`, however it is possible that there are other attributes after an
`v-` directive also it made the assumption that there couldn't be any
spaces in the value of the directive. Therefore it could result in
incorrect lexing where almost the whole file could be marked as an
LiteralString.
- Handle `-` in HTML element names.
- Explicitly mark `=` as an operator token.
- Tests added.
- Ref: https://codeberg.org/forgejo/forgejo/issues/2945

## Before
![Screen Shot 2024-04-01 at 13 58
18](https://github.com/alecthomas/chroma/assets/25481501/e33d87fe-7788-40ca-a3c5-bbd964b91cb5)

## After
![Screen Shot 2024-04-01 at 23 01
17](https://github.com/alecthomas/chroma/assets/25481501/b6b0fec0-6dfc-4dfe-93cd-8a7c42a05ef7)
2024-04-02 08:02:14 +11:00
farcop
2580aaaa8e
Add Bazel bzlmod support into Python lexer (#947)
https://bazel.build/concepts/build-ref

"such a boundary marker file could be MODULE.bazel, REPO.bazel, or in
legacy contexts, WORKSPACE or WORKSPACE.bazel."
2024-03-14 07:33:16 +11:00
Gusted
4e60c816eb
C#: Allow for empty comments (#943) 2024-03-04 11:01:17 +11:00
Andrzej Lichnerowicz
fe5dde8cf4
Add Lexer for NDISASM (#933)
The Netwide Disassembler outputs text that is assembly prefixed with
memory offsets and opcode bytes.
2024-02-28 00:17:29 +11:00
Paul Jolly
898d467079
lexers/cue: support definitions and dollars in field names (#935)
'$' is valid in a bare field name in CUE. It is not special. It happens
to be used by convention at the start of field names, but that is about
it.

Definitions start with '#'.

Add a "section" of tests that cover the various types of field. There
are no errors in new "tests", whereas before (i.e. without the change to
the CUE lexer) there would have been.
2024-02-27 07:45:59 +11:00
Francis Lavoie
381050ba00
Major updates to Caddyfile lexer (#932)
* Major updates to Caddyfile lexer

* yaml editorconfig

* nolint false positive
2024-02-20 20:08:27 +11:00
W. Michael Petullo
7ce2caf57a
Fix lexers check when built with newer Go (#928)
Newer versions of Go emit '\f' rather than "\u000c" for formfeeds. This
patch adjusts the expected values for the lexers tests to match.

Fixes: https://github.com/alecthomas/chroma/issues/927

See also: https://github.com/golang/go/issues/64346 for the change to
Go.

Signed-off-by: W. Michael Petullo <mike@flyn.org>
2024-02-15 07:37:06 +11:00
Abhinav Gupta
506e36f9e0
fix(lexers/go): "~" is a valid token (#926)
With the introduction of generics,
tilde is a valid punctuation token in Go programs.
https://go.dev/ref/spec#Operators_and_punctuation

This updates the punctuation regex for the Go lexer,
and adds a test to ensure that it's treated as such.
2024-02-12 14:51:38 +11:00
silverwind
4c6fdb1d11
Add .avsc to JSON lexer (#920) 2024-01-26 07:20:16 +11:00
Jamie Tanna
ee60f7e738
Add missing token types for Rego + add Rego to README (#919) 2024-01-24 07:45:58 +11:00
Jamie Tanna
ae36e63369
Add support for Rego syntax (#918) 2024-01-23 08:01:11 +11:00
Luke Marzen
ebc34cf4aa
fix file extension typo, remove redundent parens (#914) 2024-01-15 11:26:41 +11:00
Philipp Hagenlocher
641b06ff8c
Fix type operators not being recognised in Haskell (#913) 2024-01-09 12:20:39 +11:00
Silvan Calarco
a8704a8f0b
Add lexer for RPMSpec (#907) 2024-01-03 08:45:31 +11:00
Luke Marzen
eb47752470
Add lexer for Promela (#906) 2024-01-02 16:11:09 +11:00
JakobDev
016768bc14
Add desktop entry lexer (#903) 2023-12-19 22:44:28 +11:00
renovate[bot]
76039a532f
chore(deps): update all non-major dependencies (#897)
* chore(deps): update all non-major dependencies

* fix: fixes for test and lint

Test was slow because we were deep-comparing lexers. This is
unnecessary as we can just shallow compare.

---------

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: Alec Thomas <alec@swapoff.org>
2023-12-08 14:22:40 +11:00
Aru Sahni
678b799ad6
Add a lexer for the Materialize SQL dialect (#896) 2023-12-08 11:21:23 +11:00
Mohammed Anas
d77dc8ab51
Add Hare lexer (#882) 2023-11-29 10:00:30 +11:00
Daniel
234600703b
Add an Alloy lexer (#892) 2023-11-26 08:52:31 +11:00
Daniel
08be6f023f
Add an Agda lexer (#891) 2023-11-25 19:54:04 +11:00
codiacdev
d9f6ed634e
added: rule to objectpascal lexer (#888)
* added: another rule for objectpascal lexer

* updated: objectpascal lexer testdata

* fixed: processing of control (escape) characters

* updated: test data for objectpascal lexer
2023-11-24 06:53:33 +11:00
silverwind
c5948a61b1
Add Vagrantfile to Ruby lexer (#890) 2023-11-24 06:50:30 +11:00