1
0
mirror of https://github.com/pbnjay/grate.git synced 2024-12-12 13:35:18 +02:00
A Go native tabular data extraction package. Currently supports .xls, .xlsx, .csv, .tsv formats.
Go to file
2021-02-22 01:07:09 -05:00
cmd expose data types through interface 2021-02-22 00:01:17 -05:00
commonxl expose data types through interface 2021-02-22 00:01:17 -05:00
simple expose data types through interface 2021-02-22 00:01:17 -05:00
xls bugfix and quiet warnings 2021-02-22 01:07:09 -05:00
xlsx refactor sheets and formatting so we can use for type detection 2021-02-21 23:29:48 -05:00
.gitignore misc cleanups 2021-02-14 14:16:46 -05:00
errs.go more consistent error handling 2021-02-12 10:44:23 -05:00
go.mod of course when I backport they release 1.16... 2021-02-17 02:17:02 -05:00
grate.go expose data types through interface 2021-02-22 00:01:17 -05:00
LICENSE switch to a less restrictive license 2021-02-17 12:27:12 -05:00
README.md switch to a less restrictive license 2021-02-17 12:27:12 -05:00

grate

A Go native tabular data extraction package. Currently supports .xls, .xlsx, .csv, .tsv formats.

Why?

Grate focuses on speed and stability first, and makes no attempt to parse charts, figures, or other content types that may be present embedded within the input files. It tries to perform as few allocations as possible and errs on the side of caution.

There are certainly still some bugs and edge cases, but we have run it successfully on a set of 400k .xls and .xlsx files to catch many bugs and error conditions. Please file an issue with any feedback and additional problem files.

Usage

Grate provides a simple standard interface for all supported filetypes, allowing access to both named worksheets in spreadsheets and single tables in plaintext formats.

package main

import (
    "fmt"
    "os"
    "strings"

    "github.com/pbnjay/grate"
    _ "github.com/pbnjay/grate/simple" // tsv and csv support
    _ "github.com/pbnjay/grate/xls"
    _ "github.com/pbnjay/grate/xlsx"
)

func main() {
    wb, _ := grate.Open(os.Args[1])  // open the file
    sheets, _ := wb.List()           // list available sheets
    for _, s := range sheets {       // enumerate each sheet name
        sheet, _ := wb.Get(s)        // open the sheet
        for sheet.Next() {           // enumerate each row of data
            row := sheet.Strings()   // get the row's content as []string
            fmt.Println(strings.Join(row, "\t"))
        }
    }
    wb.Close()
}

License

All source code is licensed under the MIT License.