Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.16.0, 0.17.0
-
None
-
I'm running this on Ubuntu 20.04.4 LTS with Go version 1.16.3. I was able to reproduce this issue with both thrift 0.16.0 and the latest master.
Description
We're using the Thrift compiler to generate Go code for our parquet-go library. The parquet file format uses Thrift to serialize many of its internal data structures. A user advised us in an issue report that invalid input can cause the generated Go parser code for parquet's thrift data structure to crash, see e.g. here:
panic: runtime error: makeslice: cap out of range
goroutine 1 [running]:
github.com/fraugster/parquet-go/parquet.(*ColumnMetaData).ReadField2(0xc0001342c0, 0x6e5230, 0xc0000160c8, 0x6e6990, 0xc000118aa0, 0x2000f, 0x0)
/home/andreas/go/src/github.com/fraugster/parquet-go/parquet/parquet.go:7800 +0xd9
github.com/fraugster/parquet-go/parquet.(*ColumnMetaData).Read(0xc0001342c0, 0x6e5230, 0xc0000160c8, 0x6e6990, 0xc000118aa0, 0x0, 0x0)
/home/andreas/go/src/github.com/fraugster/parquet-go/parquet/parquet.go:7611 +0x56a
github.com/fraugster/parquet-go/parquet.(*ColumnChunk).ReadField3(0xc000060360, 0x6e5230, 0xc0000160c8, 0x6e6990, 0xc000118aa0, 0x3000c, 0x0)
/home/andreas/go/src/github.com/fraugster/parquet-go/parquet/parquet.go:9071 +0xa5
github.com/fraugster/parquet-go/parquet.(*ColumnChunk).Read(0xc000060360, 0x6e5230, 0xc0000160c8, 0x6e6990, 0xc000118aa0, 0x0, 0x0)
/home/andreas/go/src/github.com/fraugster/parquet-go/parquet/parquet.go:8965 +0x730
github.com/fraugster/parquet-go/parquet.(*RowGroup).ReadField1(0xc000060300, 0x6e5230, 0xc0000160c8, 0x6e6990, 0xc000118aa0, 0x1000f, 0x0)
/home/andreas/go/src/github.com/fraugster/parquet-go/parquet/parquet.go:9584 +0x19d
github.com/fraugster/parquet-go/parquet.(*RowGroup).Read(0xc000060300, 0x6e5230, 0xc0000160c8, 0x6e6990, 0xc000118aa0, 0x0, 0x0)
/home/andreas/go/src/github.com/fraugster/parquet-go/parquet/parquet.go:9480 +0x271
github.com/fraugster/parquet-go/parquet.(*FileMetaData).ReadField4(0xc000118a00, 0x6e5230, 0xc0000160c8, 0x6e6990, 0xc000118aa0, 0x4000f, 0x0)
/home/andreas/go/src/github.com/fraugster/parquet-go/parquet/parquet.go:11886 +0x1bd
github.com/fraugster/parquet-go/parquet.(*FileMetaData).Read(0xc000118a00, 0x6e5230, 0xc0000160c8, 0x6e6990, 0xc000118aa0, 0x673ea0, 0x1)
/home/andreas/go/src/github.com/fraugster/parquet-go/parquet/parquet.go:11753 +0x985
github.com/fraugster/parquet-go.readThrift(0x6e5230, 0xc0000160c8, 0x6e2060, 0xc000118a00, 0x6e20c0, 0xc00000e0d8, 0x0, 0x0)
/home/andreas/go/src/github.com/fraugster/parquet-go/helpers.go:108 +0xe7
github.com/fraugster/parquet-go.ReadFileMetaDataWithContext(0x6e5230, 0xc0000160c8, 0x6e3750, 0xc000074ea0, 0xc00005e201, 0x0, 0x0, 0x0)
/home/andreas/go/src/github.com/fraugster/parquet-go/file_meta.go:69 +0x56a
github.com/fraugster/parquet-go.ReadFileMetaData(0x6e3750, 0xc000074ea0, 0x1, 0x1, 0x0, 0x0)
/home/andreas/go/src/github.com/fraugster/parquet-go/file_meta.go:19 +0x8a
github.com/fraugster/parquet-go.NewFileReaderWithOptions(0x6e3750, 0xc000074ea0, 0xc00011fde8, 0x1, 0x1, 0x28, 0x67afa0, 0x489001)
/home/andreas/go/src/github.com/fraugster/parquet-go/file_reader.go:37 +0x472
github.com/fraugster/parquet-go.NewFileReader(0x6e3750, 0xc000074ea0, 0x0, 0x0, 0x0, 0x1872827b00000009, 0x625fd794, 0xc00011fe60)
/home/andreas/go/src/github.com/fraugster/parquet-go/file_reader.go:140 +0xa5
github.com/fraugster/parquet-go.FuzzFileReader(0x7fbfa1bfe000, 0x40, 0x40, 0x4)
/home/andreas/go/src/github.com/fraugster/parquet-go/reader_fuzz.go:13 +0xca
go-fuzz-dep.Main(0xc00011ff30, 0x9, 0x9)
go-fuzz-dep/main.go:36 +0x1b8
main.main()
github.com/fraugster/parquet-go/go.fuzz.main/main.go:31 +0xcd
exit status 2
parquet.go:7800 in this case is this line of code:
_, size, err := iprot.ReadListBegin(ctx)
if err != nil {
return thrift.PrependError("error reading list begin: ", err)
{{ }}}
tSlice := make([]Encoding, 0, size)
When debugging this, I've able to track this down to the generated code for ReadListBegin et al not checking the value of size before passing it on to make. This can cause the third parameter (slice capacity) to be negative, which in turn causes the panic above. I'm currently preparing a PR that fixes this issue which I will post shortly.