Details
-
Bug
-
Status: Closed
-
Major
-
Resolution: Fixed
-
0.5
-
None
-
OSX, Ruby 1.9.2, Thrift Gem version 0.5.0
Description
I came up with an encoding issue coming from the Thrift library, and especially the BufferedTransport class.
I've decided to write down few tests to give you a concrete example :
- encoding: utf-8
require 'spec_helper'
describe "encoding" do
before do
transport = Thrift::BufferedTransport.new(Thrift::Socket.new(MR_CONFIG['host'], 9090))
protocol = Thrift::BinaryProtocol.new(transport)
@client = Apache::Hadoop::Hbase::Thrift::Hbase::Client.new(protocol)
transport.open()
@table_name = "encoding_test"
@column_family = "info:"
end
it "should create a new table" do
column = Apache::Hadoop::Hbase::Thrift::ColumnDescriptor.new
@client.createTable(@table_name, [column]).should be_nil
end
it "should save standard caracteres" do
m = Apache::Hadoop::Hbase::Thrift::Mutation.new
m.column = "info:first_name"
m.value = "Vincent"
m.value.encoding.should == Encoding::UTF_8
@client.mutateRow(@table_name, "ID1", [m]).should be_nil
end
it "should save UTF8 caracteres" do
m = Apache::Hadoop::Hbase::Thrift::Mutation.new
m.column = "info:first_name"
m.value = "Thorbjørn"
m.value.encoding.should == Encoding::UTF_8
@client.mutateRow(@table_name, "ID1", [m]).should be_nil
end
it "should destroy the table" do
@client.disableTable(@table_name).should be_nil
@client.deleteTable(@table_name).should be_nil
end
end
It fails when it tries to save the UTF8 string including the caractere 'ø'.
Here is the output :
1) encoding should save UTF8 caracteres
Failure/Error: @client.mutateRow(@table_name, "ID1", [m]).should be_nil
incompatible character encodings: ASCII-8BIT and UTF-8
#/Users/vincentp/.rvm/gems/ruby-1.9.2-p0/gems/thrift-0.5.0/lib/thrift/transport/buffered_transport.rb:59:in
`write'
#/Users/vincentp/.rvm/gems/ruby-1.9.2-p0/gems/thrift-0.5.0/lib/thrift/protocol/binary_protocol.rb:107:in
`write_string'
#/Users/vincentp/.rvm/gems/ruby-1.9.2-p0/gems/thrift-0.5.0/lib/thrift/client.rb:35:in
`write'
#/Users/vincentp/.rvm/gems/ruby-1.9.2-p0/gems/thrift-0.5.0/lib/thrift/client.rb:35:in
`send_message'
- ./lib/thrift/hbase.rb:289:in `send_mutateRow'
- ./lib/thrift/hbase.rb:284:in `mutateRow'
- ./spec/thrift/cases/encoding_spec.rb:37:in `block (2 levels) in <top
(required)>'
Let me know if you need any other details, thank you !
Attachments
Attachments
Issue Links
- relates to
-
THRIFT-1224 Cannot insert UTF-8 text
- Closed