[HIVE-6140] trim udf is very slow - ASF JIRA

XML

Word

Printable

JSON

Details

Type: Bug
Status: Open
Priority: Major
Resolution: Unresolved
Affects Version/s: None
Fix Version/s: None
Component/s: UDF
Labels:
None

Description

Paraphrasing what was reported by cartershanklin -

I used the attached Perl script to generate 500 million two-character strings which always included a space. I loaded it using:
create table letters (l string);
load data local inpath '/home/sandbox/data.csv' overwrite into table letters;
Then I ran this SQL script:
select count(l) from letters where l = 'l ';
select count(l) from letters where trim(l) = 'l';

First query = 170 seconds
Second query = 514 seconds

Attachments

- Sort By Name
- Sort By Date
- Ascending
- Descending

temp.pl
05/Jan/14 00:10
0.1 kB
Thejas Nair

Activity

People

Assignee:: Anandha L Ranganathan

Reporter:: Thejas Nair

Votes:: 0 Vote for this issue

Watchers:: 3 Start watching this issue

Dates

Created:: 04/Jan/14 01:04

Updated:: 11/Jan/14 20:10