Details
-
Bug
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
None
-
None
-
None
-
Vectorization: Implement StringExpr::find() (Teddy Choi, reviewed by Gopal V)
Description
Currently, the LIKE expression implementation is a dumb StringExpr::equals() loop.
For an input of N bytes and a pattern of M bytes, this has the complexity of ((N-M)*M), which is not an issue with small patterns or small inputs.
The pattern matching is currently optimized for matches, while in clickstream data the opposite is true in general.
From the common crawl data, the following run will go through the same
select count(1) from uservisits_orc_data where useragent like "%Opera%" and searchword LIKE "%fruit%";
Attachments
Attachments
Issue Links
- links to