Uploaded image for project: 'Apache Drill'
  1. Apache Drill
  2. DRILL-5899

Simple pattern matchers can work with DrillBuf directly

    Details

      Description

      For the 4 simple patterns we have i.e. startsWith, endsWith, contains and constant,, we do not need the overhead of charSequenceWrapper. We can work with DrillBuf directly. This will save us from doing isAscii check and UTF8 decoding for each row.
      UTF-8 encoding ensures that no UTF-8 character is a prefix of any other valid character. So, instead of decoding varChar from each row we are processing, encode the patternString once during setup and do raw byte comparison. Instead of bounds checking and reading one byte at a time, we get the whole buffer in one shot and use that for comparison.
      This improved overall performance for filter operator by around 20%.

        Attachments

          Issue Links

            Activity

              People

              • Assignee:
                ppenumarthy Padma Penumarthy
                Reporter:
                ppenumarthy Padma Penumarthy
                Reviewer:
                salim achouche
              • Votes:
                0 Vote for this issue
                Watchers:
                3 Start watching this issue

                Dates

                • Created:
                  Updated:
                  Resolved: