Details
-
Task
-
Status: Closed
-
Major
-
Resolution: Fixed
-
None
-
None
Description
BigQuery offers the CONTAINS_SUBSTR function that returns TRUE if a substring is present in an expression and FALSE if it is not. A basic example of this may be seen in [1].
The expression can take many forms (more info in the linked doc) which makes its implementation more complex than other string functions that only accept arguments of type STRING. For instance, the expression to be searched can be a column or table reference.
The function also has an optional third argument called json_scope where the user can indicate the scope of JSON data (keys, values, or both) to be searched for the substring.
I am curious if anyone has thoughts on how the search of rows or tables could be implemented. I have a basic implementation (that supports expressions of type STRING and nothing else) that I will open a draft PR for as a starting point. To me, the challenge is implementing the additional features like the JSON_SCOPE argument (seen in [2]) and performing a cross field search as seen in [3].
[1] SELECT CONTAINS_SUBSTR("hello", "he"); would return TRUE.
[2] SELECT CONTAINS_SUBSTR(JSON '("lunch":"soup")', "lunch",
json_scope=>"JSON_VALUES") AS result;
would return FALSE.
[3]
{{SELECT *
FROM Recipes
WHERE CONTAINS_SUBSTR(
(SELECT AS STRUCT Recipes.* EXCEPT (Lunch, Dinner)),
'potato'
);}} would return:
Breakfast | Lunch | Dinner |
---|---|---|
Potato pancakes | Toasted cheese sandwich | Beef stroganoff |
Attachments
Issue Links
- links to