In this post, I’m going to experiment with the VALUEIN()
formula inside FILTER()
to better understand how it really works.
is a formula that returns a boolean indicating if a certain value is in a list. For example, if we want to know if ‘US-001’ is in the ‘CustTable’, we would do:VALUEIN()
VALUEIN("US-001", CustTable, CustTable.AccountNum)
And we’ll get true if it does exist:
Now, let’s try the same in ‘DAT’, where I don’t have any customer:
So it’s a simple-to-use formula, and we could use it to filter records. Let’s say we have this table:
And we want to filter from ‘CustTable’ all the records where ‘AccountNum’ is in the field ‘FieldString1’ from our table. We could do it with WHERE()
or with FILTER()
. WHERE()
will work in memory, while FILTER()
will transform our formula into a SQL query, so the latter is more efficient (WHERE VS FILTER).
FILTER(CustTable, VALUEIN(CustTable.AccountNum, TST_table001, TST_table001.FieldString1))
And we’ll get:
So everything in order. Now, let’s try the same but with an empty ‘TST Table’:
Also great, so it seems this problem doesn’t apply:
The main problem with this formula is this:
So if we are using FILTER()
, a SQL statement with a lot of ORs will be created, and sometimes this SQL statement will be longer than the max length allowed for a SQL statement. They give us the option to use VALUEINLARGE()
, but the problem is that this formula only accepts INT and INT64, basically RECIDs. Sometimes an EXISTJOIN
would be used, and this should fix our max character limit… so let’s try it:
We’ll be able to check our queries in ‘Organization administration > Electronic reporting > Configuration debug logs’.
SELECT T1.ACCOUNTNUM,T1.RECID FROM CUSTTABLE T1
WHERE ((T1.PARTITION=5637144576) AND (T1.DATAAREAID=N'usmf')) AND EXISTS (SELECT 'x' FROM TST_TABLE001 T2
WHERE (((T2.PARTITION=5637144576) AND (T2.DATAAREAID=N'usmf')) AND (T1.ACCOUNTNUM=T2.FIELDSTRING1)))
ORDER BY T1.ACCOUNTNUM
We shouldn’t have problems with too many ORs, but let’s try the same with an already filtered list:
SELECT T1.ACCOUNTNUM,T1.RECID FROM CUSTTABLE T1 WHERE
(((PARTITION=5637144576) AND (DATAAREAID=N'usmf')) AND
(((((((ACCOUNTNUM=?) OR (ACCOUNTNUM=?)) OR (ACCOUNTNUM=?))
OR (ACCOUNTNUM=?)) OR (ACCOUNTNUM=?)) OR (ACCOUNTNUM=?))
OR (ACCOUNTNUM=?))) ORDER BY T1.ACCOUNTNUM
In this case, we have a problem. The system is not able to do any 'EXIST JOIN', 'IN …'
or any expression that would keep the SQL length acceptable.
OK, now just for the fun of it, let’s break it by adding thousands of records in ‘TST Table’:
Now, let’s try the same but filter the ‘TST Table’ so it returns 0 records:
The query that it is using is:
SELECT T1.ACCOUNTNUM,T1.RECID FROM CUSTTABLE T1
WHERE ((PARTITION=5637144576) AND (DATAAREAID=N'usmf'))
ORDER BY T1.ACCOUNTNUM
It seems that now we are getting all the records, so the bug feature we thought was fixed, wasn’t.
I’m gonna try one last thing with the ‘CustTable’ filtered:
Now I’m getting the correct records because it is combining the queries, FILTER()
, and VALUEIN()
.
SELECT T1.ACCOUNTNUM,T1.RECID FROM CUSTTABLE T1
WHERE (((T1.PARTITION=5637144576) AND (T1.DATAAREAID=N'usmf'))
AND ((T1.ACCOUNTNUM<>?)
OR (T1.ACCOUNTNUM<>?)))
AND EXISTS (SELECT 'x' FROM TST_TABLE001 T2
WHERE (((T2.PARTITION=5637144576) AND (T2.DATAAREAID=N'usmf'))
AND (T1.ACCOUNTNUM=T2.FIELDSTRING1))) ORDER BY T1.ACCOUNTNUM" />
Conclusions
Based on limited experimentation and not necessarily on documentation, VALUEIN()
will:
- If the second parameter is a filtered list and it has 0 records, and we are using it inside a
FILTER
formula, it will return all records of the table we are filtering. But if the second parameter is an unfiltered list, it will return zero records. - An
EXIST JOIN
or a similar SQL statement will be created if the second parameter of theVALUEIN()
is a non-filtered list. - If the second parameter is a filtered list, the query will have a lot of ORs. If the list keeps growing in production, sooner or later we’ll get an error during execution.
Alternatives
If we think we cannot use VALUEIN()
because it will convert our formula into a query with a series of ORs, we may:
- Use
VALUEINLARGE()
if we want to joinINT
orINT64
fields. - Create a boolean in the table we want to filter that tells us if the same value exists in the other table, and then filter the table with those records where the boolean is true.
- Filter the final table inside the first, then use either
ALLITEMS()
andLISTDISTINCT()
if we have a list inside a list, or if we have a record inside a list, justFILTER()
the first table records with non-null second table records.
Any of these alternatives can have huge repercussions on efficiency, so we must understand how the size of the tables will evolve and find the best solution.
Leave a Reply