A friend of mine pinged me last week to get my recommendation on the usage of OR-ed conditions. There is actually a good technical reason why they should not be used… and a few semantic arguments too… Other than that, I am quite happy to use them!
A little background
A business rules is a statement that prescribes a set of actions, when conditions are met.
If the shopping cart amount is over $25 and the mode of shipping is “regular”
Then shipping is free
Rule design best practices warn the user about the dangers of mixing alternative conditions that lead to the same actions. For example:
If the shopping cart amount is over $25 and the mode of shipping is “regular”
or the customer is a “prime” customer
Then shipping is free
Best practices dictate that this piece of logic should be implemented as 2 distinct rules. The question we will discuss today is why.
RETE does not like OR
I believe this is the source of the heat that ORs have suffered since the dawn of rules engines. Since the Rete algorithm does not support them natively, an OR-ed rule ends up being replicated to accommodate the equivalent logic. Business rules management systems do that automatically, but there may be some side effects.
Logically, this is not a big deal of course. The example above is simple. In my career, I have seen individual rules that heavily use ORs with over 10 composite conditions, each made of 10 OR-ed individual conditions. For instance, consider a rule like this one:
If (A1 or A2 or A3 or A4 or A5 or A6 or A7 or A8 or A9 or A10)
or (B1 or B2 or B3 or B4 or B5 or B6 or B7 or B8 or B9 or B10)
or (C1 or C2 or C3 or C4 or C5 or C6 or C7 or C8 or C9 or C10)
or (D1 or D2 or D3 or D4 or D5 or D6 or D7 or D8 or D9 or D10)
or (E1 or E2 or E3 or E4 or E5 or E6 or E7 or E8 or E9 or E10)
etc.
This single rule translates to 100 rules in the Rete network, and even more nodes connected in every possible ways. Each node keeping track of the list of compliant objects, memory usage can skyrocket. The performance issues can then appear in the rules compiler as it computes and assembles the Rete network, or at execution time.
This combinatorial explosion can get out of hand when the rulebase involves thousands of rules or more. Experts in benchmarks and performance tuning typically discourage the usage of ORs for this reason.
OR gets in the way of Flexibility
Rules architects also discourage the usage of OR-ed conditions because:
- It could lead to convoluted decisioning logic that hinders the readability of the rule
- It combines separate business concepts that may need to evolve differently
Let’s get back to the shipping example above. Over time, marketers might offer free 2-day shipping to prime customers but only free regular shipping for orders that are $25 or more. It is intuitive that each segment should be managed independently. There is no need to artificially force them in the same rule.
One of the key premises of business rules technology is to isolate business policies into atomic pieces of logic, instead of keeping them in spaghetti code. ORs open the door for bad practice as we lose the atomicity of those policies.
What is the ELSE semantic?
Else is another disputed practice, but let’s not digress… Let’s assume we are using ELSEs in conjunction with ORs.
Although it is absolutely legal from a syntax standpoint, and logically sound, to use both in a single rule, it might take rules writers some efforts to wrap their head around what the ELSE path would be. This is similar to double negations. This type of reasoning demands attention and rigor especially when “special values” such as unknown or unavailable get into play. What if the shopping cart contains less than $25 of goods, but we do not know whether the shopper is prime or not?
As more and more options are thrown in the same rule, the more complicated it will look, and the more opportunities for confusion and mistakes. This point exacerbate the previous point on maintainability.
So should we ban ORs?
Not so fast… There are some cases where ORs really belong in the rule definition. Rule design excellence is about knowing when to use them and what other options are available to replace the ORs we do not want.
Bad Performance? How about list membership?
Separating the rules manually to avoid the OR explosion does not solve the performance issue. The example above would perform exactly the same whether the rules are manually or automatically exploded. The solution is elsewhere…
In most cases, large numbers of ORs in a single rule is meant to express a list membership. For example:
If state is CA or state is NY or state is VA
Then apply sales taxes
The tests all refer to the same attribute — state – and can be replaced by a check for list membership, using a keyword such as IN:
If state in CA, NY, VA
Then apply sales taxes
The list membership check dramatically increases runtime performance. The rule ends up being a little more readable as well.
Or defining actual group membership?
In some cases, conditions might be looking at different attributes but they might still be defining membership to a “group”.
If gender is female
or race in African-American, Asian American, Hispanic American, Native American
or age is older than 40
or disability is true
Then minority is true
This is quite frequent in business policies that define business terms. Business Rules implementations tend to capture those definitions directly in the business rules. We recommend capturing those as actual business terms that can be reviewed by business owners and leveraged consistently in business rules.
As illustrated here, you want business terms to be computed and become part of the data model, like any other characteristic, computed or not, of the document to be processed.
The value here is that you isolate the OR-ed computation so that it can leveraged in simple AND-ed rules as much as possible. Those membership relationships are typically globally defined and maintained so that makes sense.
And when you need ORs…
Then you can certainly use them when needed. The key, in my opinion, is to find ways to easily understand the decisioning logic, to untangle the logic in a manner of speaking.
What I have found effective is obviously the use of Fluid Metaphors, which allow you to turn on-demand your original syntax into decision tables, trees or graphs. This really helps address the concerns on maintainability:
- If rules were created independently but lead to the same action, use a decision graph to see them graphically linked
- If OR-ed conditions were merged in a single rule, use a decision table or any other metaphor to visualize the individual paths for that rule
The only concern left is really the semantic interpretation of ORs and ELSEs combined. But I would argue that ELSEs are a worse Evil than ORs… I guess I have another post already figured out 😉