What to choose: Better data, or better algorithms?
An eternal question of this big data age is: what to choose, better data or better algorithms?
So far, most [but not all!] of the deception users we interacted with seem to be using their deception tools as "a better IDS." Hence our discussion of the business case for deception (here and here) was centered on detecting threats.
Naturally, there are many detection tool categories (SIEM, UEBA / UBA, EDR, NTA, and plenty of other yet-unnamed ones) that promise exactly that - better threat detection and/or detection of "better" threats!
During one of the recent "deception calls" it dawned on us what separates "deception as detection" from those other tools:
Deception tools rely on "better source data", such as attacker'sauthentication logs, attacker's traffic, files that the attacker touched, etc while most other tools rely on "better data analysis" of data such asall logins, all traffic or all files touched, etc.
So, can we say which one is better? Until we can have a cage match of a deception vendor with, say, a UEBA vendor, we probably won't know for sure.
The largest enterprises (the proverbial "security 1%-ers") will "buy one of each" (as usual) and the smaller ones will wait for a product that combines both feature sets with a firewall.
For example, one of the interviewees outlined an elegant scenario where a deception tool and a UBA / UEBA tool are used together. We hesitate to say that this is the future for everybody, but it was an interesting example of the "strength-based" approach to tools…
Still, "detection by better source data" has unique appeal to people who are just not willing to "explore all data." Our contacts report "low friction", better signal/noise, low/no "false positives" and low operational burden for deception tools [used for detection].
Hence, unlike the "all data + smart algorithms" that may be philosophically superior (since looking at ALL data will theoretically allow you to detect all threats, but … can we really have ALL data?), some organisations are choosing "decoy-sourced data" and seem happy with their decisions.