Do you remember when wide-open S3 buckets led to countless data breaches? I sure do, and after a while it became a meme of sorts. for a while it seemed like half of all publicized "breaches" were just data buckets sitting around and waiting for someone to tap in to. While I'm sure there are still some out there, it's a lot less prevalent than before.
But, did we learn our lesson? I doubt it, as there's a disturbing trend I've noticed for a while now, which is that AI pilots are put into production, with access to corporate data, and then just...abandoned.
First, many of these hypotheticals will be internal to corporate networks, but many are likely set up as cloud-accessible and might be locked behind simple authentication. Because, after all, these are just pilots and are designed to test responsiveness and for data accuracy, but might not have SAML/SSO set up or require VPN access to test. This could be the new, prevalent data breach category in the near future as many companies jump on the AI bandwagon.
I'm not anti-AI (unless we're talking about a post-apocalyptic, Terminator-style world) and I use various iterations each week, whether it's ChatGTP, Gemini or LinkedIn's AI helper, but I also recognize that this type of technology is being deployed and ease of access and is meant to streamline the way we access data.
The two issues I'm going to talk about here isn't applicable only to AI, but to all technology deployments, but I believe they will make AI data facilitation concerns more visible, just like S3 buckets were, and before that, iSCSI drives (yes, there are iSCSI drives sitting on the open internet, no I don't know why someone would do that).
The first issue is about sunsetting. Sunsetting is all about retiring a piece of technology once its purpose has been completed, it has become redundant or has been replaced with a newer model/version/implementation.
The second issue is about lax access controls during testing/POC/pilots. It's common to set up a POC with simple auth because the intent is to show the capabilities of the technology platform, not to secure access and prevent unauthorized use.
Let's talk about sunsetting.
As I mentioned above, sunsetting is common in many areas within the realm of IT, very often performed in regular tasks like version updates to an application, replacing an older server with a newer one or upgrading to a new OS version. It is not, however, often considered for other IT or IT-adjacent technologies. There's OT (Operations Technology) consisting of manufacturing equipment, there are timeclocks for HR/Payroll, there are application stacks that are integrated on a back-end and "just work" for many years, and there are pilots for new technology.
To this day I have never seen sunsetting built into a project plan, let alone if the project is deemed lacking and does not move forward. This runs the risk of the common scenario of misalignment with business goals where project like AI can slip between the cracks and just be forgotten about by leadership. This runs the risk of leading to audit failures, circumventing access controls and leading to data leakage. Insurance carriers can deny a claim if they find that attestations were in conflict with an abandoned system having unfettered access to vital company data.
I believe sunsetting should become standard in project management for dealing with a canceled implementation. Project management normally focuses on getting a product to deployment phase, but doesn't often have complete roll-back if the decision becomes to not move forward past the pilot. If project managers start to include a simple process path about removing the integrations and spinning down the implementations in their templates we, as an industry, could make a lot of progress on risk prevention.
Lax access controls during pilots should never be a thing. We should all put a stop to it now. I am guilty of this, just like we all have been at one point, but in today's world of easy SAML/SSO we should disallow this on each new project and require it before we get to the pilot phase.
And there are good reasons for it, even if we're not "concerned" about the security yet.
Let's imagine a scenario where a company wants to use AI to assist in delivering approximate, conversational data for Sales and Product teams, and assume that you want the pilot to be for a controlled group. You do not want John in Accounting and Mary in Marketing to participate in the first pilot, because the requirements for this project came from the Sales and Product Development departments. It might seem easy to implement basic auth for this because it's only a few stakeholders that will examine it, but what if they like it and ask that you include a dozen other employees in the POC? Are you now going to set up a bunch of other basic auth accounts for those employees as well? What if a senior leader co-opts the project and insists on immediately opening it up to the Engineering, Purchasing and Marketing teams?
It would be far easier to set up federated, group-based access with a SAML/SSO connector in the beginning instead of facilitating basic auth accounts and then changing it later, which could result in confusion, deletion of WIP user data, etc. It's far easier to change group membership than to manage basic auth at scale.
Putting both of these concerns together I believe we have real risk to data through AI channels developing at a rapid pace, and I truly hope we all start to consider sunsetting as part of our projects and incorporate SAML/SSO authentication as early on as possible on projects like these in the future.