Blog

Detecting GenAI Usage in Critical Infrastructure

Dr. Rishabh Das

Critical Infrastructure Cybersecurity researcher at Assistant Professor, Ohio University || Web

Dr. Rishabh Das is an Assistant Professor at the Scripps College of Communication, Ohio University. Dr. Das has over a decade of hands-on experience in operating, troubleshooting, and supervising control systems in the oil and gas industry. Dr Das's research portfolio includes virtualization of Industrial Control Systems (ICS), threat modeling, penetration testing in ICS, active network monitoring, and the application of Machine Learning (ML) in cybersecurity.

Large Language Models (LLMs) and public chatbots are revolutionizing the tech industry. LLMs are accelerating coding, helping cybersecurity practitioners learn sector-specific knowledge, and accelerating numerous technical workflows

An LLM takes information or questions from the end user and provides meaningful analysis or answers. The challenge is that most publicly hosted LLM models tend to memorize the end user provided artifacts, meaning the information can become public if an end user uploads or shares confidential details related to critical infrastructure assets. This creates a new cybersecurity challenge.

Training and creating awareness on the safe use of GenAI can alleviate some concerns, but when detecting LLM communications on sensitive networks, using passive monitoring tools can help security teams take rapid action.

Our first article focuses on several ways to identify unauthorized end user interaction with a public Large Language Model (LLM). This article explores several advanced detection techniques, including how to watermark sensitive data to identify future leakages and using the existing cybersecurity tech stack to detect local LLM tools.

Client TLS communication analysis (Correlating JA3 and ALPN)

In the first articles, we explained how the fingerprints on the server side JA3 hash can detect communication with a known GenAI catering domain.

In this method, the detection flips the viewpoint to the client. GenAI applications like the OpenAI Python SDK produce a unique JA3 hash and advertise an Application Layer Protocol Negotiation (ALPN) of an application. A passive network monitoring tool can match the JA3 hash and the ALPN, thus fingerprinting the potential GenAI usage. The technique’s efficacy stems from how the JA3 hash condenses the TLS ClientHello information. This information includes cipher suite and binary and signature algorithms packaged into a 32-byte MD5 hash. Since most SDKs use Python libraries like urllib3 or httpx, so their hello message differs significantly from a traditional browser and SCADA traffic. The detection system can use a list of in-use SDKs, operating system versions, and software to outline a known baseline.

Analyzing network telemetry

Endpoints interacting with a public LLM’s website exhibit a specific type of network traffic pattern. The end user typically enters a large query or data segment into the LLM as a prompt. In response, the LLM model streams small chunks back to the endpoint at 50-100 ms intervals. A passive network monitoring tool can check for a traffic pattern that resembles a burst of outgoing data from an endpoint to a public LLM, followed by responses trickling back into the network.

Most LLM models such as Gemini, Anthropic, and Gemini use token streaming to respond rapidly to user questions. The model senses 20 to 40 tokens and uses a payload size of 128 -130 bytes. We simulated the behavior, checked the results on Wireshark, and found consistent results.

Detect Local LLM models

When end users train an LLM locally, they risk sensitive data being integrated into the model. That, in turn, means there may be proprietary data leakage that could expose confidential information to the public. This is why keeping track of any possible Local LLM usage within the organization’s network boundaries is crucial.

Most endpoint detection and response tools can use a combination of artifacts to trigger alerts and notify the security team. Training and inference cycles in a local LLM model require heavy GPU and VRAM utilization. The models also spawn several binaries (like *.gguf, tokenizer.model, libvllm.so). Typically, LLMs also exhibit near-constant PCIe cycles. While an endpoint running a GPU-based application might also show heavy VRAM usage, the key is to combine several artifacts, like the artifacts from the spawned binaries and PCIe write cycles, into a detection trigger. Passive monitoring tools can also play a key role in alerting a security team to threats. The monitoring tools can integrate with data aggregation tools like Sysmon to provide endpoint-level information and detect instances when the end users are using key files like torch.*.pyd or libcuda.so.

How to watermark your critical data?

Numerous research papers indicate that GenAI models or text generation models can memorize sequences. A security team could inject a synthetic tag, like an equipment ID number or a non-existent PLC tag, into documentation and the project files of critical components. Then the security team can periodically scan public LLM responses or leak sites to assess whether the tracked data was accidentally leaked to the public domain.

The embedded random string can look like {Company-alias}-{Asset Class}-{Asset identifier assigned by the security team.}

The “Company-alias” is a series of names that can be mapped to a specific organization. The general recommendation is to use a series rather than a constant string to make an adversarial recon attempt harder in case the alias is leaked.

The “Asset Class” highlights the asset type. The security team can use “IT” or “OT”, depending on the asset, or have a more granular asset class.

The “Asset identifier assigned by the security team” can be a unique identifier assigned to specific files, like ladder logics and project files.

The security team can track and run periodic checks for the embedded strings. Any positive hits should trigger an incident response. For advanced analysis, I encourage readers to explore the field of “membership inference,” which aims to determine whether a specific data point was used to train a given artificial intelligence model.

LLM is the next big tech advancement, and as security personnel, we are just scratching the surface when it comes to securing it. More time and research will undoubtedly create more “questions” on LLM and GenAI’s true potential, as well as the threats and vulnerabilities that accompany it. Clearly, monitoring usage and systematic expansion are key to safely integrating this technology as we continue our efforts to secure critical infrastructure overall.

Back to Blog

October 3, 2025

Detecting GenAI Usage in Critical Infrastructure

Dr. Rishabh Das

Client TLS communication analysis (Correlating JA3 and ALPN)

Analyzing network telemetry

Detect Local LLM models

How to watermark your critical data?

Share this entry

Become a Subscriber

You might also like:

Establishing Visibility with OT Asset and Network Baselines

Enhancing OT Vulnerability Management with Visibility

Detecting Threats and Anomalies in OT Environments: The Basics