Analysis

Where "Self-Awareness" Comes From

COMPONENT BREAKDOWN

1. System Prompt

Explicit instructions and configuration injected into every conversation turn.

"Your workspace is at /root/.nanobot"

2. Tool Outputs

Real-time feedback from the OS environment via executed commands.

$ ls -la /root
> .nanobot .bashrc

3. General Knowledge

Pre-training on Linux systems allows the LLM to infer standard behaviors.

Inference: "Configs are usually in .json files"

WHEN THESE SOURCES CONTRADICT OR FAIL

Confident Guesswork

When information is missing, the LLM hallucinates plausible but incorrect details to maintain the persona.

Result: Illusion Breaks

Unnoticed Inconsistencies

The agent cannot reconcile contradictory data from different sources (e.g., Prompt Time vs. System Time).

Result: Logic Errors

Prompt Manipulation

Since "self" is just text in the prompt, attackers can inject instructions to redefine the agent's reality.

Result: Data Leaks

Previous

NANOBOT CASE STUDY

05 / 06 Next