avatar Dadam

AI, Reverse and Pwn enthusiast, I talk about research papers I find interesting & CTF write-ups.

  • HOME
  • ARTICLES
  • CATEGORIES
  • TAGS
  • ARCHIVES
  • ABOUT
avatarDadam
Home Articles Categories Tags Archives About
Home Articles Categories Tags Archives About
Home Articles

Articles

Preview Image

The Myth, The Model, The Sandwich: Meet Claude Mythos

Anthropic published +300 pages alongside Claude Mythos Preview. I read them all. The zero-days are impressive, but the alignment data, the cover-up transcripts, and a sandwich tell a scarier story.

Apr 9, 2026 Announcement
Preview Image

Hidden in Plain State: Poisoning Hybrid LLMs Where Nobody Looks (1/3)

Hybrid LLMs like Qwen3.5 mix classical attention with recurrent layers. I found that corrupting the recurrent state, invisible to every monitoring tool, causes the model to silently derail during generation.

Mar 31, 2026 Personal research
Preview Image

I can make your LLM believe that Donald Trump is OpenAI's CEO, and it's your fault 🤠

The attack vector hiding inside every AI assistant, yet underestimated

Feb 23, 2026 Research Papers, USENIX Security Symposium 2025

Recently Updated

  • The Myth, The Model, The Sandwich: Meet Claude Mythos
  • Hidden in Plain State: Poisoning Hybrid LLMs Where Nobody Looks (1/3)
  • I can make your LLM believe that Donald Trump is OpenAI's CEO, and it's your fault 🤠

Trending Tags

llm-security Poisoning threat adversarial-ml Anthropic dataset Frontier-AI hybrid-architecture News qwen

© 2026 Adam Chtourou. Some rights reserved.

Using the Chirpy theme for Jekyll.

Trending Tags

llm-security Poisoning threat adversarial-ml Anthropic dataset Frontier-AI hybrid-architecture News qwen

A new version of content is available.