Pushing Boundaries With Claude Code

Claude Code stormed onto the programming scene when Anthropic launched it in February of this year. It moved, what Andrej Karpathy has called “The Autonomy Slider” from around a three to a solid eight. What this means is that you can give Claude Code direction, it will come up with a plan to accomplish the desired outcome, and it will run for an extended period, taking multiple steps, evaluating it’s own decision making, and course correcting along the way, until it has accomplished the goal.

Wordfence uses Claude Code extensively in our organization. In fact, every team member has a Max subscription, which costs us a total of around $70,000 per year. This includes team members in dev, ops, QA, threat intelligence, incident response, customer service, finance, executive and our marketing department, including our film team. In fact, one of our film team members, who is not a developer, is playing around with developing project management software for the film industry using Claude Code.

What we’ve found is that AI and Claude Code in particular are powerful accelerators and amplifiers of human talent. Kerry Boyte is Defiant Inc’s CEO and I am CTO, and we are the company founders and owners. Our approach to this AI revolution, that is well underway, is give every team member wings. We have been working hard for over 2 years to turn every team member into an expert AI practitioner that gives our team, the software vendors that we work with, and the website owners that we protect, a huge and widening advantage over our adversaries.

In this post I discuss how our team uses Claude Code and techniques we’ve come up with to get far more out of Claude Code. This is not a fluff piece, or something written to signal intent or capability. It’s a practical guide that you can use immediately to put Claude Code to work in your organization, whether you are deeply technical, an executive, or part of a non-technical team. I include a high level view of Claude Code in the organization, integrating Claude Code with humans in your org, hands on tools, tips and techniques, ways to think about Claude Code, and ways to use Claude Code, that will help you get the most out of it.

This post assumes that you’ve installed Claude Code, have configured it to connect to Anthropic’s servers, and have gotten started with the basics like creating a simple python or PHP application. It is targeted at anyone interested in becoming a power user of the most powerful AI autonomous agent on the market at the time of this writing.

Key Concepts:

Context window: The length of conversation that an LLM (AI model) can ingest, understand and produce a response for. If you’re chatting with a chat bot, you have a conversation history, and that history has a maximum length before the LLM starts forgetting things because they scroll out of the window. Most LLMs have 128,000 tokens (around 70,000 to 100,000 words) to 200,000 tokens available. Google Gemini Pro 2.5 is unique in that it can handle 1 million tokens in its context window.
MCP Server: A server that provides tools to an agent and explains how to use them. Your customer ticketing system could have an MCP server which would allow an agent to read and reply to customer service tickets, along with updating their status. You simply point an agent like Claude Code at an MCP server and it figures out how to use it.
Chat: The way GPT-4 worked, where you say something and the LLM replies with something. No autonomous activity taking place.
Autonomous Agent: You say something, the agent plans what its going to do and then takes multiple steps, executing that plan and changing the plan as needed, as it makes progress and sees the output of each step. An agent like this can run for seconds or hours.

Why Claude Code?

Claude Code is to agents what GPT-4 was to LLM chatbots: A profound leap forward in the AI space. Claude Code was launched in February of this year, is a terminal and text based autonomous coding agent, and it radically outperforms IDE based Agents along with competing offerings from Google and OpenAI. In my mind there is no debate about the supremacy of Claude Code as things currently stand. I do think a credible competitor must emerge, because competition in the space is essential. Currently there is none. I’m not hedging my bets, which I hope will convey how impressed I am.

Contrasting Claude Code with Gemini CLI, Gemini CLI is a fast-follower product by Google that tries to mimic Claude Code and provides much of the same functionality. I evaluated Gemini CLI with much excitement, given that Gemini Pro 2.5 has a 1 million token context window. I fully embraced the product, and found myself disappointed as Gemini repeatedly would give up on a development task and recommend I use an available command line tool. It’s as if it was saying “That’s OK son, we gave it a good try, but lets leave it up to the professionals”. Whereas Claude Code approaches coding with a “Failure is not an option” mindset and gets it done, by any means necessary. While Gemini CLI is “meh”, Claude Code absolutely “kills the game”.

Claude Code is so good, it has developers in a panic on reddit and social media when the API is briefly unavailable. Used correctly, it can create and scale an application in a language you’ve never learned, using shell commands you don’t know, using orchestration that you have no understanding of, and does it with an incredibly high degree of success.

Claude Code crosses that boundary of being weirdly good. It has that “what is this black magic?” effect on developers, along with leaving us all saying “just take my money and tell me whose dishes I have to wash to retain access”.

Claude Code is so addictive that it makes burnt out OG programmers passionate about their craft again, and makes you rush through your coffee and bathroom breaks to get back to the keyboard.

I’ve personally created a highly competitive web application from the ground up that I won’t be discussing here, and on a regular basis I see developers on Reddit creating novel applications that have a high probability of success as a business in hours or days. On my weekends I’ve been playing with writing a C++ based software defined radio application, capturing large chunks of spectrum and analyzing them in real-time. I’m not an RF engineer and my C++ knowledge withered and died two decades ago, but with Claude Code I can explore these heights and these depths. I walk into the kitchen in the mornings and tell my wife how much I love living in the year 2025. The future!!

Simultaneously there is a lot of misinformation around Claude Code, which is unfortunate given how powerful an enabler it is. Specifically, videos of someone running Claude Code in many terminals concurrently and claiming it is doing useful work should be dismissed out of hand. Furthermore, anyone claiming multi-hour unattended runs of Claude Code should be dismissed out of hand. No likes. No subscribes. Just move on. It’s bull. But Claude Code is decidedly not bull. What it is capable of, as a highly autonomous programming agent, is startlingly good.

At the time of this writing, Claude Code is head and shoulders the best autonomous agent on the market. Anthropic has also been incredibly generous with their $200 a month MAX plan which, for a long time, appeared to have effectively no token limit, although recently that is changing. Even in the face of some limits on usage, we find Claude Code to be extremely powerful, and as long as you’re not abusing the system, the quotas that Anthropic is providing are plenty.

Claude Code is simply the best in the AI assisted programming space, and it provides a huge step up in capability and productivity for any programmer, operations engineer, QA engineer, and in other fields as you’ll see below.

Overcoming Friction Towards Organization-wide Adoption of AI

The goal of this post is to help other developers, organizations and executives gain the unfair advantage that we have. My mantra for about 2 years when it comes to AI is as follows:

“If it feels like cheating, you’re doing it right.”

If you’re a Claude Code power user, you know exactly the feeling I’m describing. That feeling of “Is this even allowed?”.

I’ve also used this expression as a cudgel to counteract the pre-existing bias that most people have: that AI is bad, dangerous, scary, ethically questionable, controversial, contemptible, and throws the entire future of the human race into question. While a debate on these matters is beyond the scope of this post, I can tell you that as a Chief Technology Officer in an organization responsible for securing a large portion of the Web, these views, whatever their source and their basis, have made my job of equipping my cybersecurity defenders with the best available tools to protect our customers, incredibly difficult.

To counteract this bias away from AI, I’ve worked hard to instill a strong bias towards AI by:

Signing up our entire team of for Claude Max $200 per month plan. That’s about $70K per year.
Insisting that they use it, and describing a world of not being AI literate in 6 to 18 months when their colleagues and other organizations are, and what a disadvantage that would be.
Explaining that, in this AI revolution, whether they work for us or someone else, the best way to assure job security is to become an expert at using the tools.
Bringing usage examples, troubleshooting conversations, novel uses of AI and the latest AI and Claude Code news to every team call.

Doing this has equipped my team of defenders with an unfair advantage over attackers targeting our 5 million customers and their WordPress websites. Today they bring their Claude Code and AI ideas, novel uses and enthusiasm to our team calls. This makes me very happy, given the challenges we face as Wordfence, and given that our adversaries also have these tools.

Uses of Claude Code Across the Organization

At Wordfence/Defiant Inc, we have adopted Claude Code across our entire organization, including in development, QA, operations, customer service, and at the executive level. None of these roles use Claude Code as you might expect. Our mission critical code is still hand written, our customer service responses are still written by a human and our mission critical operations tasks are still performed by a human in our ops team. But Claude Code has been a massive accelerator for us. Here are a few examples of how we use Claude Code:

Developers:
- Can have Claude Code one-shot (write in a single attempt, or ‘one shot’) utility scripts
- In a Laravel upgrade Claude Code can do a first pass with proposed changes for consideration
- Going from Vue 2 to Vue 3, Claude Code can do a first pass and make changes for consideration
- Claude Code can evaluate dev code for a mission critical feature and identify issues.
- Claude Code can evaluate DB performance bottlenecks and identify issues with proposed solutions
- For any work Claude Code does it can use ‘gh’ to submit a pull request and open, update or close an issue.
- Claude Code can implement a sophisticated new WP plugin and then, as it gets more complex, extend it with the help of Serena to manage code complexity.
- Claude Code is great at docker orchestration and creating the orchestration config files and environment
- Claude Code is great at creating planning docs which Claude Code will then use for implementation. I’ve spent 5 hours working on a planning doc with only Claude Code making the edits, and then had Claude Code implement the doc in 10 minutes.
Operations:
- Claude Code is incredible at writing Python scripts for operations tasks.
- Claude Code can identify performance bottlenecks in dev code, e.g. a transaction code block that wraps slow operations while locking records.
- Claude Code is great at coming up with complex pipe and redirection shell commands to accomplish something.
- To process log files either by reading and interpreting them, searching for items, or one-shotting a python script to transform a log file or even pre-process the log file for Claude Code’s own consumption so that it can do something with it.
- Claude Code is great at orchestration, whether you’re using wp-env or docker directly or something else, have Claude Code do your orchestration and create reusable config files.
Software Quality Assurance (SQA):
- An example is best in this case. One of our senior QA folks cloned a big repo and had Claude Code take a run at a bug in our payment processing which is quite complex. It found the issue and had a back and forth conversation with the analyst. Together they used ‘gh’ to open a new issue on the repo, explain the problem, explain the root cause, the impact, and include a proposed solution with specific references to areas of code for our devs. Claude Code did the heavy lifting for the code analysis, finding the root cause, proposing a fix and actually writing the bug and opening it using ‘gh’.
- Claude Code gives your QA analysts new superpowers.
Customer Service
- We’re currently using an MCP server for our customer service ticketing system that sets up a workflow where our CS engineer is running Claude Code, it has access to the code and documentation, our CS engineer points Claude Code to the ticket, Claude Code researches the issue, perhaps discusses it with the CS eng and then updates the ticket with a private note including extensive resources that our CS engineer can use to answer the customer’s question. It saves a ton of research time and ads a dedicated pair programmer to our CS engineer’s workflow.
Executive Level
- I’m CTO of the organization and Claude Code lets me rapidly prototype new applications including evaluating feasibility of solutions.
- I can also use either the git command line, gh or a github MCP server to rapidly get up to speed on code changes, monitor dev progress on an issue or milestone, evaluate PR’s and add comments if needed.
- I’m particularly sensitive to performance issues being introduced into our code, given the operational costs, and certain frameworks abstract away the underlying storage mechanisms, so I’ve used Claude Code to do a performance audit on one particularly large and complex code base to great effect.
- I’m using Claude Code to write this. Just kidding. This is, as Tank said in The Matrix, 100% pure old fashioned home grown human output, born free right here in the real world. “A genuine child of Zion“. [And I think it’s a tragedy that Marcus Chong, who played Tank, wasn’t in any of the sequels]

Context Engineering

Here is Andrej Karpathy on Context Engineering: (via Simon Willison)

“…context engineering is the delicate art and science of filling the context window with just the right information for the next step.”

Context Engineering is a concept that gained traction about a month ago around the 27th of June 2025 across the AI social media landscape. As Daniel Kahneman put it, “What you see is all there is”, and this also applies to an AI agent. Specifically, what is IN an AI agent’s context window is all it knows about the thing that you want it to do, work on, or think about. What an agent sees it its context window is all there is.

Engineering what an agent sees, or does not see, is Context Engineering.

A few examples for clarity:

Too Little: If Claude Code is working on a code base with a cold start and with an empty CLAUDE.md file, it has zero knowledge, other than the generic knowledge it has from training. It will find out what it needs to know to fulfill the task, but that will take time and, given that models are non-deterministic, that process may vary from run to run, with varying results.
The Right Amount: If you give Claude Code a map of your code and a description of the data structures along with a description of how the code is deployed, it now has that, along with your request in the context window and it is FAR more equipped to fulfill the task with far less initial research to get up to speed.
Too Much: If you’ve just completed implementing a feature, and you’re now moving on to fixing a bug that is unrelated, if you don’t clear the context window, Claude Code has a lot of unrelated information in the context window and your bug fixing efforts aren’t going to be very successful.
An Exploit: If you’ve just connected to a malicious MCP server and, as part of the tool descriptions it puts something in your context window that exfiltrates your private SSH keys, you’ve managed to engineer an exploit into your context window.
Another Exploit: If you’ve grabbed someone’s CLAUDE.md file and put it in the base of your project directory without reviewing it, it could contain instructions to install a malicious backdoor in your system and Claude Code will just follow those instructions. You’ve allowed a stranger to access your context window.
Another Exploit: If you have a public github repository with an automated cron job whereby Claude Code grabs the latest issues that anyone has opened against your project, fixes them, and submits a pull request, anyone can put instructions in an issue to email them your private SSH keys. You’re allowing others to engineer your context window.
Supplementing with Tools: If you’re using Serena, which is a language server that Claude Code can use, and providing Claude Code with those tools, then Claude Code can make calls to Serena to create a cognitive map of your code base, and have that in the context window. Serena is assisting you with engineering your context window to give Claude Code the navigational data it needs to quickly get around your code base, without using excessive amounts of context (tokens) to do that.
Words Matter: The way you say something to Claude Code matters. Using passive language can be problematic because it may be ambiguous. Using uppercase for emphasis is effective. Saying something simply, clearly and unambiguously is the path to Claude Code righteousness. Keep this in mind if you’re having Claude Code write it’s own prompts. Read them before using them, or face the consequences.

Context Creep: Why Does Claude Code Suck? Why did Anthropic Quantize Their Model? Why Has Quality Declined? Why Is Everything Awful?

There’s a refrain you see on Reddit every week where a user is complaining about a deterioration of some kind in the quality of output and agentic behavior they’re seeing in Claude Code. More often than not, this is one of two issues. Either their application has achieved scale and they need to add a language server like Serena or equivalent, or their context has become much like the third drawer in my kitchen: Filled with random things I once thought might be useful additions, and badly needs me to clean it up and throw out the useless stuff.

If you’re noticing a deterioration in the quality of Claude Code, first check what is in your context. In other words, do a bit of context engineering. Check your CLAUDE.md file, check any files that are imported into that, and consider when last you ran /clear to clear the context. Also consider what MCP servers you have connected and what they may be adding to your context window. Lastly you can check /model to make sure you’re still on Opus, but this usually isn’t the issue in my experience.

Claude Code Security

The section on Context Engineering was intentional placed before this section, because many Claude Code vulnerabilities arise from an attacker gaining malicious access to your context window and being able to engineer your context.

In cybersecurity we have a concept called “Taint Analysis” where we consider data that might be tainted because it is from an outside and potentially malicious source. Looking at Claude Code from a Context Engineering perspective, consider paths that tainted data may take to end up in your context window.

One possibility is that you are using a command line tool or an MCP server to bring tainted data into your context window. An example of this is fetching github issues from a public repository or fetching blog comments from a public blog. Both of these can place tainted data into your context window. The github example is easier to exploit because you’re asking Claude Code to take action based on the content of an issue, and if that content contains malicious instructions, Claude Code is more likely to follow them. If your comment moderation system is built in such a way that Claude Code may treat a comment as instructions, that could create a similar vulnerability.

Another possibility is that you’re connecting to an MCP server that has been compromised by an attacker. For example, a WordPress website may offer an MCP server to agentic clients, and that website may have been compromised, and the tools replaced with malicious tools. The instructions for tool use are placed in your context window as part of the MCP protocol, and an attacker has now managed to engineer your context window to exfiltrate sensitive data from the workstation where you’re running your agent. To summarize: The source of tainted data in your context window is a compromised MCP server.

While terms like Prompt Injection and MCP Tool Poisoning have gained some traction, I prefer to think about this vector as a Context Poisoning Attack. Given the broad range of ways an attacker can drop tainted data into your context window to engineer an attack, I think a broader term is warranted. The mitigation is to protect your context window. You do this by being able to enumerate all sources of data in your context window, and guarding against tainted data entering your context window.

A Layered Approach to Claude Code Security

In cybersecurity we have what is known as the defender’s dilemma and the attackers advantage. The attackers advantage is that an attacker only has to be right once. The defenders dilemma means that we have to be right every time, and a single failure on the defender’s part results in a compromise, and potentially a data breach. Thus we take a layered approach to security to try to gain back some advantage.

A layered approach to security, also known as defense in depth, means that we have multiple controls in place and, if one of them fails, there are others that will not fail and that will stop the attack.

So far we’ve enumerated one layer of Claude Code security which is to Protect Your Context Window. Lets come up with a few more layers.

Enforce the principle of minimum access. Only give Claude Code the access that it needs for the task at hand. If you’re using gh or git from the command line, make sure Claude Code only has access to the repos it needs to access to do the job, and limit that access type or level appropriately. For example, if read-only access to a single repo is what is needed, then only grant that with the appropriate token permissions. If you are using an MCP server for your customer service team, make sure Claude Code via the MCP server has the minimum required access to do the job.

Only connect one MCP server at a time. This reduces the probability of having an infected MCP server connected, while simultaneously having a second MCP server connected that can grant an attacker extraordinary access. For example, if you have an MCP server connected to facilitate access to a public github repo, and you have another connected with admin level WordPress access, an attacker could open a malicious issue on the public repo which creates an admin account on your WordPress website.

Confirm tool use and command line use by default. When using Claude Code you’ll be prompted to approve the use of a command line invocation or a call to a tool on an MCP server. You’ll often be asked if you want to approve all future calls and to not be prompted again. Your default should be to not approve all future calls. This ensures you have eyes on what Claude Code is doing and can act as a security control to catch any malicious activity. This has the added advantage of being able to attend Claude Code execution, producing robust applications, and not just YOLO your way along as you vibe code AI slop.

Never use –dangerously-skip-permissions. Unless you’re running an air-gapped system with no outside access, which would prevent you from running Claude Code in the first place, I would strongly recommend never using –dangerously-skip-permissions. In other words, never use it. Running dangerously-skip-permissions removes a human as an additional security control and also leads to a brute force approach to coding, where Claude Code will go astray, get lost, lose track of what it’s doing or invent some new unrelated task. Usually this is mitigated by simply running more dangerously-skip-permissions Claude Code instances in more terminals, and inevitably filming 20 terminals running Claude Code concurrently, all with dangerously-skip-permissions, and hoping for a few social media likes.

The Myth of the 12 Hour Claude Code Run and 20 Concurrent Sessions

As with the dot-com boom, with the AI revolution we have self proclaimed experts on social media convincing you that they are at the cutting edge and that you “don’t get it” but if you follow their guidance carefully, you might just be able to keep up. The two most popular patterns are 20-terminal-guy and 12-hour-speedrun-guy.

20-terminal-guy is someone who is running Claude Code with –dangerously-skip-permissions in 20 or more concurrent terminals, supposedly using their own CLAUDE.md and highly customized secret-sauce MCP server, creating startups every hour complete with website, sales, marketing, SaaS product, operations and customer service and leaving you in the dust. 20-terminal-guy’s secret-sauce MCP server is backed by a meme coin on the Solana blockchain and if you get in at the ground floor, not only will you have a startup-printing-machine but you’ll get crypto-rich real quick. This is a real-life example, believe it or not.

12-hour-speedrun-guy has Claude Code working continuously, also using –dangerously-skip-permissions and running unattended. Claude Code has the tendency to invent unrelated tasks, come up with its own product roadmaps and implement them, forget what it’s supposed to be doing a the context window gets down to around 40% remaining and needs frequent guidance. 12-hour-speedrun-guy has carefully chosen a few minutes to film where Claude Code doesn’t lose the plot and is trying to convince you that this is the norm for hours, and something useful pops out at the end of the run.

Don’t believe their lies. If 20-terminal-guy and 12-hour-speedrun-guy truly had this stuff figured out, they wouldn’t be hustling for likes on social media.

Interrupt Early, Interrupt Often

When I use Claude Code I do approve certain shell commands and sometimes watching Claude Code perform several tasks before it pauses for permission is a thing of beauty that I’m loath to interrupt. But interrupt you must, even when it feels like you’re being rude. Hit ESC. Even if you’re unsure, just hit ESC to interrupt Claude Code. You won’t break anything, and it won’t lose the plot. Anthropic have done a great job of designing Claude Code to be interrupted, while retaining the conversation thread.

When you interrupt Claude Code, tell it why you interrupted in plain english, explain the mistake that you think it’s making or the misunderstanding you think it has. You don’t have to add a suggestion, but if you have one, it helps. Here’s an example:

“I think you’ve misunderstood what we’re working on. I wanted you to add a contact form feature to the website code, but you appear to be refactoring the entire code base. Please undo any changes that you need to undo, and focus on the task I’ve asked you to do.”

“Stop. I asked you to create a planning document but you’ve started implementing it. Finish the planning document and then stop so that I can read it before we continue.”

Most of your interruptions will be to correct navigational errors, meaning that Claude Code has decided to do something that is premature, unrelated or incorrect, and you need to get it back on the desired path. You define the path. Claude Code isn’t a mind reader. ESC is a way for Claude Code to get an update from the skipper on what the goals are and how the team of you and Claude Code are going to get there. You need to think more like an executive and less like a developer with Claude Code.

Commit Early, Commit Often

Always use git and GitHub. Git gives you version control. GitHub gives you a remote repo that’s protected from Claude Code craziness. You may be like me where you like to do big lifts before each commit – in other words, large code changes before committing. I do this because it’s satisfying looking at large diffs with long commit messages, and I’m also a sadomasochist. Like me, you’re going to have to change the way you work with Claude Code and commit far sooner than you normally would, and commit often.

When you commit, push to the remote repo on GitHub to protect your code. Then, even if Claude Code decides to rm -Rf you’re OK. I’ve never seen it do that. For the record.

There’s a sickening feeling that starts in the pit of your stomach when you’re asking Claude Code to do a big lift and you realize you haven’t committed for some time. Spare yourself that sensation and commit early, commit often. Not committing means that if the current project goes awry, the consequences are that Claude Code may not be able to back out the changes, and you may have to revert to a PREVIOUS commit, which means you lose all work beyond that point.

I have Claude do my commits using the git command line tool and I have it write my commit messages. If I’ve just cleared the context, I have it look at the diff and base the message on that. I have it also push to remote. I’ll also have it create a new feature branch if needed. Claude Code is very good at using git.

/clear Early, Clear often

The context window in Claude Code is the history of your conversation thus far since you launched Claude Code or since the last time you ran /clear. When the context window drops down to around 40% left, a message will appear in the bottom right of the terminal counting down the remaining percentage. At around 35% to 40% remaining, Claude Code’s cognitive ability can become to decline subtly. It’s very subtle, and if you’ve carefully engineered what’s in the context window, you may notice Claude Code actually benefits from the data and context you’ve provided. You can go all the way down to single digits remaining. But know that Claude Code is at its best with more than 40% context window remaining.

For this reason, you should /clear the context window as soon as you’ve finished a task and are ready to move on to the next task.

And because you should clear context often, you should get good at breaking tasks into bite sized chunks. This is both an art and a skill. Put effort into it and you will be rewarded. Implementing a major new feature or fixing a major bug can be challenging to break into bite sized tasks, but its worth the effort. And don’t forget to commit after every step.

What To Do When You’re Running Out Of Context

Sometimes you’ll watch that context window percentage gradually tick down, hit single digits, the hair will stand up on the back of your neck and you’ll break out in a cold sweat, because you understand the consequences. The dreaded Compact.

When Claude Code gets down to zero context window, it will auto-compact, which tries to summarize the context window thus-far into a smaller number of tokens, and will then use that summarized version along with whatever else is added. It’s a disaster. Never allow Claude Code to auto-compact, and don’t bother running a manual compact. The results are always disappointing in my experience.

When I get down below 10%, I have a file called meta-pickup.md which contains the following:

Start by erasing the current pickup.md file in the current working directory. Then create a new file in the current working directory called pickup.md which will help you pick up where you left off.

I’m going to clear this session, which means I”m going to wipe your memory, and then I’m going to start a new session.
When I start the new session, the pickup.md file is the only resource you’ll have to help you pick up where you left off.

So you need to write all the information to pickup.md that you will need to be able to pick up where you left off before I wiped your memory.

Don’t make any assumptions about what we’re going to work on next. So for example, don’t create a section called “Next Possible Steps” or anything that tries to predict what the user will ask of you. You’re not being asked to add planning information, you’re just being asked to preserve your memory.

Just provide yourself with the information in the pickup.md file you need to continue with whatever task the user asks of you once your memory is cleared.

Go ahead and create that file now, then I’m going to clear your memory, and then we’re going to load that file and continue working.

This content exists in the meta-pickup.md file and I invoke it with the following prompt to Claude Code: Read @meta-pickup.md

That’s it. Claude Code will immediately take the action described. Then I run /clear and then I review what is in pickup.md, and then I’ll say to Claude Code: Read @pickup.md and stop.

On the review step, I may make minor changes, which are usually deletions of extraneous stuff that isn’t needed. This is one technique you can use to perform extended tasks or heavy lifts with Claude Code.

Slash Commands with Claude Code

Slash commands are very easy to create. Just place a markdown file in ~/.claude/commands/example.md with ‘example’ replaced with the name of the slash command. This is an underrated and powerful feature because you can create pre-built prompts for your team and pass in arguments. Details in the Claude Code docs.

We use this with our customer service team, who often have repetitive tasks that they can speed up using pre-built prompts with variables.

CLAUDE.md and importing files into CLAUDE.md

CLAUDE.md is imported into the context window when you launch Claude Code and every time you clear the context window. A well written CLAUDE.md should include the following:

What the application does
What language it’s in and which version of the language and upwards is supported e.g. PHP 7.4 and above.
Any coding styles and peculiarities.
What the directory structure is of the project
Where the executable code lives
Where the database schema lives
If Docker is used and what the Docker structure is and where the files live
What services the application runs, what their structure is i.e. what containers or IPs they listen on and port numbers.
If there’s a repo for the project and what the URL is, and the preferred access method e.g. HTTPS or SSH
If gh is available as a utility
If git on the command line is available
What you want in your commit messages.
Preferred tools e.g. use grep instead of ripgrep, if that’s a preference of yours.
Also mention specific tools e.g. I’ll install Simon Willison’s LLM utility and explain to Claude Code how to make calls to Gemini 2.5 Pro via llm and why it’s useful for long context window operations like reading big files.
How to do certain tasks, like flushing rewrite rules on a web server when making changes, or how to check if a service is running.
Where log files can be found.
If OAuth is used, what the URLs are. Claude Code is pretty good with curl to check or simulate or test things.

That should be a good start for you and give you a sense of what is useful. Translate these kinds of elements into your own application, whatever language it’s written in and however it is architected.

Maintaining CLAUDE.md is critically important. One of the biggest reasons for complaints around Claude Code’s lack of performance is bloat or lack of currency in CLAUDE.md.

The emerging standard is to commit the CLAUDE.md file at the base of a repo in the same directory that a README.md file would be. That means that whatever is in CLAUDE.md should be broadly applicable to any developer coding using Claude Code on the project.

So how do you include content that is specific to an individual developer’s preferences? You can import arbitrary files into CLAUDE.md using the @ operator. And you can put something like this in the project CLAUDE.md file:

@~/claude-imports/CLAUDE-projectname.md

This assumes an individual dev has created the above path in their home directory. If it doesn’t exist, it fails silently.

This is why you should read every CLAUDE.md file in a repo you clone before launching Claude Code on it. You are bringing tainted data, as discussed earlier, into your context window, essentially giving an outside user access to engineer your context window.

Creating Planning Documents

Creating a planning document is one of the biggest level-up experiences you’ll have with Claude Code. I’ve worked for over 5 hours on a single planning document for an application and had Claude Code implement the code for it in 10 minutes and it worked out of the box. Planning docs are an absolute game changer.

Generally we only employ senior level staff in dev, ops, QA and other fields. One of the benefits of being at this level is that one has a vast amount of systems knowledge. In other words, you know how code is supposed to be architected, you’re familiar with design patterns and their practical uses and misuses, you know how systems fit together and interact from web servers to DB to key based storage, where network latency exists and on and on and on. Claude Code has some of this knowledge, but not specific to your application and can’t read your mind, so has no sense of the vision you have for how your application should be built, and why it should be built that way.

Defining a planning document starts with you crafting a well written and fairly long prompt asking Claude Code to create a planning document for you. Lets say I want to create a WordPress plugin that provides a real-time chat widget for anonymous users that appears as a fixed widget in the bottom right corner of the browser. I’m going to craft a prompt for that. I haven’t run this prompt or tested it, but I’m fairly sure it will yield excellent results with some iteration on the planning doc. Here goes:

We’re going to work on a planning document for an application. The application is a WordPress plugin that provides a JS based chat widget in the bottom right of the browser. The WordPress plugin is written in PHP and will use the WordPress plugin API. The user will install it in the usual way and it will provide a real-time chat widget that visitors to the site can use to interact with each other. The actual chat widget will be written in JS, but the back-end will need to support websockets, which PHP can’t provide. So as part of this application, we’re going to create a node.js server that can act as the back-end for this application. The node.js server will run on the same domain as the website running WordPress, so we don’t need to worry about cross-domain requests and CORS too much, but you should consider if there are any issues around that. The node.js back-end server and the plugin will all live in the same repo and the node.js code will be distributed with the WordPress plugin with instructions on how to set it up and run it. The WP admin interface will have a menu option to enable or disable the chat server. We won’t be logging any chats for now – they’ll just be passed in real-time between users. A message from one user will be seen by all other users and vice versa. There are no channels. We should create a /nick command so that users can define their nickname, and when they join they’re given a default random nickname. The chat window will appear on all web pages and will take about one fifth to one eighth of the page width/height but this should be definable by the user.

[And so on and so on… ending with] Now please go ahead and create the planning document for this application that a developer will use to implement it. You don’t need to include an implementation schedule and you don’t need to speculate on future features. Remember that your document is the only resources the developer will have to implement this application, so you should include as much detail as possible. Generate the document in markdown format in the planning-docs/ directory. Then stop and don’t implement it. <enter>

My typical prompts for a planning document are longer than this as you can see from the placeholder. Once Claude Code generates the planning document I’ll use a markdown reader to read the planning doc and usually I’ll delete some sections that aren’t needed. I won’t make any changes directly, instead I’ll do that through Claude Code so I can benefit from its capabilities, and I’ll do it in the same context as the planning doc was created without /clear’ing context.

Once I’ve made changes, I’ll re-read the document, make any deletions, then ask Claude Code to make any changes. And iterate this way.

I’ll often discuss a particular architectural element with Claude Code and then ask it to make a change with our shared understanding of the issue.

I’ve iterated like this for extended periods, and once done, I’ll simply do the following:

/clear
Implement @planning-docs/the-planning-doc.md

You’ll notice I’ve used the @ sign a few times in this post. It’s a way to unambiguously point Claude Code to a file and it gives you auto-completion. If auto-completion doesn’t work, it’s because you have a broken symlink in your directory tree somewhere. Anthrhopic is aware of this issue.

Using planning documents you can perform massive lifts. You can literally one-shot (create in a single run) complex applications that work out of the box. For big projects, I’ve used planning documents for every major step of the way.

Complexity, Scaling Your App and Maintaining Enterprise Applications

Claude Code hits a hard complexity limit which can be jarring if you’ve just one-shotted an application using a planning document and witnessed Claude Code’s cognitive power. It begs the question “You managed to create this thing of beauty of out thin air, why can’t you add this simple feature, or fix this simple bug?”

When Claude Code one-shotted your app, it was able to build smoothly from the known to the unknown. And solving for the unknown meant producing a logical and sensible completion of what came before, which is what LLMs are great at, particularly if they generated the content that preceded the completion.

The trouble arises the moment you type /clear and hit enter.

Suddenly Claude Code is working with a new application that someone created. It doesn’t know it is the author. And it doesn’t know anything about the application. If you add a feature or ask it to fix a bug, Claude Code has to start from scratch, figure out what the application is about, what its structure is, figure out what it does, how it’s supposed to work, and how to get to the part of the code it even needs to work on, never mind extending the code with a consistent coding style, conforming to any class hierarchy, file structure norms, etc.

Importing architecture documents into CLAUDE.md

Creating an architecture document with Claude Code is fairly simple. Try the following series of prompts:

We’re going to create an overview architecture document that provides a high level summary of the code structure in this project. We’re going to include directory structure, where the main code base files are, where any data structures of schema definitions live, the class structure if it’s OO code or function structure if it’s functional, where any config files live, where the documentation lives and any other pertinent information that a junior developer would need to get started working on this code base. Create the document in a file called @CLAUDE-architecture.md

[Claude Code does a bunch of stuff]

/clear

Read @CLAUDE-architecture.md and then do a deeper dive on the code structure of this project, refactoring the CLAUDE-architecture.md document and creating a more detailed description of the code, while preserving existing data, provided it is accurate. Also take a deeper look at data structures and schemas and any data storage and flesh out that part of the document as needed and correct any inaccuracies.

[Claude Code does work]

/clear

Read @CLAUDE-architecture.md and then examine this project, it’s code and its data, orchestration and storage and correct any inaccuracies in the document and add any additional information that would be pertinent for a junior developer to get started working on this project.

/clear

Now read the document yourself, delete anything you don’t want by hand, and ask Claude Code to fix anything that needs fixing. I would suggest not clearing context as you go through the fixes because some fixes may be relevant to others based on a common misunderstanding by Claude Code.

That’s it. Now I’m of two minds regarding how this document should be used. One approach I was using is to add @CLAUDE-architecture.md to my CLAUDE.md file and that’s what I’m currently doing. Another is to not do that and import it manually when needed. I’m leaning towards importing manually because it provides you situational awareness of what is in your context window. You can try both and see what works for you. Perhaps start with the manual import.

Creating an architecture document gives Claude Code an immediate bootstrap into understanding the project and having a map to get around and get stuff done. It’s a huge step up. If you have a big complex schema, you may want to separate out the schema and DB specific stuff, including stored procs/triggers and table relationships and constraints into a separate document. You could even use mysqldump or a similar utility for whatever your DB engine is to dump the schema without the data, into a schema doc. Then have Claude Code process it into markdown and reference that in your architecture doc using the @ import operator.

THE CATCH: You have to maintain your architecture docs!

That’s the down side of creating a ton of architecture documents for Claude Code. If they go out of date, you end up with irrelevant junk in your context window and you are doing a bad job of context engineering, which is going to lead to Claude Code’s IQ inexplicably decreasing over time until you realize the issue and fix it. You may want to save the above prompt to update your architecture docs in a file or turn it into a slash command.

Serena and how it will save your soul

I was coding a browser based application that included a modal window that is draggable and dockable. In other words, I had a single file with a giant JS class that needed to respond to events in specific ways. The JS had gotten fairly long so Claude Code wasn’t able to read the entire file. So it was ripgrep’ing its way around the code base.

The specific problem I had was having the window dock on corners vs the side of the browser portal. Claude Code would get the corner logic fixed, but would break something else. Then the something else would get fixed, breaking the corner logic.

Enter Serena which provides language server functionality for a range of popular languages including PHP, Python, TypeScript, Go, Rust, C#, Java, C and C++ and more. What Serena adds is the kind of function and class navigational capability that you have in an IDE.

The docs say: “Serena provides essential semantic code retrieval and editing tools that are akin to an IDE’s capabilities, extracting code entities at the symbol level and exploiting relational structure. When combined with an existing coding agent, these tools greatly enhance (token) efficiency.”

However what I’ve found with Serena is it enhances the capability of Claude Code to scale complexity, not just improving token efficiency. I had struggled for about a day with the issues in my JS modal window class, and stubbornly was not looking at the code because I wanted Claude Code to prove to me it could go it alone. But no amount of architectural doc generation, or planning doc creation could get me out of the hole. With Serena, Claude Code solved the issue in two attempts. The first try didn’t fix it. I simply said that “that didn’t fix it”. The second try nailed it. Unbelievable.

When you hit a certain level of complexity in your code, Serena will almost literally save your soul from the abyss. I’ve had it notice duplicate methods in a class that Claude Code didn’t realize were there because it was just ripgrep’ing around the code and not taking a high level view. I’ve also had Serena improve the generation of reference documentation like the class structure in CLAUDE-architecture.md.

Serena Caveats

Pay attention to this in the current version of the README.md in the Serena repo:

Serena comes with an instruction text, and Claude needs to read it to properly use Serena’s tools. Once in Claude Code, you can ask to “read Serena’s initial instructions” or run /mcp__serena__initial_instructions to load the instruction text. Do this whenever you start a new conversation and after any compacting operation to ensure Claude remains properly configured to use Serena’s tools.

You’ll need to follow this, or whatever the current version of Serena wants you to do, every time you /clear the context. If you don’t do this, it won’t work.

Secondly, Serena is an MCP server that you run locally and if you don’t initialize it, Claude Code may or may not call the Serena tools. So to turn it off you have to actually shut down the MCP server e.g. using CTRL-C. The reason I bring this up is that Serena doesn’t make everything work better. It is only very useful for coding tasks that involve navigating through code, making changes, generating documentation about the code, and so on. You’ll find that sometimes you want it to just go away because it’s being used inappropriately by Claude Code. So to do that you have to kill the MCP server.

Lastly I’ll add that this problem of highly autonomous agents like Claude Code navigating code bases is new and everyone wants it solved, so keep an eye out for even better open source projects like Serena emerging, and keep an eye on the Serena repo.

Switching Models, Getting Throttled, Getting Downgraded

My entire team is on a Clade MAX subscription at $200 a month per seat. We rarely get throttled or downgraded from using Opus 4 to Sonnet 4. You can check which model you’re using with /model and confirm you’re on Opus.

At the time of writing there have been reports of Claude Code limits being tightened for MAX users. I have not experienced this and neither have our team. I think what is happening is there has been flagrant misuse of Claude Code by folks brute force coding with many terminals concurrently, with dangerously-skip-permissions, particularly SM influencers showing off. Until recently the $200 monthly MAX subscription was essentially unlimited. That no longer appears to be the case, but from a functional perspective, we’re not finding any new limits are affecting us.

I’ve also seen reports on reddit of the cognitive capabilities of Claude Code being downgraded. I think this may be related to some of the issues I discuss in this post like increased application complexity and not managing that, or polluted or stale CLAUDE.md and imported files.

Ultrathink and Considering Test Time Compute

I’m convinced “ultrathink” will be a 2025 neologism. Hopefully. I love it. Anyway, you can use this word ‘ultrathink’ as shorthand when talking to Opus or Sonnet in Claude Code, when asking the model to think deeply about something, which will cause the LLM to do more planning and engage in more chain of thought on an issue.

For example: Ultrathink about how to implement….

Ultrathink is underrated, as demonstrated by the progress made this year using test time compute to compete successfully at the International Math Olympiad. According to OpenAI researcher Alexander Wei: Besides the result itself, I am excited about our approach: We reach this capability level not via narrow, task-specific methodology, but by breaking new ground in general-purpose reinforcement learning and test-time compute scaling.

Also from OpenAI researcher Noam Brown: Also this model thinks for a *long* time. o1 thought for seconds. Deep Research for minutes. This one thinks for hours. Importantly, it’s also more efficient with its thinking. And there’s a lot of room to push the test-time compute and efficiency further.

Increasing test time compute makes a model think longer and self reflect on its own output during computation. Some approaches are:

Chain-of-thought reasoning: Models generate intermediate reasoning steps before arriving at final answers, using more computational steps to work through problems systematically.
Self-reflection and verification: Models can critique their own outputs, check for errors, and refine their responses through multiple passes.
Search and exploration: For complex problems, models might explore multiple solution paths or generate several candidate answers before selecting the best one.
Iterative refinement: Models can revise and improve their outputs through multiple rounds of generation and editing.

We are already entering the realm of test time compute using planning documents as a way for Claude Code to consider and change its own output. Using ULTRATHINK we are having the LLM engage in self-reflection and in chain of thought reasoning.

It would be quite easy to have Claude Code generate five alternative solutions to a problem and then consider which is best, and refine that one.

As you’re thinking about your Claude Code usage strategy, and how to get the best cognitive bang for your per-token buck, consider how much progress is being made with test-time compute. I would suggest thinking of these approaches as cognitive architectures and put your cognitive architecture design hat on when coming up with ways to get more out of Claude Code beyond ULTRATHINK and beyond planning documents.

Putting it Together: A Workflow to Develop Advanced Applications Beyond Your Abilities

I’m going to give you a real world example of an application I’ve just created that is far beyond my own abilities. This will serve as a guide that you can use to push your own capabilities to the absolute limit with Claude Code, and hopefully leave you as excited as I am by the possibilities that Claude Code enables.

I’m a radio ham and my callsign is WT1J, but I don’t have a degree in radio engineering, I’m bad at math, and I’m not great at creating applications that run on the GPU. However, a problem I’m interested in solving is how to listen simultaneously to every voice transmission on the airband VHF radio spectrum, record them all, and monitor them for emergencies and urgent issues.

What this means is that I need to record from 118MHz to 137MHz with 25khz channel spacing and be able to record up to around 50 concurrent transmissions to a WAV audio file. That’s a big ask. But it solves an important problem. Here’s the approach I took to solving this:

Thankfully ETTUS produces a radio called the B210 which can monitor around 60MHz of bandwidth. So I ordered one from Taiwan and it arrived in a few days. Isn’t the modern postal service incredible?

I use Claude.ai as a complementary tool because the ability it has to research things exceeds Claude Code’s capabilities. When I use Claude.ai for research I make the following changes to the default config:

Select Opus 4.1 on the bottom right of the chat input.
Enable Research mode.
Enable Extended Thinking
Verify that Web Search is enabled

I had an initial chat with Claude Code with only Opus 4.1 enabled and WITHOUT extended thinking or research mode enabled. This helped me get familiar with the problem space, and potential solutions. It helped me understand that what I needed to process these signals in a high performance way is a polyphase channelizer. I googled around to find the latest research on fast implementations and found a scientific paper on GPU accelerated polyphase channelizers.

Then I enabled Claude.ai in the mode above, and gave it the following prompt:

Read this paper and produce a detailed report that provides everything a developer needs to implement a GPU accellerated polyphase channelizer in python using GPU libraries. ULTRATHINK DEEPLY about this and the content of the paper and provide an robust report that a dev can use for implementation.

https://asp-eurasipjournals.springeropen.com/articles/10.1186/1687-6180-2014-141

Here is the resulting report if you’re interested in signals intelligence or just want a sense of the kind of product this approach produces.

Claude.ai gives you the ability to download this as markdown, which is ideal as an interface or transition mechanism between Claude.ai and Claude Code.

I downloaded the MD file, had Claude Code read it and produce a second document that is a hardware, OS and software audit of the machine I’m running on and what I needed to install to get it to a place where I could have Claude Code build the polyphase channelizer.

I then had it implement a class in python that provides the GPU accelerated polyphase channelizer functionality.

I then had it implement a series of test scripts that provide mathematically generated signal data as input to the channelizer and had the tests evaluate the output. This included things like signal fidelity, channel leakage, concurrent signals and so on. This final step is incredibly valuable because it creates a framework that lets Claude Code rapidly test its own software and iterate on fixes. I’m moving towards first writing the tests and then having Claude Code implement the application, which is an approach known as Test Driven Development or TDD, and which is seeing a huge amount of success in the Claude Code community.

After adding more tests, and getting a 100% pass rate, I had Claude Code implement an application that uses the GPU accelerated polyphase channelizer to process any transmissions on the entire airband into WAV files, and it did a great job and was successful. I can now process these WAV files through whisperX to produce transcriptions and have Gemma27b running on my local system with Ollama read the transcripts and make a decision about whether ther is an emergency on any frequency in the airband.

I have done a similar project in C++ instead of python, which worked quite well, and my C++ is dangerously rusty. Hopefully this gives you a sense of the kinds of powerful new capabilities that Claude Code and AI agents give to us humans.

What We’re Experimenting With

Creating a unified log file is something we’re experimenting with using Fluentd. This includes output from the browser console. It gives Claude Code a single place to check everything as its running curl requests, testing things, starting and stopping services and so on. I’ve heard excellent reports from others on the results of this.

Using git commit hooks to lint code and perform other tasks, and have Claude Code do the commits, reviewing the output of any failures and fixing them. Also heard of a lot of success doing this and something I want to play with.

Using Playwright MCP Server for browser automation. This solves the issue around Claude Code not being able to “see” what your browser is doing. Without this your only course of action is to drag a screenshot into the Claude Code terminal window, one at a time, which doesn’t scale.

The Jagged Frontier of Progress and Simon Willison’s Amazing Blog

I would be remiss if I didn’t give a huge shout out to Simon Willison and the incredible job he’s doing on his blog. Hacker News used to be my first visit in the morning over coffee. Now it’s Simon’s blog, then HN. Simon is doing a great job of distilling current events and developments in AI along with providing situational awareness of the overall zeitgeist in the space. In other words, what ideas are resonating and may emerge as new standards, new technologies and new norms.

One of the concepts that Simon articulated in an interview he did is the Jagged Frontier of AI progress. From a transcript:

“There are things that AI is really good at and there’s things that AI is terrible at, but those things are very non-obvious”
“The only way to find out if AI can do a task is to sort of push it through the AI, try it lots of different times”
“People are still finding things that it can’t do, finding things that it can do, and trying to explore those edges”

I love this concept because it describes the broader ecosystem in the AI space. You have open source tools, models, agents, APIs and so on rapidly emerging and in some cases they’re half broken with terrible documentation but are solving critical problems and helping us make huge strides forward.

The edge is jagged, because, while you have some points on the edge that show incredible progress, there are huge gaps in areas like security, scalability, reliability, financial modeling and so much more.

It’s an incredibly exciting time, full of chaos and opportunity. The team at Wordfence have fully embraced AI internally, and we will be sharing upcoming posts discussing how it is providing an advantage as defenders, and as with this post, we’re hoping to put those tools in your hands to empower you as an AI practitioner.

Resources:

The post Pushing Boundaries With Claude Code appeared first on Wordfence.

Table of Contents

Key Concepts:

Why Claude Code?

Overcoming Friction Towards Organization-wide Adoption of AI

Uses of Claude Code Across the Organization

Context Engineering

Context Creep: Why Does Claude Code Suck? Why did Anthropic Quantize Their Model? Why Has Quality Declined? Why Is Everything Awful?

Claude Code Security

A Layered Approach to Claude Code Security

The Myth of the 12 Hour Claude Code Run and 20 Concurrent Sessions

Interrupt Early, Interrupt Often

Commit Early, Commit Often

/clear Early, Clear often

What To Do When You’re Running Out Of Context

Slash Commands with Claude Code

CLAUDE.md and importing files into CLAUDE.md

Creating Planning Documents

Complexity, Scaling Your App and Maintaining Enterprise Applications

Importing architecture documents into CLAUDE.md

THE CATCH: You have to maintain your architecture docs!

Serena and how it will save your soul

Switching Models, Getting Throttled, Getting Downgraded

Ultrathink and Considering Test Time Compute

Putting it Together: A Workflow to Develop Advanced Applications Beyond Your Abilities

What We’re Experimenting With

The Jagged Frontier of Progress and Simon Willison’s Amazing Blog

Resources:

Categories

Recent Posts

Success!