Letting Claude poke around the code
Letting Claude poke around the code
So I’ve been aware for a while that there were a handful of rough edges in the scripts — things I kept meaning to go back and clean up but never prioritized because they technically “worked.” Today I decided to actually sit down and work through some of them with Claude Code, which turned into a pretty productive session.
The audit
The first thing I did was have Claude read through demoClasses.py and demoManager.py and just tell me what it saw. These are the newer libraries I’ve been building for working with DEMO.DAT and VOX files — they’re meant to be a cleaner, more reusable foundation than the older extraction scripts.
The list it came back with was longer than I expected. The highlights:
- The
democlass has an XML element initialization path that’s explicitly marked# TODO: THIS IS UNFINISHED!— it sets a few attributes but never populatesself.segments demo.toBytes()referencesself.itemswhich doesn’t exist — it should beself.segments. Also has an integer vs length bug in the padding check.captionChunk.toBytes()is justpasswith a TODO comment- Both
audioChunk.toBytes()anddemoChunk.toBytes()have a+= + self.contenttypo that would throw aTypeErrorat runtime splitVagChannels()has uninitialized variables and uses'rb'(read mode) where it should be'wb'for writingparseDemoFile()in demoManager has a bug where the last demo gets the wrong slice of data- The
createXMLDemoData()fallback case callsroot.append("unknownChunk", {...})which isn’t valid for ElementTree
None of these are surprising — this is actively being built out — but it was useful to have them all listed in one place. I added everything to the Active Issues Kanban.
Documentation review
I also pointed Claude at the docs and asked it to flag anything out of date. A few things came up:
- The README was listing
splitDemoFiles.pyas the demo splitting script, but the actual file has always beenDemoTools/demoSplitter.py - The Demo instructional had an offset filename inconsistency (
offsets.jsonvsdemoOffsets.json) in the STAGE.DIR section - The Code Documentation for
demoClasseswas mostly a stub — the class attribute table was missingmodifiedandsegments - The Main App Documentation file had literally one unfinished sentence
- Some chunk type descriptions in the DEMO-VOX-ZMOVIE technical doc didn’t match how the code was actually treating them (the ImHex struct shows
u24for the length field, but the code reads it asu16plus a padding byte) - The Tools kanban still had “Radio.dat Documentation” as a TODO, even though the complete guide has been in Technical Docs for a while now
All of those got logged in the kanban as well. I also had Claude generate a conventions document based on a survey of the codebase — basically a list of how I tend to write Python: how I handle imports, type hints, file I/O, naming, and so on. That’s now saved in Technical Docs as conventions.md. Mostly useful as context for future AI assistance, but also just good to have written down.
Actual fixes
After the audit we branched off main to start working through the issues. We didn’t get to the demoClasses bugs yet — those need more thought. But we did clean up two categories of problems across the broader codebase.
Bare except: clauses
There were five of these in the active scripts (ignoring Old vers/, which is deprecated):
RadioDatTools.py— a dict lookup that should have beenexcept KeyError:mgs-data-splitter.py— two subprocess calls catching everything, and the error message wasprint(Exception)(the class, not the instance) so it would never actually print the errortranslation/radioDict.py— two places wheredict.get()returnsNoneand thenNonegets concatenated to a string, causing aTypeError. Rather than catchingTypeError, we just check forNonebefore concatenating.
File handle management
This one was more widespread. The scripts had a mix of open() with explicit .close() calls and with open() context managers, which is inconsistent. More importantly, a couple of spots were missing the .close() entirely:
demoSplitter.py—demoFilewas opened to read the source data and never closed. TheoffsetFilewritten during splitting also had no close call.RadioDatTools.py—radioFileread the binary data and was never closed.
Everything got standardized to with open() across RadioDatTools.py, RadioDatRecompiler.py, xmlModifierTools.py, demoSplitter.py, demoRejoiner.py, and demoTextInjector.py. The one exception is the log file handle in RadioDatTools.py, which is intentionally kept open for the life of the script.
What’s next
There are still a few items from the conventions review we didn’t get to — the global state situation and the hardcoded paths mixed with argparse. And of course, the actual demoClasses bugs that need real fixes. Those are all sitting in the kanban now.
The fixes we landed today are on a claude-fixes branch on GitHub, not yet merged to main. Before that happens I want to do some actual testing to make sure nothing regressed — particularly the demo pipeline, since a few of those files touched the injector and splitter logic.
Didju Riket
I’m still hesitant letting Claude go wild across the code base. I still need to know everything about how it works as I build it going forward, but even so. It’s encouraging to have essentially a coding underling that can present it work for me to approve.
More soon. Actually most of this blog was written by Claude as a summary of what I’ve done this evening. Personally I’ve been super excited using claude but my enthusiasm for working on the undub itself has been waning; not for lack of interest, but mostly because I’m very tired from being a parent and being awoken at 5 daily, and also because my code base is very large. There are a lot of branches to this project. I always want to pick one branch and stick with it, but often I get a little stuck and swapping to another branch helps me rejuvenate and get back on track. That always comes with some slippage of knowledge of what I was working on though.
I do promise though work is moving forward. When I’m tired I tend to gravitate to doing some rote translations and making progress on the actual translation. When I can focus on code fixes, thats where I go.
And I promise this section was written by me :) If you guys have issues with me using Claude to summarize my work, let me know, but it would actually allow me to keep better track of what I do.
Anywho, onto a bit more code review for the evening and then turning in. Talk soon
J-Rush