Beyond Major Floods: Deep Learning for Detecting Shallow Water Inundation in Agricultural Areas
Three-class Sentinel-1 SAR segmentation for shallow agricultural floods. ResNet-UNet matches DeepLabv3+ at lower cost.
anti big data big data guy · do models understand the world, or only imitate it
A sushi chef learning his cuts, except the fish is a model. I take models apart by hand to see what is really happening inside.
I am currently a Research Collaborator at SDU's Data and Intelligence Lab and at the HKUST DataVISards group, TA for AI at SDU (where I built an interactive AI course), and Founder of SaturoLabs. I have been admitted to the MPhil in Machine Learning and Machine Intelligence at the University of Cambridge, which I begin this October. My research has been accepted at venues including ICML workshops and IEEE VIS, one as an oral, and I release code and model weights wherever I can.
What I work on
I work on the foundations of how models learn. The thing I keep circling is how the learning process shapes the representations a model ends up with, and whether those representations capture the world's real causal structure or only imitate it. That puts me at the intersection of representation learning, causal representation learning, and the theory of learning, with counterfactual reasoning as my favourite test of whether a model understands or only pattern matches. I like to work phenomenon first, building small settings where I know the causal ground truth so I can watch what learning actually recovers. These questions sit under generalization, safety, and any real science of deep learning, because a model that only imitates causal structure breaks exactly where we most need it to hold.
Q1Does the learning process actually install representations capable of genuine counterfactual reasoning, or only ones that imitate it?
Q2And if it is only imitation, what would a learning process have to do differently to install the real thing?
I am most drawn to groups working on the theory and mechanisms of representation learning, and I am always glad to talk through these questions.
A few papers that best represent my current research direction.
Self-reports are evidence about behavior in a prompt, not proof of a self-model. Across three open instruction models, wrong-source demonstrations pull affect-state reports toward the source family unless mechanism binding holds.
Moral framing is linearly decodable in pretrained models, but only causally routed to judgment after instruction tuning.
Position: mech-interp should import the graph vocabulary of network neuroscience (modularity, rich clubs, motifs) under explicit contracts.
A browser-extension design probe (N=22) reveals three human-AI workflow archetypes for fact-checking data-driven journalism.
Three modalities, one through line. Hover a paper to see its neighbours light up. Click to open its page.
Peer reviewed and accepted work first, then preprints.
Three-class Sentinel-1 SAR segmentation for shallow agricultural floods. ResNet-UNet matches DeepLabv3+ at lower cost.
Five mechanisms by which AI coding agents make implicit architectural decisions. Vibe architecting, formalised.
A 188-question Bloom's-taxonomy benchmark for LLM knowledge of cloud-native software architecture. 22 model configs evaluated.
A reference architecture for LLM-agent dataset search: BM25 + dense fused via RRF inside a Plan-Retrieve-Evaluate loop.
Moral framing is linearly decodable in pretrained models, but only causally routed to judgment after instruction tuning.
Self-reports are evidence about behavior in a prompt, not proof of a self-model. Across three open instruction models, wrong-source demonstrations pull affect-state reports toward the source family unless mechanism binding holds.
Position: mech-interp should import the graph vocabulary of network neuroscience (modularity, rich clubs, motifs) under explicit contracts.
A browser-extension design probe (N=22) reveals three human-AI workflow archetypes for fact-checking data-driven journalism.
A four-gate evidential standard for safe-fine-tuning defense claims. SafeLoRA fails the full-card pass under matched-action audit.
Position paper: mechanistic interpretability alone should not gate deployment. We propose a calibrated verification regime with a Verification Coverage metric.
Ten segmentation models benchmarked on tiny cardiovascular histology datasets. Rankings are unstable; foundation models generalise best.
Lightweight ML on hyperspectral data matches deep models for fruit ripeness. Three visible wavelengths recover 94% accuracy.
No publications match these filters.
My bachelor thesis, with code and model weights released.
A conformal verifier between autonomous bidders and the Nordic balancing market
With Tim Lukas Adam. Heimdall sits between LLM bidding agents and the grid, and it only lets a bid through if it stays above the operator's loss limit with the coverage they asked for. It holds that guarantee straight through the March 2025 rule change, and every accepted and rejected bid stays auditable.
News, talks, and achievements, most recent first.
Where I've studied and worked, in chronological order.