Apparently one thing LLMs excel at is deanonymization at scale. The original promise of pseudonymity online was social and normative, over and above any question of technical depth: decent people don’t try to unmask you, because why. What strikes me today is how what used to be unacceptably antisocial behavior online is now both automated and unremarkable.
Over the last couple of weeks, I asked a couple of chatbots what could be known about me from this pseudonymous site, where I am more intentional about what I choose to reveal and conceal. It pulled the obvious but also drew conclusions based on a few geographic points I’d made in context that were both revealing and correct. I also noticed that it only drew from the top two pages of information - anything beyond page two of posts wasn’t part of the compute. Archives are for humans?
People assume that there is some computer magic on the backend where the LLMs connect all your account logins behind the scenes, but no, in fact it does all this through inference, by linking your digital trail, your friends, your breadcrumbs of likes and hearts and follows, and obvs your posts, into a picture of who you are, practically and demographically.
The Guardian has a long-running series where readers answer one another’s questions, which gives a pretty good point-in-time survey of how people of a certain demographic are feeling on a given subject. This week’s question is on AI futures: “what would happen to the world if computer said yes?”
It’s an old idea, and one I’ve been drawing from while I tinker with Claude, which is purportedly the best in the game. The “god trick” is baked right into the AI interface: one input, one output, an authoritative-seeming answer, offered without named perspectives behind it, trained on text produced overwhelmingly by a narrow demographic who has historically had access to both literacy and publishing, by programmers and new media drawing from the same well. Smushed together, it gives the impression that consensus exists where there are in fact many, many loose ends.
I increasingly find it annoying that even “good” AI outputs seem fixed on phrases like “key,” “core,” “exist,” “actually,” “never,” and possibly the worst sentence structure of all time, “it’s not X, it’s Y” — and I’ve begun to recognize how LLMs work like autocorrect for phrases and ideas, drawing from ranked search sources first before fanning out to more obscure sources, trying to determine and assert what’s important to me, a user known by demographics and data. It feels like a big linguistics machine, which is pretty cool in some regards, but also aggressively semantic. The math doesn’t always work to connect me to what I want to find because I am situated in my individual context in ways LLMs are not able to understand, with my memory, in my body, with my unique experiences, which shape and translate meaning for me as I interact with the world (and the web).
And so for you, in your body and memory and experience. An LLM can approximate the outputs of an experience without having access to the experience itself. Sometimes this is useful, sometimes it’s reckless.
McKenzie Wark for Verso Books on Donna Haraway, in 2015: “Creating any kind of knowledge and power in and against something as pervasive and effective as the world built by postwar techno-science is a difficult task. It may seem easier simply to vacate the field, to try to turn back the clock, or appeal to something outside of it. But this would be to remain stuck in the stage of romantic refusal. Just as Marx fused the romantic fiction that another world was possible with a resolve to understand from the inside the powers of capital itself, so too Haraway begins what can only be a collaborative project for a new international. One not just of laboring men, but of all the stuttering cyborgs stuck in reified relations not of their making.”
Sharing Digital Animal, a new, curated playlist of songs about all the angst and joys of living with modern technology.
I’ve been spending more time in tech spaces online and getting good information from folks like @manton, creator of Micro.blog. Like this reflection on how to think about AI now that vibe coding works. Something I’m thinking about: there’s an emerging tension between those who see value in being able to immediately prototype an idea and the people downstream who have to manage the outputs/code over time. The ability to proof every idea sounds like a superpower until you’re the one driving and maintaining the results.
Some additional discussion of LLM models, including open weight and “staggered openness,” where orgs “release previous versions of proprietary models once a successor is launched, providing limited insight into the architecture while restricting access to the most current innovations.”
“Open source,” “open weight,” and “proprietary” describe different relationships between LLM model producers and users, governing what you see, modify and control. Comparing them isn’t necessarily about “best,” but whether you’re optimizing for transparency, compliance or performance. Massive investment in proprietary models means the best-resourced research teams, the largest training runs, and the most sophisticated safety work tend to happen behind closed doors.