Yes, I'm going to wade into the subject of AI
The death of the creative professions, or the birth of new media?
There is no doubt that 2023 is the year of AI.
It’s all over the news, and not just the tech pages. From endless puffery about OpenGPT’s ominously-increasing powers and concerns about “fake essays” being submitted in schools and colleges (the debate is moving fast toward a point where it’s already been conceded it’ll be impossible to prevent, and it’s merely a matter of establishing whether students have fact-checked the results, or even actually read them) to the implications for creative industries like screenwriting (the Writers’ Guild recently weighed in on the subject, as part of its dogged attempts to get writers properly paid and protected for their work).
It’s a huge subject, so I’m going to wander around it in a few posts over the next couple of weeks, mainly as an attempt to encourage a conversation — because I don’t have any easy answers. And I’m going to start with art, because it’s the area where I have the least skin in the game.
Of several competing AIs in the art space, Midjourney is the one you’re most likely to have seen work from. It’s got a steady lead on the competitors in terms of the speed and focus with which it’s being developed and refined — and its output is (to my eyes) by far the most “artistic”. In the next piece I’ll highlight some of the cutting edge images the new version is producing, but for now… welcome to the world of Midjourney, my gateway to the dark side.
It’s kind of a long post, but there’s lots of pictures. Make a coffee and settle in.
As part of my job, I produce “decks” for film and television pitches. This is a relatively new way of trying to sell shows, an import from the corporate and startup world: in effect a slide presentation outlining the key elements of the property while attempting to suggest the tone and visual texture. Part of the process involves spending ages on the web trying to track down imagery: not only does this take a long time, but usually you wind up trying to evoke the look piecemeal.
So yesterday I thought, huh: why not try using an AI?
To use Midjourney you have to pay a subscription to access the service and set it up on a Discord server: this is not rocket science, but neither is it a total walk in the park. I went the further step of accessing it via a private server on Discord — not for reasons of privacy, but because otherwise your prompts get lost in the endless stream of other people using it. Once that’s done, you dig in.
This too requires some prep. The AI will respond to a wide variety of text input, from brutally short sentences to very long paragraphs tightly specifying every aspect of subject matter and style. You have to find out what works for you and what doesn’t (and this is where spending a little time on the public server is useful, so you can see how other people are doing it, and the results they’re getting).
So, sure, that’s a “skill” — though hardly on the level of decades spent acquiring the fine motor control involved in real drawing or painting.
It’s immediately clear that Midjourney has an in-built aesthetic, an undercurrent discernible regardless of tweaks you make, a result of the reference material it’s been fed. I’d throw up a bunch of other people’s images here to illustrate, but Substack has a limit on email size. Just Google “Midjourney images” — not now, obviously, but when I’ve finished banging on. It’s a style I happen to like, and so already — despite the artificial intelligence underlying all this — personal and at least semi-artistic choices are in play. Why do we return to certain artists? Because we like their style.
The “better” Midjourney gets, the wider its source catchment area will spread, and the less distinctively Midjourney its images will become (much as the initially cool Instagram filters were replaced with blander “improvements”.)
I may not like “late period” Midjourney as much as I do now, another inevitable comparison with real art.
It works like this:
You type in your prompt. After a few minutes you’re presented with a grid of four images. You have the option to roll the dice again — resulting in a wholly new grid; or to produce four new variations based on one of the initial images; or to upscale one that’s already there. (Each of these actions costs towards the limited number of interactions you get with the AI as a product of your subscription tier).
I wanted a picture of a young woman, alone, in a city at night. I added a couple of qualifying adjectives relevant to the project at hand. My very first set of four images resulted, after an upscale, in this:
Which is… pretty damned good. To be clear, this is not a stock image but a unique picture, that has never been seen before. Further playing with variations got me to:
And that’s more than good enough, and in fact exactly what I needed for the deck.
A machine did that. In about forty minutes, start to finish, and that includes me going off and making a cup of tea. It cost next to nothing. Further playing resulted in this:
Which also works for me, and is now in the deck. And eventually I wound up producing this, too out-there for my present needs but which I also kind of like:
For someone who can’t paint, but wished he could, there’s an undeniable sense of wonder in all this. With no physical skill whatsoever, you can generate images that look cool. For me it’s especially exciting when it comes to generating environments, because I’ve long been fascinated with places that look real, but aren’t. (I drove my son lightly insane with this when he was playing the iteration of Assassin’s Creed that’s set in a very well-done model of Revolution-era Paris, because I wanted him to explore every single little room and back alley we came across).
Okay, there’s a tiny bit of “skill” involved. Partly in navigating the not-too-hard tech of it, and then acquiring experience in prompting. Also something more genuinely creative — the ability to envisage a prompt in the first place, and then make aesthetic judgement calls on images you’re presented with: deciding which to work further on, which variations you like, and which you don’t. You need to have an inbuilt vision before you start, and that requires a little artistic ability, or at least a pre-existing visual take. You might then take the Midjourney results across into Photoshop to make more deep-level adjustments. There are cropping decisions to be made, too. You need an eye.
It’s a process. Artistic work is, to some degree at least, being done.
But the thing is…
… I’m not an artist. I’m not seeing my entire livelihood suddenly at risk. I’m not watching aghast as a commission I might once have been paid hundreds of dollars for — money I for food, rent, my kid’s education — is suddenly fulfilled by a computer operated by a non-artist. And to add insult to injury, everybody else in the world is thinking: “Whoa — cool!” while I suffer complete career apocalypse.
Luckily, the process is far from perfect at this point. Sometimes the initial prompt will result in four choices which all look not-great — either very obviously computer-generated, or simply not right, and at times surprisingly crap, like the AI was drunk — and slogging through variations only makes it worse and worse. It’s not like working with a real artist, where you can say “Love it overall, you’re a genius, but can you take out that dog on the left and move the light source so we get a highlight on the woman’s hair, and lose the handbag, without changing anything else?”
There are other issues. For a different project, I typed in:
man standing alone on a street in a small Welsh village in the twilight, photo
And got the following —
Not bad. Nicely atmospheric. I like #3. But if you look closely at the architecture, especially the doors, you’ll see there’s something a little wonky about it. I happen to love that wonkiness, because it feels off-kilter and uncanny, dreamlike, but it’s a function of the AI understanding what buildings look like, but not how they work. There’s also the small matter that two of the figures are missing their heads.
I later asked Midjourney for a post-apocalyptic vision of Santa Cruz, California, and it generated a landscape that looked nothing like my town, with buildings that went beyond the merely wonky and into looking like they’d been designed by an architect on drugs (which isn’t wholly inappropriate for Santa Cruz, of course).
This will change. Concept artists have a breathing space while that happens. But… it probably won’t be long. And then what?
… the deck imagery stuff is fun and useful, and doesn’t feel like it’s destroying anybody’s job — I’d never have the budget to commission artists or photographers, which anyway won’t be seen beyond internal audiences. Aside from that I’ll only play with Midjourney for my own amusement. But the issues with AI aren’t simply those of encroachment on livelihoods — serious though that is.
There’s also the knotty and increasingly relevant issue of “truth”.
Because check this out. Here’s a real painting of Paris, by Russian artist Evgeny Lushpin. (His work veers hard into the overtly chocolate-boxy and certainly won’t be to everybody’s taste, but I enjoy parts of it. You may like superhero movies, while I think they suck. Let’s agree to let everybody have their own pleasures).
I decided to try an experiment. I typed the following simple prompt into Midjourney:
beautiful village in snowy mountains in the style of Evgeny Lushpin
A few minutes later, I had my grid:
Which is… astonishing. I mean, what the actual fuck? By typing a single sentence into my laptop, I suddenly wound up with four entirely new paintings in the style of a painter I like. They’re mine. I “made” them. That’s… weird.
I went through a couple of variants and regenerations, and wound up with this:
A “painting” that — unless you had access to a catalogue raisonné of Lushpin’s work, which doubtless doesn’t exist — you would be hard-pressed to tell was not the genuine article. It me about fifteen minutes.
So now there’s a Lushpin in the world… that he didn’t paint.
And suddenly it starts to feel far more personal, because if I came across a novel that someone had spat out of an AI “in the style of Michael Marshall Smith” I would be pissed. Not even because of the potential loss of earnings: but because being Michael Marshall Smith is my preserve. It may not be fun all the time, but I’ve worked at it. I’ve earned it. Being Michael Marshall Smith is my gig, and nobody else’s.
I decided to push the envelope, and typed in:
a panda walks through a beautiful mountains village in the snow, in the style of Evgeny Lushpin
And before I knew it… (well, after a brief tweak or two)…
Mr Lushpin, if by some bizarre chance you see this, I’m sorry. I’m trying to make a point, on your behalf. Because now we’re taking an artist’s style and imputing work to him that he would never have done. Fine if it stays on my computer. Otherwise, not okay.
So at the moment I’m caught between deep concern at the implications for creatives, and an undeniable sense of wonder at what the technology can do — and the fun I can have making up shit in an creative arena I don’t possess the skills for. And I’m a rank amateur at it, too. In the next post on the subject I’ll show you what the cool kids are doing… at which point it’ll start to become more obvious where the dangers lie.
Because beyond art, there’s the next level, the kind of thing shown below — images that (however much we might personally wish they reflect reality) should not exist: because they’re lies. Once our minds have seen something it can’t ever be wholly unseen, and so now a tiny piece of our consciousness lives in a world that isn’t real.
That’s dangerous. And that’s also AI.