posts

|

knowledge

|

midjourney

9 October, 2023 | 0 mins | 0 words

Midjourney

Chronicling a digital artist's journey through various creative tools and technologies, from Photoshop and 3DS Max to advanced AI models like Wombo, Stable Diffusion, and Midjourney, highlighting the evolution of digital art.

Back in 2012-13, growing desirous of creating cool digital art featuring planets, spaceships etc., I picked up Photoshop and 3DS Max. At Deviantart I still have a gallery of things I made, rendering 3D models of spaceships in 3DS Max and using Photoshop to create planets, space backgrounds and the like. I’m neither an artist nor a designer, but with this began a journey of whatever artistry/design chops I might possess. The journey even took me to a delightful little tool called Rainmeter, where using a combination of some basic graphic design and elementary code, one could customize one’s Windows desktop screens any which way. One Rainmeter theme I made was widely plagiarized, saw more than a million downloads, and was even featured in some Japanese graphic design magazine!

The above preamble was to give a sense of my association with digital art/design prior to 2020, when I discovered the lovely app called Wombo. In an era just before Midjourney, Stable Diffusion and the likes, Wombo was a relatively “basic” text-to-image generator that would now seem primitive compared to the latest generative AI models. But what fun it was to play with, and I ended up generating hundreds of images. One image in particular stands out, for it is utterly redolent with Mā Kāli:

kali

Playing with Wombo as much as I did prepped me for the soon-to-come generative AI tsunami, but I boarded the latter train with some lag. My earliest forays were with Stable Diffusion in 2022, which were somewhat underwhelming. And generating Indic or Dharmic imagery was out of the question. Faces would be distorted, gods would be generated appearing evil or demonic, and the enterprise was simply disturbing.

Then came Midjourney, and at least from the stuff I saw being generated by others, it appeared to be well ahead of generative AI that preceded it. It wasn’t long before I began dabbling too, and now in 2023 with Midjourney v5, I’ve generated more than a thousand images of all kinds- Indic, flat graphics, Dharmic, science-fiction, photo-realistic, abstract and more. Frankly, I love it. Where I might have once idled on the phone on apps like Twitter or 9gag, I’m now more frequently serial-prompting Midjourney. Here’s a rundown of basics, tips and tricks I think I’ve picked up.

Getting Started

A bit of a hassle, for Midjourney has no standalone app or website. Instead, one must join Discord, and then join Midjourney’s Discord server. Those with paid accounts (such as myself) can “invite” Midjourney to their own private Discord server. I’m not tech/app lethargic, but getting started this way did feel tedious- especially since I’d never used Discord before. But at the end of the day it’s an app, you log in, you go to the right channel, and you begin prompting. No rocket science.

Warming Up

It’s good as a beginner to hang out awhile in the newbie forums at Midjourney’s discord server. Here you can see countless other initiates sending in prompts and the resultant images generated. It gives one a good starting point as one gets to see what words people are prompting, what styles are being generated which way, and so on.

Structure of a Prompt

There are any number of tutorials, hacks and guides on Midjourney prompting. I’ve seen/read none of them. With some trial and error, the broad structure of a prompt that I’ve settled into is:

<description of scene>, <scene qualifiers>, <scene style>, <settings, if any>

Let’s break the above down

Description of Scene

As the name says, this is your basic wording of what you want to create. Examples:

  1. a himalayan mountain landscape
  2. scene of intense battle between bronze age Hindu tribes
  3. universe within a universe within a universe within a universe

Things like ‘a’ or ‘the’ are not needed in these. So the first description above could just as same be “himalayan mountain landscape.” The second prompt above could work even without the initial “scene of.”

Scene Qualifiers

By these I mean specific properties, aspects, nuances that I might want to add to the image. Corresponding to the scene descriptions above, these are examples of qualifiers respectively:

  1. bright day and thick snow
  2. ancient Vedic India
  3. dark cosmos

Scene Style

This component often overlaps with the previous, such that in many of my prompts the scene qualifier and scene style are one and the same. But one thing to keep in mind is that by scene style I mean any variety of prompts that define the artistic style of the image. Some examples of styles are:

  • retro comic illustration
  • epic cinematic action shot
  • concept poster for james cameron movie
  • art in the style of Kawase Hasui
  • trending on artstation

With these 3 sections above, here’s what a complete prompt can look like:

scene of intense battle between bronze age Hindu tribes, vedic India, retro comic illustration

Generated variants for above prompt:

sample gen

Then there’s the option settings component:

Settings

This is not needed, but one can add a few suffixes to regulate the settings. For example, aspect ratio and no. of iterations. In Midjourney prompts, settings are always at the end, and have a particular syntax. Each setting has a keyword that must be preceeded with ’—’ and followed with the setting to configure. Examples:

  1. —ar 3:2, —ar 16:9 - these set the aspect ratios of generated image to 3:2 and 16:9 respectively
  2. —repeat 3 - each prompt generates 4 variations, and with the —repeat setting we can increase the count by multiples of 4. So —repeat 3 will generate 12 variations.

There are a few more settings one can configure, like —stylize and —seed. Settings can also be preset by typing /settings and pressing enter.

Full Examples

Here are some of my recent images and their complete prompts

  1. Lord Vishnu in vishwaroop form, intricate tribal flat art

    sampler1
  2. Intricate sculpture of alien planet made completely of lush grass, octane render

    sampler2
  3. Ancient Hindu granite fort, on top of mountain, retro comic illustration, trending on artstation

    sampler3

Tips

General things I’ve picked up while playing:

  • more often than not, Midjourney understands “indian” as the native American variety. So if I want to generate an Indian man, I instead use “Hindu man.”
  • shorter prompts work better, and the results are closer to intent. Longer prompts have usually been a hit or miss in my experience.
  • for some reason, Midjourney understands “Hindu mandir” better than it does “Hindu temple” or just “temple.” With the latter, it often generates buildings with mosque-like domes.
  • if creating historic-fiction imagery, it’s important to specific periodicity. For example, a lot of the historical fiction I generate is from pre-bronze age eras, so one can’t have soldiers in metal armour and helmet. Adding “neolithic” or “ancient primitive” in such a case shapes the image better.
  • on its own, Midjourney generates images which tend to have a bit of a synthetic, digital feel. A simple ‘trending on artstation’ as scene style in the prompt fixes this and creates images which look more like painted art. This is only one way to manage the issue.
  • while Midjourney does not understand Hindi/Sanskrit, prompts in Devanagari script often generate some amazing and interesting images, even if they have no relation to the meaning of the prompts. It’s fun to play with.
  • as a rule, I stay away from directly depicting devas and gods. There are several nuances to the depiction of deities, down to what specific objects they might hold in which hand. Since I am not an expert on that, playing with deity imagery is borderline-disrespectful. Not that I’ve not tried this at all, but the caution is ever-present in my mind. Try creating images of Shiva and you’ll find images that do indeed look like him- until you look closer. The cobra around the neck could be missing, the weapon in hand is not a trishul, the third eye is absent, and so on. That’s when you’re forced to ask- is this really a depiction of Shiva, and should I be captioning it such?
  • Midjourney generates images based on a bank of training images it has been fed. Think this through to realise that on the internet, the word “goddess” is inordinately more often used in relation to a sex goddess than to a Hindu devi. An area of extreme caution, tread lightly if at all.
  • try generating images related to “India” and “travel” or “tourism.” See how often you get something that does not contain the Taj Mahal!

scroll to top