Back

How-To: Midjourney

A Comprehensive Guide

Published: 07.03.2025
Autor: [at] Editorial Team
Category: Basics

This is the self-description of the Midjourney research laboratory. It sees itself not just as a technical tool but rather as a pioneer for an expanded human imagination. With over 20 million users in November 2024, it is clear that Midjourney's approach has met with broad approval. After all, when the first image is generated using AI, it can be a pretty magical experience. But as with any new technology, it takes time and practice to use its full potential.

This guide looks at how Midjourney works, how it compares to other image generators and, of course, what good prompts look like to create mesmerizing images.

What is Midjourney?

Midjourney is an AI-based image generator that automatically creates digital images by entering text commands (prompts). Developed by the independent research lab of the same name, the platform uses machine learning algorithms to generate detailed and “creative” visual representations. Midjourney thus makes it possible to easily generate complex images without requiring in-depth graphical knowledge on the part of the user.

Midjourney was first made available to the public as an open beta version in July 2022 and quickly attracted great interest. Over the years, several new versions have been released, bringing faster and high-resolution image generation, improved customization options and more intuitive user interfaces. The latest version 6.1 was released in July 2024. This version offers improved image quality, faster generation times (around 25% faster than version 6) and refined details, especially for complex textures and fine features such as eyes and facial features. In addition, 6.1 includes new upscale options (“Subtle” and “Creative”) that allow you to create higher resolution images with improved detail.

How does Midjourney work?

The exact way Midjourney works remains a well-kept secret, but like other image generators, the technology is based on two key machine learning approaches: Large Language Models (LLM) and Diffusion Models (DM).

The Large language model enables the AI to capture the meaning of the prompt – i.e. a text-based description – and convert it into a vector that serves as a digital version of the description. This vector then controls the next step, the Diffusion. This is a process in which the model was originally trained by adding noise to its training dataset and then gradually removing it to restore the original image.

Thus, by removing noise from a randomly generated image, Midjourney can generate new images that match the description entered by the user. It usually only takes a minute from entering the prompt to the finished image – a fast journey from idea to visual result.

Midjourney vs DALL-E vs Stable Diffusion

Midjourney faces strong competition in the image generator market. They all have their advantages and disadvantages and sometimes differ greatly from one another. For a better overview, here is a comparison of the 3 largest current providers.

Characteristic	Midjourney (V6.1)	DALL-E (3)	Stable Diffusion (3)
Quality & realism	High image quality, realistic representations, good level of detail; strengths in photorealism and atmospheric lighting	Highly stylized and detailed images; particularly strong in graphics and illustrations	Realistic scenes, high quality with complex compositions, but sometimes less detail
Prompt accuracy	High fidelity, especially with simple to moderately complex prompts	Good accuracy, especially with simple to complex texts	Strong fidelity, especially with relational and complex prompts
Adjustment options	Many options for style, variation and reference images	Inpainting and interactive editing possible	Supports custom models and adjustments for specific styles
Platform access	Access required via Discord	Access via ChatGPT web platform and via Bing	Open source and can be installed locally, flexibly accessible via API
Pricing	Subscription required; no free version	Integrated into the paid version of ChatGPT or free via Bing	Free in the basic version; higher prices for customized models
Field of application	High precision in creative, commercial and artistic projects	Particularly suitable for stylized and detailed images	Versatile; especially for users who need custom and versatile images

Midjourney vs. DALL-E vs. Stable Diffusion - the conclusion

Midjourney is particularly distinguished by its artistic focus, which allows creating not only photorealistic but also stylized and sophisticated images. The customization options and image fidelity are very good, but access and cost are limiting for some. DALL-E scores with its ease of use and is ideal for beginners as it is integrated into ChatGPT. It's flexible to edit, but with less artistic freedom and texture than Midjourney. Stable Diffusion is particularly attractive for advanced users who like to customize local models. Its open source availability and flexibility make it ideal for specific style and model customizations, but technical knowledge is required for optimal use.

How can I use Midjourney?

A Discord account is required to use Midjourney, as all interaction takes place via the Discord platform. Any device that supports Discord can be used for this. The setup is quick:

Step 1: Discord Account

Create a Discord account (if you don't already have one).

Step 2: Midjourney Server

Use the link https://discord.gg/midjourney and join the official Midjourney Discord server.

Step 3: Complete your subscription

There is currently no free trial period. Therefore, a subscription must be taken out directly. To test Midjourney, the Basic Plan for $10 is best. You can do this by typing the command /subscribe on the Midjourney Discord server. A personal link for membership will then be generated.

Step 4: Newbie Room

After completing a membership, the image prompts can be entered in special channels for newcomers (Newcomer Rooms) and the AI then generates the corresponding images.

Costs

Midjourney offers various paid subscriptions to be able to use the services to their full extent. There is no free version and the test phase is currently suspended because the number of users is too high. You can choose from the following subscriptions:

Basic ($10/month): Offers limited GPU usage time, ideal for beginners or casual users.
Standard ($30/month): Unlimited “Relax GPU” time, slightly more GPU resources, but no stealth mode. Good for frequent users.
Pro ($60/month): Unlimited “Relax GPU” time, expanded GPU resources, stealth mode for private projects and multiple parallel jobs. Suitable for power users.
Mega ($120/month): Maximum GPU resources and performance with all features. Ideal for professionals.

More detailed information about membership options can be found under Midjourney subscription.

Prompting in Midjourney for beginners

Midjourney uses prompts to create visuals from text descriptions. Creating a good prompt is the key to high-quality images, as it determines the content, style and composition.

1. Syntax and input of prompts:

In Midjourney, prompts are entered via the Discord interface. It starts with the command /imagine followed by the description.

Example:/imagine a futuristic cityscape at sunset, vibrant colors, ultra-realistic

The AI interprets these inputs and creates an image based on the description. A good prompt depends on how clear and detailed the description is. For example:

Topics and objects: Describe what is supposed to be in the picture (e.g. "a forest", "a futuristic building").
Adjectives and details: Add details to refine the image (e.g. "misty forest", "tall futuristic building with glass facade").
Styles and techniques: Instruct the AI in what style to create the image (e.g. "in the style of an impressionist painting" or "photorealism").

For example: /imagine a youngcaton a cushion in a pencil sketch style

Here are some examples of different styles:

Different eras with their typical style characteristics are also possible:

2. Options for refining prompts:

Midjourney offers a number of parameters that can be used to further control the output, for example:

--ar: Adjust the aspect ratio, e.g. --ar 16:9 for a wide picture.
--q: Change the quality level, e.g. --q 2 for a higher level of detail (by default --q 1).

Prompt example:

/imagine a futuristic cityscape at sunset, vibrant colors, flying cars, skyscrapers made of glass, ultra-realistic --ar 16:9 --q 2

If you enter this prompt, the following output appears:

When the generation is complete, four image variants are displayed, which can now be edited further:

Select U1, U2, U3 or U4 to scale up one of the variants (larger and more detailed version).
Select V1, V2, V3 or V4 to create variants based on the respective image.
Remix (if enabled): Allows you to re-edit the image with slightly different prompts.

For example, if the color scheme of the fourth image is particularly appealing, but the image does not fully meet your expectations, you can click on “V4“ to generate a series of four variants:

In this example, we choose the fourth image again and create an upscaled image by clicking on “U4”:

It is possible to edit the resulting image further:

Upscale (Subtle) and Upscale (Creative): These options increase the image resolution. “Subtle” improves the image without making major changes, while “Creative” uses more artistic freedom and makes the image more lively.
Vary (Subtle) and Vary (Strong): This can be used to create variations of the image. “Subtle” results in slight adjustments, while “Strong” creates striking differences.
Vary (Region): This function allows you to select a specific area of the image and change it, leaving the rest untouched.
Zoom Out 2x, Zoom Out 1.5x and Custom Zoom: These options enlarge the view and add additional elements around the main subject. “Custom Zoom” allows an individual zoom level.
Make Square: Adjusts the aspect ratio to a square format.
Arrows: The image motif can be easily moved in the respective direction using the arrows.

Once the desired result is achieved, the last thing to do is to download it – and the image of a futuristic city with flying cars is ready.

The command /describe

Sometimes there is a clear idea of the image you want, but finding the right prompt can be difficult. Despite numerous adjustments, the result often falls short of expectations. In such cases, the command /describe can be helpful to analyze existing images that are similar to the desired one. This allows you to understand how Midjourney interprets them.

After entering the command, a drag-and-drop box appears to upload the image:

Once the appropriate image has been uploaded, Midjourney will display four options that describe it. Simply select the most suitable option and refine it further to create the perfect prompt for your project. The /describe command can be a valuable tool when Midjourney's outputs don't align with your desired results.

Midjourney Prompt Guide

After the basic principles of image generation with Midjourney have been explained, a compact cheat sheet with important commands and parameters follows. This serves to further optimize prompt creation and make image output more precise and controllable.

Instruction	Notation	Example in the prompt	Function and application
Image generation	/imagine	/imagine a sunset over mountains	Basic command to start image generation. Always followed by a text description of the desired image.
Image description	/describe	/describe [upload image]	Describes an uploaded image in four text variants.
Load images from URL	Insert image URL	/imagine [image URL] a sunset over mountains	Allows you to use image URLs as a starting point for image generation and combine them with textual instructions.
Aspect ratio	--ar	/imagine a sunset over mountains --ar 16:9	Defines the aspect ratio of the image. Default is 1:1. It is possible to change it to, e.g., --ar 16:9 for wider images or --ar 9:16 for portrait formats.
Custom sizes	--w, --h	/imagine a sunset over mountains --w 1920 --h 1080	Sets a custom image width (--w) and height (--h) in pixels to obtain a specific resolution.
Image quality	--q	/imagine a sunset over mountains --q 2	Increases image quality. Standard is --q 1. Higher values (up to --q 2) increase the level of detail but extend processing time.
Select version	--v	/imagine a sunset over mountains --v 5	Selects a specific version of the Midjourney engine. For example --v 5 or --v 6 for the latest versions.
Style template	--style	/imagine a sunset over mountains --style 1000	Determines the artistic style of the images. Values from 0 (realistic) to 1000 (extremely stylized) are possible.
Chaos factor	--chaos	/imagine a sunset over mountains --chaos 80	Increases the randomness of the image, with values from 0 to 100; higher values lead to unpredictable and creative results.
Sharpening	--hd	/imagine a sunset over mountains --hd	Activates "HD" mode for sharper and detailed images.
Number of variations	--n	/imagine a sunset over mountains --n 3	Generates a fixed number of variations (between 1 and 4). By default, four images are generated.
Exclude certain objects in the image	--no [object/feature]	/imagine a sunset over mountains --no trees	Excludes certain elements from the image; in this case, no trees are generated in the image.
Prioritize details	--details	/imagine a sunset over mountains --details	Increases the level of detail in the image. Works well for images with a lot of elements that need more detail.
Color balance	--color	/imagine a sunset over mountains --color warm	Determines the color tone of the image, e.g., warm, cold or vibrant.
Shadow effects	--shadow	/imagine a sunset over mountains --shadow	Adds deeper shadows and realistic lighting, especially in darker scenes.

Midjourney Gallery

To conclude, a selection of prompt examples accompanied by images is provided to showcase the possibilities Midjourney offers.

#1: Elephant made of glass

/imagine Elephant made of glass, Kintsugi, orange sunset, national geographic, scenic landscape --ar 16:9

#2: Black Parrot

/imagine minimalistic photographie of black parrot, eating oranges, white background --ar 4:3

#3: Wide Wooded Valley

/imagine a large-format picture with a figure leaning on an orange (HEX #FF792B) tower PC in the foreground, in the background a wide wooded valley, with gently sloping cliffs and a bright, cloudy sky, low horizon, in the style of Pumpkin and Fruits by Yayoi Kusama, --ar 1600:1000

#4: Cowboy sitting at a Table

/imagine a cowboy sitting at a table in a tavern, playing poker, cowboy hat pulled low on his face, some cards in his hands, a superior laugh on his lips, frontal view, half-length figure, cards in orange (HEX #FF792B), --ar 16:9

#5: Personification of Language

/imagine a personification of language melting under an orange (HEX #FF792B) sun, surrealism in the style of Salvador Dalí, --ar 8:5

Share this post:

Author

[at] Editorial Team

With extensive expertise in technology and science, our team of authors presents complex topics in a clear and understandable way. In their free time, they devote themselves to creative projects, explore new fields of knowledge and draw inspiration from research and culture.

Provider:	HubSpot European Headquarters 1 Sir John Rogerson's Quay Dublin 2, Ireland
Cookiename:	__hstc; hubspotutk; __hssc; __hssrc; __cf_bm; __cfruid
Runtime:	6 months; 6 months; 30 minutes; session end; 30 minutes; session end
Privacy source url:	https://legal.hubspot.com/privacy-policy
Host:	.hubspot.com

Provider:	InnoCraft Ltd., 150 Willis St, 6011 Wellington, New Zealand
Cookiename:	_pk_id..; _pk_ses..
Runtime:	13 months; 30 minutes
Privacy source url:	https://matomo.org/gdpr-analytics/
Host:	.matomo.cloud

Provider:	Google Ireland Limited, Gordon House, Barrow Street, Dublin 4, Ireland
Cookiename:	YSC; VISITOR_INFO1_LIVE; PREF
Runtime:	Session end; 6 months; 8 months
Privacy source url:	https://policies.google.com/privacy
Host:	.youtube.com

Provider:	Podigee GmbH, Revaler Straße 28, 10245 Berlin, Germany
Cookiename:	Not specified
Runtime:	Not specified
Privacy source url:	https://www.podigee.com/en/about-us/privacy/
Host:	.podigee.com

Provider:	Google Ireland Limited, Gordon House, Barrow Street, Dublin 4, Ireland
Cookiename:	SID; HSID; NID
Runtime:	2 years; 2 years; 6 months
Privacy source url:	https://policies.google.com/privacy
Host:	.google.com