Have you ever had a creative post idea but couldn’t quite find the right image to go with it? Or maybe you got writers block thinking of a creative headline? Recent advances in generative AI are starting to enable the user to take free flowing ideas and convert them into interesting art and creative text. In this article, we’ll take a look at the latest status in AI generated content and talk about some of the interesting possibilities in the space for social media.
For example, take a look at following image, created using Stable Diffusion, one of the latest models in this space. We were going for something futuristic, the merging of animal and machine. We used this image on one of our company LinkedIn posts through the Seenly app.
What do you think, how did the AI do?
In the rest of this article, we’ll cover the 4 main types of content; images, text, video and audio. For each one, we’ll talk a little about the latest algorithms taking the world by storm.
These models underlie most of the new AI created images currently circulating. They are all text-to-image models, meaning that you type a sentence and a machine learning algorithm creates one or more corresponding images.
The advances in this year and mainly down to improvements in ‘Diffusion models’. The engineering and math of these models is very complicated for outsiders, but basically this year represented a big leap compared to the previous best. If you are interested in seeing some of the minutiae, you can find an introduction here.
Stable Diffusion by StabilityAI is open source, meaning that you can run the model yourself if you have the experience to deploy it.
If you’re not an expert but would like to play around with the service, its also available to try via Hugging Face machine learning platform.
In the world of text generation, OpenAI’s GPT-3 is the king. The model, which was famously trained on 500 billion words, has 175 billion parameters (read components) and reportedly cost around $12 million to train, has been known to produce some astounding content.
From our experience with GPT-3, its much better at producing short texts when given a strongly structured framework. Even though the model is described as AI, it doesn’t really understand the world around us and doesn’t fully understand context yet. However, if you provide that, the results are incredible.
The model is not open source but OpenAI provides API access for a reasonable price.
As far as we are aware, GPT-3 powers most of the big service providers in the space including CopyAI and JasperAI. These companies provide support and structure to make it simpler to get better results.
While GPT-3 is the single best large language model right now, there is a decent open source competitor. EleutherAI has created the GPT-NeoX-20B, a 20B parameter alternative and it performs extremely well on a wide variety of tasks.
An extremely recent addition to the content generation space comes from the team at MetaAI. They have created the “Make-a-Video”, which takes text input and converts it to short, moving videos.
This technology is currently experimental and you need to request special access. However, the paper is available to the public and its unlikely to be long before an open source alternative is made available.
Audio generation is the little brother of the content generation world. It hasnt received nearly the same hype. However, there is a lot of interesting stuff happening in this space.
Audio generation can be broken down further into two separate categories. Voice synthesis and Music generation.
Resemble AI is doing great work cloning voices from small samples which can be used for generation. The results are very convincing.
In the audio space, Magenta AI is exploring the role of machine learning in the creative process of music making. For example, look at their work using machine learning to create string ensembles. The results are not quite at the level of the best humans, but much better than the worst!
AI in the audio creation process is sure to become more important in the content generation space. In the future, will we be able to describe music and have it automatically rendered, similar to Stable Diffusion image generation?
In this article, we’ve discussed the latest trends in AI generated content. 2022 saw a big leap in the ability of these models, resulting in numerous fantastic service offerings. Would you use AI in your content generation stack?
The latest in content generation AI is coming soon to Seenly in 2023. Stay tuned and sign-up for more information.