Like every other basement-dwelling neck-bearded nerd on the intarwebs, I’ve been getting a right froth-on lately about the sudden explosion in Artificial Intelligence [AI] and, more specifically AI Chatbots.
I’ve been having a fairly regular play around with Microsoft’s ChatGPT [at least until they fucked up access to the site by funnelling it through bloody Cloudflare]. And, yesterday, I briefly tried out the first public beta of Google’s Bard [which, between me you and the gatepost, was slightly underwhelming, compared to ChatGPT. But it’s early days yet!].
Anyway, I’ve found ChatGPT to be really useful for those idle questions which flit across your mind and are difficult to do a conventional web search for or [and this is what I’ve used it for most] quickly getting a programming hint, when I’m trying to remember how to do something in whatever is my current "Flavour of the Month" programming language --which I am determined to learn but will give up on once it gets too boring, too difficult… or both.
I’ve also had a few "philosophical" conversations with ChatGPT, which were quite fun. It can be very good indeed. Almost-but-not-quite good enough to convince you you’re dealing with a thinking machine. But, if you’re of a critical bent, you soon realise that, a lot of the time, it’s just saying 'I don’t know that' in the time-honoured fashion of chatbots, characters in computer games and so-called smart speakers, down the ages. It’s just doing it in a more convincingly evasive way. A bit like a silicon politician.
Anyway, that’s all by the by. I’m not going to bore you more than necessary by waffling on about all the good and clever stuff AI chatbots can do now. There’s enough of that kind of material out there to keep you reading from now til Christingmouse. Instead I’m going to point and snigger at an area where --much to my surprise, given how good it is generally-- the AI chatbot is a complete and utter dunce. Namely "Art Appreciation" or, to put it less pseudishly "saying what’s going on in a picture".
Yesterday, I uploaded this 'ere cartoon I drew of the missus giving me a right clip round the ear and asked both ChatGPT and Bard:
'Describe what is happening in this image'
In the image, there is a person standing on a rocky cliff overlooking a scenic valley with mountains in the background. The person is standing with their back facing the camera, looking towards the valley. They are wearing a backpack and what appears to be a hat or helmet, and their arms are slightly outstretched to their sides. The person’s body language and the scenic background suggest that they may be enjoying the view or possibly engaging in some form of outdoor activity, such as hiking or mountain climbing. The lighting in the image is soft and diffused, with some shadows visible in the foreground, suggesting that it may be early morning or late afternoon. Overall, the image conveys a sense of awe and appreciation for nature’s beauty.
The image shows a group of people gathered around a table, looking at a map. The map is of a city, and the people are pointing to different locations on the map. It seems like they are planning a trip or trying to figure out how to get somewhere.
The image is well-composed, with the people and the map evenly distributed in the frame. The colors are also muted and complementary, which helps to create a calm and peaceful atmosphere.
Overall, the image is a well-crafted and visually appealing depiction of a group of people planning a trip.
The Google Bard answer was so unbelievably shite, I thought I’d give it another go. After all, it was its first day at school:
The image shows a man and a woman standing in front of a large tree. The man is holding a shovel, and the woman is holding a watering can. They are both smiling and look happy to be together. The tree is a beautiful shade of green, and the leaves are healthy and vibrant. The sky is blue and clear, and there are a few puffy white clouds in the distance. The overall scene is peaceful and serene.
It seems like the man and the woman are planting a tree. They may be doing this to commemorate a special occasion, or they may be simply enjoying the process of planting something new. Whatever the reason, they seem to be happy to be spending time together and making the world a little bit more beautiful.
Holy onion Jeebus! It’s not just that the answers are wrong. It’s the way that the AI chatbots are actually inventing details that aren’t there. And, as I alluded to above, all the more surprising, given how uncannily "intelligence faking" I have found these systems, in other regards.
But I know what you’re saying; 'Be fair, stío. Those weren’t real photos you gave the AIs to work on. Nor even artworks produced by someone with more talent than a chimpanzee with a paintbrush stuck up its arse!'
And you’re right, dear reader. I thunk as much myself. Which is why I decided to make things a bit easier by giving the chatbots an actual photograph to analyse instead. So I treated them to this snap of the magnificent "Tewwy McCaaaaaahn" AKA Dennis Waterman, being molested by a couple of pissed munters, while singing in a pub.
By the way, if you want to enjoy this spectacle in its full "Major Motion Picture" horror, here it is. But, be warned: have some mind bleach standing by!
Now, I’m not expecting either chatbot to recognise Dennis, or to grade the munterosity of his molesters. But… a bloke, with a woman on either side. At least that much should be obvious, right?. Over to you, chatbots…
In the image, there are several people gathered in a park or a public space. The foreground of the image shows a man in a white shirt and black pants lying on the ground, with his legs crossed and his hands resting on his chest, appearing to be in a meditative pose. There is a small pile of rocks on his chest, and he seems to be focused on them.
In the background of the image, there are several other people sitting or standing, some of them also in meditative poses. There are trees and grass around them, and the sky appears to be a bright blue. The scene suggests that these people have gathered for a group meditation session, possibly in a natural or peaceful setting.
The image shows a group of people sitting in a circle on the ground. They are all looking at a laptop that is in the center of the circle. The laptop is showing a video of a young woman who is talking to the camera. The people in the circle are all smiling and laughing. It seems like they are enjoying the video.
Based on the image, it is likely that the people in the circle are friends or family who are watching a video together. The video could be anything from a funny video to a heartwarming video. The people in the circle seem to be enjoying themselves and each other’s company.
This is such fun! Let’s have just one more. Behold the legendary "Sassige Monster" AKA wee me, fucking around with some barbecuing sausages, from several years back. Now, even at the best of times, I wouldn’t expect either AI to know what the feck is going on here. I mean, I was there and I haven’t a clue. But… just for the "LULZ", let’s give it a go…
In the picture, a young woman is standing on a rocky shore looking out towards the ocean. The sun is setting in the background, casting a warm orange and pink glow over the sky and reflecting off the water. The woman is dressed in casual clothing and has her hair blowing in the wind. She appears to be taking in the beautiful scenery and enjoying the peaceful moment. The rocky terrain and crashing waves suggest that she may be at a remote or rugged beach location. Overall, the picture captures a serene and tranquil moment in nature.
The image shows a young woman sitting in a chair, reading a book. She is surrounded by a variety of plants, some of which are in pots and some of which are growing in the ground. The woman appears to be relaxed and enjoying the peace and quiet of her surroundings.
The image is likely to evoke feelings of calm and tranquillity in the viewer. It may also inspire a sense of wonder and curiosity about the plants that are growing around the woman.
Right. I think we’ll leave it there. Though I could bore the gonads off you with several more of these howlers. Suffice to say that, not only are they woefully inaccurate, but the AIs seem to completely invent things that aren’t even there.
The most noticeable occurrence of this was with one of the other photos I tried, which was a screengrab taken from a family group Skype call [so the various heads & shoulders of the people involved in the call, arranged in a grid pattern. If you’ve ever used Skype or similar, you’ll know what I mean]. I’m not posting the photo here as the other people involved might not want their ugly mugs paraded on the intarwebs. But anyway, I’m sure you can picture the scene. All the people were either laughing or smiling, as I’d just made one of my hilarious quips [ahem!]. ChatGPT said this was:
In the image, there is a person who is standing in the middle of a city street holding a sign that reads "Stop Asian Hate" in bold letters. The person is wearing a face mask, a black hoodie, and black pants. The background shows a row of buildings with storefronts, and there are several parked cars on the street. The person appears to be part of a demonstration or protest against racism and violence towards the Asian community, which has been on the rise in many parts of the world. The person’s action is a visible form of activism and a call for solidarity and action against hate and discrimination towards the Asian community.
Like I say. Not just the woeful inaccuracy. But the complete invention of things that weren’t even there and further fabrication of non-existent detail on the non-existent thing, eg: '…holding a sign that reads "Stop Asian Hate"…'
Very odd indeed. I can only imagine these things will continue to improve in leaps and bounds, as they have been doing already. But, in the meantime, it’s spooky to be present at this stage, where you almost get the feeling of the Artificial Intelligence trying to make sense of the visual overload of our world… and having some pretty mad hallucinations, in the process.
Anyway, as a final flagellation of the deceased equine, I fed Bard’s description of the Sassige Monster photo into Stable Diffusion --yet another AI. This one of a similar kind to Dall-E, in that it generates images from textual prompts. So see if you can spot the difference between the Sassige Monster photo above and this: