When researchers Matthew Magnani of the University of Maine and Jon Clindaniel of the University of Chicago tried prompting ChatGPT and DALL-E to illustrate Neanderthals in words and visuals, the AIs produced grossly inaccurate images and text about what our ancestors looked like and how they lived.
…
“A majority of images depict human-like figures, slightly stooped, with large quantities of body hair. These depictions have more in common with early twentieth-century drawings of Neanderthals than contemporary scientific knowledge,” Magnani and Clindaniel said in a study recently published in Advances in Archaeological Practice.
What information is used to train generative AI is left to speculation, as these programs tend to operate in a relatively black-box manner. That said, we do know that these programs collect their training data sets by scraping the internet. Even with open access making many scientific articles publicly available, many more are paywalled, which makes the information much harder to scrape. Because of that, and because of other copyrighted material being otherwise inaccessible, the most easily collected information may be outdated.
This is an excerpt. Read the original post here
