Millions of users have utilized the chatbot to compose term papers, write computer code, and craft fairy tales. ChatGPT, the AI-powered tool by OpenAI, possesses the ability to analyze images as well, providing descriptions of the content, answering queries about them, and even identifying specific individuals.
The ultimate goal is to enable users to upload images of various issues, such as a broken-down car engine or a mysterious rash, and have ChatGPT suggest potential solutions. However, OpenAI has no intention of turning ChatGPT into a facial recognition tool.
Jonathan Mosen has been part of an exclusive group with access to an advanced version of the chatbot, capable of image analysis. During a recent trip, Mr. Mosen, who is blind and works as a CEO of an employment agency, utilized the visual analysis to distinguish between shampoo, conditioner, and shower gel dispensers in a hotel bathroom. The chatbot’s performance exceeded that of previous image analysis software he had used.
Remarkably, Mr. Mosen found himself able to “interrogate images” for the first time. He provided an example of an image on social media described as a “woman with blond hair looking happy.” Upon asking ChatGPT to analyze the image, the chatbot responded with a detailed description, identifying the woman in a dark blue shirt taking a selfie in a full-length mirror.
Mr. Mosen could even follow up with additional questions, such as those concerning the woman’s shoes and other visible elements in the mirror’s reflection. “It’s truly remarkable,” expressed Mr. Mosen, a 54-year-old resident of Wellington, New Zealand, who hosts a podcast on living as a blind individual.
GPT-4 Raises Accessibility Concerns Over Facial Information
In March, OpenAI introduced GPT-4, the latest software model powering its AI chatbot. The company highlighted that GPT-4 was “multimodal,” capable of responding to both text and image prompts. While most users were limited to text-based interactions with the chatbot,
Mr. Mosen received early access to the visual analysis feature through Be My Eyes, a startup that connects blind users with sighted volunteers and offers accessible customer service to corporate clients. This collaboration between Be My Eyes and OpenAI aimed to test the chatbot’s “sight” capabilities before its official release to the public.
However, recently, the app stopped providing Mr. Mosen with information about people’s faces, citing privacy concerns as the reason for obscuring such details. This change left him disappointed, as he believed he should have the same access to information as sighted individuals.
OpenAI’s Prudent Decision to Limit Facial Recognition
OpenAI’s decision to limit facial recognition capability in the chatbot stemmed from their apprehension about unleashing a potentially powerful and intrusive feature. According to Sandhini Agarwal, an OpenAI policy researcher, the technology can primarily identify public figures, particularly individuals with a Wikipedia page.
However, it lacks the comprehensive capabilities of tools like Clearview AI and PimEyes, which are designed for extensive face-finding on the internet. For instance, the tool can recognize OpenAI chief executive, Sam Altman, in photos, but it does not have the same capacity for identifying other employees at the company.
If made accessible to the public, such a feature would challenge the accepted norms of U.S. technology companies, potentially leading to legal complications in jurisdictions like Illinois and Europe, where obtaining citizens’ consent for using their biometric data, including faceprints, is mandatory.
OpenAI also expressed concerns about the tool making inappropriate judgments about people’s faces, like determining their gender or emotional state. To address these safety concerns and others, OpenAI is actively working on finding solutions before the wide release of the image analysis feature, as explained by Ms. Agarwal.
“We are eager for an open dialogue with the public,” Ms. Agarwal added. “If the feedback indicates a preference for not having this feature at all, we are fully receptive to that.”
OpenAI’s Nonprofit Division Seeks “Democratic Input” for AI Guidelines
To gather feedback from Be My Eyes users, OpenAI’s nonprofit division is actively exploring methods to incorporate “democratic input” to establish guidelines for AI systems.
Ms. Agarwal clarified that the development of visual analysis was not unexpected, given that the model was trained on a combination of images and text sourced from the internet. She highlighted that celebrity facial recognition tools, like the one offered by Google, already exist. Google provides an opt-out option for famous individuals who prefer not to be recognized, and OpenAI is considering adopting a similar approach.
According to Ms. Agarwal, OpenAI’s visual analysis has the potential to create “hallucinations” similar to what has been observed with text prompts. For instance, if given a picture of someone on the brink of fame, the tool might generate a hallucinated name, mistakenly associating the image with a different tech CEO’s name rather than the correct one.
There have been instances of inaccuracy with the tool’s descriptions as well. For example, Mr. Mosen reported that the tool once confidently described a remote control to him, claiming it had buttons that were not actually present.
Microsoft, being a major investor with a $10 billion investment in OpenAI, also has access to the visual analysis tool. Some users of Microsoft’s AI-powered Bing chatbot have experienced a limited rollout of this feature. When uploading images to the chatbot, they receive a message indicating that “privacy blur hides faces from Bing chat.”
AI’s Boundary-Breaking Impact
Sayash Kapoor, a computer scientist and doctoral candidate at Princeton University, utilized the tool to decipher a captcha, a visual security measure designed to be decipherable only by human eyes. Despite successfully breaking the code and identifying the two obscured words, the chatbot acknowledged that “captchas are designed to prevent automated bots like me from accessing certain websites or services.”
Ethan Mollick, an associate professor specializing in innovation and entrepreneurship at the University of Pennsylvania’s Wharton School, remarked that artificial intelligence (AI) is rapidly erasing the boundaries that traditionally distinguished humans from machines.
Last month, the visual analysis tool unexpectedly appeared in Mr. Mollick’s version of Bing’s chatbot, granting him early access without any prior notification. Since then, he has kept his computer running continuously to avoid losing access to the tool. For instance, he tested it by providing a photo of condiments in a refrigerator and asking Bing to suggest recipes based on those ingredients. The chatbot proposed “whipped cream soda” and a “creamy jalapeño sauce.”
Both OpenAI and Microsoft recognize the power of this technology and its potential privacy implications. A Microsoft spokesperson stated that the company is not divulging technical details about the face-blurring feature but is actively collaborating with OpenAI to ensure the responsible and safe deployment of AI technologies.