The Intelligent Friend - The newsletter that explores how AI changes the way we think and behave, only through scientific papers.
Intro 🖍️
Hi IF readers, how are you? I hope you had a wonderful week. If you instead had some obstacles, I hope you will start in a great way the next one. Today we are going to discover a research that has interested me and caught my attention, not only for the practical implications it has from the point of view of its results, but also for what it could bring in terms of applications of AI as an analysis tool. Ready?
The paper in a nutshell 🔦
Title: Large language models can infer psychological dispositions of social media users. Authors: Peters and Matz. Year: 2024. Journal: PNAS Nexus. Link.
Main result: large language models, like ChatGPT, are capable of deducing psychological traits from individuals' social media content despite lacking specific training for this task.
Your social media, your brain?
Think about your Facebook or Instagram profile for a moment. Your posts with friends where you exclaim in the caption that you can't live "a similar adventure!" again. Or maybe the more enigmatic, "aesthetic" ones, where you simply use some lyrics from a song you like. Social media, as we all know, reveal a part of us - what we want to share - and contribute to the idea that others have of us. But is the information we share enough to reveal more apparently hidden aspects, such as our psychological traits? And, above all, what does AI have to do with this?
In the today’s study, that fascinated me a lot, the authors tried to understand if Large Language Models (LLMs) are capable of deriving our psychological traits from our social media content, with what capacity and precision.
Apparently it is not a question of primary importance. Or at least, it might seem so. In reality, this ability has countless, potential, practical implications.
Knowing the psychological traits of a person can mean the ability to construct messages in line (as far as possible) with that trait. Communications that can, therefore, be more effective.
Although this can be a good thing (especially for health messages or public interest), it does not always translate into positive outcomes, and several aspects related to privacy and other issues emerge (which we will obviously not explore in this article). However, this study, as we will now discover, also has the merit of highlighting once again the great evolving capacity of these tools in data analysis and in carrying out tasks even without enormous training on the part of the user.
What can LLMs infer?
One of their most notable strengths is LLMs’ ability is to adapt their “knowledge” to unfamiliar scenarios and tasks, highlighting their flexibility and potential. In fact, even though more and more results are emerging that highlight how different forms of AI and GenAI are more capable in some activities than others, a chatbot like ChatGPT could be useful to us in many things without us having to waste a lot of time specifying a background, as we know.
This also happens in terms of psychological analysis. For instance, although LLMs were not explicitly created to emulate human cognitive processes, studies suggest that their extensive exposure to human-generated language has enabled them to develop characteristics akin to human cognition. These include skills such as understanding others' mental states1 or replicating human decision-making biases2. Moreover, LLMs have demonstrated the ability to craft tailored persuasive messages that align with individual psychological traits and moral frameworks3 (we talked about this also in the last issue).
Building on this foundation, researchers have begun exploring whether LLMs can emulate another distinctly human capability: forming initial impressions of others’ psychological traits without prior interaction.
To understand the basic reasoning, as always, let's take an example. Imagine that you know a person and you know that he listens to a certain band, for example Radiohead. Then you discover that he also listens to Nirvana, Metallica and AC/DC. You become friends with this person and when he invites you to his house you notice particular ways of organizing things or the work day. Naturally, you begin to form an opinion of the person, even involuntarily, on certain aspects related to his personality.
In this way, the reasoning of researchers related to LLMs’ capacity to derive psychological traits draws, according to the authors of today’s paper, from “zero-acquaintance” research4, which has shown that humans can accurately infer personality traits by analyzing indirect behavioral cues like room organization5, music preferences6, or social media activity7.
Similarly, prior studies in computational social science reveal that machine learning models trained on datasets linking self-reported traits with digital footprints can predict personality traits8 using information such as Facebook Likes, playlists, or blog posts9. These findings suggest a parallel between traditional supervised learning methods and the emerging potential of LLMs to make nuanced psychological inferences.
What is understood by LLMs
As you know, I don't often go into the depth of the methodological aspects of the paper, both for a time issue and for the difficulty of some elements. However, I think that the method used in this study by the authors is particularly interesting: how did they understand if AI (in this case ChatGPT) was able to derive some psychological traits?
The study analyzed text data users of the MyPersonality Facebook app, which combined validated Big Five personality assessments10 with user-donated Facebook data, including status updates. Participants completed a 100-item personality questionnaire, and their 200 most recent status updates were examined.
The Big Five personality traits, also known as the Five-Factor Model (FFM)11, describe five broad dimensions of human personality: Neuroticism (emotional stability), Extraversion (social engagement), Openness (curiosity and creativity), Agreeableness (compassion and cooperation), and Conscientiousness (organization and dependability). Of course, these descriptions are extremely summarized, but they can give you an immediate idea of what is meant and what the authors of today's paper actually measured. For further information, I found this reference very useful12.
To evaluate the capacity of LLMs, such as ChatGPT, to infer psychological traits, this study compared personality scores inferred from social media data with self-reported measures.
Additionally, intriguingly, the study examined how varying amounts of available data influenced inference accuracy. Analyzing correlations between self-reported and inferred personality scores based on increasing numbers of status updates (from 20 to 200), findings confirmed that more extensive data generally improved accuracy. So, at least in this case, having more data for the instrument to train on (although not by huge differences) significantly increased its accuracy. However, diminishing returns were observed: while some traits, such as Openness, Extraversion, Agreeableness, and Neuroticism, showed continued gains with additional data, others like Conscientiousness reached near-maximum accuracy after fewer updates.
This is very interesting: these findings suggest that LLMs can make accurate predictions even with limited input data, though the accuracy varies by trait.
However, there are also some differences in precision based on the subjects: for women and younger individuals, the inferred traits aligned more closely with self-reports compared to men and older adults.
Therefore, not only do LLMs demonstrate surprising abilities in deriving human psychological traits, but the accuracy also varies depending on certain conditions. Not to stop at the surface, of course, one might ask: why? The question, interesting and not at all trivial, seems to have some answers already in today's study: the observed disparities in accuracy across demographic groups may reflect underlying biases in the training data or differences in online self-expression.
Prior research highlights that LLMs are prone to stereotyping and demographic bias, often mirroring the representation of groups in their training corpora131415.
Takeaways 📮
ChatGPT can understand your traits: ChatGPT, particularly GPT-4, demonstrated the ability to infer psychological traits from social media posts, with accuracy improving as the volume of input data increased;
Different model, different accuracy: Inferred personality traits varied by trait and model version, with GPT-4 showing closer alignment to self-reported traits compared to GPT-3.5, indicating potential advancements in newer versions;
Different people, different accuracy: LLMs showed preliminary evidence of greater accuracy in inferring traits for women and younger individuals, raising questions about demographic and bias factors in the models' predictions.
Further research directions 🔭
What specific linguistic features are most correlated with inferred personality trait scores?
How does the accuracy of LLM-based personality inferences vary when applied to contemporary online language or datasets from more diverse social media platforms and populations?
Can advanced prompting strategies (e.g., chain-of-thought prompting, in-context learning) or incorporating user demographic information improve the predictive accuracy of LLMs without amplifying existing biases?
The Highlight 🥷🏻
This is the section where I'd like to highlight the amazing work that several authors do here on Substack, through links to their newsletter or specific pieces I've read. Here are some issues you can't miss reading on this platform. In the last issue I made a list of authors whose work I recommend in general. In the next ones, starting from this one, I will highlight specific issues that I found particularly intriguing.
Fear of self-promotion and related issues like impostor syndrome can have an often overlooked impact on what we do. Here
gives some practical and interesting ideas to apply in what you create.
An interesting issue on how there are important books that we often return to, by
.
What's it like to work in AI? Some answers from many authors and professionals here on Substack.
Thank you for reading this issue of The Intelligent Friend and/or for subscribing. The relationships between humans and AI are a crucial topic and I am glad to be able to talk about it having you as a reader.
Has a friend of yours sent you this newsletter or are you not subscribed yet? You can subscribe here.
Surprise someone who deserves a gift or who you think would be interested in this newsletter. Share this post with your friend or colleague.
Kosinski, M. (2023). Theory of mind may have spontaneously emerged in large language models. arXiv preprint arXiv:2302.02083, 4, 169.
Hagendorff, T., Fabi, S., & Kosinski, M. (2023). Human-like intuitive behavior and reasoning biases emerged in large language models but disappeared in ChatGPT. Nature Computational Science, 3(10), 833-838.
Matz, S. C., Teeny, J. D., Vaid, S. S., Peters, H., Harari, G. M., & Cerf, M. (2024). The potential of generative AI for personalized persuasion at scale. Scientific Reports, 14(1), 4692.
Albright, L., Kenny, D. A., & Malloy, T. E. (1988). Consensus in personality judgments at zero acquaintance. Journal of personality and social psychology, 55(3), 387.
Gosling, S. D., Ko, S. J., Mannarelli, T., & Morris, M. E. (2002). A room with a cue: personality judgments based on offices and bedrooms. Journal of personality and social psychology, 82(3), 379.
Rentfrow, P. J., & Gosling, S. D. (2006). Message in a ballad: The role of music preferences in interpersonal perception. Psychological science, 17(3), 236-242.
Back, M. D., Stopfer, J. M., Vazire, S., Gaddis, S., Schmukle, S. C., Egloff, B., & Gosling, S. D. (2010). Facebook profiles reflect actual personality, not self-idealization. Psychological science, 21(3), 372-374.
Azucar, D., Marengo, D., & Settanni, M. (2018). Predicting the Big 5 personality traits from digital footprints on social media: A meta-analysis. Personality and individual differences, 124, 150-159.
Yarkoni, T. (2010). Personality in 100,000 words: A large-scale analysis of personality and word use among bloggers. Journal of research in personality, 44(3), 363-373.
Goldberg, L. R., Johnson, J. A., Eber, H. W., Hogan, R., Ashton, M. C., Cloninger, C. R., & Gough, H. G. (2006). The international personality item pool and the future of public-domain personality measures. Journal of Research in personality, 40(1), 84-96.
McCrae, R. R., & John, O. P. (1992). An introduction to the five‐factor model and its applications. Journal of personality, 60(2), 175-215.
Costa, P. T., & McCrae, R. R. (1999). A five-factor theory of personality. Handbook of personality: Theory and research, 2(01), 1999.
Bolukbasi, T., Chang, K. W., Zou, J., Saligrama, V., Kalai, A., Guyon, I., ... & Garnett, R. (2016). Proceedings of the 30th International Conference on Neural Information Processing Systems.
Durmus, E., Nguyen, K., Liao, T. I., Schiefer, N., Askell, A., Bakhtin, A., ... & Ganguli, D. (2023). Towards measuring the representation of subjective global opinions in language models. arXiv preprint arXiv:2306.16388.
Santurkar, S., Durmus, E., Ladhak, F., Lee, C., Liang, P., & Hashimoto, T. (2023, July). Whose opinions do language models reflect?. In International Conference on Machine Learning (pp. 29971-30004). PMLR.
That's interesting and of course makes sense; if you can get an accurate Big Five assessment of your personality from a 100-question questionnaire, it would seem likely that an LLM could make a prediction based on unstructured data by mapping it to a known set of questions that reveal the answer.
Given the outrage around Cambridge Analytica and similar natively-Facebook attempts to do this sort of thing but with different tools, it seems obvious that a) people will not like to learn they are being categorised in this latest new and creepy way, b) it will affect the way people engage with social media at all, and c) it will be revealed to be much more inaccurate and ineffective at scale