Semantic Analysis of Group Values Structure Using Roget Thesaurus: Automated Algorithm
Keywords:
Semantic analysis, thesaurus Roget, Python script, vectorization of values, youth valuesAbstract
The article proposes an approach for vectorization and quantitative analysis of group values. To demonstrate the possibilities of the method, the value structure of a group of young people was analyzed for differences between women and men, and for differences in the use of different parts of speech. The values of the group were verbalized in the form of free associations “with something most important in life”. The resulting array of words was converted into an array of semantic groups using Roget Thesaurus. Pairwise comparison of vectors with frequencies of individual semantic groups showed a high level of cosine similarity (0,9664) between subgroups separated by gender. Calculation of statistically significant differences in frequencies of separate semantic groups by сhi-square test allowed us to single out separate semantic groups, for which gender subgroups differed significantly. Frequency vectors obtained from the transformation of arrays of different parts of speech had a low level of cosine similarity in all pairwise comparisons. Nouns were most frequently used to express life values related to cause-and-effect relationships (14% of semantic groups). Adjectives were most often used to express values having a sense of personal predilections (18% of semantic groups). Verbs were most often used to express values related to liking (14% of semantic groups). The developed automatic algorithm will be useful for quantitative comparison of values between different groups, as well as calculating the degree of consistency of the target group values with the declared values of commercial brands.
References
Андреюк Д.С., Петрунин Ю.Ю., Храбровская В.Д. Метод кластеризации групп молодежи на основании ценностных смыслов в отношении профессионального развития и жизни в целом // Государственное управление. Электронный вестник. 2020. № 83. С. 221–242. DOI: 10.24411/2070-1381-2020-10117
Баева Л.В. Ценности изменяющегося мира: Экзистенциальная аксиология истории. Астрахань: Изд-во АГУ, 2004.
Митрофанова О.А. Измерение семантических расстояний как проблема прикладной лингвистики // Структурная и прикладная лингвистика. 2007. № 7. С. 92–101.
Человек в условиях глобальных рисков: социально-психологический анализ / под ред. Т.А. Нестика, А.Л. Журавлева. М: Изд-во «Институт психологии РАН», 2020.
Benjamini Y., Hochberg Y. Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing // Journal of the Royal Statistical Society: Series B (Methodological). 1995. Vol. 57. № 1. P. 289–300. DOI: https://doi.org/10.1111/j.2517-6161.1995.tb02031.x
Gerasimenko V., Andreyuk D., Kurkova D. Approach for Management of Brand Positioning: Quantification of Value Matching between Brand and Target Audience // Polish Journal of Management Studies. 2021. Vol. 24. P.96–111. DOI: 10.17512/pjms.2021.24.1.06
Hochberg Y. A Sharper Bonferroni Procedure for Multiple Tests of Significance // Biometrika. 1988. Vol. 75. Is. 4. P. 800–802. DOI: https://doi.org/10.2307/2336325
Holm S. A Simple Sequentially Rejective Multiple Test Procedure // Scandinavian Journal of Statistics. 1979. Vol. 6. Is. 2. P. 65–70.
Jarmasz M., Szpakowicz S. Roget’s Thesaurus and Semantic Similarity // Cornell University arXiv. 2012. DOI: https://doi.org/10.48550/arXiv.1204.0245
Klingenstein S., Hitchcock T., DeDeo S. The Civilizing Process in London’s Old Bailey // Proceedings of the National Academy of Sciences. 2014. Vol. 111. Is. 26. P. 9419–9424. DOI: https://doi.org/10.1073/pnas.1405984111
Noble W.S. How Does Multiple Testing Correction Work? // Nature Biotechnology. 2009. Vol. 27. P. 1135–1137. DOI: https://doi.org/10.1038/nbt1209-1135
Ochiai A. Zoogeographical Studies on the Soleoid Fishes Found Japan and Its Neighboring Regions // Bulletin of the Japanese Society of Scientific Fischeries. 1957. Vol. 22. Is. 9. P. 526–530.
Roget P.M. Roget’s Thesaurus of English Words and Phrases. Austin: MICRA, Inc., 1991.
Rothman K.J. No Adjustments Are Needed for Multiple Comparisons // Epidemiology. 1990. Vol. 1. Is. 1. P. 43–46. DOI: 10.1097/00001648-199001000-00010
Sahlgren M. The Distributional Hypothesis. From Context to Meaning // Rivista di Linguistica. 2008. Vol. 20. Is. 1. P. 33–53.
TenHouten W.D. Neurosociology // Journal of Social and Evolutionary Systems. 1997. Vol. 20. Is. 1. P. 7–37. DOI: https://doi.org/10.1016/S1061-7361(97)90027-8
TenHouten W.D. The Emotions of Powerlessness // Journal of Political Power. 2016. Vol. 9. Is. 1. P. 83–121. DOI: http://dx.doi.org/10.1080/2158379X.2016.1149308
Downloads
Published
Similar Articles
- Varvara A. Sazhina, The Concept of «Russian Nation» and Formation of Civic Identity in the Perception ofthe North Caucasus Youth , Public Administration. E-journal (Russia): No. 97 (2023)
- Raisa N. Shpakova, Ilya S. Demakov, Community-Minded Activity as Project Management Aspect: Analysis of Federal Project “Community-Minded Activity” , Public Administration. E-journal (Russia): No. 90 (2022)
- Elena V. Selezneva, Semantic Determinants of Civil Servants’ Representations of the Modern Russian Leader , Public Administration. E-journal (Russia): No. 113 (2025)
- Maxim V. Nevzorov, Yulia N. Frolova, Organizational Culture of Political Actors as a Prospective Research Direction , Public Administration. E-journal (Russia): No. 90 (2022)
- Matvei N. Chistikov, Anna A. Shuranova, International Regimes as a Factor of Interstate Bilateral Relations (the Case of Russia and Norway) , Public Administration. E-journal (Russia): No. 95 (2022)
- Anna S. Titova, Maria A. Sukhareva, Alexey I. Fedoseev, Analysis of Community Approaches in Digital Economy Field and Its Social Implications , Public Administration. E-journal (Russia): No. 93 (2022)
- Kirill Yu. Degtyarev, Reducing the Effectiveness of Sanctions: Analysis of the Sanctions Impact (2022–2024) , Public Administration. E-journal (Russia): No. 108 (2025)
- Elena I. Vasileva, Tatiana E. Zerchaninova, Alena S. Nikitina, Youth Protest Activity: Comparative Age Groups Analysis , Public Administration. E-journal (Russia): No. 90 (2022)
- Ekaterina A. Zubova, Societal Cost of Coronavirus Disease (COVID-19) Mortality Using Value of Statistical Life , Public Administration. E-journal (Russia): No. 91 (2022)
- Nataliya S. Grigorieva, Citizens and Society in the face of COVID-19 Pandemic: Public Interest versus Individual Freedom , Public Administration. E-journal (Russia): No. 84 (2021)
You may also start an advanced similarity search for this article.


