Skip to main content
In this article we will apply various statistical analyses with the R programming language (word frequency, bigrams, word co-occurrence and Topic Modelling) to the bibliographical reviews of 1932-1933 from the Spanish journal Índice... more
In this paper, we focus on cross-modal (visual and textual) e-commerce search within the fashion domain. Particularly, we investigate two tasks: 1) given a query image, we retrieve textual descriptions that correspond to the visual... more
Text classification typically performs best with large training sets, but short texts are very common on the World Wide Web. Can we use resampling and data augmentation to construct larger texts using similar terms? Several current... more
The richness of social media data has opened a new avenue for social science research to gain insights into human behaviors and experiences. In particular, emerging data-driven approaches relying on topic models provide entirely new... more
When did European modes of political thought diverge from those that existed in other world regions? We compare Muslim and Christian political advice texts from the medieval period using automated text analysis to identify four major and... more
In this paper, we present a framework that combines automatic and manual approaches to discover themes in disaster-related tweets. As case study, we decided to focus on tweets related to typhoon Haiyan, which caused billions of dollars... more
Legal scholars study international courts by analyzing only a fraction of available material, which leaves doubts as to whether their accounts correctly capture the dynamics of international law. In this paper we use dynamic topic... more
Increasingly, management researchers are using topic modeling, a new method borrowed from computer science, to reveal phenomenon-based constructs and grounded conceptual relationships in textual data. By conceptualizing topic modeling as... more
This is the definition given by New York Times bestselling author Chuck Martin for Artificial Intelligence. To be precise, these devices are learning from the consumers who apply them. It was predicted that 2018 will see this learning... more
The aim of this chapter is to show some basic methods using R to analyze text content to discover emergent issues and controversies in diverse corpora. As a specific case study, I investigate the culture of microblogging academics... more
The goal of this teaching material is to provide a better understanding of the concept of digital humanities from various viewpoints of distinguished scholars in the field, identify the characteristic features of digital humanities... more
Recent discussions in the field of Linguistic Landscapes (LL) have emphasized the importance of acknowledging LL 'actors' and their role in interpreting or interacting with language in place (Lou 2016; Banda & Jimaima 2015; Barni & Bagna... more
Questa tesi doveva essere presentata per la discussione nella sessione di laurea magistrale in "Sociologia e ricerca sociale" del novembre 2021. A causa del comportamento straordiariamente scorretto della Prof.ssa Bracciale e del Dott.... more
Topic modeling has become very popular in digital humanities. It’s easy and very powerful method to get an overview of the contents of large textual collections. This made it very appealing for humanists. But applying topic models in such... more
The term digital transformation (DT) captures the changes in our personal, social, economic, political, cultural and academic life brought about by the rapid diffusion of information and communication technologies. With the increasing... more
A huge amount of potentially dangerous COVID-19 misinformation is appearing online. Here we use machine learning to quantify COVID-19 content among online opponents of establishment health guidance, in particular vaccinations... more
Abstract-The classification of the emotions contained in the social media is of great importance in terms of its use in related fields such as media as well as developing technology. The Latent Dirichlet Allocation (LDA), a topic modeling... more
We provide a brief, non-technical introduction to the text mining methodology known as topic modeling. We summarize the theory and background of the method and discuss just what kinds of things are found by topic models. Using a text... more
Sociolinguistic research has predominately relied on spoken language to understand how social structures influence and are influenced by communication and interaction. This dissertation, however, turns to the increasingly prevalent and... more
Recent corpus techniques ask literary analysts to bracket the interpretation of meaning so that we may trace the motions of mind. These techniques allow us to think of the mind as being, in some aspect, a high-dimensional space of verbal... more
Simple exploratory text mining and document clustering of journal articles from JSTOR’s Data for Research service. Go to http://dfr.jstor.org/, make a request for data (specify CSV as outout format and Word Counts as data type), then once... more
The concept of "life" certainly is of some use to distinguish birds and beavers from water and stones. This pragmatic usefulness has led to its construal as a categorical predicate that can sift out living entities from non-living ones... more
Legitimacy is a crucial factor determining the success of technologies in the early stages of development and for maintaining resource flows as well as public and political support across the technology life cycle. In sustainability... more
Sociological self-understanding is that the frequency of economic topics in sociology has peaked twice: first during the classical era between 1890 and 1920 and second after Mark Granovetter’s often cited 1985 article. This paper tests... more
In this work we analyse the evolution of the vulnerability concept in the research streams of climate change adaptation (CCA) and disaster risk reduction (DRR). We combine a traditional literature review with data mining procedures... more
An important text mining problem is to find, in a large collection of texts, documents related to specific topics and then discern further structure among the found texts. This problem is especilly important for social sciences, where the... more
This paper suggests the use of automatic topic modeling for large-scale corpora of privacy policies using unsupervised learning techniques. The advantages of using unsupervised learning for this task are numerous. The primary advantages... more
I present an in-detail introduction to Topic Models (TM), a family of probabilistic models for (mainly) document modeling. I introduce and motivate the model, and illustrate its applications in Natural Language Processing (NLP), with the... more
Political campaigns mostly run parallel to each other during an election cycle, but intersect when the main candidates face off for televised debates. They offer supporters of these candidates a chance to engage with each other while... more
In this article we apply the method of quantitative textual analysis known as 'topic modelling' to a significant Australian legal text corpus: that of judgments of the High Court of Australia from 1903 to 2015. 1 The High Court of... more
The urban policy mobility literature describes the widespread circulation of policy ideas while highlighting their mutations along the way. At the same time, the literature often analyzes the localization of such ideas by examining their... more
Los orígenes de la concepción peyorativa de lo «medieval» hay que rastrearlos en el Renacimiento, cuando, con la división del tiempo histórico, la Edad Media fue vista como una etapa «bárbara» en contraposición del esplendor cultural... more
В статье на основе данных письменных источников (археологический материал не привлекается) выясняется вопрос о времени наступлении старости в Средневековой Руси. Выясняется, что старость наступала между 50 и 60 годами. Лица, прожившие 70,... more
This is the project report of the Network Institute project " Do you see what I am talking about? " , which is a followup of the earlier project " Polemics Visualized " .
The aim of this article is to analyze the discursive ackground for the characters of teachers in the Soviet school story of the afterwar period. The 1,8 million words corpus for the study as compiled of the novels about school and... more
Aspect extraction is one of the key tasks in sentiment analysis. In recent years, statistical models have been used for the task. However, such models without any domain knowledge often produce aspects that are not interpretable in... more
Since 2015 there has been a surge of academic publications and citations focused on consumer food waste. To introduce a special issue of Appetite focused on the drivers of consumer food waste we perform a transdisciplinary and historical... more
Cikkünk a TANIT (Text ANalysIs Tools) rendszer célkitűzését, funkcióit és használatát mutatja be. A TANIT rendszer célja, hogy magyar nyelvű szövegek számítógépes nyelvészeti feldolgozásával dokumentumok összehasonlító elemzéséhez... more
Despite being a relatively new discipline, Chinese Interpreting Studies (CIS) has witnessed tremendous growth in the number of publications and diversity of topics investigated over the past two decades. The number of doctoral... more
Can "distant reading" and digital tools enhance the history of technology by revealing hitherto undetected patterns in the record? Using the parliamentary debates of Britain in the nineteenth century, this essay revisits the history of... more
The typological distinction between pilgrims and tourists has often been drawn in tourism studies. This article aims at complementing this debate by applying computational techniques to analyse discourses in a corpus of blogs from the... more
In this paper focus is on developing a hashtag recommendation system for an online social network application with a Peer-to-Peer infrastructure motivated by BestPeer++ architecture and BATON overlay structure. A user may invoke a... more
This study introduces a comparative approach to study user comments on the same news content across online platforms while distinguishing between soft and hard news genres. Empirical analysis focuses on Israel’s popular news website Ynet.... more
With the growth of the internet, short texts such as tweets from Twitter, news titles from the RSS, or comments from Amazon have become very prevalent. Many tasks need to retrieve information hidden from the content of short texts. So... more