Scientometrics Workshop

at the Extended Semantic Web Conference (ESWC) 2017

May 28th, 2017

Draft Program

09:00 – 09:15 Welcome

09:15 – 9:40 Paper Presentations (15 mins presentation +10 mins Q&A)
Towards an Infrastructure for Understanding and Interlinking Knowledge Co-Creation in European research
Diana Maynard, Adam Funk and Benedetto Lepori

Abstract: This paper describes the initial development of an infrastructure for understanding and visualising knowledge co-creation in European research. Datasets containing information about projects, publications and patents are enhanced with semantic information enabling indicators to be generated, that are used to inform users about the state of art in the research area and new trends. This helps to resolve the problem of increasing complexity and multidisciplinarity of emerging scientific and technological research. Ontologies and Semantic Web techniques play a central role in mapping between topics, data, and user queries so that information can be enhanced, interlinked and aggregated.

09:40 – 10:05 Paper Presentations (15 mins presentation +10 mins Q&A)
Scholia and scientometrics with Wikidata
Finn Årup Nielsen, Daniel Mietchen and Egon Willighagen

Abstract: Scholia is a tool to handle scientific bibliographic information through Wikidata. The Scholia Web service creates on-the-fly scholarly profiles for researchers, organizations, journals, publishers, individual scholarly works, and for research topics. To collect the data, it queries the SPARQL-based Wikidata Query Service. Among several display formats available in Scholia are lists of publications for individual researchers and organizations, publications per year, employment timelines, as well as coauthor and topic networks and citation graphs. The Python package implementing the Web service is also able to format Wikidata bibliographic entries for use in LaTeX/BIBTeX.

10:05 – 10:30 Paper Presentations (15 mins presentation +10 mins Q&A)
Leveraging Mathematical Subject Information to Enhance Bibliometric Data
Maria Koutraki, Olaf Teschke, Harald Sack, Fabian Müller and Adam Bannister

Abstract: The field of mathematics is known to be especially challenging from a bibliometric point of view. Its bibliographic metrics are especially sensitive to distortions and are heavily influenced by the subject and its popularity. Therefore, quantitative methods are prone to misrepresentations, and need to take subject information into account. In this paper we investigate how the mathematical bibliography of the abstracting and reviewing service Zentralblatt MATH (zbMATH) could further benefit from the inclusion of mathematical subject information MSC2010. Furthermore, the mappings of MSC2010 to Linked Open Data resources have been upgraded and extended to also benefit from semantic information provided by DBpedia.

10:30 – 11:00 Coffee

11:00 – 11:25 Paper Presentations (15 mins presentation +10 mins Q&A)
ACE: Big Data approach to Scientific Collaboration Patterns analysis
Andrei Zammit, Kenneth Penza, Foaad Haddod, Joel Azzopardi and Charlie Abela

Abstract: The characteristics of scientific collaboration networks have been extensively analysed and found to be similar to other scale-free networks. Research has furthermore focused on investigating how collaboration patterns between authors evolved over time, by providing insights into different fields of research. Numerous bibliographic datasets, such as DBLP and Microsoft Academic Graph, provide the basis for investigations and analysis of such networks. This paper presents ACE (Academic Collaboration analyzEr); an interactive framework that uses big data technologies and allows for scientific collaboration patterns to be analysed and visualised. Through ACE it is possible to reveal the key authors in particular fields of research, the topological features of the collaboration network, the network trends over time and the relationships between authors and co-authors. Furthermore, ACE allows for the discovery of potentially new collaborations between authors in the same field of research as well as fields where scientists can conduct future joint-research work.

11:25 – 11:50 Community Presentations (15 mins presentation +10 mins Q&A)
The SMS platform: Enriching scientometrics with linked data
Ali Khalili, Vrije Universiteit Amsterdam, The Netherlands

Abstract: Scientometrics has been dominated strongly by the large bibliographic datasets like WoS and more recently Scopus. However, this restricts scientometrics research to a small set of variables. To do richer studies, one needs to combine traditional scientometric data with other relevant datasets. The SMS platform exactly does this: linking and enriching data for studying science. In the presentation we will describe the system and show some use examples.

11:50 – 12:20 Community Presentations (15 mins presentation +10 mins Q&A)
Andrea Nuzzolese, STLab, ISTC-CNR, Italy

Abstract: ScholarlyData is the new and currently the largest reference linked dataset of the Semantic Web community about papers, people, organisations, and events related to its academic conferences. Originally started from the Semantic Web Dog Food (SWDF), it addresses multiple issues on scholarly data representation and maintenance by (i) adopting a novel data model, (ii) establishing an open source workflow to support the addition of new data from the community and (iii) adopting an entity deduplication methodology. The novel data model consists of a new self-contained ontology, called the conference-ontology, which exploits good ontology design practices and Ontology Design Patterns (ODP) and is aligned to other relevant ontologies in the scholarly domain. The workflow is implemented in open source tool called cLODg, which support the production of metadata for conferences and scholarly data in nearby one-click. Finally, the entity deduplication methodology relies on blocking techniques to narrow down a list of candidate duplicate URI pairs and exploits supervised classification methods to identify candidate duplicates.

12:20 – 12:30 Planning for the afternoon session

12:30 – 14:00 Lunch

14:00 – 14:25 Community Presentations (15 mins presentation +10 mins Q&A)
Saffron: Topic extraction, expert finding and trend analysis from scientific literature
John McCrae, Insight Centre for Data Analytics at the National University of Ireland Galway

Abstract: Saffron is a system that analyzes the content of a collection of documents in order to understand their content deeply. This is performed primarily by the extraction of terminology and grouping these terms into topics, which are used to construct a heterogeneous network of authors, papers and topics. This network is then used as a basis for further analysis, including expert finding systems, browsable interfaces and most recently trend prediction, where we use collections of papers from conference series to predict the future topics of research at major conferences such as ACL and LREC. In this talk, I will describe the Saffron system from a technical perspective as well as showing how this has been applied to the analysis of the scientific literature. Furthermore, I will briefly describe the applications of this kind of analysis to commercial applications including newspaper archives and e-commerce applications, showing wider application of the techniques first developed for scientometrics.

14:25 – 14:50 Community Presentations (15 mins presentation +10 mins Q&A)
PROPEL: Topic and trend analysis
Sabrina Kirrane, WU Wien, Austria

Abstract: For many years Semantic Web technologies have been an area of intense research in the academic community. In this talk, we present our analyse of the academic literature that emerged from the five most popular international publishing venues for Linked Data researchers: the International Semantic Web Conference (ISWC), the Extended Semantic Web Conference (ESWC), the SEMANTiCS conferences, the Semantic Web Journal (SWJ) and the Journal of Web Semantics (JWS). Our analysis, which is limited in scope to the last 10 years (i.e. from 2006 to 2015), examines research trends based on topics: identified in the tracks, call for papers, workshops, tutorials and sessions of the top 5 venues; and extracted from three seminal papers “The Semantic Web” (Berners-Lee et al., 2001), “The Semantic Web in action” (Feigenbaum et al., 2009), and “A new look at the Semantic Web” (Bernstein et al., 2016).

14:50 – 15:15 Community Presentations (15 mins presentation +10 mins Q&A)
Metrics for Research Evaluation
Anna Tordai, Elsevier, Amsterdam, The Netherlands

Abstract: Research metrics provide institutions with a way of measuring progress, understand their strengths, report on their output and support decision making. Elsevier is evolving its research metrics strategy to offer a more balanced transparent, multi-dimensional view on research performance. The presentation will provide an overview of research metrics at four levels: institutional, journal, article and author level and some background on how those metrics are calculated and what data they are based on.

15:15 – 15:30 lightening talks

15:30 – 16:00 Coffee

16:00 – 17:30 world cafe & closing

19:30 Dinner dinner at Pavel 2