Interview with Juan Pablo Alperin from PKP

Date: May 2, 2019
Categories: altmetrics

Lede: PKP is developing an open source altmetrics project called Paperbuzz that tracks online buzz around scholarly articles and provides an openly available dataset of metrics.

The Public Knowledge Project (PKP) develops an open source software suite for editorial management, offers services to researchers and leads various research projects on scholarly communication. Can you tell us more about its activities and expertise?

Although PKP is best known for the widely used publishing software, Open Journal Systems (OJS), we offer the academic and scholarly publishing communities much more than a suite of open source software tools. Our work can be grouped into three main areas: 1) the software development that we’re best known for; 2) the publishing services we provide (like hosting, preservation, and indexing); and 3) our research, education, and advocacy.

In software, we don’t just develop OJS, we have also developed software for academic monographs, and, in part through this project, we have also been supporting the development of software for editing XML documents and for collecting social media metrics.

On the services side, our PKP Publishing Services (PKP | PS) has developed into a professional journal hosting operation, backed by the expertise of the PKP team. The team offers hosting, upgrades, and customizations that eventually benefit the entire community, among others. Details can be found here.

Finally, on the research, education, and advocacy side, there is a wide range of activities. As academics, we carry out a wide range of research on open access, funding models, research metrics, academic incentives, and more. We are also an active voice in the community, participating in many open access, library, and scholarly publishing events and advocating for the thousands of journals that make up our community of users. Finally, PKP School offers online modules (in English and Spanish) on the journal editing process, and other related topics.

As part of CO.SHS, PKP is working on two projects, reflecting different facets of PKP’s work. Can you tell us briefly about these?

We have been working on two main projects: the development of a tool to create XML from author submissions, and on an open system for collecting alternative metrics.

The intent of the first is to allow for the low-cost and easy creation of document that can be read by machines and used for automated creation of HTML and PDFs for publishing. Such documents (in the JATS XML format) are more common in the life sciences than they are in the social sciences, and so our hope is that by facilitating the creation of these semantically marked up documents we will help to lower the cost of publishing, and make open access easier.

The second project is intended to reduce the reliance on commercial services for the collection of social media metrics. In this case, we have teamed up with ImpactStory for the creation of Paperbuzz to draw on the data being collected by Crossref Event Data and deliver easy to use metrics. The Paperbuzz service delivers metrics through an API at the article level, thereby making it easier for journals to highlight where their articles have circulated on the web.

As social media and other online platforms take more and more importance in the scholarly communication, there is still a misunderstanding of what we now call “altmetrics” really are. Can you explain to us what they can measure and what is their significance?

Ever since the early days of altmetrics, I have been advocating for people to not dwell so much on the “metrics” side of altmetrics. Studies have shown that these metrics are not highly correlated to citations, that research receives attention for a wide range of reasons, and that the metrics are subject to various biases. Trying to measure something under these conditions is bound to offer limited value, and trying to make comparisons even more so.

For me, the value of altmetrics is in giving us a greater understanding of where and how research circulates. Researchers can use these tools to dig into who has been sharing their work and what is being said about it in new ways. That continues to be exciting.

What can the research community do to use this new type of metrics in a good way, and hedge against the “metric fixation” problem? Do you think developing a community-driven solution, entirely open, can help address this issue?

I spoke to the first of these questions above, but let me address the second. In short, having a community-driven solution, based on open source and open data, will not in itself remove the impetus to use the metrics for measuring and ranking. However, it does open up the opportunity for the community to move the development of these metrics in new directions, as well as invite scrutiny and transparency into what is being counted, and how. These are important prerequisites for the development of tools and metrics that better serve the academic community.

The Paperbuzz project will lead to different outputs:

a Web service;
an API to access metrics;
a Javascript visualization library (PaperbuzzViz) that can be reused and customized;
a plugin to easily add PaperbuzzViz in OJS journals;
and of course the source code of all these developments.

Yes, and the plugin has just been released! You can read the announcement here.

In parallel you have been working on a Facebook metrics project to feed back this data into Crossref Event Data to increase their coverage. What is your progress on these fronts today?

We have fully developed a mechanism for collecting the number of times articles have been shared on Facebook. There are numerous challenges to doing so, which we presented at the STI conference last year, but we are confident we have a good mechanism in place and a tool to implement it.

Before we can make this available, we still need to do two things: 1) implement a scheduler to query Facebook without tripping over the allowable rate limits; and 2) finish testing and working with Crossref so that we can send them the data. Once these are done, we will also need to work with Paperbuzz to make sure they can read that data and make it available in their API.

How does Paperbuzz work? What is the coverage of? Do you use all of Crossref Event Data sources in Paperbuzz?

Yes it does, and PaperbuzzVizz similarly displays anything available. Details on the sources available can be found in the Crossref Event Data User Guide.

How will you make sure the data is consistent for a same DOI on different platforms?

Several studies have examined the similarities and overlaps between the different altmetric providers and have found, perhaps unsurprisingly, that there is relatively little consistency. The web is a messy place, the timing of collection matters, and collecting mentions is not an exact science. Each platform, including both Crossref Event Data and Paperbuzz, makes implementation decisions that affect the coverage and numbers, at least this new service is entirely transparent about those choices.

Will the service be maintained and enhanced (for example to add new data sources or update the visualization library) by PKP in the long run?

We have chosen to build this service using Crossref Event Data because that offers the community some amount of guarantees that the underlying data will continue to be available in the long term. ImpactStory similarly has a good history of keeping services running, even after they have moved onto other priorities. Our intent is to keep working with them to keep the service running for as long as it is needed.

That said, because everything is open source and open data, anyone could take up running the service in the long run.

Interview with Juan Pablo Alperin from PKP

Article