www.fgks.org   »   [go: up one dir, main page]

Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Impact of Sci-Hub on the subscription model #35

Open
StuartCT opened this issue Nov 30, 2017 · 7 comments
Open

Impact of Sci-Hub on the subscription model #35

StuartCT opened this issue Nov 30, 2017 · 7 comments

Comments

@StuartCT
Copy link

In terms of your point that the subscription model is becoming unsustainable as a result of sci-hub, I wonder if that is really true? We have had sci-hub now for nearly six years and yet we still have pretty widespread subscriptions (apart from a few specific regional disputes with certain publishers).

Presumably this is due to the reluctance of librarians to sanction the use of illegal resources and cancel their subscriptions accordingly? Or maybe that (despite the Streisand Effect) sci-hub is still not all that widely known amongst researchers.

I’d be interested in your thoughts on that.

@dhimmel
Copy link
Collaborator
dhimmel commented Dec 1, 2017

@StuartCT great question. Also very timely, given that a reviewer made a similar point:

The authors make one claim that seems to me not supported by the evidence. They claim that their paper shows that toll-based publishing is becoming unsustainable. But they also point to a recent study that estimates that the ratio of the number of times papers are downloaded from the publisher to downloaded from SciHub is 48:1 for Elsevier and 20:1 for Royal Society of Chemistry. This suggests that SciHub so far has very little influence on the subscription demand for journal articles.

So this issue is a great opportunity for us to really flesh out our stance and take a second look at the available data.

We have had sci-hub now for nearly six years and yet we still have pretty widespread subscriptions

That's right. While there have been large scale subscription reductions in the past decade, many libraries still subscribe to lots of toll access journals. As an aside, thanks to work by @Publicus, we'll soon know how extensive the University of Pennsylvania's coverage is.

Presumably this is due to the reluctance of librarians to sanction the use of illegal resources and cancel their subscriptions accordingly?

I agree. Most librarians are not going to cancel subscriptions that their members are commonly using and refer their users to Sci-Hub instead. But I think Sci-Hub usage does affect the decision making of librarians in two important ways:

  1. Subscription usage goes down as members use Sci-Hub for convenience despite having authorized access. Authorized access may become more burdensome as Universities start requiring two-factor authentication, as Penn recently did for staff. Librarians rely on usage metrics to determine whether to renew subscriptions. If usage goes down, they may decide a journal is no longer worth the price... the cost per use is just too high. Publishers of course could reduce their cost per article proportionally, but we haven't seen recent reductions in subscription costs (quite the opposite, in fact).

  2. When libraries drop access to a journal, they no longer receive complaints from faculty. If most users are aware of Sci-Hub, it's easier for them to just to start using Sci-Hub for the dropped journal than to complain to their librarians. Librarians may become more "trigger happy" regarding cancellations when they no longer encounter as much pressure to sustain them.

Note that neither of the above factors require librarians to condone or even be aware of Sci-Hub. So what determines how quickly these factors affect overall subscriptions? Primarily the overall awareness and usage of Sci-Hub.

While Sci-Hub has been around for 6 years, I'd argue it has only recently become common knowledge and has yet to become common practice among most library members. Most of my evidence here is anecdotal from talking to other researchers. However, some existing data agrees:

  • A study conducted from February to June 2016 found only 19.2% of medical students in Latin America were aware of Sci-Hub.

  • Search interest resulting from domain outages suggests recent growth in Sci-Hub usage:

    Ⓓ and Ⓛ correspond to suspended Sci-Hub domains. While it's hard to separate search interest generated from news coverage from search interest from frustrated users who cannot access a site, I think Ⓛ is most likely the later. Sci-Hub has seen many recent large press events without sudden large upticks in search interest. Therefore, I suspect Ⓛ resulted from users who were confused when their habitual Sci-Hub domain no longer resolved. This unprecedented search interest therefore reflects recent growth in Sci-Hub usage.

  • The aforementioned licit to illicit download ratios from Table 1 of Shadow Libraries and You use the Sci-Hub access logs from around the start of 2016. This is now almost 2 years ago. About a year ago, Alexandra Elbakyan told Nature that Sci-Hub serves "3% of all downloads from science publishers worldwide".

@stevemclaugh do you know of any estimates of Sci-Hub adoption or awareness that I missed or are more recent?

While Sci-Hub awareness and usage appear to be growing quickly, I think they're far from peaking. So the question is how long will it take Sci-Hub to displace subscription access? Will Sci-Hub or an alternative survive long enough to irreversibly and drastically affect subscriptions? @tamunro, a study coauthor with the most librarianship experience, believes that 2017 has been a turning point regarding large scale subscription cancellations.

In conclusion, our hypothesizing over Sci-Hub's effect on future subscriptions requires a bit of speculation. However, there are strong financial incentives to not subscribe to journals. I think the collapse of the subscription publishing model has been a long time in the making. Sci-Hub will be the force that pushes it over the brink.

@StuartCT
Copy link
Author
StuartCT commented Dec 1, 2017

Thanks @dhimmel. Those are both excellent points about librarians.

I guess the other thing that may prevent sci-hub having the effect on the subscription model you propose is its robustness.

  1. It must be very heavy on server resource to operate and that potentially limits the number of options for people seeking to mirror it
  2. as each domain name gets blocked, Alexandra has to create another one and somehow get the message out to users that there's a different URL. This constant disruption is surely going to impede adoption. As soon as communities of users build, they will have to de-camp and move somewhere else. Some may simply see a 404 and assume the whole project has died.
  3. if (as a result of 2.) sci-hub has to move to purely TOR, it's going to severely limit usage as most people are not prepared to visit the dark web (even if they know about it)

@dhimmel
Copy link
Collaborator
dhimmel commented Dec 1, 2017

Great points @StuartCT. It'll be interesting to see how committed users are to accessing content through Sci-Hub given the inconvenience.

Constantly changing domain names is disruptive to user experience. However, quickly looking up Sci-Hub's current domain on Wikipedia before each use may still be faster than authorized alternatives. However, I agree that many users will not be extremely committed to using Sci-Hub. They may stop using Sci-Hub if it doesn't match their familiar internet usage patterns (e.g. it requires Tor or other workarounds to DNS censorship).

The three weaknesses you mention are exactly what the decentralized web movement has been trying to solve. If IPFS becomes standard in a few years, then these issues may just be temporary. However, it's difficult to predict how quickly decentralized web technology will develop and whether it will be adopted.

As the Russia incident and recent legal DNS suspensions show, Sci-Hub is not robust, but instead fragile. This fragility is somewhat mitigated since Sci-Hub uploads downloaded articles to LibGen scimag. But Sci-Hub does appear to be the only entity providing access with a user experience/interface that appeals to the masses.

@tamunro
Copy link
Collaborator
tamunro commented Dec 6, 2017

I think this is an important point. The reviewer's objection can be met with a bit of hedging - i.e. "these early signs suggest ... may in future become unsustainable" or similar. A good analogy here would be newspapers and record companies: revenues didn't plummet overnight, but they have steadily declined, and the end result is disastrous.

Some news on this topic: the projekt DEAL cancellations in Germany will spread to hundreds of institutions at the end of the month. I make it 186. If we assume they're paying $500k/year on average, in the middle of Bergstrom's range for the Elsevier bundle, that's nearly $100 million/year, which is enough to cut into Elsevier's growth. Plus there's Taiwan and Peru etc, but I haven't seen numbers.

I'm confident that mass cancellations like this by leading universities, of all content from the biggest scientific publisher, are unprecedented. It seems to me it wouldn't have been possible in the past without crippling their research program. I don't know how to prove it though. I think it just needs to be phrased tentatively.

@tamunro
Copy link
Collaborator
tamunro commented Dec 22, 2017

Here's a conference paper @dhimmel - very relevant to the question of what librarians are thinking:
Houle, L. (2017-08-16). Sci-Hub and LibGen: what if… why not?

Hoole is director of library collections at McGill, Canada's highest-ranked and wealthiest university. He "has taken a major role in the negotiation of licenses ... at the institutional level, at the provincial level ... and including work across Canada through cooperative agreements with CRKN."

He asks "can we imagine ... substituting some or most of our journal collection funds with ... Sci-Hub and LibGen ...?" and finally concludes "Using Sci-Hub/LibGen or not should remain a personal decision". Both radical statements, I think.

The paper's mostly devoted to coverage calculations, nothing surprising; but of interest, p. 12: 98% of Nature articles and 100% of Science articles were available from Sci-Hub and Libgen within 24 h of publication; by contrast only 8% and 9% respectively were indexed by Google Scholar within that time!

@dhimmel
Copy link
Collaborator
dhimmel commented Jan 4, 2018

Sci-Hub adoption growth

In my comment above, I looked at a few ways to assess growth in Sci-Hub adoption. I was responding to the reviewer comments and wrote a bit more on the issue, which I thought I should cross-post here. The relevant portions follow:

Based on the following quote (from the Nature's 10 article mentioned above), Sci-Hub usage appears to have increased by 79% from 2015 to 2016:

According to Elbakyan’s figures, the site … is likely to serve up more than 75 million downloads in 2016 — up from 42 million last year

We can also roughly assess Sci-Hub adoption from the Google Trends data in the updated Figure 1. We historically see a large spike in searches following domain outages, as existing users presumably Google how to access Sci-Hub. Following the suspension of sci-hub.org — Ⓓ in Figure 1, search interest peaked at 58 on the week of 2015-11-08, according to our data. When four of Sci-Hub's publicized domains were suspended in late 2017 — Ⓛ, search interest peaked at 215 for the week of 2017-12-10. Hence, we can estimate that Sci-Hub adoption increased by 88% annually over the 764-day period (Python calculation below).

>>> (215.01 / 57.53) ** (365 / 764) - 1
0.8773352629656139

Hoole 2016

On a slightly unrelated note relating to "Sci-Hub and LibGen: what if… why not?". I came across that study before but was turned off by the improper pie chart on page 6. However, the lag time analysis @tamunro mentions is interesting, and I'll reconsider referencing it.

Migrating Sci-Hub domain names

@StuartCT and I discuss the effect of Sci-Hub domain censorship on adoption above. Anyways Sci-Hub is currently online at the following domains below. I also list their creation dates according to their whois info (from commit message in 8e39d11):

Domain Creation Date
sci-hub.hk 2017-11-23
sci-hub.la 2015-11-24
sci-hub.mn 2017-11-23
sci-hub.name 2017-11-23
sci-hub.tv 2017-11-23
sci-hub.tw 2017-11-23

as each domain name gets blocked, Alexandra has to create another one and somehow get the message out to users that there's a different URL

It appears that creating new domains was not much of an issue here. Interestingly sci-hub.la appears to have been registered by Alexandra following the sci-hub.org suspension resulting from the Elsevier case. However, I don't think this domain was publicized, and hence wasn't part of the November 2017 suspensions presumably resulting from the ACS suit judgement.

Anyways, Ⓛ above implies that users are Googling "Sci-Hub" to find how to access it. We've also seen a crazy spike in visitors to our Sci-Hub Stats Browser, mostly driven by Google according to our Piwik logs:

Piwik logs for Sci-Hub Stats Browser

4,848 pageviews a day seems a bit high for the number of users that are interested in our Stats Browser 😸. I'm not sure why there was a delay from the Sci-Hub domain suspensions to traffic on our site. Perhaps Google was still showing suspended Sci-Hub domains for some time or starting ranking us higher at some point.

Anyways, so I think it's pretty clear that a good deal of users are willing to clickthrough search results in hopes of finding an operating Sci-Hub. What's not clear is what percent of users give up prior to this point.

dhimmel added a commit that referenced this issue Jan 4, 2018
Refs discussion of Sci-Hub's impact on subscriptions in
#35
@tamunro
Copy link
Collaborator
tamunro commented Jan 9, 2018

Related news: Elsevier has again granted free access to German universities who refused to renew their contracts under Project DEAL:

Günter Ziegler ... a member of the consortium's negotiating team, says that German researchers have the upper hand in the negotiations. “Most papers are now freely available somewhere on the Internet ... Clearly our negotiating position is strong. It is not clear that we want or need a paid extension of the old contracts.”

Note that some of these unis have already had free access for a year:

The same continued access applies to German institutions whose contracts expired at the beginning of 2017.


An interview with Bernhard Mittermaier, another DEAL negotiator. I've tweaked the shaky English translation from the German original.

LIBREAS: How common is the term "SciHub" during such a meeting?

BM: Currently, only rarely. ... if it is claimed that a research institution could not do without the publisher’s journals, we point out that the experience of the Elsevier dropouts teaches something different: they use various legal, alternative methods of document delivery to ensure access.

One can only hope that the interim agreements with Wiley and Springer Nature ... trigger some action from Elsevier. If this does not happen, there will be further escalation: more institutions will terminate contracts, editors will resign at regular intervals ... If there is still no progress to be seen, one must assume that Elsevier would rather forego sales in Germany than to question their business model. But even that would be very risky for the publisher: ultimately, this is a large field trial of whether you can live without Elsevier journals.

LIBREAS: From the DEAL team’s perspective – what is desirable?

BM: That the institutions hold their nerve. There has probably never been a campaign like this.


An anonymous interview with a purported volunteer for Sci-Hub, only in German.

He makes several quite explosive claims, if true. They seem a bit hard to believe.

Of course there is a lot of developer and hardware capacity behind Sci-Hub. The infrastructure is supported, for example, by a large foundation. And there are many volunteers like me. These are now increasingly criminalized. Only a few days ago, a colleague was arrested in Australia ...
In the foreseeable future, Sci-Hub would like to make the ~37 million books in Google Books available in a single full-text repository beyond Google's control ... Google currently seems to be relatively indifferent to what happens with this collection. Using a distributed structure, the full texts can be harvested comparatively easily.

I've emailed the magazine's contact address asking for more info on the arrest.

dhimmel added a commit that referenced this issue Jan 30, 2018
dhimmel added a commit that referenced this issue Jan 30, 2018
dhimmel added a commit that referenced this issue Jan 30, 2018
dhimmel added a commit that referenced this issue Jan 30, 2018
This build is based on
54c27eb.

This commit was created by the following Travis CI build and job:
https://travis-ci.org/greenelab/scihub-manuscript/builds/335256233
https://travis-ci.org/greenelab/scihub-manuscript/jobs/335256234

[ci skip]

The full commit message that triggered this build is copied below:

Report on 2017 Sci-Hub logs (#46)

Refs #35

Relevant source code commits with the analyses are:
greenelab/scihub@bbb5506
greenelab/scihub@64a01fc
greenelab/scihub@b4c5300
dhimmel added a commit that referenced this issue Jan 30, 2018
This build is based on
54c27eb.

This commit was created by the following Travis CI build and job:
https://travis-ci.org/greenelab/scihub-manuscript/builds/335256233
https://travis-ci.org/greenelab/scihub-manuscript/jobs/335256234

[ci skip]

The full commit message that triggered this build is copied below:

Report on 2017 Sci-Hub logs (#46)

Refs #35

Relevant source code commits with the analyses are:
greenelab/scihub@bbb5506
greenelab/scihub@64a01fc
greenelab/scihub@b4c5300
dhimmel added a commit that referenced this issue Feb 1, 2018
dhimmel added a commit that referenced this issue Feb 1, 2018
* Miscellaneous edits

* Update GitHub issues link

* Acknowledge Stuart Taylor
For input in #35
https://orcid.org/0000-0003-0862-163X
dhimmel added a commit that referenced this issue Feb 1, 2018
This build is based on
561763d.

This commit was created by the following Travis CI build and job:
https://travis-ci.org/greenelab/scihub-manuscript/builds/336045304
https://travis-ci.org/greenelab/scihub-manuscript/jobs/336045306

[ci skip]

The full commit message that triggered this build is copied below:

Miscellaneous edits (#51)

* Miscellaneous edits

* Update GitHub issues link

* Acknowledge Stuart Taylor
For input in #35
https://orcid.org/0000-0003-0862-163X
dhimmel added a commit that referenced this issue Feb 1, 2018
This build is based on
561763d.

This commit was created by the following Travis CI build and job:
https://travis-ci.org/greenelab/scihub-manuscript/builds/336045304
https://travis-ci.org/greenelab/scihub-manuscript/jobs/336045306

[ci skip]

The full commit message that triggered this build is copied below:

Miscellaneous edits (#51)

* Miscellaneous edits

* Update GitHub issues link

* Acknowledge Stuart Taylor
For input in #35
https://orcid.org/0000-0003-0862-163X
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants