A free and open source web application that highlights important sentences of any publication whose full text is available in the PubMed Central (PMC) to facilitate and speed up knowledge curation from the scientific literature.
Submit PMCIDs, highlights of these research(es) will be generated for you
For example: PMC5063010,PMC4936267
submit
Search, Select and Submit, highlights of selected research(es) will be generated for you
For example: parkinson's (based on Europe PMC web service)
search submit

How to use

The only input NapEasy needs is a list of comma separated PMCIDs. Note that PMIDs (PubMed identifiers) are different to PMCIDs. Should you only have PMIDs to hand, an easy conversion tool is available online.

There are two options to use NapEasy:

    1) If you happen to have PMC IDs of papers you want to highlight, just click the "Use PMC IDs" tab, enter the relevant PMC IDs and submit.
    2) Otherwise, just simply use the "Search PMC" functionality to look up and select papers based on keyword search; the selected papers can then immediately be submitted for highlighting.
In its current implementation, the service will not provide highlight results on the fly. Instead, a unique job id and URI for the job will be generated on submission (e.g., http://napeasy.org/ht.html?b201e1c9-5dcb-425a-8ddc-a5c4ce03ad0f), which will allow you to check the result later (so, please make a note of it if you don't want to provide an email address). If you do provide your email address during the submission process, you will receive a notification email from napeasy.noreply@gmail.com when your highlighting job has finished. (Normally, a one paper job finishes within 10-30 minutes depending on the number of sentences; more than one paper will take longer.)

Highlight Result

When the job finishes, the result can be visualised directly by following the job result URI generated at the submission (e.g., sample job result). In addition, from the same webpage, the highlighted sentences can then be downloaded in three different formats (XML, JSON, and plain text) for further analysis. The data structure of the downloaded result is explained as follows.

Result Data Structure

Essentially, for a given job, the result contains a list of highlighted sentences for each paper within the job. Here, we elaborate the result data structure using a sample job. Although we use the JSON format (see below) here, the structure applies to XML and text formats as well.

At the top level, a result contains two attributes - jobid (a GUID identifying the job) and papers (an array, of whom each element is the highlight result of each paper). In this case, we only have one paper in this job. So, the papers contains just one element.

For each paper, the result is an object containing three attributes:
- pmcid: the PMCID of the paper
- total_sentences: the total number of sentences in the paper
- highlights: an array of highlighted sentences

Each highlighted sentence is an object having four attributes:
- sid: the sentence ID (an integer sequence number from 1 for the first sentence)
- text: the text content of the sentence
- type: the type of the sentence, which can be one of or a combination of three basic types (goal, findings, method)
- score: a double valued score indicating the importance of the sentence (from 0 to 10). The higher the value; the more important the sentence is.


{
  "jobid": "b201e1c9-5dcb-425a-8ddc-a5c4ce03ad0f",
  "papers": [
    {
      "pmcid": "PMC5089825",
      "total_sentences": 262,
      "highlights": [
        {
          "sid": "27",
          "text": "Anti-NMDAR encephalitis is the most common autoimmune encephalitis described so far,  9 with >900 cases identified worldwide since its first description in 2007.",
          "type": "findings-method",
          "score": 2.045864771554322
        },
        ...
      ]
    }
  ]
}
        
NapEasy highlights important sentences in open access publications in the PubMed Central. Instead of picking sentences best representing the paper content like most text summarisation tools do, NapEasy highlights aim to represent the main characteristics of the study described in a publication, which are especially sought after in tasks of curating knowledge from the scientific literature.

NapEasy was initiated as a collaboration between King's College London and University College London, which was to develop a text mining tool for facilitating the curation of a knowledge base of Parkinson's and Alzheimer's diseases from the scientific literature (a recent development of ApiNATOMY project). Inspired by the success of NapEasy's application in the case study, the team decided to make the tool available to and, hopefully, benefit a wider community by deploying it as a web server.

This website is free and open to all users and there is no login requirement. The tool is open source and the Github repo is https://github.com/KHP-Informatics/NapEasy. We are also keen to hear your comments and suggestions: honghan.wu@kcl.ac.uk.

People

Honghan Wu1
Anika Oellrich1
Christine Girges2
Bernard de Bono2
Tim JP Hubbard1
Richard JB Dobson1,2

1King's College London
2University Colledge London

 

NapEasy 2016-2017 | Github