Usage. Please provide DOI of the document and the service will show the metadata of the given document along with documents with similar content. The documents are searched for in PMC Open Access Subset database retrieved on 01/16/2014.
The service returns a result containing the following fields:
Machine-readable service. The service provides a RESTful API which can be accessed by external client services.
General description of the algorithm.Majority of the processing presented by the service is done off-line. Namely, for each document stored in the database, words coming from it are transformed into terms (using stemming), and their importance is measured with TF-IDF weighting scheme. Only top n terms among them are selected and used to compute similarity score. The method utilizes inverted dictionary technique to efficiently calculate document similarity, considering only objects related to each other (sharing at least one important common word). The computed similarity score is based on cosine similarity (or its substitute normalized to the range of [0, 1]). See our paper arXiv:1303.5367 for more details.