The Linked Web APIs dataset is a Linked Data dataset with semantic descriptions about Web APIs. It contains over 11,339 of Web APIs descriptions, over 7,415 mashups and almost 7,717 mashup developers' profiles. The data is retrieved from ProgrammableWeb.com, the largest Web service and mashup repository. In total the datasets contain over half million of RDF triples.

We implemented a simple information extractor which captured raw data, which was further converted into an RDF semantic model. The dataset is published as Linked Data using derefencable URIs, provided through a SPARQL endpoint and an accompanying RDF dump.

News

Processing

Each web page describing Web API, mashup or a mashup developer was parsed and valuable information was extracted. An example of such Web page describing a Web API is the one describing the Twitter API. For each Web API we extracted its title, short summary describing its functionalities, tags and categories assigned, technical information such as supported formats and protocols, as well as non-functional properties such as its homepage, usage limits, usage fees, security, etc. Similarly, for each mashup we extracted its title, short free-text descripton of its funcitonalities, assigned tags, and the homepage of the mashup. From each page describing a developer we extracted its username, homepage and short bio about the developer. Also, the city and country of residence, its given and family name and the gender were extracted, if these information were available as public information.

We also captured the relationships between the Web APIs, mashups and developers. In other words, for each mashup we extracted the list of Web APIs which were used by the mashup and also the information about the list of mashups created by each developer. The dataset also captures the temporal aspects - the creation time of the Web APIs, mashups and the time a user registered opened his profile.

Data Model

Data Access

Linked Data Format

Using the Linked Web APIs Ontology (see the figure above) we have modelled the available information and created an RDF version of the Linked Web APIs dataset. For all the resources we mint URIs in our own namespace (http://linked-web-apis.fit.cvut.cz/resource/{name}) where the name is the normalized form of the resource title (Web APIs and mashup title).

The dataset conforms to the Linked Data principles. It contains owl:sameAs links to four central LOD datasets: DBpedia, Freebase, GeoNames and LinkedGeoData. It is available via Virtuoso SPARQL endpoint within the http://linked-web-apis.fit.cvut.cz graph. All URIs are dereferencable in RDF/XML and Turtle format. An example of a Web API resource in the Linked Web APIs dataset is the one describing the Google Maps API.

RDF Dump

An RDF dump of The data is available in the Apache file system (nt).

Size Metrics, Publications

INFO: size metrics, license, publications

If you consider using, analyzing or you refer to the Linked Web APIs dataset in an academic context please cite the latest dataset description paper:
[1] M. Dojchinovski and T. Vitvar
The Linked Web APIs Dataset: Web APIs Meet Linked Data
Semantic Web Journal (under review), August 2015
List of related papers using the Linked Web APIs dataset:
[2] M. Dojchinovski and T. Vitvar
Personalised Access to Linked Data
EKAW 2014, Linköping, Sweden, Springer Verlag, November 2014
PDF BIBTEX
[3] M. Dojchinovski, J. Kuchar, T. Vitvar and M. Zaremba
Personalised Graph-based Selection of Web APIs
ISWC 2012, Boston, USA, Springer Verlag, November 2012
PDF BIBTEX
[4]  J. Kuchar, M. Dojchinovski and T. Vitvar
Time-aware Link Prediction in RDF Graphs
WEBIST 2015, Lisbon, Portugal, May 2015
PDF BIBTEX

Support and Feedback

If you find an incorrect information in the dataset you can report it as issue at our github repository. For any support or feedback feel free to write us an email milan.dojchinovski@fit.cvut.cz.

Project Team