Renku is a platform that bundles together various tools for reproducible and collaborative data analysis projects. It is aimed at independent researchers and data scientists as well as labs, collaborations, and courses and workshops. Renku can be used by anyone who deals with data, whether they are a researcher, data analyst, project owner, or data provider.

Renku promotes reproducibility by providing tools to track your analysis workflows and save them together with your versioned data, code, and environment specification. Every result can be replayed either to repeat a calculation or to re-execute on new data or with a different choice of parameters.

Renku encourages reusability by storing and querying the connections between datasets, code executions, and results in a Knowledge Graph. Producers and consumers of analysis artifacts can always recover the full provenance of a result, establishing trust and reducing boilerplate.

Renku stimulates collaboration among peers and across disciplines by guaranteeing that a media-rich discussion space and fully configured, shareable interactive computational environments are always just a click away. Collaborators can easily work on projects together or in parallel, combining their work in a systematic and safe manner.

Whole Tale

Whole Tale is an NSF-funded Data Infrastructure Building Block (DIBBS) initiative to build a scalable, open source, web-based, multi-user platform for reproducible research enabling the creation, publication, and execution of tales – executable research objects that capture data, code, and the complete software environment used to produce research findings.

A beta version of the system is available at


Biological Toolset
Search and analyse Biological Sequences disclosed in patents

Choose among the 5 apps available to you to search and analyse the DNA, RNA and protein sequences found in patents. The Lens’ unique open PatSeq facility allows you to search, analyse and share the biological sequences disclosed in patents. This is the world’s largest publicly available database with internal transparency metrics.


LOV stands for Linked Open Vocabularies. This name is derived from LOD, standing for Linked Open Data. Let’s assume that the reader is somehow familiar with the latter concept, otherwise a visit to or will help to figure it before further reading.

Data on the Web use properties (aka predicates) and classes (aka types) to describe people, places, products, events, and any kind of things whatsoever. In the data “Mary is a person, her family name is Watson, she lives is the city of San Francisco”, “Person” is the class of Mary, “City” is the class of San Francisco, “family name” and “lives is” are properties used to describe a person, the latter acting also as a link between a person and a place.

A vocabulary in LOV gathers definitions of a set of classes and properties (together simply called terms of the vocabulary), useful to describe specific types of things, or things in a given domain or industry, or things at large but for a specific usage.

Terms of vocabularies also provide the links in linked data, in the above case between a Person and a City. The definitions of terms provided by the vocabularies bring clear semantics to descriptions and links, thanks to the formal language they use (some dialect of RDF such as RDFS or OWL). In short, vocabularies provide the semantic glue enabling Data to become meaningful Data.


OntoWiki facilitates the visual presentation of a knowledge base as an information map, with different views on instance data. It enables intuitive authoring of semantic content, with an inline editing mode for editing RDF content, similar to WYSIWIG for text documents.


Wikidata is a free and open knowledge base that can be read and edited by both humans and machines.

Wikidata acts as central storage for the structured data of its Wikimedia sister projects including Wikipedia, Wikivoyage, Wiktionary, Wikisource, and others.

Wikidata also provides support to many other sites and services beyond just Wikimedia projects! The content of Wikidata is available under a free license, exported using standard formats, and can be interlinked to other open data sets on the linked data web.


OpenRefine (formerly Google Refine) is a powerful tool for working with messy data: cleaning it; transforming it from one format into another; and extending it with web services and external data.
OpenRefine is available in English, Chinese, Spanish, French, Russian, Portuguese (Brazil), German, Japanese, Italian, Hungarian, Hebrew, Filipino, Cebuano, Tagalog.