Tools: Data Wranglers

Bela Seeger of Open Knowledge Foundation Germany

Sep 15, 2017

The platform offers several applications that are tailored to the needs of data wranglers and domain experts. Complex data mining algorithms are accompanied by state-of-the-art linked data tools that cover large aspects of the complexities involved in the work within the Resource Description Framework (RDF).

Data Mining Tool Collection

Are you looking to dive deep into your data? The Data Mining Tool Collection offers you a plethora of ways to do so. Using it you can apply time series algorithms, detect outliers, perform descriptive statistics, and do clustering and similarity learning.

Data Mining

ETL LinkedPipes

Create custom pipelines to source, process, and convert data from almost any source into a variety of formats.

ETL LinkedPipes

Browse RDF Data

The RDF Browser is an open source content negotiator and HTML description generator for RDF resources. It is a PHP web application, able to be deployed in most environments out of the box with minimum effort, lowering the barrier for publishing Linked Data on the Web.

Browse RDF data


An application for online, collaborative, system aided manual entity linking. The tool can be used to manually create linksets between two knowledge graphs or to validate linkesets.



Subsidystories is a database containing beneficiaries of three of the five European Structural Investment Funds (ESIF). It has been carefully collected, cleaned, and standardised by the OpenBudgets team to allow journalists deep data dives on their search for salience and malpractice. The database has been accompanied by the Storyhunt, a series of workshops with a final expedition weekend into the depths of the data. A valuable resource for investigations into corruption and financial policy.


Upload & Explore

Upload & Explore are essential and important parts of the OpenBudgets platform, as here budget and spending data is imported, visualised, and analysed. You may choose to upload your own dataset using the ‘Upload’ button, or you can dive straight into the existing datasets by clicking ‘Explore’.


The explore view shows you the datasets that have been already uploaded to the platform. Selecting ‘view’ will open the OpenBudgets Viewer, with instant access to a series of single-click visualisations that you can customise and embed. ‘Analyse’ will open the suite of data mining algorithms you can run on the dataset, allowing for more advanced insights into your data.



Clicking on ‘Upload’ offers two ways to transfer data into the platform:


To upload a dataset you have on your harddrive, choose OpenSpending Packager and follow the instructions on the next page to upload and describe your dataset. LinkedPipes is an ETL (Extract, Transform, Load) tool primarily targeted at advanced users and domain experts that allows users to create individual pipelines for specific use cases in which data needs to be sourced, e.g. from a URL.

For support requests, please use Github.