FLORA
Financial Linked Open Data Reasoning and Management for Web Science

Index


Introduction

The concept of Linked Data is a new technological trend in information management that has emerged within the general framework that represents the Semantic Web. This term refers to a method for exposing, sharing and connecting data through unreferenced URIs on the Web.

Behind this concept is the ability to be able to transform the information contained in documents or data sources that, although they apparent to be structured, only their content is structured, not its data. This transformation is based on being able to get information from a document (which can be read, but that their data can not be used automatically as they have not been pre-interpreted) in a series of tagged data with which can work automatically.

Another concept associated to Linked Data is "Linked Open Data". Linked Open Data is a two-fold reference: On one hand makes references to Linked Open Data project which is precisely related with the managment of public (open) information with the aim of generate a kind of Linked Data avalaible to all the world.

Based on Linked Data, is intended to generate a series of procedures of scientific and technical nature to enable the isolation, categorization and classification of those elements of economic and financial character from the various sources of information available, allowing that these generated data can be used for automated environments.

This generation will rise to the concept of Financial Linked Data. In this case, "open" concept came from the acquisition of information from public and private sources with the objective of generate this information. The idea is, therefore, to perform a transformation of the information contained in financial documents to financial data which can be read and interpreted automatically by a machine.

Aims

The information stored in the documents can be read, interpreted and analyzed, among other things, by humans, in a more or less easy way, being able to isolate and categorize the elements contained in the text, so that the document or its contents, go of being mere information and data, meaning data a set of alphanumeric elements that can be used by a programmable element not human.

The process of obtaining these data is not a trivial task. A comprehensive analysis of the documents by using techniques of NLP (Natural Language Processing) is necessary to achieve the extraction of specific data. Besides this NLP process is necessary to generate a mapping process between the extracted concepts and their links with the aim of generating the Linked Open Data.

The aim therefore of this project and the initial hypothesis raised is the generation of a process that allows the extraction of economic data and financial information from the documents.