Búsqueda avanzada

TITAN: A knowledge-based platform for Big Data workflow management

(Artículo ya publicado)


Modern applications of Big Data are transcendingfrom being scalable solutions of data processing and analysis, to nowprovide advanced functionalities with the ability to exploit and understandthe underpinning knowledge. This change is promoting the developmentof tools in the intersection of data processing, data analysis,knowledge extraction and management. In this paper, we proposeTITAN, a software platform for managing all the life cycle of scienceworkflows from deployment to execution in the context of Big Data applications.This platform is characterised by a design and operation modedriven by semantics at different levels: data sources, problem domain andworkflow components. The proposed platform is developed upon an ontologicalframework of meta-data consistently managing processes andmodels and taking advantage of domain knowledge. TITAN comprises awell-grounded stack of Big Data technologies including Apache Kafka forinter-component communication, Apache Avro for data serialisation andApache Spark for data analytics. A series of use cases are conducted forvalidation, which comprises workflow composition and semantic metadatamanagement in academic and real-world fields of human activityrecognition and land use monitoring from satellite images.

Palabras Clave:

Big Data analytics - Knowledge extraction - Semantics





Acceso a los detalles haciendo click aquí.