http://www.tu-ilmenau.de

Logo TU Ilmenau


Databases and Information Systems Group


headerphoto Databases and Information Systems Group
Ansprechpartner

Prof. Dr. Kai-Uwe Sattler

Head

Telefon +49 3677 69 4577

E-Mail senden

INHALTE

AnduIN Data Stream Engine

Sensor measurements, logging data, stock prices or simple messages like tweets can be considered as a continuous, in general infinite stream of information. Often, it is neither possible nor necessary to store the entire data for further processing. Far more interesting are the immediate detection of events or the meaningful aggregation of incoming data for later analysis.

AnduIN is an easy to use stream engine to analyze streaming data on the fly (without storing data to a hard disk). User can describe tasks by a simple SQL-like interface (CQL). For example

    select
        avg(temp) as avg_temp slide 1 minutes
    from
        sensordata [ range 5 minutes ];

The example query computes the average temperature over a sliding window of 5 minutes. The query result will be returned every minute. In addition to the SQL-like interface AnduIN offers a SPADE like interface for query description (Spade is a more powerful but also more complex query language for streaming data).

Operators

AnduIN provides one-pass (non-blocking) implementations for most of the standard database operators, such as projection, filter, aggregation, and join. Additionally, AnduIN provides operators for the following tasks:

  • Data stream analytics: The system offers solutions for typical data mining problems like clustering or frequent pattern mining that operate on data streams. To identify missing or outliers, AnduIN implements operators for outlier, burst and missing value detection.

  • Spatial Operators: Modern mobile devices like cell phones or wireless sensor nodes deliver location-dependent information (e.g. GPS-based). That is, data originating from such devices often contain spatial information. AnduIN allows to analyze these data by providing spatial operators (e.g. inside, nearest neighbor, within distance).

  • Complex Event Processing: A major task in data stream analysis is the identification of complex events. A complex event can be considered as a pattern of events, which can occur within a predefined time interval. AnduIN provides a set of operators (temporal sequence matching, followed by and time range limitation) for complex event processing, that satisfy a large set of possible query scenarios.

In addition to this Operators AnduIN provides functions for a simple data processing / transformation. The stream engine contains the following:

  • Basic Date and Math functions

  • String Functions (Regex, Q-Gram, Semantic Comparison using Ontologies)

  • Integration of Geonames (reverse geocoding, location into coordinates)

Processing

Similar to traditional database systems, the user defines just the goal, AnduIN autonomously decides, how this goal can be achieved most efficiently. Several different optimization strategies have been implemented for that purpose. A simple static, rule-based optimizer prepares queries for their initial execution. Optionally, a cost-based optimizer, leveraging statistical data from previous query executions to determine an optimal query execution plan, can be used. Due to the dynamic and potentially infinite character of data streams, the statistics of running queries are continuously observed and updated by AnduIN. If the running execution plan becomes suboptimal, the adaptive optimizer replaces it by a better solution without any loss of data.

Benefits & Extensibility

AnduIN can be easily integrated in existing solutions. External applications can send data to and receive from AnduIN by simple socket connections. The system contains also operators to join streaming data with data from standard SQL databases.

In addition to these built-in operators, AnduIN provides an easy-to-use scripting interface to implement new operators without recompilation. It is also possible to combine sets of operators to more complex predefined operators (compositions).

The query processing in AnduIN follows the well known publish-subscribe pattern. Due to well defined operator interfaces, it is also quite easy to add new specialized query operators.

Publications

2012

  • Steffen Hirte, Eugen Schubert, Andreas Seifert, Stephan Baumann, Daniel Klan, Kai-Uwe Sattler
    Data3 - A Kinect Interface for OLAP using Complex Event Processing
    ICDE, April, Video: http://youtu.be/DROXI0\_wDRM, Pages 1297-1300, 2012
    [Bibtex] [Abstract] [Download]

2011

  • D. Klan, Th. Rohe, K.-U. Sattler
    Quantitatives Frequent Pattern Mining in drahtlosen Sensornetzen
    BTW Workshops, March, Pages 3-12, 2011
    [Bibtex] [Abstract] [Download]
  • D. Klan, M. Karnstedt, K. Hose, L. Ribe-Baumann, K. Sattler
    Stream Engines Meet Wireless Sensor Networks: Cost-Based Planning and Processing of Complex Queries in AnduIN, Distributed and Parallel Databases
    Distributed and Parallel Databases, January, Number 1, Pages 151-183, Volume 29, 2011
    [Bibtex] [Abstract] [Download]
  • D. Klan, K.-U. Sattler
    AnduIN: Anwendungsentwicklung für drahtlose Sensornetzwerke
    Datenbank-Spektrum, Number 1, Pages 15-26, Volume 11, 2011
    [Bibtex] [Abstract] [Download]

2010

  • D. Klan, K. Hose, M. Karnstedt, K. Sattler
    Power-Aware Data Analysis in Sensor Networks
    ICDE 2010, March, Pages 1125-1128, 2010
    [Bibtex] [Abstract] [Download]
  • D. Klan, Th. Rohe, K. Sattler
    Quantitatives Frequent-Pattern Mining über Datenströmen
    KDML 2010, 2010
    [Bibtex] [Abstract] [Download]

2009

  • C. Franke, M. Karnstedt, D. Klan, M. Gertz, K.-U. Sattler, W. Kattanek
    In-Network Detection of Anomaly Regions in Sensor Networks with Obstacles
    BTW 2009, March, 2009
    [Bibtex] [Abstract] [Download]
  • E. Chervakova, D. Klan, T. Rossbach
    Energy-optimized Sensor Data Processing
    EUROSSC, Pages 35-38, 2009
    [Bibtex] [Abstract] [Download]
  • C. Franke, M. Karnstedt, D. Klan, M. Gertz, K.-U. Sattler, E. Chervakova
    In-Network Detection of Anomaly Regions in Sensor Networks with Obstacles
    Computer Science - Research and Development, Special issue, 2009
    [Bibtex] [Abstract] [Download]
  • Katja Hose, Daniel Klan, Kai-Uwe Sattler
    Online Tuning of Aggregation Tables for OLAP
    ICDE, Pages 1679-1686, 2009
    [Bibtex] [Abstract] [Download]
  • D. Klan, K. Hose, K.-U. Sattler
    Developing and deploying sensor network applications with AnduIN
    DMSN, Pages 1-6, 2009
    [Bibtex] [Abstract] [Download]
  • M. Karnstedt, D. Klan, Chr. Politz, K.-U. Sattler, C. Franke
    Adaptive burst detection in a stream engine
    SAC, Pages 1511--1515, 2009
    [Bibtex] [Abstract] [Download]

2008

  • D. Klan, M. Karnstedt, Ch. Politz, K. Sattler
    Towards Burst Detection for Non-Stationary Stream Data
    KDML, Pages 57--60, 2008
    [Bibtex] [Abstract] [Download]

2007

  • K. Hose, M. Karnstedt, D. Klan, K. Sattler, J. Quasebarth
    Incremental Mining for Facility Management
    KDML 2007: Knowledge Discovery, Data Mining, and Machine Learning, Pages 183-190, 2007
    [Bibtex] [Abstract] [Download]