Applications of data graphs
|Title||Applications of data graphs|
|Institution||Department of Information Systems|
|Department||Faculty of Informatics|
|School||Eötvös Loránd University|
Internet became part of our everyday life. We read news, listen music, contact our friends on web pages. Forums, on-line social networks, and web pages are available from anywhere so there is no more physical limit on contacts between people. We can meet new things and new peoples that have impact on our thinking.
Parallel to transformation and growth of Web, science and technology is evolving. More sophisticated and more precise measuring instruments are created by which we can observe more closely the world around us. In addition, computer simulation plays important role in formulating theoretical models and in controlling and evaluating experimental results.
Due to these processes, the amount of available data has drastically increased over a short period of time. Storage and processing of the data are serious challenges for engineers and researchers. As a result, new storage models were developed besides relational databases. Some of the new models limited the available operations to increase efficiency; others are specialized to areas such as storing semi structured documents or graphs.
In this dissertation data graphs are examined from several aspects. Data graphs are directed, labeled graphs whose vertices represent concepts or objects, and edges describe relationships among them. My theses are grouped around three main topics and each of these topics is related in some way to data graphs.
The focus of the first topic is on Semantic Web technologies that are designed to link data available on the Web. For this purpose, a framework was created that allows heterogeneous data from various sources to be integrated and allow us to describe our knowledge about the world. The data sets given in this framework can be represented as directed, labeled graph and can be stored as data graphs. For querying the data sets, a declarative pattern-based language was developed.
Concerning this area we explore how to exploit the benefits of visualization in understanding and in characterizing data sets. We show what options and challenges are offered by the structure of the new framework in displaying the data. In addition, we review the advantages and disadvantages of visual query languages.
The second main topic is the semantic matching in which we have to select the best matching elements among many. Various matching functions are introduced that make it possible to decide how much the elements _t together. For definition of functions, another Semantic Web technology, namely the ontologies, is used. Ontologies provide vocabularies to describe the most important concepts of given domains, and to organize the concepts into hierarchies and groups. They can be described in the above mentioned framework and so can be represented as graphs. We present probability model based matching functions as well.
The third main topic is related to the analysis of social networks that makes it easier to understand the structure of our society and people's behavior. We examine the most inuential set of nodes of the social networks through which the largest part of the network can be accessed and inuenced. Greedy algorithms are introduced to find the most inuential set of nodes. We also present an extension for RapidMiner data mining software which allows us to use network analyzing algorithms as part of data mining tasks. Moreover, we show how Semantic Web technologies can be used for examining social networks.