All Issues

Volume 2, 2017

Volume 1, 2016

Big Data & Information Analytics

April 2017 , Volume 2 , Issue 2

Select all articles


First steps in the investigation of automated text annotation with pictures
J. Kent Poots and Nick Cercone
2017, 2(2): 97-106 doi: 10.3934/bdia.2017001 +[Abstract](3208) +[HTML](431) +[PDF](370.11KB)

We describe the investigation of automatic annotation of text with pictures, where knowledge extraction uses dependency parsing. Annotation of text with pictures, a form of knowledge visualization, can assist understanding. The problem statement is, given a corpus of images and a short passage of text, extract knowledge (or concepts), and then display that knowledge in pictures along with the text to help with understanding. A proposed solution framework includes a component to extract document concepts, a component to match document concepts with picture metadata, and a component to produce an amalgamated output of text and pictures. A proof-of-concept application based on the proposed framework provides encouraging results

Rendering website traffic data into interactive taste graph visualizations
Ana Jofre, Lan-Xi Dong, Ha Phuong Vu, Steve Szigeti and Sara Diamond
2017, 2(2): 107-118 doi: 10.3934/bdia.2017003 +[Abstract](5474) +[HTML](320) +[PDF](2054.58KB)

We present a method by which to convert a large corpus of website traffic data into interactive and practical taste graph visualizations. The website traffic data lists individual visitors' level of interest in specific pages across the website; it is a tripartite list consisting of anonymized visitor ID, webpage ID, and a score that quantifies interest level. Taste graph visualizations reveal psychological profiles by revealing connections between consumer tastes; for example, an individual with a taste for A may be also have a taste for B. We describe here the method by which we map the web traffic data into a form that can be displayed as interactive taste graphs, and we describe design strategies for communicating the revealed information. In the context of the publishing industry, this interactive visualization is a tool that renders the large corpus of website traffic data into a form that is actionable for marketers and advertising professionals. It could equally be used as a method to personalize services in the domains of government services, education or health and wellness.

Proportional association based roi model
Wenxue Huang, Yuanyi Pan and Lihong Zheng
2017, 2(2): 119-125 doi: 10.3934/bdia.2017004 +[Abstract](3807) +[HTML](222) +[PDF](280.31KB)

Based on a local-to-global proportional association measure proposed by Huang, Shi and Wang [9], with cost and revenue information known, an association measure is proposed to maximize the expected RoI. A descriptive experiment with a synthetical data set is presented.

Big data collection and analysis for manufacturing organisations
Pankaj Sharma, David Baglee, Jaime Campos and Erkki Jantunen
2017, 2(2): 127-139 doi: 10.3934/bdia.2017002 +[Abstract](4745) +[HTML](300) +[PDF](403.1KB)

Data mining applications are becoming increasingly important for the wide range of manufacturing and maintenance processes. During daily operations, large amounts of data are generated. This large volume and variety of data, arriving at a greater velocity has its own advantages and disadvantages. On the negative side, the abundance of data often impedes the ability to extract useful knowledge. In addition, the large amounts of data stored in often unconnected databases make it impractical to manually analyse for valuable decision-making information. However, an advent of new generation big data analytical tools has started to provide large scale benefits for the organizations. The paper examines the possible data inputs from machines, people and organizations that can be analysed for maintenance. Further, the role of big data within maintenance is explained and how, if not managed correctly, big data can create problems rather than provide solutions. The paper highlights the need to have advanced mining techniques to enable conversion of data into information in an acceptable time frame and to have modern analytical tools to extract value from the big datasets.

Identifying electronic gaming machine gambling personae through unsupervised session classification
Maria Gabriella Mosquera and Vlado Keselj
2017, 2(2): 141-175 doi: 10.3934/bdia.2017015 +[Abstract](4776) +[HTML](351) +[PDF](17775.63KB)

The rising accessibility in gambling products, such as Electronic Gaming Machines (EGM), has increased interest in the effects of gambling; in particular, the potential for impulse control disorders, such as problem gambling. Nevertheless, empirical research of EGM gambling behaviour is scarce. In this exploratory study, we apply data mining techniques on 46,416 gambling sessions, collected in situ from 288 EGMs. Our research focused on identifying the at-risk behavioural markers of sessions to help distinguish gambling personae. Our data included measures of gambling involvement, out-of pocket expense of sessions, amount won, and cost of gambling. This research, discusses the methodology used to collect and analyze the required gambling measures, explains the criteria used for identifying valid sessions, and combines outlier mining methods to identify instances of heavily involved gambling (i.e., outliers). Our results suggest that sessions were classified as potential non-problem, potential low-risk, potential moderate risk, and potential problem gambling sessions. Further, outlier sessions were more heavily involved in terms of gambling intensity and amount redeemed, despite having low duration times. Finally, our methods suggest that the lack of player identification does not prevent one from identifying the potential incidence of problem gambling behaviour.

An ontological account of flow-control components in BPMN process models
Xing Tan, Yilan Gu and Jimmy Xiangji Huang
2017, 2(2): 177-189 doi: 10.3934/bdia.2017016 +[Abstract](3852) +[HTML](235) +[PDF](456.36KB)

The Business Process Model and Notation (BPMN) has been widely adopted in the recent years as one of the standard languages for visual description of business processes. BPMN however does not include a formal semantics, which is required for formal verification and validation of behaviors of BPMN models.

Towards bridging this gap using first-order logic, we in this paper present an ontological/formal account of flow-control components in BPMN, using Situation Calculus and Petri nets. More precisely, we use SCOPE (Situation Calculus Ontology of PEtri nets), developed from our previous work, to formally describe flow-control related basic components (i.e., events, tasks, and gateways) in BPMN as SCOPE-based procedures. These components are first mapped from BPMN onto Petri nets.

Our approach differs from other major approaches for assigning semantics to BPMN (e.g., the ones applying communicating sequential processes, or abstract state machines) in the following aspects. Firstly, the approach supports direct application of automated theorem proving for checking theory consistency or verifying dynamical properties of systems. Secondly, it defines concepts through aggregation of more basic concepts in a hierarchical way thus the adoptability and extensibility of the models are presumably high. Thirdly, Petri-net-based implementation is completely encapsulated such that interfaces between the system and its users are defined completely within a BPMN context. Finally, the approach can easily further adopt the concept of time.




Email Alert

[Back to Top]