Data mining is used to explore increasingly large databases and to improve market segmentation. By analysing the relationships between parameters such as customer age, gender, tastes, etc. Data mining in marketing also predicts which users are likely to unsubscribe from a service, what interests them based on their searches, or what a mailing list should include to achieve a higher response rate.
Linking Behavioral Patterns to Personal Attributes through Data Re-Mining
Data mining also detects which offers are most valued by customers or increase sales at the checkout queue. Banks use data mining to better understand market risks. Data mining also allows banks to learn more about our online preferences or habits to optimise the return on their marketing campaigns, study the performance of sales channels or manage regulatory compliance obligations.
Having all of the patient's information, such as medical records, physical examinations, and treatment patterns, allows more effective treatments to be prescribed. Television and radio.
Data Mining Techniques | Top 7 Data Mining Techniques for Best results
Networks also gain valuable knowledge for their advertisers, who use this data to target their potential customers more accurately. Big data Refers to the collection and storage of large amounts of data.
Due to the volume it is impossible to process it with conventional software. Special tools are needed to capture, manage and process the information. These data groups have a reduced volume of information to make predictions. The quality of the information can vary considerably and affect the result of the analysis. The set of user devices—for the organization of universal access to computing and information resources of the system.
The hypervisor management use for software agents. It is consolidated computing resources in a multiprocessor system for distributed processing of big sensor data. Software agents are operated in the sensor nodes and interacted with the data acquisition modules, other agents and brokers. In this model, computing agent is software template for parallel processing. The intelligent agent responds to requests, decides on the selection of data processing functions, clone and migrate to other nodes in the network.
The agents exchange messages with each other and brokers, which send protected data to the central network coordinator, and receive from them control commands. A feature of the agents is the realization of a behavior. The behavior is determined by the mathematical function which implements the steps of processing big sensor data. Other options determine the behavior of the agent in case of certain kinds of situations, for example, when energy indicators are outside the acceptable limits.
Security agents can be provided in the following ways: prohibition of migration; disable automatic firmware updates and OS; installing the brokers for the implementation of the mechanism of the data transmitted; organization of protected communication channels; using means of digital signatures, certification and key management procedures. The model of intellectual brokers is offered to agent interaction with server applications at the data center Fig.
- Introduction to nuclear engineering!
- Handbook of Defense Economics, Vol. 2: Defense in a Globalized World.
- Wave Progagation, Observation and Control in 1-d Flexible Multi-Structures.
Broker is an agent that runs on routers and realizing the storage, data protection and transmission functions. Message Query Telemetry Transport protocol is used for the implementation of information exchange with limited energy resources [ 27 ]. Collect cloud computing results entering from sensor segments ZigBee network is performed by using the broker MQTT, loaded into computing network gateway cluster, that provides interaction ZigBee protocol stack [ 28 ] and MQTT-client.
The gateway is implemented on a central coordinator ZigBee network or modem pool cellular network. Availability coordinators and routers simplifies the integration MQTT brokers in the converged computing model. The Erlang programming language used to develop software for converged computing, including software brokers, software agents and server applications [ 29 , 30 ]. Language includes a means of generating parallel processes and ensure their interaction through the exchange of asynchronous messages in accordance with the multi-agent model. Erlang is designed to create a failsafe distributed computing systems in real time.
The Open Telecom Platform OTP framework includes the plug-in libraries and behavior patterns Class library has built-in capabilities of distributed computing: clustering, load balancing, add nodes and servers, increased computing reliability. Failsafe is achieved by using insulated from each other lightweight processes related messaging and output signals.
Erlang program is translated into byte code executable virtual machines on a network nodes. The use of lightweight processes in accordance with the model allows agents to perform simultaneous multiple processes on distributed nodes with limited computing resources. Memory requirements are minimized due to the fact that the virtual machine runs lightweight process, not tools of the OS.
To ensure the information security, communication between processes and nodes is done using SSL and key management schemes. To create server applications on Erlang language used set of behaviors framework OTP. Set action formalizes processes and allows you to build on their OTP-based applications. OTP module determines the design of parallel application templates, such as a server, and the observer, the state machine, the event handler, and others.
The OTP-behavior are divided into working processes perform query processing and monitoring processes watching workflows. The combination of an industrial warehouse with non-relational data storage system is proposed to improve the efficiency of data processing and storage of results [ 31 ]. For this, together with Oracle database uses a distributed non-relational Cassandra system for caching of sections of the multidimensional storage. This improves the data sampling rate, fault tolerance and scalability.
Cassandra system on the Java platform includes a distributed hash-based system that provides scalability by increasing the amount of data. Thus, the continuous removal and change data in the storage is not required. The data is loaded only in moments of interrogation OPC OLE for process control servers and computing integral indicators at remote sites in the cloud computing environment.
Each entry of the relational component is cached section of Oracle relational database. The data mining tool in the system allows to process large arrays of sensor data for data retrieval and analysis. It includes such components as:. The visualization agents of the multidimensional data is essential for a dispatcher and the DM. It allows you to see information from the specific devices for accounting and control in the structured hypertable form. The hypertable is a nonstandard user interface element of the monitoring system.
It combines the functionality of a classic table with a tree structure and controls to view the dynamic changes of the values displayed in cells in real time. The hypertable is a way to visualize the hypercube data in which the data are grouped according to the parameters and levels of aggregation. It provides the ability to navigate in a multidimensional data structure. A distinctive feature of the hypertable is that the number of rows is not a static value, and a row character and functionality are not equal, and some of them being the aggregates.
The aggregates are nodal and show summary information on the relevant columns of the lower levels aggregation rows. In turn, the row-aggregates may belong to the rows of the upper levels of aggregation. The button is connected with the aggregates which and works similar to the tree list anchor element, i. The actual number of the hypertable rows varies dynamically depending on the grouping of rows. Another feature of the hypertable is the ability to view quickly and analyze changes in indicators of energy and heat consumption over time. When one updates data in and MDB the hypertable also changes the data values for all rows and for each time interval.
Discover how data mining will predict our behaviour
The time interval is determined from and analysis and forecast requirements. Thus, one can view hypertable graphs change of any selected index for the period, as well as the predicted values for the specified forecast horizon. One can see the change of power indicators values tabulated in the hypertable while moving through the timeline. The agents for aggregation modes creation and edition is needed to support the technology of work with and database in the aggregation mode for selection and visualization of data mart in the hypertable.
When the aggregation mode is setup the hypertable view is given, and values of visible columns, the number and nature of data grouping levels, as well as color coding are determined. To add the aggregation mode, which will determine the form and content of the archival hypertable, current and forecast indicators, it is necessary to use tools, to define a set of aggregation parameters and save it with a unique name. When the mode is setup, the user should define a set of object properties columns of values that will be shown in the hypertable.
Available properties can be selected from the drop-down list. The agents for the analyzed data mart extraction from the hypercube allows choosing the data needed for the analysis of a concrete situation, which arises in the urban heat supply system. The data marts selection criteria can be quite complex.
For this purpose, the system uses multi-level queries and filters that limit the data choice displayed in summary tables and charts. The module allows the personnel easily and clearly create queries to choose the right information, in terms of the subject area. In fact, the module is a constructor that implements the methods of querying data sampling with different requirements and criteria.
It allows to group the selection criteria using logical operators in the structure of the parameterized specifications. Data mining agents operate in a server GRID cluster. The server part of the monitoring system gets operational and archival data from the OPC server via a wireless heterogeneous network with sensor and cellular networks segments.
It provides tools for generating and updating a set of tables in the MDB, SQL-query building and responses processing. The server platform is built using the JBoss Application Server applications. JBoss Application Server with the open source is used as a server platform in the monitoring system. To create the business logic of client applications and the user interface we use the Adobe Flex platform, which reduces time and development cost.
It is a development environment with the open source software to create desktop and mobile web applications. The technology extends the Flash capabilities, allowing describing application interface on the XML API for storing and transmitting structured data. To process very large-scale data arrays processing large corporations use computing clusters, including several thousand servers and specialized software solutions the MapReduce model for the distribution of tasks between the nodes and parallel execution. However, analytical processing of large amounts of sensor data can be performed in modern GPU graphics cards.
Most modern video cards have an interface that provides access to the graphics processor GPU , which is called the compute unified device architecture CUDA. We consider the method for solving the problem of sensor data in the urban heat supply system. Suppose we have obtained the archival data mart from the database of with thermal accounting devices for the last year.
It is necessary to find out which facilities of the urban heat supply networks have a maximum energy consumption. In the first step the input list is received by the master node of the cluster and distributed among the remaining nodes. In the second step each node performs the predetermined mapping function of its part of the list of generating pairs, its key being the name of the object, and the value of energy consumption. The mapping operations work independently of each other and can be performed in parallel by all nodes in the cluster.
In the reduce step all nodes perform a given function in parallel, which adds all the values for the input list, thus creating a single pair with the name of the monitoring object, as the key, and the number of occurrences in the original list, as the value.
After that, the master node receives data from the operating units and generates a result list, in which the records with the highest value and are the desired objects. Cluster dual-platform architecture implementation for the ultrafast complex analytical processing of large amounts of sensory data with the possibility of parallel processing on a set of the GPU video card with the CUDA technology support is proposed to support data mining and knowledge discovery procedures at the software and hardware level.
The calculation performance for the data aggregation tasks reaches a the value comparable with the solution of similar problems on a single cluster platform of servers. The research towards the synthesis and implementation of converged computing model led to the following conclusions:. Convergent approach to distributed computing is the convergence of distributed data processing technologies GRID, cloudy, foggy, and mobile computing. The model is designed for the collection, processing and integration of large sensor data obtained in the process of monitoring and control of spatially distributed objects and processes.
Perspectives supports the current state of wireless technologies, software and hardware sensor networks. Convergent model of distributed computing includes four levels of processing and two levels of storage. The first level—is the level of fog computing. Here, processing and aggregation of sensor data is realized by migrating software agents in nodes WSN ZigBee and controllers embedded in devices of industrial automation and energy accounting. At the next level sensor data and aggregates are integrated in a multidimensional cloud storage, generated based on a combination of industrial SQL Oracle type storage and distributed non-relational Cassandra system for caching data slices.
The third level of data processing is implemented in the server cluster. The cluster includes the main server to control the hypervisor, network servers of low power of the local network, a lot of GPU server graphics card with CUDA technology. The fourth level is implemented on mobile systems smartphones and tablets where loading and operating agents to retrieve and visualize the results of intelligent analysis with elements of augmented reality and the use of geo-information technologies.
Information interaction of the agents of fog computing together, with cloud storage and database and the server applications provided by brokers through intellectual MQTT protocol. For managing agents and brokers use the hypervisor network that consolidates distributed resources in a multiprocessor complex.
Functionality of agents and brokers is defined as a mathematical function that determines the action to sensor data processing and the selection of behaviors to respond to emerging events. Functionality is implemented in Erlang language. The benefits of large sensor data processing based on convergent model of distributed computing: decrease the load on the server cluster; reduction in the volume of traffic in sensor networks; increasing the battery life of a network and its components; decrease in the volume of data in the repository; monitoring in real time controllers and sensors; obtaining integral indicators on mobile communication directly from the terminal nodes, etc.
Int J Appl Eng Res 10 3 — Shcherbakov M, Kamaev V, Shcherbakova N Automated electric energy consumption forecasting system based on decision tree approach. In: Proceeding of the IFAC conference on manufacturing modelling, management and control, pp — Izvestia VSTU Actual problems of management, computer science and informatics in technical systems 10 14 — Soc Sci 10 2 — Dargie W, Poellabauer C Fundamentals of wireless sensor networks: theory and practice.
Wiley, Hoboken. ISBN: , Gustafsson J Integration of wireless sensor and actuator nodes with IT infrastructure using service-oriented architecture. Procedia Comput Sci — Valery Kamaev, Alexey Finogeev, Anton Finogeev, Vinh Thai Quang et al Wireless sensor network for the supervisory control and data acquisition in the heating supply system. Int J Appl Eng Res 10 15 — Comput Commun — Futur Gener Comput Syst 29 1 — Commun ACM 53 4 —8. Arunkumar G, Venkataraman Neelanarayanan A novel approach to address interoperability concern in cloud computing. Recommendations of the National Institute of Standards and Technology.
- Contact us.
- Literature and the Politics of Family in Seventeenth-Century England;
- VTLS Chameleon iPortal System Error Occurred..
- Timişoara, Romania, January 22-26, 2018.
- BE THE FIRST TO KNOW.
Accessed 16 Oct Cisco and Microsoft extend relationship with new Cloud Platform. Google Cloud Platform. A fast, economical and fully managed data warehouse for large-scale data analytics. IBM Cloud delivers a new way to work. Toshiba Asia Pacific Pte. Stojmenovic I, Wen S The fog computing paradigm: scenarios and security issues.
ZigBee Alliance. MQTT version 3. ZigBee Specification Overview. Larson Jim Erlang for concurrent programming. ACM Queue — Concurr Comput Pract Exp 20 8 — In: Proceedings. Download references. All coauthors contributed significantly to the research and this paper, and the lead author is the main contributor. All authors read and approved the final manuscript. Correspondence to Alexey G. Reprints and Permissions. Search all SpringerOpen articles Search. Research Open Access Published: 13 March The convergence computing model for big sensor data mining and knowledge discovery Alexey G.
Finogeev 2 , Danila S.