Data mining, also known as knowledge-discovery in databases (KDD), is the practice of automatically searching large stores of data for patterns. To do this, data mining uses computational techniques from statistics and pattern recognition. Data mining has been defined as "The nontrivial extraction of implicit, previously unknown, and potentially useful information from data". Although it is usually used in relation to analysis of data, data mining, like artificial intelligence, is an umbrella term and is used with varied meanings in a wide range of contexts. It is usually associated with a business or other organization's need to identify trends.
A simple example of data mining is its use in a retail sales department. If a store tracks the purchases of a customer and notices that a customer buys a lot of silk shirts, the data mining system will make a connection sport ween that customer and silk shirts. The sales department will look at that information and may begin direct mail marketing of silk shirts to that customer, or may alternatively attempt to get the customer to buy a wider range of products. In this case, the data mining system used by the retail store discovered new information about the customer that was previously unknown to the company.
Another widely used hypothetical example is that of a very large North American chain of supermarkets. Through intensive analysis of the purchases made and the goods bought over a period of time, analysts found that beers and diapers were often bought together. Though explaining this might be difficult, taking advantage of it, on the other hand, should not be hard (e.g. placing the high-profit diapers next to the high-profit beers).
Introduction todata mining
is data mining ?
Data Mining is a systematic process designed to look for hidden patterns after
exploring data (usually large amounts of data, generally business or market
and/or analytic relationships between items or variables, and then use them
to validate these conclusions by applying the detected patterns to new subsets
data mining is prediction - and predictive data mining is the most used category
of data mining and the one that is used by server businesses as a statistical
tool. For example a supermarket can use data mining to find its customers
baster behaviour pattern through products bought. Many companies use data
mining as a technique for accumulative statistical purposes and to be able
to utilize data mining properly one must use data mining software and data
Data mining Software
Data mining software allows you to collect
and modify your current variable which could be customers for higher levels
of customer recollection and allegiance to your company. Firstly, you build
a customer database which is very easy.Then, you expand your customer information
beyond contact data to include variables like purchasing data mining statistics,
marketing data mining statistics,
and demographic data mining statistics. Leave your competitors behind by
using an affordable desktop database solution.
Microsoft Access as a data mining software tool
Access is a fairly new entry to the Microsoft set of products.
Access provides a small business the ability to organize and manage data for
mining and other data base solutions. Create a variety of reports and forms
in your data base to connect with your customers and drive further profits.
Excellent for growing company environments. Newer versions of the software
provides users with data support for XML, OLE, and ODBC.
FileMaker Pro as a data mining software tool
FileMaker Pro a stand alone data mining application, continues its product
leadership in the desktop database category. For the technically challenged
Pro provides an easy-to-use format to maximize your customer information. Database
templates to get you started are a nice must-have feature. An added security
function allows limited access to employees and read-only formats necessary
to protect private information.
Lotus approach as a data mining software tool
Lotus Approach allows full modification
and statistical analysis of your customer data. This database data mining tool
can integrate items like forms, worksheets, and charts in Lotus Notes mail.
provides an alternative to other data mining products.
What is different between statistics and data mining?
There is no easy answer as to was is the difference between "data mining" and
The techniques and concepts used in data mining are basically the same as those
used in statistics. For example both techniques are used to predict data or
classify data and arrange them in catagories. One must know that many data
mining techniques were previously used in statistics like CART or CHAID.
But now let's get to the differences between "data mining" and statistics.
One gets very excited when he hears the word "data mining" but the same doesn't
happen with "statistics". This happens mainly because data mining techniques
are friendlier and better collectors of data for the average user.
Another reason, of course, is the current technology the world has come into.
The use of computers for business data storage has changed things in a way
that the amount of data that can be stored in smaller spaces is vaster and
more available to the users than ever before. If there was no data we wouldn't
have to mine it. And as computers grow stronger each day, data mining becomes
more powerful and the computers technique is the one we keep in our head when
we hear data mining. But the differences between "data mining" and
statistic are not many.
Data mining concepts & techniques
Nearest Neighbor and clustering data mining
Clustering and the Nearest Neighbor prediction technique are one of the chronologically
oldest techniques used in data mining. Most people have an instinct that they
comprehend what clustering is - namely that like records are grouped or categorized
together. Nearest neighbor is a data mining technique that is quite similar
- its basis is that in order to predict what a prediction value is in one
record look for records with similar predictor values in the historical database
and use the prediction value from the record that it “nearest” to the unclassified
A simple example of clustering data mining would be the clustering that most
people perform when they do the laundry - grouping the permanent press, dry
and brightly colored clothes is important because they have similar characteristics.
And it turns out they have important attributes in common about the way they
behave (and can be ruined) in the wash. To cluster your laundry most of your
decisions are relatively straightforward. There are of course difficult decisions
to be made about which cluster your white shirt with red stripes goes into
(since it is mostly white but has some color and is permanent press). When
clustering is used in business the clusters are often much more dynamic - even
changing weekly to monthly and many more of the decisions concerning which
cluster a record falls into can be difficult.
Data preparation for data mining
Years of practical experience of data mining and its statistical analyses
have shown that
preparing data is the most time-consuming part of any data mining project.
Estimates of the amount of timeand resources spent on data preparation
vary from atleast 60% to upward of 80% (SPSS, 2002a). In spite of this
enough attention is given to this important task, thus perpetuating the
idea that the core of the data mining effort is the modeling process rather
all phases of the data mining life cycle. This article presents a synopsis
of the most crucial issues and considerations a company should take for
Data Mining Tags:
Data Mining: data mining tool data mining download data mining application
what is data mining data mining visualization data mining course web
data mining data
mining software introduction to data mining data mining technologies data mining
concepts data mining tools data mining solutions data mining services excel data
mining data mining sql server data mining techniques data mining conference oracle
data mining data mining definition data mining jobs data mining projects siam
data mining government data mining data mining algorithms data mining introduction
clementine data mining clustering data mining java data mining business data
mining data mining tutorial sas data mining data mining applications weka data
mining data mining job privacy preserving data mining summarizing and mining
skewed data streams data mining classification data mining and knowledge discovery
data mining research data mining methods data warehousing and data mining data
mining book distributed data mining open source data mining data mining training
data mining concepts and techniques data mining conferences data mining privacy
data mining program data mining tutorials visual data mining data mining stock
define data mining data mining process temporal data mining perl data mining
data mining companies data mining algorithm data mining model spatial data mining
data mining security data mining pdf data mining project principles of data mining
data mining thesis data mining analysis definition of data mining data mining
association data mining ppt data mining insurance siam data mining 2006 lucky
game data mining matrix data mining orange data mining data mining journal data
mining books ieee data mining data mining lab text data mining data mining paper
data mining machine learning sql server 2005 data mining data mining knowledge
discovery data mining example mining data streams applications of data mining
data mining 2006 international conference on data mining olap and data mining
data mining sql microsoft data mining data mining information business intelligence
data mining cart data mining unable to get list of data mining algorithms data
mining case study multimedia data mining data mining is data mining statistics
data mining statistical data mining approach parallel data mining data mining
company online data mining data stream mining php data mining data mining examples
graph based data mining papers on data mining ibm data mining advanced data mining
data mining group olap data mining spatio temporal data mining data mining introductory
and advanced data mining and warehousing crm data mining data mining overview
data mining softwares use data mining description data mining data warehousing
data mining data mining api data mining technique data mining bioinformatics
machine learning and data mining data mining introductory and advanced topics
spss data mining data mining han international conference on knowledge discovery
and data mining data mining system advantages of data mining data mining technology
intelligent data mining data mining models data mining approaches data mining
future mastering data mining data mining intrusion detection about data mining
data mining cluster data warehouse data mining java data mining api data mining
warehousing data mining marketing data mining 2005 time series data mining data
mining lecture data mining support data mining decision tree prediction data
mining paper on data mining data mining source code data mining systems real
time data mining data mining warehouse regression data mining dataset data mining
herbs al data mining data mining architecture data mining methodology network
data mining relational data mining database data mining data warehouse and data
mining marts data warehouse cleansing warehouses data mining datamart webhouse
warehous data+ scrubbing datawarehouse data warehousing harehouse what is this
stream datahouse mart data ware house vality star schema warehouse+ dataware
house warwhouse merge purge werehouse data maning dataw warehaus datamining datamine
firstlogic materialization warehose mininng wharehousing winpure dataware warehause
datamarts profiling multisensor bathymetric datawhere data wharehouse trillium
mining+ dataflux minding data minning wahehouse deduping wrehouse aggregation
datawarehousing ining miniing whare house datam earehouse minining cleaning etl
werhouse warhouse wherehouse metadata potter's wheel dwh evoke perlombongan webhousing
ware house fusion hausing minnig integration dasarathy datastudio axio datawarehouses
datawarehous warehaouse olap mining's datawherehouse integrity slowly changing
dimensions snowflake transformation datamark stovepipe datawharehouse quality
cleasing werehousing conformed datawerehouse warehouse data flow standardization
consolidation mine warehousing+ datamar starflake feed wizrule slowly changing
dimension data minin miming marting mining migration golfarelli dataming enrichment
warehouse's crm wardhouse warhousing methodology infocentre nining wareshouse
dimensional matching datawarhouse cruncher dataminig ware potters wheel whare
minning infozoom preparation multidimensional oltp data minig kimball sensor
mineing scrub were surrogate keys datafusion multisource auditing datewarehouse
tools datamind dependent validity definition dst2 kdd groundwater dw datastage
dimension hierarchies lineage dempster shafer &data snow flake sensor networks
park n shop dataweb irds