1 - 20
Next
- Lipp, Brian.
- [S.l.] : Packt Publishing, 2020.
- Description
- Book — 1 online resource
- Shields, Ben, author.
- 1st edition. - MIT Sloan Management Review, 2018.
- Description
- Book — 1 online resource (1 page) Digital: text file.
- Summary
-
The successful use of analytics in sports, both on the field and off, comes down to integrating analytics within an organization. Three strategies - collaborative analytics, a common language, and accessible technology - are key.
- Ransbotham, Sam, author.
- 1st edition - MIT Sloan Management Review, 2017.
- Description
- Book — 1 online resource (3 pages)
- Summary
-
Plummeting data acquisition costs have been a big part of the surge in business analytics. We have much richer samples of data to use for insight. But more data doesn't inherently remove sampling bias; in fact, it may make it worse.
4. What's Your Data Worth? [2017]
- Todd, James Short. Steve.
- [Place of publication not identified] MIT Sloan Management Review, 2017.
- Description
- Book — 1 online resource
- Summary
-
What is the value of data? Many businesses don't yet know the answer to that question. But going forward, companies will need to develop greater expertise at valuing their data assets.
5. Principles of data mining [2016]
- Bramer, M. A. (Max A.), 1948- author.
- Third edition. - London, United Kingdom : Springer, 2016.
- Description
- Book — 1 online resource (xv, 526 pages) : illustrations
- Summary
-
- Introduction to Data Mining
- Data for Data Mining
- Introduction to Classification: Naïve Bayes and Nearest Neighbour
- Using Decision Trees for Classification
- Decision Tree Induction: Using Entropy for Attribute Selection
- Decision Tree Induction: Using Frequency Tables for Attribute Selection
- Estimating the Predictive Accuracy of a Classifier
- Continuous Attributes
- Avoiding Overfitting of Decision Trees
- More About Entropy
- Inducing Modular Rules for Classification
- Measuring the Performance of a Classifier
- Dealing with Large Volumes of Data
- Ensemble Classification
- Comparing Classifiers
- Associate Rule Mining I
- Associate Rule Mining II
- Associate Rule Mining III
- Clustering
- Mining
- Classifying Streaming Data
- Classifying Streaming Data II: Time-dependent Data
- Appendix A
- Essential Mathematics
- Appendix B
- Datasets
- Appendix C
- Sources of Further Information
- Appendix D
- Glossary and Notation
- Appendix E
- Solutions to Self-assessment Exercises
- Index.
6. Designing and Operating a Data Reservoir [2015]
- Chessell, Mandy.
- [Place of publication not identified] : IBM Redbooks, 2015.
- Description
- Book — 1 online resource
- Summary
-
Together, big data and analytics have tremendous potential to improve the way we use precious resources, to provide more personalized services, and to protect ourselves from unexpected and ill-intentioned activities. To fully use big data and analytics, an organization needs a system of insight. This is an ecosystem where individuals can locate and access data, and build visualizations and new analytical models that can be deployed into the IT systems to improve the operations of the organization. The data that is most valuable for analytics is also valuable in its own right and typically contains personal and private information about key people in the organization such as customers, employees, and suppliers. Although universal access to data is desirable, safeguards are necessary to protect people's privacy, prevent data leakage, and detect suspicious activity. The data reservoir is a reference architecture that balances the desire for easy access to data with information governance and security. The data reservoir reference architecture describes the technical capabilities necessary for a system of insight, while being independent of specific technologies. Being technology independent is important, because most organizations already have investments in data platforms that they want to incorporate in their solution. In addition, technology is continually improving, and the choice of technology is often dictated by the volume, variety, and velocity of the data being managed. A system of insight needs more than technology to succeed. The data reservoir reference architecture includes description of governance and management processes and definitions to ensure the human and business systems around the technology support a collaborative, self-service, and safe environment for data use. The data reservoir reference architecture was first introduced in Governing and Managing Big Data for Analytics and Decision Makers, REDP-5120, which is available at: http://www.redbooks.ibm.com/redpieces/abstracts/redp5120.html. This IBM® Redbooks publication, Designing and Operating a Data Reservoir, builds on that material to provide more detail on the capabilities and internal workings of a data reservoir.
7. JMP Essentials, 2nd Edition [2014]
- Hinrichs, Curt, author.
- 1st edition - SAS Institute, 2014.
- Description
- Book — 1 online resource (358 pages)
- Summary
-
Grasp essential steps in order to generate meaningful results quickly with JMP. JMP Essentials: An Illustrated Step-by-Step Guide for New Users, Second Edition is designed for the new or occasional JMP user who needs to generate meaningful graphs or results quickly. Drawing on their own experience working with these customers, the authors provide essential steps for what new users typically need to carry out with JMP. This newest edition has all new instructions and screen shots reflecting the latest release of JMP software. In addition, it has eight new detailed sections and 10 new subsections that include creating maps, filtering data, creating dashboards, and working with Excel data, all of which highlight new, useful and basic level enhancements to JMP. The format of the book is unique. It adopts a show-and-tell design with essential step-by-step instructions and corresponding screen illustrations, which help users quickly see how to generate the desired results. In most cases, each section completes a JMP task, which maximizes the book's utility as a reference. In addition, each chapter contains a family of features that are carefully crafted to first introduce you to basic features and then on to more advanced ones. JMP Essentials: An Illustrated Step-by-Step Guide for New Users, Second Edition is the quickest and most accessible reference book available. This is part of the SAS Press program.
- Drake, Matthew, author.
- 1st edition. - PH Professional Business, 2014.
- Description
- Book — 1 online resource (8 pages) Digital: text file.
- Summary
-
This new business analytics case study challenges readers to optimize the logistics operation for a regional chain of discount stores, gaining the insight they need to consolidate routes and eliminate overspending. Crystallizing realistic analytical challenges faced by companies in many industries and markets, it exposes readers to the entire decision-making process, providing opportunities to perform analyses, interpret output, and recommend the best course of action. Author: Matthew J. Drake, Duquesne University.
9. Data mining : concepts and techniques [2012]
- Han, Jiawei.
- 3rd ed. - Waltham, MA : Morgan Kaufmann/Elsevier, ©2012.
- Description
- Book — 1 online resource (xxxv, 703 pages) : illustrations, facsimiles. Digital: text file.
- Summary
-
- Front Cover
- Data Mining: Concepts and Techniques
- Copyright
- Dedication
- Table of Contents
- Foreword
- Foreword to Second Edition
- Preface
- Acknowledgments
- About the Authors
- Chapter 1. Introduction
- 1.1 Why Data Mining?
- 1.2 What Is Data Mining?
- 1.3 What Kinds of Data Can Be Mined?
- 1.4 What Kinds of Patterns Can Be Mined?
- 1.5 Which Technologies Are Used?
- 1.6 Which Kinds of Applications Are Targeted?
- 1.7 Major Issues in Data Mining
- 1.8 Summary
- 1.9 Exercises
- 1.10 Bibliographic Notes
- Chapter 2. Getting to Know Your Data
- 2.1 Data Objects and Attribute Types
- 2.2 Basic Statistical Descriptions of Data
- 2.3 Data Visualization
- 2.4 Measuring Data Similarity and Dissimilarity
- 2.5 Summary
- 2.6 Exercises
- 2.7 Bibliographic Notes
- Chapter 3. Data Preprocessing
- 3.1 Data Preprocessing: An Overview
- 3.2 Data Cleaning
- 3.3 Data Integration
- 3.4 Data Reduction
- 3.5 Data Transformation and Data Discretization
- 3.6 Summary
- 3.7 Exercises
- 3.8 Bibliographic Notes
- Chapter 4. Data Warehousing and Online Analytical Processing
- 4.1 Data Warehouse: Basic Concepts
- 4.2 Data Warehouse Modeling: Data Cube and OLAP
- 4.3 Data Warehouse Design and Usage
- 4.4 Data Warehouse Implementation
- 4.5 Data Generalization by Attribute-Oriented Induction
- 4.6 Summary
- 4.7 Exercises
- 4.8 Bibliographic Notes
- Chapter 5. Data Cube Technology
- 5.1 Data Cube Computation: Preliminary Concepts
- 5.2 Data Cube Computation Methods
- 5.3 Processing Advanced Kinds of Queries by Exploring Cube Technology
- 5.4 Multidimensional Data Analysis in Cube Space
- 5.5 Summary
- 5.6 Exercises
- 5.7 Bibliographic Notes
- Chapter 6. Mining Frequent Patterns, Associations, and Correlations: Basic Concepts and Methods
- 6.1 Basic Concepts
- 6.2 Frequent Itemset Mining Methods.
- 6.3 Which Patterns Are Interesting?-Pattern Evaluation Methods
- 6.4 Summary
- 6.5 Exercises
- 6.6 Bibliographic Notes
- Chapter 7. Advanced Pattern Mining
- 7.1 Pattern Mining: A Road Map
- 7.2 Pattern Mining in Multilevel, Multidimensional Space
- 7.3 Constraint-Based Frequent Pattern Mining
- 7.4 Mining High-Dimensional Data and Colossal Patterns
- 7.5 Mining Compressed or Approximate Patterns
- 7.6 Pattern Exploration and Application
- 7.7 Summary
- 7.8 Exercises
- 7.9 Bibliographic Notes
- Chapter 8. Classification: Basic Concepts
- 8.1 Basic Concepts
- 8.2 Decision Tree Induction
- 8.3 Bayes Classification Methods
- 8.4 Rule-Based Classification
- 8.5 Model Evaluation and Selection
- 8.6 Techniques to Improve Classification Accuracy
- 8.7 Summary
- 8.8 Exercises
- 8.9 Bibliographic Notes
- Chapter 9. Classification: Advanced Methods
- 9.1 Bayesian Belief Networks
- 9.2 Classification by Backpropagation
- 9.3 Support Vector Machines
- 9.4 Classification Using Frequent Patterns
- 9.5 Lazy Learners (or Learning from Your Neighbors)
- 9.6 Other Classification Methods
- 9.7 Additional Topics Regarding Classification
- 9.8 Summary
- 9.9 Exercises
- 9.10 Bibliographic Notes
- Chapter 10. Cluster Analysis: Basic Concepts and Methods
- 10.1 Cluster Analysis
- 10.2 Partitioning Methods
- 10.3 Hierarchical Methods
- 10.4 Density-Based Methods
- 10.5 Grid-Based Methods
- 10.6 Evaluation of Clustering
- 10.7 Summary
- 10.8 Exercises
- 10.9 Bibliographic Notes
- Chapter 11. Advanced Cluster Analysis
- 11.1 Probabilistic Model-Based Clustering
- 11.2 Clustering High-Dimensional Data
- 11.3 Clustering Graph and Network Data
- 11.4 Clustering with Constraints
- 11.5 Summary
- 11.6 Exercises
- 11.7 Bibliographic Notes
- Chapter 12. Outlier Detection
- 12.1 Outliers and Outlier Analysis.
- 12.2 Outlier Detection Methods
- 12.3 Statistical Approaches
- 12.4 Proximity-Based Approaches
- 12.5 Clustering-Based Approaches
- 12.6 Classification-Based Approaches
- 12.7 Mining Contextual and Collective Outliers
- 12.8 Outlier Detection in High-Dimensional Data
- 12.9 Summary
- 12.10 Exercises
- 12.11 Bibliographic Notes
- Chapter 13. Data Mining Trends and Research Frontiers
- 13.1 Mining Complex Data Types
- 13.2 Other Methodologies of Data Mining
- 13.3 Data Mining Applications
- 13.4 Data Mining and Society
- 13.5 Data Mining Trends
- 13.6 Summary
- 13.7 Exercises
- 13.8 Bibliographic Notes
- Bibliography
- Index.
(source: Nielsen Book Data)
- Ferguson, Renee, author.
- 1st edition. - MIT Sloan Management Review, 2012.
- Description
- Book — 1 online resource (5 p.) Digital: text file.
- Summary
-
In a recent data and analytics survey conducted by MIT Sloan Management Review in partnership with SAS Institute Inc., the authors found a strong correlation between the value companies say they generate using analytics and the amount of data they use. Combining the responses to several survey questions, they identified five levels of analytics sophistication, with those at Level 5 being most sophisticated and innovative. These analytical innovators in Level 5 had several defining characteristics. First, they tended to use more data than other groups. In fact, they were three times more likely than the 8% of those respondents who fell into the Level 1 category to say they used a great deal or all of their data. Second, there was a strong correlation between driving competitive advantage and innovation with analytics and how effective a company is at managing what the authors term "the information transformation cycle." This cycle refers to the process of capturing data, analyzing information, aggregating and integrating data, using insights to guide future strategy and disseminating information and insights. Respondents who fell into the Level 5 category also had a stronger need for speed than other survey respondents. Eighty-seven percent reported that the ability to process and analyze data more quickly was very important. Utilizing speed fell into three separate areas: customer experience, pricing strategy and innovation. Another intriguing finding from the survey involved the cultural impact on organizations. Some respondents reported that the use of analytics is shifting the power structure within their organizations. Analytical innovators, as a group, tended to be more likely than other groups to say that analytics has started to shift the power structure in their organizations.
- [Place of publication not identified] : O'Reilly Media, 2011.
- Description
- Video — 1 online resource (1 streaming video file (49 hr., 6 min., 17 sec.))
- Summary
-
"At O'Reilly's Strata New York Conference in September 2011, developers and data professionals learned about the best tools and technologies for everything from gathering, cleaning, analyzing, and storing data to communicating data intelligence effectively. This video compilation gives you access to every session."--Resource description page.
12. Data mining methods and models [2006]
- Larose, Daniel T.
- Hoboken, NJ : Wiley-Interscience, c2006.
- Description
- Book — xvi, 322 p. : ill. ; 25 cm.
- Summary
-
- Preface
- .1. Dimension Reduction Methods.Need for Dimension Reduction in Data Mining.Principal Components Analysis.Factor Analysis.User-Defined Composites
- .2. Regression Modeling.Example of Simple Linear Regression.Least-Squares Estimates.Coefficient or Determination.Correlation Coefficient.The ANOVA Table.Outliers, High Leverage Points, and Influential Observations.The Regression Model.Inference in Regression.Verifying the Regression Assumptions.An Example: The Baseball Data Set.An Example: The California Data Set.Transformations to Achieve Linearity
- .3. Multiple Regression and Model Building.An Example of Multiple Regression.The Multiple Regression Model.Inference in Multiple Regression.Regression with Categorical Predictors.Multicollinearity.Variable Selection Methods.An Application of Variable Selection Methods.Mallows' C p Statistic.Variable Selection Criteria.Using the Principal Components as Predictors in Multiple Regression
- .4. Logistic Regression.A Simple Example of Logistic Regression.Maximum Likelihood Estimation.Interpreting Logistic Regression Output.Inference: Are the Predictors Significant?Interpreting the Logistic Regression Model.Interpreting a Logistic Regression Model for a Dichotomous Predictor.Interpreting a Logistic Regression Model for a Polychotomous Predictor.Interpreting a Logistic Regression Model for a Continuous Predictor.The Assumption of Linearity.The Zero-Cell Problem.Multiple Logistic Regression.Introducing Higher Order terms to Handle Non-Linearity.Validating the Logistic Regression Model.WEKA: Hands-On Analysis Using Logistic Regression
- .5. Naive Bayes and Bayesian Networks.The Bayesian Approach.The Maximum a Posteriori (MAP) Classification.The Posterior Odds Ratio.Balancing the Data.Naive Bayes Classification.Numeric Predictors for Naive Bayes Classification.WEKA: Hands-On Analysis Using Naive Bayes.Bayesian Belief Networks.Using the Bayesian Network to Find Probabilities.WEKA: Hands-On Analysis Using Bayes Net
- .6. Genetic Algorithms.Introduction to Genetic Algorithms.The Basic Framework of a Genetic Algorithm.A Simple Example of Genetic Algorithms at Work.Modifications and Enhancements: Selection.Modifications and enhancements: Crossover.Genetic Algorithms for Real-Valued Variables.Using Genetic Algorithms to Train a Neural Network.WEKA: Hands-On Analysis Using Genetic Algorithms
- .7. Case Study: Modeling Response to Direct-Mail Marketing.The Cross-Industry Standard Process for Data Mining: CRISP-DM.Business Understanding Phase.Data Understanding and Data Preparation Phases.The Modeling Phase and the Evaluation Phase.Index.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
Engineering Library (Terman)
Engineering Library (Terman) | Status |
---|---|
Stacks | |
QA76.9 .D343 L378 2006 | Unknown |
13. Avro data [2011]
- Cutting, Doug.
- [United States?] : O'Reilly Media, 2011.
- Description
- Video — 1 online resource (1 streaming video file (47 min., 10 sec.))
- Summary
-
"Apache Avro provides an expressive, efficient standard for representing large data sets. Avro data is programming-language neutral and MapReduce-friendly. Hopefully it can replace gzipped CSV-like formats as a dominant format for data."--Resource description page.
- North, Matthew, speaker.
- [Place of publication not identified] : O'Reilly, [2016]
- Description
- Video — 1 online resource (1 streaming video file (1 hr., 56 min., 18 sec.)) : digital, sound, color
- Summary
-
"This course is designed for the person who is new to the science of data analytics, who has completed at least one college-level math class, and is comfortable with basic statistics. The course explains the core methods used in data analytics and how to apply those methods in conjunction with RapidMiner, a free and easy-to-use (no programming knowledge required) data analytics platform. You'll first learn about the features of RapidMiner, configuring it, and how to connect to a variety of data sets, and then move into a detailed survey of the analytical methods incorporated within the software. Topics covered include correlation, association rules, k-means clustering, k-nearest neighbors, discriminant analysis, Naive Bayes, linear and logistic regression, neural networks, decision trees, and text analysis."--Resource description page.
- IC3K (Conference) (10th : 2018 : Seville, Spain)
- Cham : Springer, 2020.
- Description
- Book — 1 online resource Digital: text file.PDF.
- Summary
-
- Knowledge Discovery and Information Retrieval.- Knowledge Engineering and Ontology Development.- Knowledge Management and Information Sharing.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
- Madsen, Mark.
- [Place of publication not identified] : O'Reilly, ©2011.
- Description
- Video — 1 online resource (1 streaming video file (45 min., 46 sec.))
- Summary
-
"There has been an explosion in database technology designed to handle big data and deep analytics from both established vendors and startups. This session will provide a quick tour of the primary technology innovations and systems powering the analytic database landscape."--Resource description page.
17. Data mining : a knowledge discovery approach [2007]
- New York ; New York : Springer, ©2007.
- Description
- Book — 1 online resource (xv, 606 pages) : illustrations Digital: text file.PDF.
- Summary
-
- Data Mining and Knowledge Discovery Process.- The Knowledge Discovery Process.- Data Understanding.- Data.- Concepts of Learning, Classification, and Regression.- Knowledge Representation.- Data Preprocessing.- Databases, Data Warehouses, and OLAP.- Feature Extraction and Selection Methods.- Discretization Methods.- Data Mining: Methods for Constructing Data Models.- Unsupervised Learning: Clustering.- Unsupervised Learning: Association Rules.- Supervised Learning: Statistical Methods.- Supervised Learning: Decision Trees, Rule Algorithms, and Their Hybrids.- Supervised Learning: Neural Networks.- Text Mining.- Data Models Assessment.- Assessment of Data Models.- Data Security and Privacy Issues.- Data Security, Privacy and Data Mining.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
18. Data mining for dummies [2014]
- Brown, Meta S., author.
- Hoboken : John Wiley & Sons, 2014.
- Description
- Book — 1 online resource
- Summary
-
- Introduction 1 Part I: Getting Started with Data Mining 5
- Chapter 1: Catching the Data-Mining Train 7
- Chapter 2: A Day in Your Life as a Data Miner 17
- Chapter 3: Teaming Up to Reach Your Goals 49 Part II: Exploring Data-Mining Mantras and Methods 61
- Chapter 4: Learning the Laws of Data Mining 63
- Chapter 5: Embracing the Data-Mining Process 73
- Chapter 6: Planning for Data-Mining Success 89
- Chapter 7: Gearing Up with the Right Software 97 Part III: Gathering the Raw Materials 109
- Chapter 8: Digging into Your Data 111
- Chapter 9: Making New Data 119
- Chapter 10: Ferreting Out Public Data Sources 141
- Chapter 11: Buying Data 163 Part IV: A Data Miner's Survival Kit 171
- Chapter 12: Getting Familiar with Your Data 173
- Chapter 13: Dealing in Graphic Detail 195
- Chapter 14: Showing Your Data Who's Boss 219
- Chapter 15: Your Exciting Career in Modeling 245 Part V: More Data-Mining Methods 273
- Chapter 16: Data Mining Using Classic Statistical Methods 275
- Chapter 17: Mining Data for Clues 295
- Chapter 18: Expanding Your Horizons 307 Part VI: The Part of Tens 319
- Chapter 19: Ten Great Resources for Data Miners 321
- Chapter 20: Ten Useful Kinds of Analysis That Complement Data Mining 325 Appendix A: Glossary 333 Appendix B: Data-Mining Software Sources 339 Appendix C: Major Data Vendors 349 Appendix D: Sources and Citations 357 Index 361.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
- Sigman, Betsy Page, author.
- Birmingham, England : Packt Publishing Ltd, 2015.
- Description
- Book — 1 online resource (156 pages) : illustrations (some color) Digital: text file.
- Summary
-
- Cover; Copyright; Credits; About the Author; About the Reviewers; www.PacktPub.com; Table of Contents; Preface;
- Chapter 1: Introducing Splunk; How to install Splunk; Splunk setup instructions; Setting up Splunk for Windows; Splunk for Mac; Starting up Splunk; The functions of Splunk; Splunk and big data; The three Vs; Other big data descriptors; Splunk data sources; Understanding events, event types, and fields in Splunk; Events; Event types; Sourcetypes; Fields; Getting data into Splunk; Summary;
- Chapter 2: An Introduction to Indexing and Searching; Collecting data to search
- Indexing data with SplunkUsing indexed data; Viewing a list of indexes; Bringing in indexed data; Specifying a sourcetype; What is Search Processing Language (SPL)?; Using pipes when processing data with Splunk; Types of SPL commands; Filter commands; The sort command; The grouping command; Reporting commands; Other commands; How to perform simple searches; Summary;
- Chapter 3: More on Using Search; More on search; Doing a count; Creating a count broken down by field values; Other stat functions; Using the eval command; Combining stats with eval; Using the timechart command; Visualizations
- Changing Format to Column ChartThe top command; Charting by the day of the week; Putting days of the week in an alphabetical order; Summary;
- Chapter 4: Reports in Splunk; Getting data ready for reporting; Tagging; Setting event types; The field extractor; The Report Builder; Creating a dashboard; Adding a panel with a search string; Built-in search dashboards; Creating a bar chart; Creating a stacked bar chart; Changing the placement of a legend; Creating an area chart across time; How to make a sparkline panel; Creating a scattergram; Creating a transaction; Radial Gauge
- Creating a Marker GaugeCreating a pivot table; Summary;
- Chapter 5: Splunk Applications; What are Splunk applications?; How to find Splunk apps; The wide range of Splunk applications; Apps versus add-ons; Types of apps; Splunk's app environment; Creating a Splunk applications; How to install an app; How to manage apps; Splunk's Twitter Application; Installing Splunk's Twitter app; Obtaining a Twitter account; Obtaining a Twitter API Key; Summary;
- Chapter 6: Using the Twitter App; Creating a Twitter index; Searching Twitter data; A simple search; Examining the Twitter event; The implied AND
- The need to specify ORFinding other words used; Using a lookup table; The built-in General Activity dashboard; The search code for the dashboard panels; Top Hashtags
- last 15 minutes; Top Mentions
- last 15 minutes; Time Tweet Zones
- 15 minutes; Tweet Stream (First-Time Users)
- last 30 seconds; The built-in per-user Activity dashboard; First panel
- Users Tweeting about @user (Without Direct RTs or Direct Replies); Second panel
- Users Replying to @user; Third panel
- Users Retweeting @user; Fourth panel
- Users Tweeting about #hashtag; Creating dashboard panels with Twitter data
(source: Nielsen Book Data)
20. Practical data mining [2012]
- Hancock, Monte.
- Boca Raton, FL : CRC Press, 2012.
- Description
- Book — 1 online resource (xxiii, 267 pages) : illustrations
- Summary
-
- What Is Data Mining and What Can It Do? The Data Mining Process. Problem Definition (Step 1). Data Evaluation (Step 2). Feature Extraction and Enhancement (Step 3). Prototyping Plan and Model Development (Step 4). Model Evaluation (Step 5). Implementation (Step 6). Supervised Learning Genre Section 1-Detecting and Characterizing Known Patterns. Forensic Analysis Genre Section 2-Detecting, Characterizing, and Exploiting Hidden Patterns. Genre Section 3-Knowledge: Its Acquisition, Representation, and Use.
- (source: Nielsen Book Data)
- What Is Data Mining and What Can It Do?Introduction A Brief Philosophical Discussion The Most Important Attribute of the Successful Data Miner: IntegrityWhat Does Data Mining Do? What Do We Mean By Data? Data Complexity Computational Complexity SummaryThe Data Mining Process IntroductionDiscovery and ExploitationEleven Key Principles of Information Driven Data MiningKey Principles ExpandedType of Models: Descriptive, Predictive, Forensic Data Mining Methodologies A Generic Data Mining Process RAD Skill Set DesignatorsSummaryProblem Definition (Step 1) IntroductionProblem Definition Task
- 1: Characterize Your Problem Problem Definition Checklist Candidate Solution ChecklistProblem Definition Task
- 2: Characterizing Your SolutionProblem Definition Case Study Summary Data Evaluation (Step 2) Introduction Data Accessibility Checklist How Much Data Do You Need?Data StagingMethods Used for Data Evaluation Data Evaluation Case Study: Estimating the Information Content Features Some Simple Data Evaluation MethodsData Quality ChecklistSummary Feature Extraction and Enhancement (Step 3) Introduction: A Quick Tutorial on Feature Space Characterizing and Resolving Data Problems Principal Component Analysis Synthesis of FeaturesDegappingSummary Prototyping Plan and Model Development (Step 4) Introduction Step 4A: Prototyping Plan Prototyping Plan Case StudyStep 4B: Prototyping/Model DevelopmentModel Development Case Study Summary Model Evaluation (Step 5) IntroductionEvaluation Goals and Methods What Does Accuracy Mean? Summary Implementation (Step 6) Introduction Quantifying the Benefits of Data Mining Tutorial on Ensemble Methods Getting It Wrong: Mistakes Every Data Miner Has Made Summary Supervised Learning Genre Section 1-Detecting and Characterizing Known Patterns Introduction Representative Example of Supervised Learning: Building a Classifier Specific Challenges, Problems, and Pitfalls of Supervised Learning Recommended Data Mining Architectures for Supervised Learning Descriptive Analysis Predictive ModelingSummary Forensic Analysis Genre Section 2-Detecting, Characterizing, and Exploiting Hidden Patterns Introduction Genre OverviewRecommended Data Mining Architectures for Unsupervised Learning Examples and Case Studies for Unsupervised Learning Tutorial on Neural Networks Making Syntactic Methods Smarter: The Search Engine Problem Summary Genre Section 3-Knowledge: Its Acquisition, Representation, and Use Introduction to Knowledge Engineering Computing with Knowledge Inferring Knowledge from Data: Machine Learning Summary References Glossary Index.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
Used by corporations, industry, and government to inform and fuel everything from focused advertising to homeland security, data mining can be a very useful tool across a wide range of applications. Unfortunately, most books on the subject are designed for the computer scientist and statistical illuminati and leave the reader largely adrift in technical waters. Revealing the lessons known to the seasoned expert, yet rarely written down for the uninitiated, Practical Data Mining explains the ins-and-outs of the detection, characterization, and exploitation of actionable patterns in data. This working field manual outlines the what, when, why, and how of data mining and offers an easy-to-follow, six-step spiral process. Catering to IT consultants, professional data analysts, and sophisticated data owners, this systematic, yet informal treatment will help readers answer questions, such as: What process model should I use to plan and execute a data mining project? How is a quantitative business case developed and assessed? What are the skills needed for different data mining projects? How do I track and evaluate data mining projects? How do I choose the best data mining techniques? Helping you avoid common mistakes, the book describes specific genres of data mining practice. Most chapters contain one or more case studies with detailed projects descriptions, methods used, challenges encountered, and results obtained. The book includes working checklists for each phase of the data mining process. Your passport to successful technical and planning discussions with management, senior scientists, and customers, these checklists lay out the right questions to ask and the right points to make from an insider's point of view. Visit the book's webpage.
(source: Nielsen Book Data)
Articles+
Journal articles, e-books, & other e-resources
Guides
Course- and topic-based guides to collections, tools, and services.