1 - 20
Next
- Müller, M. E.
- Cambridge : Cambridge University Press, 2012.
- Description
- Book — 1 online resource (280 p.) : digital, PDF file(s).
- Summary
-
- 1. Introduction
- 2. Relational knowledge
- 3. From data to hypotheses
- 4. Clustering
- 5. Information gain
- 6. Knowledge and relations
- 7. Rough set theory
- 8. Inductive logic learning
- 9. Ensemble learning
- 10. The logic of knowledge
- 11. Indexes and bibliography
- Bibliography
- Index.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
2. From big data to smart data [2015]
- Iafrate, Fernando, author.
- London : John Wiley and Sons, Inc., 2015.
- Description
- Book — 1 online resource : illustrations.
- Summary
-
- PREFACE ix LIST OF FIGURES AND TABLES xiii INTRODUCTION xv
- CHAPTER 1. WHAT IS BIG DATA? 1 1.1. The four "V"s characterizing Big Data 3 1.1.1. V for "Volume" 3 1.1.2. V for "Variety" 4 1.1.3. V for "Velocity" 8 1.1.4. V for "Value", associated with Smart Data 9 1.2. The technology that supports Big Data 10
- CHAPTER 2. WHAT IS SMART DATA? 13 2.1. How can we define it? 13 2.1.1. More formal integration into business processes 13 2.1.2. A stronger relationship with transaction solutions 14 2.1.3. The mobility and the temporality of information 15 2.2. The structural dimension 17 2.2.1. The objectives of a BICC 17 2.3. The closed loop between Big Data and Smart Data 18
- CHAPTER 3. ZERO LATENCY ORGANIZATION 21 3.1. From Big Data to Smart Data for a zero latency organization 21 3.2. Three types of latency 21 3.2.1. Latency linked to data 21 3.2.2. Latency linked to analytical processes 22 3.2.3. Latency linked to decisionmaking processes 23 3.2.4. Action latency 23
- CHAPTER 4. SUMMARY BY EXAMPLE 25 4.1. Example 1: date/product/price recommendation 26 4.1.1. Steps "1" and "2" 28 4.1.2. Steps "3" and "4": enter the world of "Smart Data" 29 4.1.3. Step "5": the presentation phase 29 4.1.4. Step "6": the "Holy Grail" (the purchase) 30 4.1.5. Step "7": Smart Data 30 4.2. Example 2: yield/revenue management (rate controls) 31 4.2.1. How it works: an explanation based on the Tetris principle (see Figure 4.4) 35 4.3. Example 3: optimization of operational performance 38 4.3.1. General department (top management) 42 4.3.2. Operations departments (middle management) 42 4.3.3. Operations management (and operational players) 43 CONCLUSION 47 BIBLIOGRAPHY 51 GLOSSARY 53 INDEX 57.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
- Bansal, Hanish, author.
- Birmingham, UK : Packt Publishing, [2016].
- Description
- Book — 1 online resource (1 volume) : illustrations
- Summary
-
Easy, hands-on recipes to help you understand Hive and its integration with frameworks that are used widely in today's big data world About This Book * Grasp a complete reference of different Hive topics. * Get to know the latest recipes in development in Hive including CRUD operations * Understand Hive internals and integration of Hive with different frameworks used in today's world. Who This Book Is For The book is intended for those who want to start in Hive or who have basic understanding of Hive framework. Prior knowledge of basic SQL command is also required What You Will Learn * Learn different features and offering on the latest Hive * Understand the working and structure of the Hive internals * Get an insight on the latest development in Hive framework * Grasp the concepts of Hive Data Model * Master the key concepts like Partition, Buckets and Statistics * Know how to integrate Hive with other frameworks such as Spark, Accumulo, etc In Detail Hive was developed by Facebook and later open sourced in Apache community. Hive provides SQL like interface to run queries on Big Data frameworks. Hive provides SQL like syntax also called as HiveQL that includes all SQL capabilities like analytical functions which are the need of the hour in today's Big Data world. This book provides you easy installation steps with different types of metastores supported by Hive. This book has simple and easy to learn recipes for configuring Hive clients and services. You would also learn different Hive optimizations including Partitions and Bucketing. The book also covers the source code explanation of latest Hive version. Hive Query Language is being used by other frameworks including spark. Towards the end you will cover integration of Hive with these frameworks. Style and approach Starting with the basics and covering the core concepts with the practical usage, this book is a complete guide to learn and explore Hive offerings.
(source: Nielsen Book Data)
- Larose, Daniel T., author.
- Second edition. - Hoboken, New Jersey : Wiley, [2014]
- Description
- Book — 1 online resource (xviii, 316 pages) : illustrations (some color)
- Summary
-
- PREFACE xi
- CHAPTER 1 AN INTRODUCTION TO DATA MINING 1 1.1 What is Data Mining? 1 1.2 Wanted: Data Miners 2 1.3 The Need for Human Direction of Data Mining 3 1.4 The Cross-Industry Standard Practice for Data Mining 4 1.4.1 Crisp-DM: The Six Phases 5 1.5 Fallacies of Data Mining 6 1.6 What Tasks Can Data Mining Accomplish? 8 1.6.1 Description 8 1.6.2 Estimation 8 1.6.3 Prediction 10 1.6.4 Classification 10 1.6.5 Clustering 12 1.6.6 Association 14 References 14 Exercises 15
- CHAPTER 2 DATA PREPROCESSING 16 2.1 Why do We Need to Preprocess the Data? 17 2.2 Data Cleaning 17 2.3 Handling Missing Data 19 2.4 Identifying Misclassifications 22 2.5 Graphical Methods for Identifying Outliers 22 2.6 Measures of Center and Spread 23 2.7 Data Transformation 26 2.8 Min-Max Normalization 26 2.9 Z-Score Standardization 27 2.10 Decimal Scaling 28 2.11 Transformations to Achieve Normality 28 2.12 Numerical Methods for Identifying Outliers 35 2.13 Flag Variables 36 2.14 Transforming Categorical Variables into Numerical Variables 37 2.15 Binning Numerical Variables 38 2.16 Reclassifying Categorical Variables 39 2.17 Adding an Index Field 39 2.18 Removing Variables that are Not Useful 39 2.19 Variables that Should Probably Not Be Removed 40 2.20 Removal of Duplicate Records 41 2.21 A Word About ID Fields 41 The R Zone 42 References 48 Exercises 48 Hands-On Analysis 50
- CHAPTER 3 EXPLORATORY DATA ANALYSIS 51 3.1 Hypothesis Testing Versus Exploratory Data Analysis 51 3.2 Getting to Know the Data Set 52 3.3 Exploring Categorical Variables 55 3.4 Exploring Numeric Variables 62 3.5 Exploring Multivariate Relationships 69 3.6 Selecting Interesting Subsets of the Data for Further Investigation 71 3.7 Using EDA to Uncover Anomalous Fields 71 3.8 Binning Based on Predictive Value 72 3.9 Deriving New Variables: Flag Variables 74 3.10 Deriving New Variables: Numerical Variables 77 3.11 Using EDA to Investigate Correlated Predictor Variables 77 3.12 Summary 80 The R Zone 82 Reference 88 Exercises 88 Hands-On Analysis 89
- CHAPTER 4 UNIVARIATE STATISTICAL ANALYSIS 91 4.1 Data Mining Tasks in Discovering Knowledge in Data 91 4.2 Statistical Approaches to Estimation and Prediction 92 4.3 Statistical Inference 93 4.4 How Confident are We in Our Estimates? 94 4.5 Confidence Interval Estimation of the Mean 95 4.6 How to Reduce the Margin of Error 97 4.7 Confidence Interval Estimation of the Proportion 98 4.8 Hypothesis Testing for the Mean 99 4.9 Assessing the Strength of Evidence Against the Null Hypothesis 101 4.10 Using Confidence Intervals to Perform Hypothesis Tests 102 4.11 Hypothesis Testing for the Proportion 104 The R Zone 105 Reference 106 Exercises 106
- CHAPTER 5 MULTIVARIATE STATISTICS 109 5.1 Two-Sample t-Test for Difference in Means 110 5.2 Two-Sample Z-Test for Difference in Proportions 111 5.3 Test for Homogeneity of Proportions 112 5.4 Chi-Square Test for Goodness of Fit of Multinomial Data 114 5.5 Analysis of Variance 115 5.6 Regression Analysis 118 5.7 Hypothesis Testing in Regression 122 5.8 Measuring the Quality of a Regression Model 123 5.9 Dangers of Extrapolation 123 5.10 Confidence Intervals for the Mean Value of y Given x 125 5.11 Prediction Intervals for a Randomly Chosen Value of y Given x 125 5.12 Multiple Regression 126 5.13 Verifying Model Assumptions 127 The R Zone 131 Reference 135 Exercises 135 Hands-On Analysis 136
- CHAPTER 6 PREPARING TO MODEL THE DATA 138 6.1 Supervised Versus Unsupervised Methods 138 6.2 Statistical Methodology and Data Mining Methodology 139 6.3 Cross-Validation 139 6.4 Overfitting 141 6.5 BIAS Variance Trade-Off 142 6.6 Balancing the Training Data Set 144 6.7 Establishing Baseline Performance 145 The R Zone 146 Reference 147 Exercises 147
- CHAPTER 7 k-NEAREST NEIGHBOR ALGORITHM 149 7.1 Classification Task 149 7.2 k-Nearest Neighbor Algorithm 150 7.3 Distance Function 153 7.4 Combination Function 156 7.4.1 Simple Unweighted Voting 156 7.4.2 Weighted Voting 156 7.5 Quantifying Attribute Relevance: Stretching the Axes 158 7.6 Database Considerations 158 7.7 k-Nearest Neighbor Algorithm for Estimation and Prediction 159 7.8 Choosing k 160 7.9 Application of k-Nearest Neighbor Algorithm Using IBM/SPSS Modeler 160 The R Zone 162 Exercises 163 Hands-On Analysis 164
- CHAPTER 8 DECISION TREES 165 8.1 What is a Decision Tree? 165 8.2 Requirements for Using Decision Trees 167 8.3 Classification and Regression Trees 168 8.4 C4.5 Algorithm 174 8.5 Decision Rules 179 8.6 Comparison of the C5.0 and Cart Algorithms Applied to Real Data 180 The R Zone 183 References 184 Exercises 185 Hands-On Analysis 185
- CHAPTER 9 NEURAL NETWORKS 187 9.1 Input and Output Encoding 188 9.2 Neural Networks for Estimation and Prediction 190 9.3 Simple Example of a Neural Network 191 9.4 Sigmoid Activation Function 193 9.5 Back-Propagation 194 9.5.1 Gradient Descent Method 194 9.5.2 Back-Propagation Rules 195 9.5.3 Example of Back-Propagation 196 9.6 Termination Criteria 198 9.7 Learning Rate 198 9.8 Momentum Term 199 9.9 Sensitivity Analysis 201 9.10 Application of Neural Network Modeling 202 The R Zone 204 References 207 Exercises 207 Hands-On Analysis 207
- CHAPTER 10 HIERARCHICAL AND k-MEANS CLUSTERING 209 10.1 The Clustering Task 209 10.2 Hierarchical Clustering Methods 212 10.3 Single-Linkage Clustering 213 10.4 Complete-Linkage Clustering 214 10.5 k-Means Clustering 215 10.6 Example of k-Means Clustering at Work 216 10.7 Behavior of MSB, MSE, and PSEUDO-F as the k-Means Algorithm Proceeds 219 10.8 Application of k-Means Clustering Using SAS Enterprise Miner 220 10.9 Using Cluster Membership to Predict Churn 223 The R Zone 224 References 226 Exercises 226 Hands-On Analysis 226
- CHAPTER 11 KOHONEN NETWORKS 228 11.1 Self-Organizing Maps 228 11.2 Kohonen Networks 230 11.2.1 Kohonen Networks Algorithm 231 11.3 Example of a Kohonen Network Study 231 11.4 Cluster Validity 235 11.5 Application of Clustering Using Kohonen Networks 235 11.6 Interpreting the Clusters 237 11.6.1 Cluster Profiles 240 11.7 Using Cluster Membership as Input to Downstream Data Mining Models 242 The R Zone 243 References 245 Exercises 245 Hands-On Analysis 245
- CHAPTER 12 ASSOCIATION RULES 247 12.1 Affinity Analysis and Market Basket Analysis 247 12.1.1 Data Representation for Market Basket Analysis 248 12.2 Support, Confidence, Frequent Itemsets, and the a Priori Property 249 12.3 How Does the a Priori Algorithm Work? 251 12.3.1 Generating Frequent Itemsets 251 12.3.2 Generating Association Rules 253 12.4 Extension from Flag Data to General Categorical Data 255 12.5 Information-Theoretic Approach: Generalized Rule Induction Method 256 12.5.1 J-Measure 257 12.6 Association Rules are Easy to do Badly 258 12.7 How Can We Measure the Usefulness of Association Rules? 259 12.8 Do Association Rules Represent Supervised or Unsupervised Learning? 260 12.9 Local Patterns Versus Global Models 261 The R Zone 262 References 263 Exercises 263 Hands-On Analysis 264
- CHAPTER 13 IMPUTATION OF MISSING DATA 266 13.1 Need for Imputation of Missing Data 266 13.2 Imputation of Missing Data: Continuous Variables 267 13.3 Standard Error of the Imputation 270 13.4 Imputation of Missing Data: Categorical Variables 271 13.5 Handling Patterns in Missingness 272 The R Zone 273 Reference 276 Exercises 276 Hands-On Analysis 276
- CHAPTER 14 MODEL EVALUATION TECHNIQUES 277 14.1 Model Evaluation Techniques for the Description Task 278 14.2 Model Evaluation Techniques for the Estimation and Prediction Tasks 278 14.3 Model Evaluation Techniques for the Classification Task 280 14.4 Error Rate, False Positives, and False Negatives 280 14.5 Sensitivity and Specificity 283 14.6 Misclassification Cost Adjustment to Reflect Real-World Concerns 284 14.7 Decision Cost/Benefit Analysis 285 14.8 Lift Charts and Gains Charts 286 14.9 Interweaving Model Evaluation with Model Building 289 14.10 Confluence of Results: Applying a Suite of Models 290 The R Zone 291 Reference 291 Exercises 291 Hands-On Analysis 291 APPENDIX: DATA SUMMARIZATION AND VISUALIZATION 294 INDEX 309.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
- Kantardzic, Mehmed.
- 2nd ed. - Hoboken, NJ : Wiley-IEEE Press, c2011.
- Description
- Book — 1 online resource (xvii, 534 p.)
- Summary
-
- Preface to the Second Edition xiii Preface to the First Edition xv
- 1 DATA-MINING CONCEPTS 1 1.1 Introduction 1 1.2 Data-Mining Roots 4 1.3 Data-Mining Process 6 1.4 Large Data Sets 9 1.5 Data Warehouses for Data Mining 14 1.6 Business Aspects of Data Mining: Why a Data-Mining Project Fails 17 1.7 Organization of This Book 21 1.8 Review Questions and Problems 23 1.9 References for Further Study 24
- 2 PREPARING THE DATA 26 2.1 Representation of Raw Data 26 2.2 Characteristics of Raw Data 31 2.3 Transformation of Raw Data 33 2.4 Missing Data 36 2.5 Time-Dependent Data 37 2.6 Outlier Analysis 41 2.7 Review Questions and Problems 48 2.8 References for Further Study 51
- 3 DATA REDUCTION 53 3.1 Dimensions of Large Data Sets 54 3.2 Feature Reduction 56 3.3 Relief Algorithm 66 3.4 Entropy Measure for Ranking Features 68 3.5 PCA 70 3.6 Value Reduction 73 3.7 Feature Discretization: ChiMerge Technique 77 3.8 Case Reduction 80 3.9 Review Questions and Problems 83 3.10 References for Further Study 85
- 4 LEARNING FROM DATA 87 4.1 Learning Machine 89 4.2 SLT 93 4.3 Types of Learning Methods 99 4.4 Common Learning Tasks 101 4.5 SVMs 105 4.6 kNN: Nearest Neighbor Classifi er 118 4.7 Model Selection versus Generalization 122 4.8 Model Estimation 126 4.9 90% Accuracy: Now What? 132 4.10 Review Questions and Problems 136 4.11 References for Further Study 138
- 5 STATISTICAL METHODS 140 5.1 Statistical Inference 141 5.2 Assessing Differences in Data Sets 143 5.3 Bayesian Inference 146 5.4 Predictive Regression 149 5.5 ANOVA 155 5.6 Logistic Regression 157 5.7 Log-Linear Models 158 5.8 LDA 162 5.9 Review Questions and Problems 164 5.10 References for Further Study 167
- 6 DECISION TREES AND DECISION RULES 169 6.1 Decision Trees 171 6.2 C4.5 Algorithm: Generating a Decision Tree 173 6.3 Unknown Attribute Values 180 6.4 Pruning Decision Trees 184 6.5 C4.5 Algorithm: Generating Decision Rules 185 6.6 CART Algorithm & Gini Index 189 6.7 Limitations of Decision Trees and Decision Rules 192 6.8 Review Questions and Problems 194 6.9 References for Further Study 198
- 7 ARTIFICIAL NEURAL NETWORKS 199 7.1 Model of an Artifi cial Neuron 201 7.2 Architectures of ANNs 205 7.3 Learning Process 207 7.4 Learning Tasks Using ANNs 210 7.5 Multilayer Perceptrons (MLPs) 213 7.6 Competitive Networks and Competitive Learning 221 7.7 SOMs 225 7.8 Review Questions and Problems 231 7.9 References for Further Study 233
- 8 ENSEMBLE LEARNING 235 8.1 Ensemble-Learning Methodologies 236 8.2 Combination Schemes for Multiple Learners 240 8.3 Bagging and Boosting 241 8.4 AdaBoost 243 8.5 Review Questions and Problems 245 8.6 References for Further Study 247
- 9 CLUSTER ANALYSIS 249 9.1 Clustering Concepts 250 9.2 Similarity Measures 253 9.3 Agglomerative Hierarchical Clustering 259 9.4 Partitional Clustering 263 9.5 Incremental Clustering 266 9.6 DBSCAN Algorithm 270 9.7 BIRCH Algorithm 272 9.8 Clustering Validation 275 9.9 Review Questions and Problems 275 9.10 References for Further Study 279
- 10 ASSOCIATION RULES 280 10.1 Market-Basket Analysis 281 10.2 Algorithm Apriori 283 10.3 From Frequent Itemsets to Association Rules 285 10.4 Improving the Effi ciency of the Apriori Algorithm 286 10.5 FP Growth Method 288 10.6 Associative-Classifi cation Method 290 10.7 Multidimensional Association-Rules Mining 293 10.8 Review Questions and Problems 295 10.9 References for Further Study 298
- 11 WEB MINING AND TEXT MINING 300 11.1 Web Mining 300 11.2 Web Content, Structure, and Usage Mining 302 11.3 HITS and LOGSOM Algorithms 305 11.4 Mining Path-Traversal Patterns 310 11.5 PageRank Algorithm 313 11.6 Text Mining 316 11.7 Latent Semantic Analysis (LSA) 320 11.8 Review Questions and Problems 324 11.9 References for Further Study 326
- 12 ADVANCES IN DATA MINING 328 12.1 Graph Mining 329 12.2 Temporal Data Mining 343 12.3 Spatial Data Mining (SDM) 357 12.4 Distributed Data Mining (DDM) 360 12.5 Correlation Does Not Imply Causality 369 12.6 Privacy, Security, and Legal Aspects of Data Mining 376 12.7 Review Questions and Problems 381 12.8 References for Further Study 382
- 13 GENETIC ALGORITHMS 385 13.1 Fundamentals of GAs 386 13.2 Optimization Using GAs 388 13.3 A Simple Illustration of a GA 394 13.4 Schemata 399 13.5 TSP 402 13.6 Machine Learning Using GAs 404 13.7 GAs for Clustering 409 13.8 Review Questions and Problems 411 13.9 References for Further Study 413
- 14 FUZZY SETS AND FUZZY LOGIC 414 14.1 Fuzzy Sets 415 14.2 Fuzzy-Set Operations 420 14.3 Extension Principle and Fuzzy Relations 425 14.4 Fuzzy Logic and Fuzzy Inference Systems 429 14.5 Multifactorial Evaluation 433 14.6 Extracting Fuzzy Models from Data 436 14.7 Data Mining and Fuzzy Sets 441 14.8 Review Questions and Problems 443 14.9 References for Further Study 445
- 15 VISUALIZATION METHODS 447 15.1 Perception and Visualization 448 15.2 Scientifi c Visualization and Information Visualization 449 15.3 Parallel Coordinates 455 15.4 Radial Visualization 458 15.5 Visualization Using Self-Organizing Maps (SOMs) 460 15.6 Visualization Systems for Data Mining 462 15.7 Review Questions and Problems 467 15.8 References for Further Study 468 Appendix A 470 A.1 Data-Mining Journals 470 A.2 Data-Mining Conferences 473 A.3 Data-Mining Forums/Blogs 477 A.4 Data Sets 478 A.5 Comercially and Publicly Available Tools 480 A.6 Web Site Links 489 Appendix B: Data-Mining Applications 496 B.1 Data Mining for Financial Data Analysis 496 B.2 Data Mining for the Telecomunications Industry 499 B.3 Data Mining for the Retail Industry 501 B.4 Data Mining in Health Care and Biomedical Research 503 B.5 Data Mining in Science and Engineering 506 B.6 Pitfalls of Data Mining 509 Bibliography 510 Index 529.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
- Krishnan, Krish.
- Amsterdam : Morgan Kaufmann is an imprint of Elsevier, 2013.
- Description
- Book — 1 online resource.
- Summary
-
- Part 1 - Big Data Chapter 1 - Introduction to Big Data Chapter 2 - Complexity of Big Data Chapter 3 - Big Data Processing Architectures
- Chapter 4 - Big Data Technologies
- Chapter 5 - Big Data Business Value
- Part 2 - The Data Warehouse Chapter 6 - Data Warehouse
- Chapter 7 - Re-Engineering the Data Warehouse
- Chapter 8 -Workload Management in the Data Warehouse
- Chapter 9 - New Technology Approaches
- Part 3 - Extending Big Data into the Data Warehouse Chapter 10 - Integration of Big Data and Data Warehouse
- Chapter 11 - Data Driven Architecture
- Chapter 12 - Information Management and Lifecycle
- Chapter 13 - Big Data Analytics, Visualization and Data Scientist
- Chapter 14 - Implementing The "Big Data" Data Warehouse
- Appendix A - Customer Case Studies From Vendors Appendix B - Building The HealthCare Information Factory.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
- Els, Anton (Oracle Database certified master)
- New York : McGraw-Hill Education, [2017]
- Description
- Book — 1 online resource
- Summary
-
- Part 1: An Introduction to Multitenant
- Chapter 1: Introduction to Multitenant
- Chapter 2: Creating the Database
- Chapter 3: Single-Tenant, Multitenant, and Application Container
- Part 2: Multitenant Administration
- Chapter 4: Day-to-Day Management
- Chapter 5: Networking and Services
- Chapter 6: Security
- Part 3: Backup, Recovery, and Database Movement
- Chapter 7: Backup and Recovery
- Chapter 8: Flashback and Point-in-Time Recovery
- Chapter 9: Moving Data
- Part 4: Advanced Multitenant
- Chapter 10: Resource Manager
- Chapter 11: Data Guard
- Chapter 12: Sharing Data Across PDBs
- Chapter 13: Logical Replication.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
- Geller, Arie, author.
- New York : McGraw-Hill Education, 2017.
- Description
- Book — 1 online resource Digital: text file.
- Summary
-
- Chapter 1: Introduction to APEX
- Chapter 2: Getting Ready
- Chapter 3: APEX IDE: Quick Tour & Basic Concepts
- Chapter 4: APEX Applications - Concepts and Building Blocks
- Chapter 5: The Page Designer
- Chapter 6: APEX Wizards
- Chapter 7: Computations, Validations and Processes
- Chapter 8: Crafting a Powerful UI
- Chapter 9: Dynamic Actions
- Chapter 10: APEX Security
- Chapter 11: Packaging and Deployment.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
- Bansal, Hanish, author.
- Birmingham, UK : Packt Publishing, [2016].
- Description
- Book — 1 online resource (1 volume) : illustrations
- Summary
-
Easy, hands-on recipes to help you understand Hive and its integration with frameworks that are used widely in today's big data world About This Book * Grasp a complete reference of different Hive topics. * Get to know the latest recipes in development in Hive including CRUD operations * Understand Hive internals and integration of Hive with different frameworks used in today's world. Who This Book Is For The book is intended for those who want to start in Hive or who have basic understanding of Hive framework. Prior knowledge of basic SQL command is also required What You Will Learn * Learn different features and offering on the latest Hive * Understand the working and structure of the Hive internals * Get an insight on the latest development in Hive framework * Grasp the concepts of Hive Data Model * Master the key concepts like Partition, Buckets and Statistics * Know how to integrate Hive with other frameworks such as Spark, Accumulo, etc In Detail Hive was developed by Facebook and later open sourced in Apache community. Hive provides SQL like interface to run queries on Big Data frameworks. Hive provides SQL like syntax also called as HiveQL that includes all SQL capabilities like analytical functions which are the need of the hour in today's Big Data world. This book provides you easy installation steps with different types of metastores supported by Hive. This book has simple and easy to learn recipes for configuring Hive clients and services. You would also learn different Hive optimizations including Partitions and Bucketing. The book also covers the source code explanation of latest Hive version. Hive Query Language is being used by other frameworks including spark. Towards the end you will cover integration of Hive with these frameworks. Style and approach Starting with the basics and covering the core concepts with the practical usage, this book is a complete guide to learn and explore Hive offerings.
(source: Nielsen Book Data)
- Fritz, Mike, author.
- Amsterdam ; Boston : Morgan Kaufmann, an imprint of Elsevier, c2015.
- Description
- Book — 1 online resource : ill.
- Summary
-
- Chapter 1: The Changing World of UX Chapter 2: Data to the Rescue Chapter 3: Stats! It's Easier Than You Think Chapter 4: Unmoderated Remote Usability Tests Chapter 5: Surveys
- Chapter 6: Good Old Usability Tests
- Chapter 7: Persona Development
- Chapter 8: Field Studies
- Chapter 9: Live Website Data Chapter 10: Card Sorting Data
- Chapter 11: Case Studies - Tips from the Real World.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
11. Data mining : concepts and techniques [2012]
- Han, Jiawei.
- 3rd ed. - Amsterdam ; Boston : Elsevier/Morgan Kaufmann, ©2012.
- Description
- Book — 1 online resource (xxxv, 703 pages) : illustrations. Digital: text file.
- Summary
-
- 1. Introduction
- 2. Getting to Know Your Data
- 3. Preprocessing: Data Reduction, Transformation, and Integration
- 4. Data Warehousing and On-Line Analytical Processing
- 5. Data Cube Technology
- 6. Mining Frequent Patterns, Associations and Correlations: Concepts and Methods
- 7. Advanced Frequent Pattern Mining
- 8. Classification: Basic Concepts
- 9. Classification: Advanced Methods
- 10. Cluster Analysis: Basic Concepts and Methods
- 11. Cluster Analysis: Advanced Methods
- 12. Outlier Analysis
- 13. Trends and Research Frontiers in Data Mining.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
- Martinez, Wendy L., eauthor.
- Third edition. - Boca Raton, FL : Chapman and Hall/CRC, an imprint of Taylor and Francis, 2015.
- Description
- Book — 1 online resource (759 pages) : 240 illustrations.
- Summary
-
- Introduction What Is Computational Statistics? An Overview of the Book
- Probability Concepts Introduction Probability Conditional Probability and Independence Expectation Common Distributions
- Sampling Concepts Introduction Sampling Terminology and Concepts Sampling Distributions Parameter Estimation Empirical Distribution Function
- Generating Random Variables Introduction General Techniques for Generating Random Variables Generating Continuous Random Variables Generating Discrete Random Variables
- Exploratory Data Analysis Introduction Exploring Univariate Data Exploring Bivariate and Trivariate Data Exploring Multidimensional Data
- Finding Structure Introduction Projecting Data Principal Component Analysis Projection Pursuit EDA Independent Component Analysis Grand Tour Nonlinear Dimensionality Reduction
- Monte Carlo Methods for Inferential Statistics Introduction Classical Inferential Statistics Monte Carlo Methods for Inferential Statistics Bootstrap Methods
- Data Partitioning Introduction Cross-Validation Jackknife Better Bootstrap Confidence Intervals Jackknife-after-Bootstrap
- Probability Density Estimation Introduction Histograms Kernel Density Estimation Finite Mixtures Generating Random Variables
- Supervised Learning Introduction Bayes' Decision Theory Evaluating the Classifier Classification Trees Combining Classifiers Nearest Neighbor Classifier Support Vector Machines
- Unsupervised Learning Introduction Measures of Distance Hierarchical Clustering K-Means Clustering Model-Based Clustering Assessing Cluster Results
- Parametric Models Introduction Spline Regression Models Logistic Regression Generalized Linear Models Model Selection and Regularization Partial Least Squares Regression
- Nonparametric Models Introduction Some Smoothing Methods Kernel Methods Smoothing Splines Nonparametric Regression-Other Details Regression Trees Additive Models Multivariate Adaptive Regression Splines
- Markov Chain Monte Carlo Methods Introduction Background Metropolis-Hastings Algorithms The Gibbs Sampler Convergence Monitoring
- Appendix A: MATLAB (R) Basics Appendix B: Projection Pursuit Indexes Appendix C: Data Sets Appendix D: Notation
- References
- Index
- MATLAB (R) Code, Further Reading, and Exercises appear at the end of each chapter.
- (source: Nielsen Book Data)
- Introduction. Probability Concepts. Sampling Concepts. Generating Random Variables. Exploratory Data Analysis. Finding Structure. Monte Carlo Methods for Inferential Statistics. Data Partitioning. Probability Density Estimation. Supervised Learning. Unsupervised Learning. Parametric Models. Nonparametric Models. Markov Chain Monte Carlo Methods. Appendices. References. Index.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
A Strong Practical Focus on Applications and AlgorithmsComputational Statistics Handbook with MATLAB, Third Edition covers today's most commonly used techniques in computational statistics while maintaining the same philosophy and writing style of the bestselling previous editions. The text keeps theoretical concepts to a minimum, emphasizing the i.
(source: Nielsen Book Data)
- Song, Guo (Computer scientist), author.
- Cambridge, United Kingdom ; New York, NY : Cambridge University Press, 2022.
- Description
- Book — 1 online resource (x, 217 pages) : illustrations
- Summary
-
- 1. Introduction
- 2. Preliminary
- 3. Fundamental Theory and Algorithms of Edge Learning
- 4. Communication-Efficient Edge Learning
- 5. Computation Acceleration
- 6. Efficient Training with Heterogeneous Data Distribution
- 7. Security and Privacy Issues in Edge Learning Systems
- 8. Edge Learning Architecture Design for System Scalability
- 9. Incentive Mechanisms in Edge Learning Systems
- 10. Edge Learning Applications.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
- Chao, Lee, author.
- First edition. - Boca Raton, FL : Auerbach Publications, an imprint of Taylor and Francis, 2015.
- Description
- Book — 1 online resource (527 pages) : 607 illustrations
- Summary
-
- Overview on Cloud and Networking. Network Protocols. Network Concepts and Design. Network Directory Services. Dynamic Host Service and Name Service. Networking with Windows PowerShell. Internet Data Transaction Protection. Internet Protocol Security. Routing and Remote Access Service. Virtual Private Network. Hybrid Cloud.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
- Havewala, Porus Homi, author.
- New York : McGraw-Hill Education, [2017]
- Description
- Book — 1 online resource
- Summary
-
- Cover
- Title Page
- Copyright Page
- Dedication
- Contents at a Glance
- Contents
- Acknowledgments
- Introduction
- 1 Consolidation Planning for the Cloud
- Introducing Oracle Cloud Computing
- Consolidating to Physical Servers
- Running the Host Consolidation Planner
- Creating a Custom Consolidation Scenario for P2P
- Consolidating to Oracle Public Cloud Servers
- Consolidating to Virtual Machines
- Updating the Benchmark Rates
- Database Consolidation Workbench
- Creating a D2S Project and Scenario
- Creating a D2D Project and Scenario
- Implementing the Consolidation
- Self-Service: Request and Create a Snap Clone (CloneDB) Database Service
- Summary
- 3 Schema as a Service
- Create a Schema Pool
- Dealing with Quotas
- Creating a Data Profile
- Creating a Service Template for the HR Schema
- Creating a Service Template for User-Defined Schemas
- Request Settings
- Configuring Chargeback
- Self Service: Request and Create a Schema Service
- Self Service: Request and Create User-Defined Schemas
- Summary
- 4 Pluggable Database as a Service
- Creating a Pluggable Database Pool
- Setting Up Quotas
- Creating a Data Profile for PDB
- Creating a Service Template from a PDB Profile
- Creating a Service Template for an Empty PDB
- Setting Up Request Settings
- Configuring Chargeback
- Self Service: Request and Create a PDB Service for SALES PDB
- Self Service: Request and Create a PDB Service for Empty PDB
- Viewing Chargeback Results
- Administering the Cloud
- Summary
- 5 Hybrid Database Cloud
- Presetup for Hybrid Cloud
- Step 1: Register the Agent
- Step 2: Generate SSH Keys
- Step 3: Create a Named Credential
- Step 4: Changing SELINUX to Permissive
- Other Requirements
- Testing the Hybrid Cloud
- Installing the Cloud Agent
- Discovering the Database and Listener on the Cloud Server
- Comparing On-Premise and Cloud Database Configurations
- Cloning PDBs from On-Premise to Cloud
- Preserving the Agent Home When Patching the Cloud Database
- Cloning to Oracle Cloud
- Masking Results
- Verifying the APEX Version of CDBs
- Cloning from Oracle Cloud
- Summary
- 6 Using the Cloud REST API
- Installing the REST Client
- Presteps
- Step 1: Viewing Details
- Step 2: Creating a Database
- Checking the Progress of Database Creation via the Enterprise Manager API
- Boston, Massachusetts : Harvard Business Review Press, [2019]
- Description
- Book — 181 pages : illustrations ; 21 cm.
- Summary
-
- Artificial intelligence for the real world / by Thomas H. Davenport and Rajeev Ronanki
- Stitch Fix's CEO on selling personal style to the mass market / by Katrina Lake
- Algorithms need managers too / by Michael Luca, Jon Kleinberg, Sendhil Mullainathan
- Marketing in the age of Alexa / by Niraj Dawar
- Why every company needs an augmented reality strategy / by Michael Porter
- Drones go to work / by Chris Anderson
- The truth about blockchain / by Marco Iansiti and Karim R. Lakhani
- The 3-D printing playbook / by Richard D'Aveni
- Collaborative intelligence: humans and AI are joining forces / by H. James Wilson and Paul Daugherty
- When your boss wears metal pants / by Walter Frick
- Managing our hub economy / by Marco Iansiti and Karim R. Lakhani.
(source: Nielsen Book Data)
- Online
Business Library
Business Library | Status |
---|---|
Stacks | Request (opens in new tab) |
TA347.A78 H456 2019 | Missing |
17. Warranty fraud management : reducing fraud and other excess costs in warranty and service operations [2016]
- Kurvinen, Matti, 1961- author.
- Hoboken : Wiley, 2016.
- Description
- Book — 1 online resource.
- Summary
-
- Foreword xiii Preface xvii Acknowledgments xxiii About the Authors xxv
- Chapter 1 Overview 1 Warranties 3 Warranty Servicing 4 Warranty Costs 5 Warranty Fraud 6 Impact of Warranty Fraud 9 Warranty Fraud Management 10 Study of Warranty 10 Goals of the Book 12 Structure of the Book 12 Note 14
- Chapter 2 Products and Product Warranty 15 Products 16 Product Performance, Failure, and Reliability 19 Product Maintenance 24 Product Warranty 26 Maintenance Service Contracts 36 Insurances 37 Notes 38
- Chapter 3 Warranty Servicing 39 Parties in the Warranty Service Network 40 Warranty Service Process 46 Outsourcing of Warranty Service 54 Contracts 56 Notes 62
- Chapter 4 Warranty Costs 63 Different Perspectives 65 Factors Underlying Warranty Costs 68 Warranty Cost Metrics 72 Warranty Reserves and Accruals 77 Warranty Cost Control 78 Notes 79
- Chapter 5 Warranty Management 81 Evolution of Warranty Management 82 Service Life-Cycle Perspective 84 Product Life-Cycle Perspective 95 Organizational Structure 100 Warranty Management Systems 105 Warranty Management Maturity Models 122 Notes 124
- Chapter 6 Warranty Fraud 125 Fraud in General 126 Actors and Victims of Warranty Fraud 128 Classification of Warranty Fraud 129 Fraud Patterns 130 Consequences and Impacts of Warranty Fraud 135 Customer Fraud 139 Service Agent Fraud 147 Sales Channel Fraud 162 Warranty Administrator Fraud 166 Warranty Provider Fraud 169 Notes 175
- Chapter 7 Warranty Control Framework 177 Contracts 180 Transaction Controls 181 Analytics 183 Service Network Management 187
- Chapter 8 Customer Fraud Management 189 Customer Contract 190 Customer Entitlement 200 Material Returns Control 207 Analytics 208 Notes 213
- Chapter 9 Service Agent Fraud Management 215 Service Agent Contract 216 Entitlement and Repair Authorization Processes 237 Claim Validation Process 239 Analytics 248 Material Returns Control 278 Service Network Management 280 Notes 291
- Chapter 10 Fraud Management with Other Parties 293 Sales Channel Fraud Management 294 Warranty Administrator Fraud Management 299 Warranty Provider Fraud Management 305
- Chapter 11 Structures Influencing Warranty Fraud 307 Effective Service Process 308 Service Organization 315 Notes 318
- Chapter 12 Implementing a Warranty Control Framework 319 Assessing The Current Situation 320 Crafting an Improvement Plan 322 Defining Policies and Rules 322 Building the Capabilities 323 Deploying the Change 325 Business Case Considerations 327 Implementation Challenges 328 Achieving the Transformation 329
- Chapter 13 Epilogue 331 Opportunities to Improve Warranty Control 333 New Research into Warranty Fraud 335 Appendix A Detailed Claim Data 337 Appendix B Agency Theory 343 Appendix C Game Theory 347 Glossary 351 Acronyms 355 References 357 Index 363.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
18. Intelligent Modeling, Prediction, and Diagnosis from Epidemiological Data : COVID-19 and Beyond [2021]
- First edition - [Place of publication not identified] : Chapman and Hall/CRC, 2021
- Description
- Book — 1 online resource (xviii, 214 pages)
- Summary
-
- 1. Human Immune System and Infectious DiseaseFaruk Bin Poye
- n2. A Systematic Review of Predictive Models on COVID-19 with a Special Focus on CARD Modeling with SEI Formulation⁰́₄An Indian ScenarioSougata Mazumder, Debjit Majumder, and Prasun Ghosa
- l3. Data Analytics to Assess the Outbreak of the Coronavirus Epidemic: Opportunities and ChallengesMourade Azrour and Jamal Mabrouk
- i4. Leveraging Artificial Intelligence (AI) during the Coronavirus Pandemic: Applications and ChallengesPrabha Susy Mathew, Anitha S. Pillai, and Bindu Meno
- n5. Early Prediction of Coronavirus Epidemic Outbreak Using Stacked Long Short-Term Memory NetworksDebanjan Konar, Siddhartha Bhattacharyya, Sourav De, Aparajita Das, Jan Platos, Sergey V. Gorbachev, and Khan Muhamma
- d6. Use of Satellite Sensors to Diagnose Changes in Air Quality in Africa Before and During the COVID-19 PandemicLoubna Bouhachlaf, Jamal Mabrouki, Fatimazahra Mousli, Souad El Hajjaji, and Driss Dhib
- a7. Public Sentiments Analysis through Tweets on the COVID-19 Pandemic: A Comparative Study and Performance AssessmentBhagwati Prasad Pande and Koushal Kuma
- r8. Exploring Twitter Data to Understand the Impact of COVID-19 Pandemic in India Using NLP and Deep LearningRahul Deb Das, Ananda Sankar Pal, Madorina Paul, and Anjan Manda
- l9. Novel Coronavirus (COVID-19): Tracking, Health Care Precautions, Alerts, and Early WarningsAnupam Mondal and Naba Kumar Monda
- l10. Edge Computing-Based Smart Healthcare System for Home Monitoring of Quarantine Patients: Security Threat and Sustainability AspectsBiswajit Debnath, Adrija Das, Ankita Das, Rohit Roy Chowdhury, Saswati Gharami, and Abhijit Das
- Larson, Brian, author.
- Fourth edition. - New York : McGraw-Hill Education, [2017]
- Description
- Book — 1 online resource
- Summary
-
- Part 1: Business Intelligence
- Chapter 1: Equipping the Organization for Effective Decision Making
- Chapter 2: Making the Most of What You've Got--Using Business Intelligence
- Chapter 3: Seeking for the Source--The Source of Business Intelligence
- Chapter 4: Two, Two, Two Model in One--The BI Semantic Model
- Chapter 5: First Steps--Beginning the Development of Business Intelligence
- Part 2: Defining Business Intelligence Structures
- Chapter 6: Building Foundations--Creating the Data Marts
- Chapter 7: Transformers--Integration Services Structure and Components
- Chapter 8: Fill'er Up--Using Integration Services for Populating Data Marts
- Part 3: Working with a Tabular BI Semantic Model
- Chapter 9: Setting the Table--Creating a Tabular BI Semantic Model
- Chapter 10: A Fancy Table--Tabular BI Semantic Model Advanced Features
- Part 4: Working with a Multidimensional BI Semantic Model
- Chapter 11: Cubism--Measures and Dimensions
- Chapter 12: Bells and Whistles--Special Features of OLAP Cubes
- Chapter 13: Writing a New Script--MDX Scripting
- Chapter 14: Pulling It Out and Building It Up--MDX Queries
- Part 5: Modeling and Visualization with Power BI
- Chapter 15: Power BI Modeling
- Chapter 16: Power BI Reporting
- Part 6: Delivering
- Chapter 17: Special Delivery--Microsoft Business Intelligence Client Tools
- Chapter 18: Let's Get Together--Integrating Business Intelligence with Your Applications.
- (source: Nielsen Book Data)
(source: Nielsen Book Data)
- Hoberman, Steve, 1968-
- 1st ed. - Basking Ridge, NJ : Technics Pub., 2014.
- Description
- Book — 1 online resource (1 volume) : illustrations
- Summary
-
Annotation Congratulations! You completed the MongoDB application within the given tight timeframe and there is a party to celebrate your application's release into production. Although people are congratulating you at the celebration, you are feeling some uneasiness inside. To complete the project on time required making a lot of assumptions about the data, such as what terms meant and how calculations are derived. In addition, the poor documentation about the application will be of limited use to the support team, and not investigating all of the inherent rules in the data may eventually lead to poorly-performing structures in the not-so-distant future. Now, what if you had a time machine and could go back and read this book. You would learn that even NoSQL databases like MongoDB require some level of data modeling. Data modeling is the process of learning about the data, and regardless of technology, this process must be performed for a successful application. You would learn the value of conceptual, logical, and physical data modeling and how each stage increases our knowledge of the data and reduces assumptions and poor design decisions.
Articles+
Journal articles, e-books, & other e-resources
Guides
Course- and topic-based guides to collections, tools, and services.