1  10
 Sugiyama, Masashi, 1974 author.
 Boca Raton, Florida : CRC Press, [2015]
 Description
 Book — 1 online resource
 Summary

 1. Introduction
 2. Modelfree policy iteration
 3. Modelfree policy search
 4. Modelbased reinforcement learning
 Sugiyama, Masashi, 1974 author.
 Boca Raton, FL : CRC Press, [2015]
 Description
 Book — 1 online resource (xiii, 189 pages) : illustrations
 Summary

 Introduction to Reinforcement Learning. ModelFree Policy Iteration. Policy Iteration with Value Function Approximation. Basis Design for Value Function Approximation. Sample Reuse in Policy Iteration. Active Learning in Policy Iteration. Robust Policy Iteration. ModelFree Policy Search. Direct Policy Search by Gradient Ascent. Direct Policy Search by ExpectationMaximization. PolicyPrior Search. ModelBased Reinforcement Learning. Transition Model Estimation. Dimensionality Reduction for Transition Model Estimation.
 (source: Nielsen Book Data)
 Introduction to Reinforcement Learning Reinforcement Learning Mathematical Formulation Structure of the Book
 ModelFree Policy Iteration
 ModelFree Policy Search
 ModelBased Reinforcement Learning MODELFREE POLICY ITERATION Policy Iteration with Value Function Approximation Value Functions
 State Value Functions
 StateAction Value Functions LeastSquares Policy Iteration
 ImmediateReward Regression
 Algorithm
 Regularization
 Model Selection Remarks Basis Design for Value Function Approximation Gaussian Kernels on Graphs
 MDPInduced Graph
 Ordinary Gaussian Kernels
 Geodesic Gaussian Kernels
 Extension to Continuous State Spaces Illustration
 Setup
 Geodesic Gaussian Kernels
 Ordinary Gaussian Kernels
 GraphLaplacian Eigenbases
 Diffusion Wavelets Numerical Examples
 RobotArm Control
 RobotAgent Navigation Remarks Sample Reuse in Policy Iteration Formulation OffPolicy Value Function Approximation
 Episodic Importance Weighting
 PerDecision Importance Weighting
 Adaptive PerDecision Importance Weighting
 Illustration Automatic Selection of Flattening Parameter
 ImportanceWeighted CrossValidation
 Illustration SampleReuse Policy Iteration
 Algorithm
 Illustration Numerical Examples
 Inverted Pendulum
 Mountain Car Remarks Active Learning in Policy Iteration Efficient Exploration with Active Learning
 Problem Setup
 Decomposition of Generalization Error
 Estimation of Generalization Error
 Designing Sampling Policies
 Illustration Active Policy Iteration
 SampleReuse Policy Iteration with Active Learning
 Illustration Numerical Examples Remarks Robust Policy Iteration Robustness and Reliability in Policy Iteration
 Robustness
 Reliability Least Absolute Policy Iteration
 Algorithm
 Illustration
 Properties Numerical Examples Possible Extensions
 Huber Loss
 Pinball Loss
 DeadzoneLinear Loss
 Chebyshev Approximation
 Conditional ValueAtRisk Remarks MODELFREE POLICY SEARCH Direct Policy Search by Gradient Ascent Formulation Gradient Approach
 Gradient Ascent
 Baseline Subtraction for Variance Reduction
 Variance Analysis of Gradient Estimators Natural Gradient Approach
 Natural Gradient Ascent
 Illustration Application in Computer Graphics: Artist Agent
 Sumie Paining
 Design of States, Actions, and Immediate Rewards
 Experimental Results Remarks Direct Policy Search by ExpectationMaximization ExpectationMaximization Approach Sample Reuse
 Episodic Importance Weighting
 PerDecision Importance Weight
 Adaptive PerDecision Importance Weighting
 Automatic Selection of Flattening Parameter
 RewardWeighted Regression with Sample Reuse Numerical Examples Remarks PolicyPrior Search Formulation Policy Gradients with ParameterBased Exploration
 PolicyPrior Gradient Ascent
 Baseline Subtraction for Variance Reduction
 Variance Analysis of Gradient Estimators
 Numerical Examples Sample Reuse in PolicyPrior Search
 Importance Weighting
 Variance Reduction by Baseline Subtraction
 Numerical Examples Remarks MODELBASED REINFORCEMENT LEARNING Transition Model Estimation Conditional Density Estimation
 RegressionBased Approach
 QNeighbor Kernel Density Estimation
 LeastSquares Conditional Density Estimation ModelBased Reinforcement Learning Numerical Examples
 Continuous Chain Walk
 Humanoid Robot Control Remarks Dimensionality Reduction for Transition Model Estimation Sufficient Dimensionality Reduction SquaredLoss Conditional Entropy
 Conditional Independence
 Dimensionality Reduction with SCE
 Relation to SquaredLoss Mutual Information Numerical Examples
 Artificial and Benchmark Datasets
 Humanoid Robot Remarks References Index.
 (source: Nielsen Book Data)
(source: Nielsen Book Data)
Reinforcement learning is a mathematical framework for developing computer agents that can learn an optimal behavior by relating generic reward signals with its past actions. With numerous successful applications in business intelligence, plant control, and gaming, the RL framework is ideal for decision making in unknown environments with large amounts of data. Supplying an uptodate and accessible introduction to the field, Statistical Reinforcement Learning: Modern Machine Learning Approaches presents fundamental concepts and practical algorithms of statistical reinforcement learning from the modern machine learning viewpoint. It covers various types of RL approaches, including modelbased and modelfree approaches, policy iteration, and policy search methods. Covers the range of reinforcement learning algorithms from a modern perspective Lays out the associated optimization problems for each reinforcement learning scenario covered Provides thoughtprovoking statistical treatment of reinforcement learning algorithms The book covers approaches recently introduced in the data mining and machine learning fields to provide a systematic bridge between RL and data mining/machine learning researchers. It presents stateoftheart results, including dimensionality reduction in RL and risksensitive RL. Numerous illustrative examples are included to help readers understand the intuition and usefulness of reinforcement learning techniques. This book is an ideal resource for graduatelevel students in computer science and applied statistics programs, as well as researchers and engineers in related fields.
(source: Nielsen Book Data)
 Sugiyama, Masashi, 1974 author.
 Boca Raton, Florida : CRC Press, [2015]
 Description
 Book — 1 online resource : text file, PDF.
 Summary

 Introduction to Reinforcement Learning. ModelFree Policy Iteration. Policy Iteration with Value Function Approximation. Basis Design for Value Function Approximation. Sample Reuse in Policy Iteration. Active Learning in Policy Iteration. Robust Policy Iteration. ModelFree Policy Search. Direct Policy Search by Gradient Ascent. Direct Policy Search by ExpectationMaximization. PolicyPrior Search. ModelBased Reinforcement Learning. Transition Model Estimation. Dimensionality Reduction for Transition Model Estimation.
 (source: Nielsen Book Data)
 Introduction to Reinforcement Learning Reinforcement Learning Mathematical Formulation Structure of the Book
 ModelFree Policy Iteration
 ModelFree Policy Search
 ModelBased Reinforcement Learning MODELFREE POLICY ITERATION Policy Iteration with Value Function Approximation Value Functions
 State Value Functions
 StateAction Value Functions LeastSquares Policy Iteration
 ImmediateReward Regression
 Algorithm
 Regularization
 Model Selection Remarks Basis Design for Value Function Approximation Gaussian Kernels on Graphs
 MDPInduced Graph
 Ordinary Gaussian Kernels
 Geodesic Gaussian Kernels
 Extension to Continuous State Spaces Illustration
 Setup
 Geodesic Gaussian Kernels
 Ordinary Gaussian Kernels
 GraphLaplacian Eigenbases
 Diffusion Wavelets Numerical Examples
 RobotArm Control
 RobotAgent Navigation Remarks Sample Reuse in Policy Iteration Formulation OffPolicy Value Function Approximation
 Episodic Importance Weighting
 PerDecision Importance Weighting
 Adaptive PerDecision Importance Weighting
 Illustration Automatic Selection of Flattening Parameter
 ImportanceWeighted CrossValidation
 Illustration SampleReuse Policy Iteration
 Algorithm
 Illustration Numerical Examples
 Inverted Pendulum
 Mountain Car Remarks Active Learning in Policy Iteration Efficient Exploration with Active Learning
 Problem Setup
 Decomposition of Generalization Error
 Estimation of Generalization Error
 Designing Sampling Policies
 Illustration Active Policy Iteration
 SampleReuse Policy Iteration with Active Learning
 Illustration Numerical Examples Remarks Robust Policy Iteration Robustness and Reliability in Policy Iteration
 Robustness
 Reliability Least Absolute Policy Iteration
 Algorithm
 Illustration
 Properties Numerical Examples Possible Extensions
 Huber Loss
 Pinball Loss
 DeadzoneLinear Loss
 Chebyshev Approximation
 Conditional ValueAtRisk Remarks MODELFREE POLICY SEARCH Direct Policy Search by Gradient Ascent Formulation Gradient Approach
 Gradient Ascent
 Baseline Subtraction for Variance Reduction
 Variance Analysis of Gradient Estimators Natural Gradient Approach
 Natural Gradient Ascent
 Illustration Application in Computer Graphics: Artist Agent
 Sumie Paining
 Design of States, Actions, and Immediate Rewards
 Experimental Results Remarks Direct Policy Search by ExpectationMaximization ExpectationMaximization Approach Sample Reuse
 Episodic Importance Weighting
 PerDecision Importance Weight
 Adaptive PerDecision Importance Weighting
 Automatic Selection of Flattening Parameter
 RewardWeighted Regression with Sample Reuse Numerical Examples Remarks PolicyPrior Search Formulation Policy Gradients with ParameterBased Exploration
 PolicyPrior Gradient Ascent
 Baseline Subtraction for Variance Reduction
 Variance Analysis of Gradient Estimators
 Numerical Examples Sample Reuse in PolicyPrior Search
 Importance Weighting
 Variance Reduction by Baseline Subtraction
 Numerical Examples Remarks MODELBASED REINFORCEMENT LEARNING Transition Model Estimation Conditional Density Estimation
 RegressionBased Approach
 QNeighbor Kernel Density Estimation
 LeastSquares Conditional Density Estimation ModelBased Reinforcement Learning Numerical Examples
 Continuous Chain Walk
 Humanoid Robot Control Remarks Dimensionality Reduction for Transition Model Estimation Sufficient Dimensionality Reduction SquaredLoss Conditional Entropy
 Conditional Independence
 Dimensionality Reduction with SCE
 Relation to SquaredLoss Mutual Information Numerical Examples
 Artificial and Benchmark Datasets
 Humanoid Robot Remarks References Index.
 (source: Nielsen Book Data)
(source: Nielsen Book Data)
Reinforcement learning is a mathematical framework for developing computer agents that can learn an optimal behavior by relating generic reward signals with its past actions. With numerous successful applications in business intelligence, plant control, and gaming, the RL framework is ideal for decision making in unknown environments with large amounts of data. Supplying an uptodate and accessible introduction to the field, Statistical Reinforcement Learning: Modern Machine Learning Approaches presents fundamental concepts and practical algorithms of statistical reinforcement learning from the modern machine learning viewpoint. It covers various types of RL approaches, including modelbased and modelfree approaches, policy iteration, and policy search methods. Covers the range of reinforcement learning algorithms from a modern perspective Lays out the associated optimization problems for each reinforcement learning scenario covered Provides thoughtprovoking statistical treatment of reinforcement learning algorithms The book covers approaches recently introduced in the data mining and machine learning fields to provide a systematic bridge between RL and data mining/machine learning researchers. It presents stateoftheart results, including dimensionality reduction in RL and risksensitive RL. Numerous illustrative examples are included to help readers understand the intuition and usefulness of reinforcement learning techniques. This book is an ideal resource for graduatelevel students in computer science and applied statistics programs, as well as researchers and engineers in related fields.
(source: Nielsen Book Data)
 Sugiyama, Masashi, 1974
 New York : Cambridge University Press, 2012.
 Description
 Book — 1 online resource (xii, 329 pages) : illustrations
 Summary

 Part I. Density Ratio Approach to Machine Learning: 1. Introduction
 Part II. Methods of Density Ratio Estimation: 2. Density estimation
 3. Moment matching
 4. Probabilistic classification
 5. Density fitting
 6. Densityratio fitting
 7. Unified framework
 8. Direct densityratio estimation with dimensionality reduction
 Part III. Applications of Density Ratios in Machine Learning: 9. Importance sampling
 10. Distribution comparison
 11. Mutual information estimation
 12. Conditional probability estimation
 Part IV. Theoretical Analysis of Density Ratio Estimation: 13. Parametric convergence analysis
 14. Nonparametric convergence analysis
 15. Parametric twosample test
 16. Nonparametric numerical stability analysis
 Part V. Conclusions: 17. Conclusions and future directions.
 (source: Nielsen Book Data)
(source: Nielsen Book Data)
 Sugiyama, Masashi, 1974
 Cambridge, Mass. : MIT Press, ©2012.
 Description
 Book — 1 online resource (xiv, 261 pages) : illustrations.
 Summary

Theory, algorithms, and applications of machine learning techniques to overcome "covariate shift" nonstationarity. As the power of computing has grown over the past few decades, the field of machine learning has advanced rapidly in both theory and practice. Machine learning methods are usually based on the assumption that the data generation mechanism does not change over time. Yet realworld applications of machine learning, including image recognition, natural language processing, speech recognition, robot control, and bioinformatics, often violate this common assumption. Dealing with nonstationarity is one of modern machine learning's greatest challenges. This book focuses on a specific nonstationary environment known as covariate shift, in which the distributions of inputs (queries) change but the conditional distribution of outputs (answers) is unchanged, and presents machine learning theory, algorithms, and applications to overcome this variety of nonstationarity. After reviewing the stateoftheart research in the field, the authors discuss topics that include learning under covariate shift, model selection, importance estimation, and active learning. They describe such real world applications of covariate shift adaption as braincomputer interface, speaker identification, and age prediction from facial images. With this book, they aim to encourage future research in machine learning, statistics, and engineering that strives to create truly autonomous learning machines able to learn under nonstationarity.
(source: Nielsen Book Data)
 Sugiyama, Masashi, 1974
 Cambridge, Mass. : MIT Press, ©2012.
 Description
 Book — 1 online resource (xiv, 261 pages) : illustrations.
 Summary

Theory, algorithms, and applications of machine learning techniques to overcome "covariate shift" nonstationarity. As the power of computing has grown over the past few decades, the field of machine learning has advanced rapidly in both theory and practice. Machine learning methods are usually based on the assumption that the data generation mechanism does not change over time. Yet realworld applications of machine learning, including image recognition, natural language processing, speech recognition, robot control, and bioinformatics, often violate this common assumption. Dealing with nonstationarity is one of modern machine learning's greatest challenges. This book focuses on a specific nonstationary environment known as covariate shift, in which the distributions of inputs (queries) change but the conditional distribution of outputs (answers) is unchanged, and presents machine learning theory, algorithms, and applications to overcome this variety of nonstationarity. After reviewing the stateoftheart research in the field, the authors discuss topics that include learning under covariate shift, model selection, importance estimation, and active learning. They describe such real world applications of covariate shift adaption as braincomputer interface, speaker identification, and age prediction from facial images. With this book, they aim to encourage future research in machine learning, statistics, and engineering that strives to create truly autonomous learning machines able to learn under nonstationarity.
(source: Nielsen Book Data)
7. Machine learning in nonstationary environments : introduction to covariate shift adaptation [2012]
 Sugiyama, Masashi, 1974
 Cambridge, Mass. : MIT Press, ©2012.
 Description
 Book — 1 online resource (xiv, 261 pages) : illustrations Digital: data file.
 Summary

 Foreword; Preface; I INTRODUCTION; 1 Introduction and Problem Formulation; 1.1 Machine Learning under Covariate Shift; 1.2 Quick Tour of Covariate Shift Adaptation; 1.3 Problem Formulation; 1.4 Structure of This Book; II LEARNING UNDER COVARIATE SHIFT; 2 Function Approximation; 2.1 ImportanceWeighting Techniques for Covariate Shift Adaptation; 2.2 Examples of ImportanceWeighted Regression Methods; 2.3 Examples of ImportanceWeighted Classification Methods; 2.4 Numerical Examples; 2.5 Summary and Discussion; 3 Model Selection; 3.1 ImportanceWeighted Akaike Information Criterion.
 3.2 ImportanceWeighted Subspace Information Criterion3.3 ImportanceWeighted CrossValidation; 3.4 Numerical Examples; 3.5 Summary and Discussion; 4 Importance Estimation; 4.1 Kernel Density Estimation; 4.2 Kernel Mean Matching; 4.3 Logistic Regression; 4.4 KullbackLeibler Importance Estimation Procedure; 4.5 LeastSquares Importance Fitting; 4.6 Unconstrained LeastSquares Importance Fitting; 4.7 Numerical Examples; 4.8 Experimental Comparison; 4.9 Summary; 5 Direct DensityRatio Estimation with Dimensionality Reduction; 5.1 Density Difference in HeteroDistributional Subspace.
 5.2 Characterization of HeteroDistributional Subspace5.3 Identifying HeteroDistributional Subspace by Supervised Dimensionality Reduction; 5.4 Using LFDA for Finding HeteroDistributional Subspace; 5.5 DensityRatio Estimation in the HeteroDistributional Subspace; 5.6 Numerical Examples; 5.7 Summary; 6 Relation to Sample Selection Bias; 6.1 Heckman's Sample Selection Model; 6.2 Distributional Change and Sample Selection Bias; 6.3 The TwoStep Algorithm; 6.4 Relation to Covariate Shift Approach; 7 Applications of Covariate Shift Adaptation; 7.1 BrainComputer Interface.
 7.2 Speaker Identification7.3 Natural Language Processing; 7.4 Perceived Age Prediction from Face Images; 7.5 Human Activity Recognition from Accelerometric Data; 7.6 Sample Reuse in Reinforcement Learning; III LEARNING CAUSING COVARIATE SHIFT; 8 Active Learning; 8.1 Preliminaries; 8.2 PopulationBased Active Learning Methods; 8.3 Numerical Examples of PopulationBased Active Learning Methods; 8.4 PoolBased Active Learning Methods; 8.5 Numerical Examples of PoolBased Active Learning Methods; 8.6 Summary and Discussion; 9 Active Learning with Model Selection.
 9.1 Direct Approach and the Active Learning/Model Selection Dilemma9.2 Sequential Approach; 9.3 Batch Approach; 9.4 Ensemble Active Learning; 9.5 Numerical Examples; 9.6 Summary and Discussion; 10 Applications of Active Learning; 10.1 Design of Efficient Exploration Strategies in Reinforcement Learning; 10.2 Wafer Alignment in Semiconductor Exposure Apparatus; IV CONCLUSIONS; 11 Conclusions and Future Prospects; 11.1 Conclusions; 11.2 Future Prospects; Appendix: List of Symbols and Abbreviations; Bibliography; Index.
(source: Nielsen Book Data)
8. Machine learning in nonstationary environments : introduction to covariate shift adaptation [2012]
 Sugiyama, Masashi, 1974
 Cambridge, Mass. : MIT Press, ©2012.
 Description
 Book — 1 online resource (xiv, 261 pages) : illustrations Digital: data file.
 Summary

 Foreword; Preface; I INTRODUCTION; 1 Introduction and Problem Formulation; 1.1 Machine Learning under Covariate Shift; 1.2 Quick Tour of Covariate Shift Adaptation; 1.3 Problem Formulation; 1.4 Structure of This Book; II LEARNING UNDER COVARIATE SHIFT; 2 Function Approximation; 2.1 ImportanceWeighting Techniques for Covariate Shift Adaptation; 2.2 Examples of ImportanceWeighted Regression Methods; 2.3 Examples of ImportanceWeighted Classification Methods; 2.4 Numerical Examples; 2.5 Summary and Discussion; 3 Model Selection; 3.1 ImportanceWeighted Akaike Information Criterion.
 3.2 ImportanceWeighted Subspace Information Criterion3.3 ImportanceWeighted CrossValidation; 3.4 Numerical Examples; 3.5 Summary and Discussion; 4 Importance Estimation; 4.1 Kernel Density Estimation; 4.2 Kernel Mean Matching; 4.3 Logistic Regression; 4.4 KullbackLeibler Importance Estimation Procedure; 4.5 LeastSquares Importance Fitting; 4.6 Unconstrained LeastSquares Importance Fitting; 4.7 Numerical Examples; 4.8 Experimental Comparison; 4.9 Summary; 5 Direct DensityRatio Estimation with Dimensionality Reduction; 5.1 Density Difference in HeteroDistributional Subspace.
 5.2 Characterization of HeteroDistributional Subspace5.3 Identifying HeteroDistributional Subspace by Supervised Dimensionality Reduction; 5.4 Using LFDA for Finding HeteroDistributional Subspace; 5.5 DensityRatio Estimation in the HeteroDistributional Subspace; 5.6 Numerical Examples; 5.7 Summary; 6 Relation to Sample Selection Bias; 6.1 Heckman's Sample Selection Model; 6.2 Distributional Change and Sample Selection Bias; 6.3 The TwoStep Algorithm; 6.4 Relation to Covariate Shift Approach; 7 Applications of Covariate Shift Adaptation; 7.1 BrainComputer Interface.
 7.2 Speaker Identification7.3 Natural Language Processing; 7.4 Perceived Age Prediction from Face Images; 7.5 Human Activity Recognition from Accelerometric Data; 7.6 Sample Reuse in Reinforcement Learning; III LEARNING CAUSING COVARIATE SHIFT; 8 Active Learning; 8.1 Preliminaries; 8.2 PopulationBased Active Learning Methods; 8.3 Numerical Examples of PopulationBased Active Learning Methods; 8.4 PoolBased Active Learning Methods; 8.5 Numerical Examples of PoolBased Active Learning Methods; 8.6 Summary and Discussion; 9 Active Learning with Model Selection.
 9.1 Direct Approach and the Active Learning/Model Selection Dilemma9.2 Sequential Approach; 9.3 Batch Approach; 9.4 Ensemble Active Learning; 9.5 Numerical Examples; 9.6 Summary and Discussion; 10 Applications of Active Learning; 10.1 Design of Efficient Exploration Strategies in Reinforcement Learning; 10.2 Wafer Alignment in Semiconductor Exposure Apparatus; IV CONCLUSIONS; 11 Conclusions and Future Prospects; 11.1 Conclusions; 11.2 Future Prospects; Appendix: List of Symbols and Abbreviations; Bibliography; Index.
(source: Nielsen Book Data)
9. Machine learning in nonstationary environments : introduction to covariate shift adaptation [2012]
 Sugiyama, Masashi, 1974
 Cambridge, Mass. : MIT Press, ©2012
 Description
 Book — 1 online resource (xiv, 261 pages) : illustrations
 Summary

This volume focuses on a specific nonstationary environment known as covariate shift, in which the distributions of inputs (queries) changes but the conditional distributions of outputs (answers) is unchanged, and presents machine learning theory algorithms, and applications to overcome this variety of nonstationarity
10. Machine learning in nonstationary environments : introduction to covariate shift adaptation [2012]
 Sugiyama, Masashi, 1974
 Cambridge, Massachusetts : MIT Press, c2012 [Piscataqay, New Jersey] : IEEE Xplore, [2012]
 Description
 Book — 1 online resource (xiv, 261 pages) : illustrations
 Summary

Theory, algorithms, and applications of machine learning techniques to overcome "covariate shift" nonstationarity. As the power of computing has grown over the past few decades, the field of machine learning has advanced rapidly in both theory and practice. Machine learning methods are usually based on the assumption that the data generation mechanism does not change over time. Yet realworld applications of machine learning, including image recognition, natural language processing, speech recognition, robot control, and bioinformatics, often violate this common assumption. Dealing with nonstationarity is one of modern machine learning's greatest challenges. This book focuses on a specific nonstationary environment known as covariate shift, in which the distributions of inputs (queries) change but the conditional distribution of outputs (answers) is unchanged, and presents machine learning theory, algorithms, and applications to overcome this variety of nonstationarity. After reviewing the stateoftheart research in the field, the authors discuss topics that include learning under covariate shift, model selection, importance estimation, and active learning. They describe such real world applications of covariate shift adaption as braincomputer interface, speaker identification, and age prediction from facial images. With this book, they aim to encourage future research in machine learning, statistics, and engineering that strives to create truly autonomous learning machines able to learn under nonstationarity.
(source: Nielsen Book Data)
Articles+
Journal articles, ebooks, & other eresources
Guides
Course and topicbased guides to collections, tools, and services.