In the second edition of this practical book, four Cloudera data scientists present a set of self-contained patterns for performing large-scale data analysis with Spark. The authors bring Spark, statistical methods, and real-world datasets together to teach you how to approach analytics problems by example.
Learn methods of data analysis and their application to real-world data sets This updated second edition serves as an introduction to data mining methods and models, including association rules, clustering, neural networks, logistic regression, and multivariate analysis. The authors apply a unified “white box” approach to data mining methods and models. This approach is designed to walk readers through the operations and nuances of the various methods, using small data sets, so readers can gain an insight into the inner workings of the method under review. Chapters provide readers with hands-on analysis problems, representing an opportunity for readers to apply their newly-acquired data mining expertise to solving real problems using large, real-world data sets. Data Mining and Predictive Analytics, Second Edition : Offers comprehensive coverage of association rules, clustering, neural networks, logistic regression, multivariate analysis, and R statistical programming language Features over 750 chapter exercises, allowing readers to assess their understanding of the new material Provides a detailed case study that brings together the lessons learned in the book Includes access to the companion website, www.dataminingconsultant, with exclusive password-protected instructor content Data Mining and Predictive Analytics, Second Edition will appeal to computer science and statistic students, as well as students in MBA programs, and chief executives. Daniel T. Larose is Professor of Mathematical Sciences and Director of the Data Mining programs at Central Connecticut State University. He has published several books, including Data Mining the Web: Uncovering Patterns in Web Content, Structure, and Usage (Wiley, 2007) and Discovering Knowledge in Data: An Introduction to Data Mining (Wiley, 2005). In addition to his scholarly work, Dr. Larose is a consultant in data mining and statistical analysis working with many high profile clients, including Microsoft, Forbes Magazine, the CIT Group, KPMG International, Computer Associates, and Deloitte, Inc. Chantal D. Larose is a Ph.D. candidate in Statistics at the University of Connecticut. Her research focuses on the imputation of missing data and model-based clustering. She has taught undergraduate statistics since 2011, and is a statistical consultant for DataMiningConsultant.com, LLC.
Big data has been described as the ´´new oil.´´ Data Science and Big Data Analytics is all about harnessing the power of data for new insights. EMC, the world class information management company, has developed this book with you in mind--every concept and task can be completed using free or open-source software. This is also an approved study guide for the EMC Data Science Associate (EMCDSA) certification. You´ll learn everything you need to participate in big data projects, including how to: * Become an immediate contributor on a data science team * Reframe a business challenge as an analytics challenge * Deploy a structured lifecycle approach to data analytics problems * Apply appropriate analytic techniques and tools to analyze big data * Learn how to tell a compelling story with data to drive business action * Use open source tools such as R, Hadoop, and PostgreSQL * Prepare for EMC Proven Professional Data Scientist certification Today´s IT professionals, business analysts, and database administrators are expected to work with enormous datasets. After reading Data Science and Big Data Analytics, you´ll be on the cutting edge of this exciting paradigm shift. Data Science and Big Data Analytics is about harnessing the power of data for new insights. The book covers the breadth of activities and methods and tools that Data Scientists use. The content focuses on concepts, principles and practical applications that are applicable to any industry and technology environment, and the learning is supported and explained with examples that you can replicate using open-source software. This book will help you: * Become a contributor on a data science team * Deploy a structured lifecycle approach to data analytics problems * Apply appropriate analytic techniques and tools to analyzing big data * Learn how to tell a compelling story with data to drive business action * Prepare for EMC Proven Professional Data Science Certification Corresponding data sets are available at www.wiley.com/go/9781118876138. Get started discovering, analyzing, visualizing, and presenting data in a meaningful way today!
Practical Web Analytics for User Experience teaches you how to use web analytics to help answer the complicated questions facing UX professionals. Within this book, you´ll find a quantitative approach for measuring a website´s effectiveness and the methods for posing and answering specific questions about how users navigate a website. The book is organized according to the concerns UX practitioners face. Chapters are devoted to traffic, clickpath, and content use analysis, measuring the effectiveness of design changes, including A/B testing, building user profiles based on search habits, supporting usability test findings with reporting, and more. This is the must-have resource you need to start capitalizing on web analytics and analyze websites effectively. Discover concrete information on how web analytics data support user research and user-centered design Learn how to frame questions in a way that lets you navigate through massive amounts of data to get the answer you need Learn how to gather information for personas, verify behavior found in usability testing, support heuristic evaluation with data, analyze keyword data, and understand how to communicate these findings with business stakeholders
Judith S. Hurwitz is President and CEO of Hurwitz & Associates, a research and consulting firm focused on emerging technology, and is a leading strategy consultant. Marcia Kaufman is a Principal Analyst and COO of Hurwitz & Associates, with leadership in big data and advanced analytics, information management, and business strategy. Adrian Bowles is President and CEO of STORM Insights, Inc., a market analytics firm providing research and advisory services for buyers, sellers, and investors in emerging technology markets.