Angebote zu "Hadoop" (23 Treffer)

Kategorien

Shops

Hadoop in the Enterprise: Architecture
37,99 € *
ggf. zzgl. Versand
Anbieter: buecher.de
Stand: 17.05.2018
Zum Angebot
Integration von HADOOP in die Data-Warehouse-Ar...
12,99 € *
ggf. zzgl. Versand

Studienarbeit aus dem Jahr 2015 im Fachbereich Informatik - Wirtschaftsinformatik, Note: 2.3, , Sprache: Deutsch, Abstract: Ziel dieser Arbeit ist es, anhand der vorhandenen Fachliteratur die Grundlagen der Hadoop-Technologie darzustellen. Sie erläutert die mögliche Erweiterbarkeit der Data Warehouse Architektur sowie die Einsatzmöglichkeiten durch Hadoop. Im Fazit der Arbeit werden die Hadoop Technologie und ihre Einsatzmöglichkeiten kritisch hinterfragt.

Anbieter: ciando eBooks
Stand: 07.11.2017
Zum Angebot
Installation und Anwendung einer Support Vector...
16,99 € *
ggf. zzgl. Versand

Bachelorarbeit aus dem Jahr 2016 im Fachbereich Informatik - Wirtschaftsinformatik, Note: 1,3, FOM Essen, Hochschule für Oekonomie & Management gemeinnützige GmbH, Hochschulleitung Essen früher Fachhochschule, Sprache: Deutsch, Abstract: Zielsetzung dieser Arbeit ist es, strukturierte Qualitätsberichte des gemeinsamen Bundesausschusses (G-BA) von Krankenhäusern im Dateiformat XML mithilfe des Frameworks und Programmiermodels Hadoop MapReduce zu analysieren. Das Kapitel 2 beschreibt die Grundlagen des Hadoop Frameworks und erläutert die Architektur von Yet Another Resource Manager (YARN), den Aufbau und Ablauf des Programmiermodells MapReduce sowie die Funktionsweise des Hadoop Distributed File System (HDFS). Im Anschluss daran werden das mathematische Modell der Support Vector Machines (SVM) und die Statistiksoftware R vorgestellt. In Kapitel 3 werden die zu untersuchenden strukturierten Qualitätsberichte aus Krankenhäusern beschrieben und deren Aufbau erläutert. Das Kapitel 4 behandelt das Setup für diese Arbeit und beschreibt die Installation und Administration der Server und von Hadoop. Im darauf folgenden Kapitel 5 wird die Durchführung der Analyse beschrieben. Im Wesentlichen werden die Vorüberlegungen und das Erstellen der MapReduce Programme betrachtet. Anschließend werden die Auswertungsergebnisse und eine weitere mögliche Verarbeitung mit den vorgestellten Analyseverfahren k-Means Clustering und der Support Vector Regressionsanalyse (SVR) in R erläutert. Das Kapitel 6 setzt sich mit der Diskussion der Vor- und Nachteile des Einsatzes von Hadoop im Zusammenhang mit der Analyse von Qualitätsberichten auseinander. In Kapitel 7 wird ein Fazit über das eingesetzte Verfahren zur Analyse gezogen und ein Ausblick auf weitere Technologien gegeben. Der Begriff Big Data ist ein Synonym für die ansteigenden und täglich generierten Datenmengen, die gespeichert und verwaltet werden müssen. Aus diesen Daten lassen sich neue Informationen und Wissen ableiten. Da es für den Begriff Big Data keine eindeutige Definition gibt, wird der Begriff sehr häufig beschrieben als unstrukturierte, in großen Mengen und in verschiedenen Formaten vorliegende Daten, die in die festen Strukturen der relationalen Datenbanksysteme (RDBS) nur schwer übernommen werden können. Wird der Begriff Big Data in der Internetsuchmaschine Google eingegeben, werden ca. 431 Mio. Suchergebnisse in 0,48 Sekunden zurückgegeben. Die unumstrittene Definition von Big Data wurde durch das Unternehmen Gartner im Jahre 2011 entwickelt. Gartner stützt sich in der Definition auf das 3-V Modell, dessen Entstehung auf dem Forschungsbericht ?3D Data Management: Controlling Data Volume, Velocity, and Variety? von Doug Laney von 2001 basiert.

Anbieter: ciando eBooks
Stand: 07.11.2017
Zum Angebot
Getting Started with Impala
27,99 € *
ggf. zzgl. Versand

Learn how to write, tune, and port SQL queries and other statements for a Big Data environment, using Impala?the massively parallel processing SQL query engine for Apache Hadoop. The best practices in this practical guide help you design database schemas that not only interoperate with other Hadoop components, and are convenient for administers to manage and monitor, but also accommodate future expansion in data size and evolution of software capabilities.

Anbieter: buecher.de
Stand: 08.05.2018
Zum Angebot
Network Data Analytics
98,99 € *
ggf. zzgl. Versand

In order to carry out data analytics, we need powerful and flexible computing software. However the software available for data analytics is often proprietary and can be expensive. This book reviews Apache tools, which are open source and easy to use. After providing an overview of the background of data analytics, covering the different types of analysis and the basics of using Hadoop as a tool, it focuses on different Hadoop ecosystem tools, like Apache Flume, Apache Spark, Apache Storm, Apache Hive, R, and Python, which can be used for different types of analysis. It then examines the different machine learning techniques that are useful for data analytics, and how to visualize data with different graphs and charts. Presenting data analytics from a practice-oriented viewpoint, the book discusses useful tools and approaches for data analytics, supported by concrete code examples. The book is a valuable reference resource for graduate students and professionals in related fields, and is also of interest to general readers with an understanding of data analytics.

Anbieter: buecher.de
Stand: 17.05.2018
Zum Angebot
Data Algorithms
62,99 € *
ggf. zzgl. Versand

If you are ready to dive into the MapReduce framework for processing large datasets, this practical book takes you step by step through the algorithms and tools you need to build distributed MapReduce applications with Apache Hadoop or Apache Spark. Each chapter provides a recipe for solving a massive computational problem, such as building a recommendation system. You?ll learn how to implement the appropriate MapReduce solution with code that you can use in your projects. Dr. Mahmoud Parsian covers basic design patterns, optimization techniques, and data mining and machine learning solutions for problems in bioinformatics, genomics, statistics, and social network analysis. This book also includes an overview of MapReduce, Hadoop, and Spark. Learn the algorithms and tools you need to build MapReduce applications with Hadoop and Spark for processing gigabyte, terabyte, or petabyte-sized datasets on clusters of commodity hardware. With this practical book, author Mahmoud Parsian, head of the big data team at Illumina, takes you step-by-stepthrough the design of machine-learning algorithms, such as Naive Bayes and Markov Chain, and shows you how apply them to clinical and biological datasets, using MapReduce design patterns. Apply MapReduce algorithms to clinical and biological data, such as DNA-Seq and RNA-Seq Use the most relevant regression/analytical algorithms used for different biological data types Apply t-test, joins, top-10, and correlation algorithms using MapReduce/Hadoop and Spark

Anbieter: buecher.de
Stand: 28.04.2018
Zum Angebot
Pro Docker
39,99 € *
ggf. zzgl. Versand

In this fast-paced book on the Docker open standards platform for developing, packaging and running portable distributed applications, Deepak Vorha discusses how to build, ship and run applications on any platform such as a PC, the cloud, data center or a virtual machine. He describes how to install and create Docker images. and the advantages off Docker containers.The remainder of the book is devoted to discussing using Docker with important software solutions. He begins by discussing using Docker with a traditional RDBMS using Oracle and MySQL. Next he moves on to NoSQL with chapter on MongoDB Cassandra, and Couchbase. Then he addresses the use of Docker in the Hadoop ecosystem with complete chapters on utilizing not only Hadoop, but Hive, HBase, Sqoop, Kafka, Solr and Spark. What You Will Learn How to install a Docker image How to create a Docker container How to run an Application in a Docker Container Use Docker with Apache Hadoop Ecosystem Use Docker with NoSQL Databases Use Docker with RDBMS Who This Book Is For Apache Hadoop Developers. Database developers. NoSQL Developers. Deepak Vohra is a consultant and a principal member of the NuBean-dot-com software company. Deepak is a Sun-certified Java programmer and Web component developer.He has worked in the fields of XML, Java programming, and Java EE for over five years. Deepak is the coauthor of Pro XML Development with Java Technology (Apress, 2006). Deepak is also the author of the JDBC 4.0 and Oracle JDeveloper for J2EE Development, Processing XML Documents with Oracle JDeveloper 11g, EJB 3.0 Database Persistence with Oracle Fusion Middleware 11g , and Java EE Development in Eclipse IDE (Packthing). He also served as the technical reviewer on WebLogic: The Definitive Guide (OReilly Media, 2004) and Ruby Programming for the Absolute Beginner (Cengage Learning PTR, 2007).

Anbieter: ciando eBooks
Stand: 02.04.2018
Zum Angebot
Kubernetes Microservices with Docker
46,99 € *
ggf. zzgl. Versand

This book on Kubernetes, the container cluster manager, discusses all aspects of using Kubernetes in todays complex big data and enterprise applications, including Docker containers. Starting with installing Kubernetes on a single node, the book introduces Kubernetes with a simple Hello example and discusses using environment variables in Kubernetes. Next, the book discusses using Kubernetes with all major groups of technologies such as relational databases, NoSQL databases, and in the Apache Hadoop ecosystem. The book concludes with using multi container Pods and installing Kubernetes on a multi node cluster. No other book on using Kubernetes - beyond simple introduction - is available in the market. What You Will Learn How to install Kubernetes on a single node How to install Kubernetes on a multi-node cluster How to set environment variables How to create a multi-container pods using Docker How to use volumes How to use Kubernetes with Apache Hadoop Ecosystem How to use Kubernetes with NoSQL Databases How to use Kubernetes with RDBMS Who This Book Is For Application Developers including Apache Hadoop Developers, Database developers and NoSQL Developers. Deepak Vohra is a consultant and a principal member of the NuBean-dot-com software company. Deepak is a Sun-certified Java programmer and Web component developer. He has worked in the fields of XML, Java programming, and Java EE for over five years. Deepak is the coauthor of Pro XML Development with Java Technology (Apress, 2006). Deepak is also the author of the JDBC 4.0 and Oracle JDeveloper for J2EE Development, Processing XML Documents with Oracle JDeveloper 11g, EJB 3.0 Database Persistence with Oracle Fusion Middleware 11g, and Java EE Development in Eclipse IDE (Packthing). He also served as the technical reviewer on WebLogic: The Definitive Guide (OReilly Media, 2004) and Ruby Programming for the Absolute Beginner (Cengage Learning PTR, 2007).

Anbieter: ciando eBooks
Stand: 02.04.2018
Zum Angebot
Usage-Driven Database Design - From Logical Dat...
36,99 € *
ggf. zzgl. Versand

Design great databases-from logical data modeling through physical schema definition. You will learn a framework that finally cracks the problem of merging data and process models into a meaningful and unified design that accounts for how data is actually used in production systems. Key to the framework is a method for taking the logical data model that is a static look at the definition of the data, and merging that static look with the process models describing how the data will be used in actual practice once a given system is implemented. The approach solves the disconnect between the static definition of data in the logical data model and the dynamic flow of the data in the logical process models. The design framework in this book can be used to create operational databases for transaction processing systems, or for data warehouses in support of decision support systems. The information manager can be a flat file, Oracle Database, IMS, NoSQL, Cassandra, Hadoop, or any other DBMS. Usage-Driven Database Design emphasizes practical aspects of design, and speaks to what works, what doesnt work, and what to avoid at all costs. Included in the book are lessons learned by the author over his 30+ years in the corporate trenches. Everything in the book is grounded on good theory, yet demonstrates a professional and pragmatic approach to design that can come only from decades of experience. Presents an end-to-end framework from logical data modeling through physical schema definition. Includes lessons learned, techniques, and tricks that can turn a database disaster into a success. Applies to all types of database management systems, including NoSQL such as Cassandra and Hadoop, and mainstream SQL databases such as Oracle and SQL Server What Youll Learn Create logical data models that accurately reflect the real world of the user Create usage scenarios reflecting how applications will use a new database Merge static data models with dynamic process models to create resilient yet flexible database designs Support application requirements by creating responsive database schemas in any database architecture Cope with big data and unstructured data for transaction processing and decision support systems Recognize when relational approaches wont work, and when to turn toward NoSQL solutions such as Cassandra or Hadoop Who This Book Is For System developers, including business analysts, database designers, database administrators, and application designers and developers who must design or interact with database systems George Tillmann is a retired Booz, Allen Hamilton partner; a former programmer, analyst, management consultant; and CIO who managed Booz Allens global IT organization. He brings more than 30 years experience as a database administrator, database consultant, and database product designer. He has written two books, was a Computerworld columnist, and has articles published in CIO, Infoworld, Techworld, Data Base, The Standard, Database Programming & Design and is a former member of the ANSI/X3/SPARC Data Base Management Systems Study Group.

Anbieter: ciando eBooks
Stand: 02.04.2018
Zum Angebot
Pro MongoDB Development
39,99 € *
ggf. zzgl. Versand

Pro MongoDB Development is about MongoDB, a NoSQL database based on the BSON (binary JSON) document model. The book discusses all aspects of using MongoDB in web applications: Java, PHP, Ruby, JavaScript are the most commonly used programming/scripting languages and the book discusses accessing MongoDB database with these languages. The book also discusses using Java EE frameworks Kundera and Spring Data with MongoDB. As NoSQL databases are commonly used with the Hadoop ecosystem the book also discusses using MongoDB with Apache Hive. Migration from other NoSQL databases (Apache Cassandra and Couchbase) and from relational databases (Oracle Database) is also discussed. What Youll Learn: How to use a Java client and MongoDB shell How to use MongoDB with PHP, Ruby, and Node.js as well How to migrate Apache Cassandra tables to MongoDB documents; Couchbase to MongoDB; and transferring data between Oracle and MongoDB How to use Kundera, Spring Data, and Spring XD with MongoDB How to load MongoDB data into Oracle Database and integrating MongoDB with Oracle Database in Oracle Data Integrator Audience: The target audience of the book is NoSQL database developers. Target audience includes Java, PHP and Ruby developers. The book is suitable for an intermediate level course in NoSQL database. Deepak Vohra is an Oracle Certified Associate and a Sun Certified Java Programmer. Deepak is a Fellow British Computer Scoiety. Deepak has published in Oracle Magazine, OTN, IBM developerWorks, ONJava, DevSource, WebLogic Developers Journal, XML Journal, Java Developers Journal, FTPOnline, and devx.

Anbieter: ciando eBooks
Stand: 12.12.2017
Zum Angebot