Angebote zu "Hadoop" (74 Treffer)

Data Analytics with Hadoop
16,99 € *
ggf. zzgl. Versand

Provides a solid introduction to the world of clustered computing and analytics with Hadoop, focusing on particular analyses users can build, the data warehousing techniques that Hadoop provides and the higher order data workflows it can produce. Original.

Anbieter: buecher.de
Stand: 21.06.2017
Zum Angebot
Practical Hadoop Security
47,59 € *
ggf. zzgl. Versand

Practical Hadoop Security is an excellent resource for administrators planning a production Hadoop deployment who want to secure their Hadoop clusters. A detailed guide to the security options and configuration within Hadoop itself, author Bhushan Lakhe takes you through a comprehensive study of how to implement defined security within a Hadoop cluster in a hands-on way. You will start with a detailed overview of all the security options available for Hadoop, including popular extensions like Kerberos and OpenSSH, and then delve into a hands-on implementation of user security (with illustrated code samples) with both in-the-box features and with security extensions implemented by leading vendors. No security system is complete without a monitoring and tracing facility, so Practical Hadoop Security next steps you through audit logging and monitoring technologies for Hadoop, as well as ready to use implementation and configuration examples--again with illustrated code samples. The book concludes with the most important aspect of Hadoop security - encryption. Both types of encryptions, for data in transit and data at rest, are discussed at length with leading open source projects that integrate directly with Hadoop at no licensing cost. Practical Hadoop Security : Explains importance of security, auditing and encryption within a Hadoop installation Describes how the leading players have incorporated these features within their Hadoop distributions and provided extensions Demonstrates how to set up and use these features to your benefit and make your Hadoop installation secure without impacting performance or ease of use Bhushan Lakhe is Senior Vice President of Information and Data Architecture at Ipsos, a global market research company headquartered in Paris. He has more than 25 years experience in software development life cycle management, enterprise architecture design and framework implementation, service management, data warehousing, and Hadoop ecosystem (HDFS, HBase, Hive, Pig, SQOOP, MongoDB) implementation, having worked successively at Tata Consultancy Services, Fujitsu-ICIM, ICL, IBM, Unisys Corporation, and as a database architecture consultant to such clients as Leo Burnett, ABN AMRO Bank, Abbott Laboratories, Motorola, JPMorgan Chase, and British Petroleum. He received IBM’s 2012 Gerstner Award for his implementation of major big data and data warehouse projects. Lakhe is a Cloudera Certified Administrator for Apache Hadoop CDH4 and a Microsoft Certified Technology Specialist, SQL Server Implementation and Maintenance. He is the author of Practical Hadoop Security. He is active in the Chicago Hadoop community and as a speaker at technical meetups and industry conferences. Lakhe graduated from the Birla Institute of Technology and Science, Pilani.

Anbieter: ciando eBooks
Stand: 22.08.2016
Zum Angebot
Integration von HADOOP in die Data-Warehouse-Ar...
12,99 € *
ggf. zzgl. Versand

Studienarbeit aus dem Jahr 2015 im Fachbereich Informatik - Wirtschaftsinformatik, Note: 2.3, , Sprache: Deutsch, Abstract: Ziel dieser Arbeit ist es, anhand der vorhandenen Fachliteratur die Grundlagen der Hadoop-Technologie darzustellen. Sie erläutert die mögliche Erweiterbarkeit der Data Warehouse Architektur sowie die Einsatzmöglichkeiten durch Hadoop. Im Fazit der Arbeit werden die Hadoop Technologie und ihre Einsatzmöglichkeiten kritisch hinterfragt.

Anbieter: ciando eBooks
Stand: 22.08.2016
Zum Angebot
Big Data Made Easy - A Working Guide to the Com...
35,69 € *
ggf. zzgl. Versand

Many corporations are finding that the size of their data sets are outgrowing the capability of their systems to store and process them. The data is becoming too big to manage and use with traditional tools. The solution: implementing a big data system. As Big Data Made Easy: A Working Guide to the Complete Hadoop Toolset shows, Apache Hadoop offers a scalable, fault-tolerant system for storing and processing data in parallel. It has a very rich toolset that allows for storage (Hadoop), configuration (YARN and ZooKeeper), collection (Nutch and Solr), processing (Storm, Pig, and Map Reduce), scheduling (Oozie), moving (Sqoop and Avro), monitoring (Chukwa, Ambari, and Hue), testing (Big Top), and analysis (Hive). The problem is that the Internet offers IT pros wading into big data many versions of the truth and some outright falsehoods born of ignorance. What is needed is a book just like this one: a wide-ranging but easily understood set of instructions to explain where to get Hadoop tools, what they can do, how to install them, how to configure them, how to integrate them, and how to use them successfully. And you need an expert who has worked in this area for a decade—someone just like author and big data expert Mike Frampton. Big Data Made Easy approaches the problem of managing massive data sets from a systems perspective, and it explains the roles for each project (like architect and tester, for example) and shows how the Hadoop toolset can be used at each system stage. It explains, in an easily understood manner and through numerous examples, how to use each tool. The book also explains the sliding scale of tools available depending upon data size and when and how to use them. Big Data Made Easy shows developers and architects, as well as testers and project managers, how to: Store big data Configure big data Process big data Schedule processes Move data among SQL and NoSQL systems Monitor data Perform big data analytics Report on big data processes and projects Test big data systems Big Data Made Easy also explains the best part, which is that this toolset is free. Anyone can download it and—with the help of this book—start to use it within a day. With the skills this book will teach you under your belt, you will add value to your company or client immediately, not to mention your career. Mike Frampton has been in the IT industry since 1990, working in many roles (tester, developer, support, QA), and in many sectors ( telecoms, banking, energy, insurance). He has also worked for major corporations and banks, including IBM, HP, and JPMorgan Chase. The owner of Semtech Solutions, an IT/Big Data consultancy, Mike currently lives by the beach in Paraparaumu, New Zealand, with his wife and son.

Anbieter: ciando eBooks
Stand: 22.08.2016
Zum Angebot
iX Developer Big Data - Tools, Standards, Daten...
9,99 € *
ggf. zzgl. Versand

Big Data ist heute eine der zentralen Entwicklungen im Datenbankbereich. Das Sonderheft von heise Develope stellt die theoretische Grundlagen dar, erklärt die wichtigsten Softwarelösungen von Apache Solr über Hadoop bis Elastic Search und geht der Frage nach, was diese Entwicklung für das Berufsbild des Informatikers bedeutet.

Anbieter: ciando eBooks
Stand: 31.01.2017
Zum Angebot
Apache Drill
19,99 € *
ggf. zzgl. Versand

Apache Drill is a significant new tool in the Hadoop ecosystem that enables users to execute queries in a Hadoop cluster and get results quickly. This practical book provides a first touch introduction to Drill and its ability to handle large files containing data in flexible formats with nested data structures and tables

Anbieter: buecher.de
Stand: 01.06.2017
Zum Angebot
Programming Pig
24,99 € *
ggf. zzgl. Versand

Introduces new users to Pig?the open source engine for executing parallel data flows on Hadoop?and provides experienced users with comprehensive coverage on key features such as the Pig Latin scripting language, the Grunt shell and User Defined Functions (UDFs) for extending Pig. Original.

Anbieter: buecher.de
Stand: 21.06.2017
Zum Angebot
Kubernetes Microservices with Docker
47,59 € *
ggf. zzgl. Versand

This book on Kubernetes, the container cluster manager, discusses all aspects of using Kubernetes in todays complex big data and enterprise applications, including Docker containers. Starting with installing Kubernetes on a single node, the book introduces Kubernetes with a simple Hello example and discusses using environment variables in Kubernetes. Next, the book discusses using Kubernetes with all major groups of technologies such as relational databases, NoSQL databases, and in the Apache Hadoop ecosystem. The book concludes with using multi container Pods and installing Kubernetes on a multi node cluster. No other book on using Kubernetes - beyond simple introduction - is available in the market. What You Will Learn How to install Kubernetes on a single node How to install Kubernetes on a multi-node cluster How to set environment variables How to create a multi-container pods using Docker How to use volumes How to use Kubernetes with Apache Hadoop Ecosystem How to use Kubernetes with NoSQL Databases How to use Kubernetes with RDBMS Who This Book Is For Application Developers including Apache Hadoop Developers, Database developers and NoSQL Developers. Deepak Vohra is a consultant and a principal member of the NuBean-dot-com software company. Deepak is a Sun-certified Java programmer and Web component developer. He has worked in the fields of XML, Java programming, and Java EE for over five years. Deepak is the coauthor of Pro XML Development with Java Technology (Apress, 2006). Deepak is also the author of the JDBC 4.0 and Oracle JDeveloper for J2EE Development, Processing XML Documents with Oracle JDeveloper 11g, EJB 3.0 Database Persistence with Oracle Fusion Middleware 11g, and Java EE Development in Eclipse IDE (Packthing). He also served as the technical reviewer on WebLogic: The Definitive Guide (OReilly Media, 2004) and Ruby Programming for the Absolute Beginner (Cengage Learning PTR, 2007).

Anbieter: ciando eBooks
Stand: 22.08.2016
Zum Angebot
Usage-Driven Database Design - From Logical Dat...
39,26 € *
ggf. zzgl. Versand

Design great databases-from logical data modeling through physical schema definition. You will learn a framework that finally cracks the problem of merging data and process models into a meaningful and unified design that accounts for how data is actually used in production systems. Key to the framework is a method for taking the logical data model that is a static look at the definition of the data, and merging that static look with the process models describing how the data will be used in actual practice once a given system is implemented. The approach solves the disconnect between the static definition of data in the logical data model and the dynamic flow of the data in the logical process models. The design framework in this book can be used to create operational databases for transaction processing systems, or for data warehouses in support of decision support systems. The information manager can be a flat file, Oracle Database, IMS, NoSQL, Cassandra, Hadoop, or any other DBMS. Usage-Driven Database Design emphasizes practical aspects of design, and speaks to what works, what doesnt work, and what to avoid at all costs. Included in the book are lessons learned by the author over his 30+ years in the corporate trenches. Everything in the book is grounded on good theory, yet demonstrates a professional and pragmatic approach to design that can come only from decades of experience. Presents an end-to-end framework from logical data modeling through physical schema definition. Includes lessons learned, techniques, and tricks that can turn a database disaster into a success. Applies to all types of database management systems, including NoSQL such as Cassandra and Hadoop, and mainstream SQL databases such as Oracle and SQL Server What Youll Learn Create logical data models that accurately reflect the real world of the user Create usage scenarios reflecting how applications will use a new database Merge static data models with dynamic process models to create resilient yet flexible database designs Support application requirements by creating responsive database schemas in any database architecture Cope with big data and unstructured data for transaction processing and decision support systems Recognize when relational approaches wont work, and when to turn toward NoSQL solutions such as Cassandra or Hadoop Who This Book Is For System developers, including business analysts, database designers, database administrators, and application designers and developers who must design or interact with database systems George Tillmann is a retired Booz, Allen Hamilton partner; a former programmer, analyst, management consultant; and CIO who managed Booz Allens global IT organization. He brings more than 30 years experience as a database administrator, database consultant, and database product designer. He has written two books, was a Computerworld columnist, and has articles published in CIO, Infoworld, Techworld, Data Base, The Standard, Database Programming & Design and is a former member of the ANSI/X3/SPARC Data Base Management Systems Study Group.

Anbieter: ciando eBooks
Stand: 08.05.2017
Zum Angebot
Git - Dezentrale Versionsverwaltung im Team - G...
25,99 € *
ggf. zzgl. Versand

Git ist eine der beliebtesten Versionsverwaltungen. Die Vielfalt an Kommandos, Optionen und Konfigurationen wirkt anfangs aber oft einschüchternd - obwohl die Grundkonzepte einfach sind und man schon mit wenigen davon effektiv arbeiten kann. Die Autoren dieses Buches bieten daher zunächst eine kompakte Einführung in die Konzepte und jene Befehle, die man im Entwickleralltag wirklich benötigt. Anschließend widmen sie sich ausführlich den wichtigsten Workflows bei der Softwareentwicklung im Team und zeigen, wie Git dort eingesetzt wird. Behandelt werden u.a. folgende Workflows: - Ein Projekt aufsetzen - Mit Feature-Branches entwickeln - Gemeinsam auf einem Branch arbeiten - Kontinuierlich Releases durchführen - Periodisch Releases durchführen - Große Projekte aufteilen Sie lernen in diesem Buch alle wichtigen Git-Befehle und -Funktionen kennen und erfahren, wie Sie sie effektiv anwenden. Sowohl über die Kommandozeile als auch mit Tools wie »Atlassian Source Tree«. Darüber hinaus erfahren Sie, wie Git mit dem Build-Server Jenkins genutzt werden kann, und lernen auch fortgeschrittene Features kennen wie z.B. Submodules, Subtrees und Worktrees. Die 4. Auflage wurde komplett aktualisiert. Viele Projekte nutzen Git heute auf Plattformen wie GitHub oder Bitbucket, auf denen mit sogenannten Pull-Requests gearbeitet wird. Dies wird jetzt auch in den beschriebenen Workflows berücksichtigt. Ein neuer Workflow »Mit Forks entwickeln«, der das Arbeiten in Open-Source-Projekten widerspiegelt, wurde hinzugefügt. Ebenfalls neu ist ein Workflow über die LFS-Erweiterung zur Versionierung großer Binärdateien. René Preißel arbeitet als freiberuflicher Softwarearchitekt, Entwickler und Trainer. Er beschäftigt sich seit vielen Jahren mit der Entwicklung von Anwendungen und dem Coaching von Teams. Seine Arbeitsschwerpunkte liegen im Bereich Softwarearchitektur, Java-Entwicklung und Build-Management. Mehr Informationen unter www.eToSquare.de. Bjørn Stachmann arbeitet als Software Developer für die Otto (GmbH & Co KG) in Hamburg. Seine Schwerpunkte liegen in den Bereichen Java-Entwicklung, Softwarearchitektur und Hadoop. Sein aktuelles Arbeitsfeld ist der Hadoop-basierte Big-Data-Stack für die BI-Plattform BRAIN.

Anbieter: ciando eBooks
Stand: 24.04.2017
Zum Angebot