Quantcast
Channel: Infosys-Oracle Blog
Viewing all articles
Browse latest Browse all 561

Comparative Study Between Oracle Big Data Cloud Service and Compute Engine

$
0
0

 

Comparative study between Oracle BDCS and Oracle Big Data Cloud Compute Engine.

 

1.             Oracle Big Data Cloud Service: Gives us access to the resources of a preinstalled Oracle Big Data environment, this also comes with an entire installation of the Cloudera Distribution Including open source Apache Hadoop and Apache Spark. This can be used to analyze data generated from Social Media Feeds, E-mail, Smart Meters etc.

OBD CS contains:

·         3-60 Nodes cluster, 3 is the minimum number of cluster node(OCPU) available to start with; where we can increase the processing power and secondary memory of the cluster node can be extended by adding Cluster computer nodes("bursting").

·         Linux Operating System Provided by Oracle

·         Cloudera Distribution with Apache Hadoop (CDH):

-          File System: HDFS to store different types of files

-          MapReduce Engine (YARN is default for resource management)

-          Administrative Framework, cloud era manager is default

-          Apache Projects e.g. Zookeeper, Oozie, Pig, Hive, Ambari

-          Cloudera Application, Cloudera Enterprise Edition Data hub, Impala Search and Navigator

 

·         Built-in Utilities for managing data and resource

·         Big Data Spatial and Graph for Oracle

·         Big Data Connectors for Oracle:

-          Oracle SQL Connector for HDFS

-          Oracle Loader for Hadoop environment

-          Oracle XQuery for Big Data

-          ORE Advanced Analytics for Big Data

-          ODI Enterprise Edition

 

Typical Workflow of OBDCS: Purchase a subscription -> Create and manages users and their roles -> Create a service instance -> Create an SSH key pair -> Create a cluster -> Control network access to services -> Access and work with your cluster -> Add permanent nodes to a cluster -> Add temporary compute nodes to a cluster (bursting) -> Patch a cluster -> Manage storage providers and copy data

odiff (Oracle Distributed Diff) is a Oracle developed innovative tool to compare huge data sets stores sparsely using a Spark application and compatible with CDH 5.7.x. Maximum file/directory size limit is 2 G.B. to compare.

 

2.       


Viewing all articles
Browse latest Browse all 561

Trending Articles