Informatica Intros Big Data
Management Solution for Apache Spark-Based Big Data Clouds
September 13, 2018
has introduced a data management solution for Apache Spark Based Big
Data Cloud environments. These new innovations, powered by the CLAIRE
engine, enable organizations to stream, ingest, process, cleanse,
protect, and govern even more big data with less effort.
"The Qubole and Informatica partnership enables organizations to design advanced data management pipelines including integration, data quality, masking, and more and execute them in a self-service cloud native platform for end-to-end big data processing," said Ashish Thusoo, co-founder and CEO, Qubole. "The partnership provides customers accelerated time to value and success adopting next-generation analytics."
"Big data management is going through a wave of innovation that empowers data operations teams to efficiently and effectively collaborate and interact with large volumes of company data for crucial analytics projects," said Ronen Schwartz, senior vice president and general manager, Cloud, Big Data and Data Integration, Informatica.
"The new Informatica innovations empower all levels of data users to interact with huge data sets to glean insights. For example, data engineers can now build serverless data pipelines running on Apache Spark in the Cloud and provide data scientists with advanced self-service data prep, powered by AI and machine learning. In addition, on September 11, 2018, Informatica won the Cloudera Partner Impact Award for driving customer insights, furthering the notion that our big data innovations are delivering transformational value for our customers."
The new innovations include:
•Increasing data engineering productivity with even broader support for big data clouds like Google Cloud Dataproc and new advanced Spark serverless based integrations with Qubole and Azure Databricks. Additionally, users will benefit from rapid development for IoT data pipelines with machine-learning driven structure discovery of semi-structured datasets (e.g. machine data).
•Empowering data analyst and data science teams with advanced self-service data discovery and data preparation with 50+ new functions. Examples include statistical and windowing functions, fuzzy clustering, matching rules, more controlled access to data using data masking and the ability to ingest logical models and business terms into a data catalog.
•Optimizing data operations with improved monitoring of data infrastructure with machine learning driven operational management and proactive actions and recommendations.