Apache Spark and In-Memory Computing

Optimizing Apache Spark™ with Memory1™

White Paper: Inspur Group Co. Ltd

Apache Spark is a fast and general engine for large-scale data processing. To handle increasing data rates and demanding user expectations, big data processing platforms like Apache Spark have emerged and quickly gained popularity.

This whitepaper on “Optimizing Apache Spark with Memory1”demonstrates that by leveraging Memory1 to maximize the available memory, the servers can do more work (75% efficiency improvement in Spark performance), unleashing the full potential of real-time, big data processing.

This Technology Whitepaper Covers:

How Spark works?
What are the issues that subvert the full potential of Apache Spark’s disaggregated approach?
How to simulate the critical demands of a typical Spark operations workload?
How to eliminate the hardware cost concerns traditionally faced in multi-server Spark deployments?
What efficiency metrics are involved in the Spark operations?
Key issues faced by Spark in traditional, DRAM-only deployments
How to Avoid the high cost of DRAM-only implementations in Apache Spark architects

Related White Papers

BIG DATA 2.0 - Cataclysm or Catalyst?

By: Actian

Four undeniable trends shape the way we think about data – big or small. While managing big data is ripe with challenges, it continues to represent more than $15 trillion in untapped value. New analytics and next-generation platforms hold the key to unlocking that value. Understanding these four trends brings light to the current shift taking place from Big Data 1.0 to 2.0, a cataclysmal shift for those who fail to see the shift and take action, a catalyst for those who embrace the new opportunity. Key take aways from this white paper: 5 Advancements of Big Data 1.0 5 Challenges of Big Data 1.0 Big Data 2.0 – Transformational Value Big Data 2.0 Means Big Value Times Ten

Big Data Projects‐ Paving the path to success

By: Intersec Group

The advent of open‐source technologies fueled big data initiatives with the intent to materialize new business models. The goal of big data projects often revolves around solving problems in addition to helping drive ROI and value across a business unit or entire organization. It’s often difficult to launch a big data project quickly due to competing business priorities; the myriad of technology choices available as well as, the sheer size, volume, and velocity of data. Key questions from this whitepaper: What are the common questions and challenges that the operators are facing when starting a Big Data project? What are the best practices to avoid being trapped in the ever‐lasting big data project that fails to generate any revenue? Should the big data project be carried out by the IT department or should it be led by a dedicated organization, under a new function like a Chief Data Officer, distinct from traditional IT?

Tweets by @@InspurCorp

"Apache Spark and In-Memory Computing"

Optimizing Apache Spark™ with Memory1™

This Technology Whitepaper Covers:

Big Data

IT Costs

Data Warehouse

Related White Papers

BIG DATA 2.0 - Cataclysm or Catalyst?

Big Data Projects‐ Paving the path to success