STARWEST 2016 - Big Data
Thursday, October 6
The Four V’s of Big Data Testing: Variety,Volume, Velocity, and Veracity
The expression “garbage, garbage out” emphasizes the need for thorough testing in any Big Data and analytics implementation. Big Data testing means ensuring the correctness and completeness of voluminous, often heterogeneous, data as it moves across different stages—ingestion, storage, analytics, and visualization—producing actionable insights. What should be our testing focus? Which of the 4 V’s—variety, volume, velocity, and veracity—are most important at which stage? For example, in the ingestion stage, testing needs to focus on variety of data rather than...
Big Data, Big Trouble: Getting into the Flow of Hadoop Testing
Big Data, one of the latest buzzwords in our industry, involves working with petabytes of data captured by various systems and making sense of that data in some way. Maryam Umar has found that testing systems like Hadoop is very challenging because of the frequency with which the data arrives in the system, the number of jobs that run to process that data, and the interdependency of the data. Maryam describes some of the projects at Hotels.com which involve identifying multiple users and using that data to make recommendations of hotels. Testing this is fairly...