Big Data Feedback Architectures
Want to harness the real power of big data? Then you’ll need to build an architecture capable of closing the feedback loop through machine learning. In this presentation, I’ll share knowledge gathered from designing streaming big data systems for mobile advertising, where every minute taken off the feedback loop translates to real dollars.
The inherent challenge is balancing technology maturity, hardware cost, and the needs of machine learning. The streaming technology we used is similar to Apache Spark, and gave a serious competitive edge in dealing with several hundred thousand auctions per second. By combining the power of Hadoop, Cassandra, Hive, and Pig, we managed to build a cost-effective solution capable of handling massive incoming traffic, tens of thousand user-data enrichments per second, and maintaining zero loss of business-critical data.