Big Data Resources

Published: 22 Sep 2015 Category: big_data


MIT 6.S897: Large-Scale Systems(Matei Zaharia)


Learning to Hash for Indexing Big Data - A Survey

Random Forests for Big Data

Big data analytics: a survey

A Comparison of Big Data Frameworks on a Layered Dataflow Model

A survey of machine learning for big data processing

A Big Data Analysis Framework Using Apache Spark and Deep Learning

  • intro: IEEE ICDM 2017 (International Conference on Data Mining) Workshop on Data Science and Big Data Analytics (DSBDA)
  • intro: University of Delhi & Manav Rachna University & CMU]
  • arxiv:


Open Big Data Group

Open Big Data Group

  • intro: This website contains a collection of libraries to be used in processing massive data size in highly distributed and paralleled environment
  • homepage:

PLDA: Parallel C++ implementation of Latent Dirichlet Allocation

PSVM: Parallelizing Support Vector Machines on Distributed Computers

PFP: Parallel FP-Growth for Query Recommendation

Pspectralclustering: A parallel C++ implementation of Parallel Spectral Clustering

Speedo: Parallelizing Stochastic Gradient Descent for Deep Convolutional Neural Network


Awesome Big Data Algorithms


Uncovering Big Bias with Big Data