Big Data Resources

Published: 22 Sep 2015 Category: big_data

Courses

MIT 6.S897: Large-Scale Systems(Matei Zaharia)

Papers

Learning to Hash for Indexing Big Data - A Survey

Random Forests for Big Data

Big data analytics: a survey

A Comparison of Big Data Frameworks on a Layered Dataflow Model

A survey of machine learning for big data processing

A Big Data Analysis Framework Using Apache Spark and Deep Learning

  • intro: IEEE ICDM 2017 (International Conference on Data Mining) Workshop on Data Science and Big Data Analytics (DSBDA)
  • intro: University of Delhi & Manav Rachna University & CMU]
  • arxiv: https://arxiv.org/abs/1711.09279

Projects

Open Big Data Group

Open Big Data Group

  • intro: This website contains a collection of libraries to be used in processing massive data size in highly distributed and paralleled environment
  • homepage: http://openbigdatagroup.github.io/

PLDA: Parallel C++ implementation of Latent Dirichlet Allocation

PSVM: Parallelizing Support Vector Machines on Distributed Computers

PFP: Parallel FP-Growth for Query Recommendation

Pspectralclustering: A parallel C++ implementation of Parallel Spectral Clustering

Speedo: Parallelizing Stochastic Gradient Descent for Deep Convolutional Neural Network

Videos

Awesome Big Data Algorithms

Blog

Uncovering Big Bias with Big Data