Skip to content
/ dcf Public
forked from dcfjs/dcf

Yet another distributed compute framework

Notifications You must be signed in to change notification settings

zzz08900/dcf

 
 

Repository files navigation

Distributed Computing Framework for Node.js

Early development stage: this project was still under early development, many necessery feature was not done yet, use it on your own risk.

Document

API Reference

A node.js version of Spark, without hadoop or jvm.

You should read tutorial first, then you can learn Spark but use this project instead.

Async API & deferred API

Any api that requires a RDD and generate a result is async, like count, take, max ... Any api that creates a RDD is deferred API, which is not async, so you can chain them like this:

await dcc
  .parallelize([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
  .map(v => v + 1)
  .filter(v => v % 2 === 0)
  .take(10); // take is not deferred api but async

Milestones

0.1.x: Basic

  • local master.
  • rdd & partition creation & release.
  • map & reduce
  • repartition & reduceByKey
  • disk storage partitions
  • cache
  • file loader & saver
  • export module to npm
  • decompresser & compresser
  • use debug module for information/error
  • provide a progress bar.
  • sampler
  • sort
  • object hash(for key) method
  • storage MEMORY_OR_DISK, and use it in sort
  • storage MEMORY_SER,storage in memory but off v8 heap.
  • config default partition count.

0.2.x: Remote mode

  • distributed master
  • runtime sandbox
  • plugin system
  • remote dependency management
  • aliyun oss loader
  • hdfs loader

How to use

Install from npm(shell only)

npm install -g dcf
#or
yarn global add dcf

Then you can use command: dcf-shell

Install from npm(as dependency)

npm install --save dcf
#or
yarn add dcf

Then you can use dcf with javascript or typescript.

Run samples & cli

download this repo, install dependencies

npm install
# or
yarn

Run samples:

npm run ts-node src/samples/tutorial-0.ts
npm run ts-node src/samples/repartition.ts

Run interactive cli:

npm start

About

Yet another distributed compute framework

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • TypeScript 98.5%
  • JavaScript 1.5%