|
3 | 3 | Simple distributed data manipulation and processing routines for Julia. |
4 | 4 |
|
5 | 5 | This was originally developed for |
6 | | -[GigaSOM.jl](https://github.com/LCSB-BioCore/GigaSOM.jl), this package contains |
7 | | -the separated-out lightweight distributed-processing framework that can be used |
8 | | -with GigaSOM. |
| 6 | +[`GigaSOM.jl`](https://github.com/LCSB-BioCore/GigaSOM.jl); DiDa.jl package |
| 7 | +contains the separated-out lightweight distributed-processing framework that |
| 8 | +was used in `GigaSOM.jl`. |
9 | 9 |
|
10 | 10 | ## Why? |
11 | 11 |
|
12 | | -This provides a very simple, imperative and straightforward way to move your |
13 | | -data around a cluster of Julia processes created by the `Distributed` package, |
| 12 | +DiDa.jl provides a very simple, imperative and straightforward way to move your |
| 13 | +data around a cluster of Julia processes created by the |
| 14 | +[`Distributed`](https://docs.julialang.org/en/v1/stdlib/Distributed/) package, |
14 | 15 | and run computation on the distributed data pieces. The main aim of the package |
15 | | -is to avoid anything complicated-- the first version used in GigaSOM had just |
16 | | -under 500 lines of relatively straightforward code with comments. |
17 | | - |
18 | | -Most importantly, distributed processing should be simple and accessible. |
| 16 | +is to avoid anything complicated-- the first version used in |
| 17 | +[GigaSOM](https://github.com/LCSB-BioCore/GigaSOM.jl) had just under 500 lines |
| 18 | +of relatively straightforward code (including the doc-comments). |
| 19 | + |
| 20 | +Compared to plain `Distributed` API, you get more straightforward data |
| 21 | +manipulation primitives, some extra control over the precise place where code |
| 22 | +is executed, and a few high-level functions. These include a distributed |
| 23 | +version of `mapreduce`, simpler work-alike of the |
| 24 | +[DistributedArrays](https://github.com/JuliaParallel/DistributedArrays.jl) |
| 25 | +functionality, and easy-to-use distributed dataset saving and loading. |
| 26 | + |
| 27 | +Most importantly, the main motivation behind the package is that the |
| 28 | +distributed processing should be simple and accessible. |
19 | 29 |
|
20 | 30 | ## Brief how-to |
21 | 31 |
|
|
0 commit comments