Stores large files in distributed manner
The simpliest way to use (assuming you have erlang installed on your machine) is console.sh file
chmod +x console.sh
./console.sh localIt starts the erlang console and application. In this mode you have a console access to the application through erlang console.
If you want to use this application in distributed way you should run
chmod +x dist_foreground.sh
./dist_foreground.shHere you don't have to point to the profile (like local for console.sh). Profile names which are the release names are generated by bash script in for loop.
Script starts 3 nodes.
There are common tests for the application.
chmod +x test.sh
./test.sh testchmod +x dist_test.sh
./dist_test.sh testTo check the service just send some file via curl
curl -F test=123 -F 'file=@googlechrome.dmg' http://localhost:5551Use your favorite browser and go to the http://localhost:5551/?action=read&name=googlechrome.dmg
Use your favorite browser and go to the http://localhost:5551/?action=delete&name=googlechrome.dmg
Application starts with cowboy listener and application supervisor starts chunk_controller module.
When cowboy gets incoming request it starts API handler. API handler uses chunk_controller exported functions to initialize chunk_handler modules. Controller uses async cast to put each piece of file into handlers respectively it's number. Handlers write data to the file system.
So, main idea is to avoid RAM overflow in case of large file. Every handler process writes its piece of file asynchronously but as controller get the file by chunks that is unlikely to have this problem.
However, reading is made in synchronous way. Here is some space to improve the reading process by partial data preloading but it is a topic for another discussion.
There're some details to improve. First of all I'd definitely improve the way of the chunks metadata stored. In this version all the chunks data is stored into the chunk_controller state, which is basically means in the RAM. I guess it's okay for the test purposes, but in production it has to be stored into some persistent storage (e.g. Mnesia, disk_copies mode).
Don't forget to stop the running nodes otherwise you'll get the error like Protocol 'inet_tcp': the name node_2@127.0.0.1 seems to be in use by another Erlang node or some other error about epmd.
If you face with theses errors just kill the running erlang nodes
ps -ef | grep erlYou'll get something like
1692471576 17185 1 0 1:07 ?? 0:00.01 /usr/local/Cellar/erlang/21.3.3/lib/erlang/erts-10.3.2/bin/epmd -daemon
1692471576 17282 17232 0 1:07 ?? 0:00.00 erl_child_setup 256
1692471576 17340 17290 0 1:07 ?? 0:00.00 erl_child_setup 256
1692471576 17398 17348 0 1:07 ?? 0:00.00 erl_child_setup 256
...The second value in each row is the process id. Use
kill -9 17185to kill every found node.