Seq2Seq_PlouffeRainbows

TensorFlow implementation of a novel open-source Seq2SeqRegression API for performing a wide range of automatic feature extraction tasks outside of NLP. This general purpose Sequence-to-Sequence Regression model can predict a sequence of multidimensional vectors based on previous observations. The system of study being analyzed here is the Plouffe Graph, a graph by Canadian mathematician Simon Plouffe in 1974-1979. More information about the Plouffe Graph can be found here: Times Tables, Mandelbrot and the Heart of Mathematics.

Dataset

The Plouffe dataset is already included. A dataset of multidimensional vectors that represent the Plouffe Graph gets constructed during training. The dataset can be configured easily in the plouffe.yml file inside the configs folder.

IPython Notebook

An IPython Notebook of the Seq2Seq Regression model can be found inside the notebooks folder. This notebook serves to complement the paper and walks you through the computational graph. It also provides a background of the Plouffe Graph dataset.

In order to see the interactive graphics of the Seq2Seq Regression model's predictions, you will need to download this pre-trained model at the Google Drive link,

https://drive.google.com/open?id=0B86gEeQqfnjtMERTV2tjLWMwNnc

Create a logs directory in the root of the Seq2Seq_PlouffeRainbows folder.

After downloading, you need to move/copy the lr0002 folder that was downloaded from the Google Drive link into the logs folder.

Launch IPython Notebook

cd notebooks
jupyter notebook

Note: The iopoub rate limits are too low by default, for this visualization heavy project. To fix this, you can launch the IPython notebook the following way:

jupyter notebook --NotebookApp.iopub_data_rate_limit=10000000000

Installation

The program requires the following dependencies (easy to install using pip, Anaconda or Docker):

python 2.7
tensorflow API (tested with r1.0.0)
numpy
scipy
pandas
matplotlib
jupyter
networkx
tqdm
pyyaml
jupyterthemes
seaborn

Anaconda

Anaconda: Installation

To install DLFractalSequences in an Anaconda environment:

conda env create -f environment.yml

To activate Anaconda environment:

source activate dlfractals-env

Anaconda: Train

Train Seq2Seq Regression model on the local machine using the Plouffe dataset:

python train.py -c configs/plouffe.yml

Note: The training inputs (i.e. dataset parameters, hyperparameters etc.) for training on a local machine can be modified in the plouffe.yml inside the configs folder.

Docker

Docker: Installation

Prerequisites: Docker installed on your machine. If you don't have docker installed already, then go here to Docker Setup

To build Docker image:

docker build -t dlfractals:latest .

Docker: Train

To deploy and train on Docker container:

docker run -it dlfractals:latest python train.py -c configs/plouffe.yml

Sharcnet

The Shared Hierarchical Academic Research Computing Network (SHARCNET) is used when you want to run multiple jobs.

Activate Tensorflow Python2.7 environment:

source /opt/sharcnet/testing/tensorflow/tensorflow-cp27-active

Note: If there is anything missing, then do:

pip install <missing_pkg> --user

Example:

pip install /opt/sharcnet/testing/tensorflow/tensorflow-1.0.0-cp27-cp27m-linux_x86_64.whl --user

Train multiple jobs using the Seq2Seq Regression model on the Plouffe dataset:

python train_manyjobs.py -c configs/plouffe_sharcnet.yml

Note: The training inputs (i.e. dataset parameters, hyperparameters etc.) for training on a sharcnet machine can be modified in the plouffe.yml inside the configs folder. You must specify train option inside the YAML config file to be either copper or local when training on sharcnet.

Future Work

Perform futher analysis on the Plouffe Graph. We particularly want to analyze how arithmetic in embedding space corresponds to the group arithmetic in input space, and establish strong baselines in relation to that.
Add libraries that allow more experimentation with attention and external memory.
Explore more datasets (i.e. video sequences) which would leverage the automatic feature extraction functionality of the Seq2Seq Regression model.

Name		Name	Last commit message	Last commit date
Latest commit History 136 Commits
8515717jnngtrmrvkrr		8515717jnngtrmrvkrr
configs		configs
notebooks		notebooks
plouffe		plouffe
resources		resources
seq2seq_regression		seq2seq_regression
utils		utils
.gitignore		.gitignore
.gitmodules		.gitmodules
Dockerfile		Dockerfile
README.md		README.md
check_loss.py		check_loss.py
environment.yml		environment.yml
job_runner.py		job_runner.py
plot_names.txt		plot_names.txt
test_model.py		test_model.py
train.py		train.py
train_manyjobs.py		train_manyjobs.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Seq2Seq_PlouffeRainbows

Table of Contents

Dataset

IPython Notebook

Launch IPython Notebook

Installation

Anaconda

Anaconda: Installation

Anaconda: Train

Docker

Docker: Installation

Docker: Train

Sharcnet

Future Work

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Seq2Seq_PlouffeRainbows

Table of Contents

Dataset

IPython Notebook

Launch IPython Notebook

Installation

Anaconda

Anaconda: Installation

Anaconda: Train

Docker

Docker: Installation

Docker: Train

Sharcnet

Future Work

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages