LBANN: Livermore Big Artificial Neural Network Toolkit

The Livermore Big Artificial Neural Network toolkit (LBANN) is an open-source, HPC-centric, deep learning training framework that is optimized to compose multiple levels of parallelism.

LBANN provides model-parallel acceleration through domain decomposition to optimize for strong scaling of network training. It also allows for composition of model-parallelism with both data parallelism and ensemble training methods for training large neural networks with massive amounts of data. LBANN is able to advantage of tightly-coupled accelerators, low-latency high-bandwidth networking, and high-bandwidth parallel file systems.

DistConv Repository

The DistConv repository contains a a rewrite of the original DistConv algorithm, published using the LBANN C++ Core, with a reimplmentation using PyTorch 2.x DTensor objects.

Publications

Yosuke Oyama, Naoya Maruyama, Nikoli Dryden, Erin McCarthy, Peter Harrington, Jan Balewski, Satoshi Matsuoka, Peter Nugent, Brian Van Essen. "The Case for Strong Scaling in Deep Learning: Training Large 3D CNNs with Hybrid Parallelism", under review for Special Session on Parallel and Distributed Computing Techniques for AI, ML and DL in Transactions on Parallel and Distributed Systems, July 2020.
- arXiv.org/abs/2007.12856
Nikoli Dryden, Naoya Maruyama, Tom Benson, Tim Moon, Marc Snir, Brian Van Essen. "Channel and Filter Parallelism for Large-Scale CNN Training", in SC '19: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, November 2019 Article No. 10, Pages 1-20, DOI: 10.1145/3295500.3356207.

    @INPROCEEDINGS{8820780,
      author={N. {Dryden} and N. {Maruyama} and T. {Benson} and T. {Moon} and M. {Snir} and B. {Van Essen}},
      booktitle={2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)},
      title={Improving Strong-Scaling of {CNN} Training by Exploiting Finer-Grained Parallelism},
      year={2019},
      volume={},
      number={},
      pages={210-220},
      doi={10.1109/IPDPS.2019.00031}}

Nikoli Dryden, Naoya Maruyama, Tom Benson, Tim Moon, Marc Snir, Brian Van Essen. "Improving Strong-Scaling of CNN Training by Exploiting Finer-Grained Parallelism", in Proceedings of IEEE International Parallel & Distributed Processing Symposium, 2019.
- arXiv.org/abs/1903.06681

    @INPROCEEDINGS{8820780,
      author={N. {Dryden} and N. {Maruyama} and T. {Benson} and T. {Moon} and M. {Snir} and B. {Van Essen}},
      booktitle={2019 IEEE International Parallel and Distributed Processing Symposium (IPDPS)},
      title={Improving Strong-Scaling of {CNN} Training by Exploiting Finer-Grained Parallelism},
      year={2019},
      volume={},
      number={},
      pages={210-220},
      doi={10.1109/IPDPS.2019.00031}}

A complete list of LBANN related publications, presentations and posters are shown here.

Reporting issues

Issues, questions, and bugs can be raised on the Github issue tracker.

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
distconv		distconv
tests		tests
.gitignore		.gitignore
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS		CONTRIBUTORS
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LBANN: Livermore Big Artificial Neural Network Toolkit

DistConv Repository

Publications

Reporting issues

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

LBANN/DistConv

Folders and files

Latest commit

History

Repository files navigation

LBANN: Livermore Big Artificial Neural Network Toolkit

DistConv Repository

Publications

Reporting issues

About

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages