Artificial Intelligence Toolkit – CRAY

Tensorflow

Introduction:

TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device with a single API. TensorFlow was originally developed by researchers and engineers working on the Google Brain Team within Google’s Machine Intelligence research organization for the purposes of conducting machine learning and deep neural networks research, but the system is general enough to be applicable in a wide variety of other domains as well.

Available Version: 1.3.0

Built on: Cray Programming Environment-6.0.4

Tensor flow using Anaconda Python 3.5 (For CPU and GPU):

To load python, Use the below command

module load anaconda/python3.5

Note: Use python not other python versions of anaconda

To import and check the tensorflow version, Use the below command

python -c 'import tensorflow as tf; print(tf.__version__)'
1.3.0

Torch

Introduction:

Torch is a scientific computing framework with wide support for machine learning algorithms that put GPUs first. It is easy to use and efficient, thanks to an easy and fast scripting language, LuaJIT, and an underlying C/CUDA implementation.

Available Version: 7.0

Built on: Cray Programming Environment-6.0.4

To load torch, Use the below command

module load torch/7.0

Urika Analytics

Introduction:

Cray Urika-XC is a high-performance big data software stack, which is optimized for multiple work-flows and run on the Cray XC series systems. It features a comprehensive analytics software stack to help derive optimal business value from data. In addition, the software stack provides an optimized set of tools for capturing and organizing a wide variety of data types from different sources and for executing a variety of analytic jobs.

Available Version: 1.0

Built on: Cray Programming Environment-6.0.4

Currently available Urika analytics components:

  •  Apache(tm) Spark(tm) – Spark is a general data processing framework that simplifies developing big data applications. It provides the means for executing batch, streaming, and interactive analytics jobs. In addition to the core Spark components, Urika-XC ships with a number of Spark ecosystem components.
  •  Anaconda® Python and R – Anaconda is a distribution of the Python and R programming languages for large-scale data processing, predictive analytics, and scientific computing. It aims at simplifying package management and deployment.
  • Dask and Dask Distributed – Dask is a parallel programming library that combines with the Numeric Python ecosystem to provide parallel arrays, data-frames, machine learning, and custom algorithms.
  • Intel BigDL – BigDL is a distributed deep learning library for Spark that can run directly on top of existing Spark or Apache Hadoop clusters. Deep learning applications can be written as Scala or Python programs.

Load the below modules to use Urika Analytics.

module load shifter

module load analytics

User are not allowed to run analytics jobs in interactive mode. Use the below  aprun command in your batch script.

aprun -n 24 -N 24 -b shifter --image=analytics-1.00.0000.201708011334_0018-Urika-XC-1.0-latest <executable> <options>

For more details on running different application and commands, kindly check the below link :
http://www.serc.iisc.in/facilities/wp-content/uploads/2017/11/Urika-XC_Analytic_Applications_Guide_10UP00_S-2589.pdf

For any problems with respect to G09 software contachelpdesk.serc@auto.iisc.ac.in or system administrators in room no. 103, SERC.