Ashish Agarwal’s scientific contributions

What is this page?


This page lists works of an author who doesn't have a ResearchGate profile or hasn't added the works to their profile yet. It is automatically generated from public (personal) data to further our legitimate goal of comprehensive and accurate scientific recordkeeping. If you are this author and want this page removed, please let us know.

Publications (5)


Auto-Vectorizing TensorFlow Graphs: Jacobians, Auto-Batching And Beyond
  • Preprint

March 2019

·

92 Reads

Ashish Agarwal

·

Igor Ganichev

We propose a static loop vectorization optimization on top of high level dataflow IR used by frameworks like TensorFlow. A new statically vectorized parallel-for abstraction is provided on top of TensorFlow, and used for applications ranging from auto-batching and per-example gradients, to jacobian computation, optimized map functions and input pipeline optimization. We report huge speedups compared to both loop based implementations, as well as run-time batching adopted by the DyNet framework.


TensorFlow Eager: A Multi-Stage, Python-Embedded DSL for Machine Learning

February 2019

·

157 Reads

Akshay Agrawal

·

Akshay Naresh Modi

·

Alexandre Passos

·

[...]

·

Shanqing Cai

TensorFlow Eager is a multi-stage, Python-embedded domain-specific language for hardware-accelerated machine learning, suitable for both interactive research and production. TensorFlow, which TensorFlow Eager extends, requires users to represent computations as dataflow graphs; this permits compiler optimizations and simplifies deployment but hinders rapid prototyping and run-time dynamism. TensorFlow Eager eliminates these usability costs without sacrificing the benefits furnished by graphs: It provides an imperative front-end to TensorFlow that executes operations immediately and a JIT tracer that translates Python functions composed of TensorFlow operations into executable dataflow graphs. TensorFlow Eager thus offers a multi-stage programming model that makes it easy to interpolate between imperative and staged execution in a single package.


Figure 1: Example TensorFlow code fragment  
Figure 5: Gradients computed for graph in Figure 2  
Figure 6: Before and after graph transformation for partial execution
Figure 10: TensorBoard graph visualization of a convolutional neural network model  
Figure 12: EEG visualization of multi-threaded CPU operations (x-axis is time in µs).  

+2

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems
  • Article
  • Full-text available

March 2016

·

9,805 Reads

·

15,040 Citations

TensorFlow is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at www.tensorflow.org.

Download


TensorFlow : Large-Scale Machine Learning on Heterogeneous Distributed Systems

January 2015

·

3,968 Reads

·

8,491 Citations

TensorFlow [1] is an interface for expressing machine learning algorithms, and an implementation for executing such algorithms. A computation expressed using TensorFlow can be executed with little or no change on a wide variety of heterogeneous systems, ranging from mobile devices such as phones and tablets up to large-scale distributed systems of hundreds of machines and thousands of computational devices such as GPU cards. The system is flexible and can be used to express a wide variety of algorithms, including training and inference algorithms for deep neural network models, and it has been used for conducting research and for deploying machine learning systems into production across more than a dozen areas of computer science and other fields, including speech recognition, computer vision, robotics, information retrieval, natural language processing, geographic information extraction, and computational drug discovery. This paper describes the TensorFlow interface and an implementation of that interface that we have built at Google. The TensorFlow API and a reference implementation were released as an open-source package under the Apache 2.0 license in November, 2015 and are available at www.tensorflow.org.

Citations (3)


... The generator employs three-dimensional convolutional layers and leaky rectified linear unit (LReLU) activation functions. LReLUs are chosen for their enhanced computational efficiency compared to the standard rectified linear unit (ReLU) [51]. ...

Reference:

Parallel Implementation and Performance of Super-Resolution Generative Adversarial Network Turbulence Models for Large-Eddy Simulation
Tensorflow: Large-scale machine learning on heterogeneous distributed systems
  • Citing Article
  • January 2016

... The generator employs three-dimensional convolutional layers and leaky rectified linear unit (LReLU) activation functions. LReLUs are chosen for their enhanced computational efficiency compared to the standard rectified linear unit (ReLU) [51]. ...

TensorFlow : Large-Scale Machine Learning on Heterogeneous Distributed Systems
  • Citing Technical Report
  • January 2015

... The classification of the data under analysis was based on what was taught by the training bank, and was collected following the criteria described for each category. We used the TensorFlow (Abadi et al. 2016), Matplotlib (Hunter 2007), Numpy (Harris et al. 2020), PIL (Clark 2015), and Keras (Chollet et al. 2015) packages. ...

TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems