close
The Wayback Machine - https://web.archive.org/web/20201207015843/https://github.com/intel-analytics/BigDL/pull/1561
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Use MKL method to implement a 'lazy' gradient zero #1561

Open
wants to merge 9 commits into
base: master
from

Conversation

@frankfzw
Copy link
Contributor

@frankfzw frankfzw commented Sep 14, 2017

What changes were proposed in this pull request?

  1. Remove tensor.fill(0) when the zeroGradParameters is called. Instead, only a flag named zeroGradFlag is set to be true and then this method return instantly.
  2. The clearance of gradient parameter is finished in accGradParameters. The different MKL math functions are called with different parameters according to zeroGradFlag.
  3. zeroGradFlag is set to false after each accGradParameters.
  4. Add corresponding unit tests or test cases.

How was this patch tested?

unit tests

@frankfzw
Copy link
Contributor Author

@frankfzw frankfzw commented Sep 14, 2017

@yiheng Please take a look at this.

@@ -793,13 +813,22 @@ class SpatialConvolution[T: ClassTag](
val sWFloat = ev.toType[Float](scaleW)
val sBFloat = ev.toType[Float](scaleB)
val gradBFloat = gradBias.asInstanceOf[Tensor[Float]]
val update = if (zeroGradFlag) {

This comment has been minimized.

@yiheng

yiheng Sep 25, 2017
Contributor

do not use function object in computing intensive code

@@ -471,9 +471,14 @@ class SpatialConvolution[T: ClassTag](
val gradView = gradWeightMMInBatch.view(batchSize,
nOutputPlane * nInputPlane * kernelH * kernelW / nGroup).t
val grad = gradWeight.view(nOutputPlane * nInputPlane * kernelH * kernelW / nGroup)
grad.addmv(ev.fromType(1.0), ev.fromType(1.0), gradView, onesBatch)
val beta = if (zeroGradFlag) {
ev.fromType(0.0)

This comment has been minimized.

@yiheng

yiheng Sep 25, 2017
Contributor

ev.zero

val beta = if (zeroGradFlag) {
ev.fromType(0.0)
} else {
ev.fromType(1.0)

This comment has been minimized.

@yiheng

yiheng Sep 25, 2017
Contributor

ev.one

if (withBias) {
gradBias.addmv(ev.fromType(1.0), ev.fromType(1.0), gradientBiasMT.t, onesBatch)
gradBias.addmv(beta, ev.fromType(1.0), gradientBiasMT.t, onesBatch)

This comment has been minimized.

@yiheng

yiheng Sep 25, 2017
Contributor

ev.one

}

if (withBias && scaleB != 0) {
gradBias.addmv(ev.fromType[Double](scaleB), gradOutput.t, addBuffer)
if (zeroGradFlag) {
gradBias.addmv(ev.fromType[Double](0.0),

This comment has been minimized.

@yiheng

yiheng Sep 25, 2017
Contributor

ev.zero

}
}
else if (input.dim() == 2) {
if (scaleW != 0) {
gradWeight.addmm(ev.fromType[Double](scaleW), gradOutput.t, input)
if (zeroGradFlag) {
gradWeight.addmm(ev.fromType[Double](0.0),

This comment has been minimized.

@yiheng

yiheng Sep 25, 2017
Contributor

ev.zero

@@ -141,20 +141,38 @@ class Linear[T: ClassTag](

if (input.dim() == 1) {
if (scaleW != 0) {
gradWeight.addr(ev.fromType[Double](scaleW), gradOutput, input)
if (zeroGradFlag) {
gradWeight.addr(ev.fromType[Double](0.0), gradOutput, ev.fromType[Double](scaleW), input)

This comment has been minimized.

@yiheng

yiheng Sep 25, 2017
Contributor

ev.zero

}
}
else if (input.dim() == 2) {
if (scaleW != 0) {
gradWeight.addmm(ev.fromType[Double](scaleW), gradOutput.t, input)
if (zeroGradFlag) {
gradWeight.addmm(ev.zero,

This comment has been minimized.

@frankfzw

frankfzw Sep 26, 2017
Author Contributor

It seems this operation slows down the backward of Linear @yiheng

This comment has been minimized.

@yiheng

yiheng Sep 26, 2017
Contributor

why, can you dig deeper? These two methods call the same method with only one parameter difference

This comment has been minimized.

@yiheng

yiheng Sep 26, 2017
Contributor

You may need to run some micro-benchmark on this method

@frankfzw frankfzw force-pushed the frankfzw:zeros branch from 248b29f to 5e2101b Oct 24, 2017
@frankfzw frankfzw force-pushed the frankfzw:zeros branch from 8fe7cbf to 5e2101b Oct 31, 2017
@frankfzw frankfzw force-pushed the frankfzw:zeros branch from 5e2101b to d67b640 Nov 2, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

2 participants
You can’t perform that action at this time.