문제 해설

Batched Outer Sum (선형층 weight gradient) [medium]

선형대수 · medium

preview

Batched Outer Sum [medium]

v1 Outer product 은 한 쌍 의 벡터. 신경망 backprop 에서 자주 등장하는 연산은:

$\nabla_W L = \sum_{n=1}^{N} \delta^{(n)} \otimes x^{(n)}$

즉 배치 전체에 걸친 outer product 의 합. 이 연산은 선형층 $y = Wx + b$ 의 weight gradient 다.

loop: sum(np.outer(D[n], X[n]) for n in range(N)).
matrix product: D.T @ X. shape 확인:
- D.T (out_features, N)
- X (N, in_features)
- 결과 (out_features, in_features) — weight 행렬 shape 과 일치.

이 동치는 각 batch 샘플의 기여를 합 하는 기본 구조 — 아인슈타인 표기 np.einsum('no,ni->oi', D, X) 와도 같음.

함수 linear_weight_grad(X, D) 를 완성하세요.

#	이름	검증
1	반환 shape `(out, in)`
2	루프 구현과 일치
3	einsum 과 일치
4	수치 예제	손계산 검증
5	linearity: grad(cD) = c·grad(D)
6	linearity: grad(X1 + X2 stacked) = 각 grad 의 합
7	수치 미분: $y = W X^\top$ , L = sum(y), dL/dW 와 일치

코드 작성

Loading...

실행 결과

코드를 작성하고 Run 을 눌러보세요.